# CLAUDE.md — BLIGHT: CUE ## Instructions for Claude - **Every commit**: Update the Changelog section below with a summary of what changed and bump the version number if appropriate. - **Every commit**: If the changes affect anything documented in README.md (trigger syntax, failure behavior, setup, project structure, AI provider info, etc.), update README.md in the same commit. --- ## Version **v1.2.0** --- ## Project Overview BLIGHT: CUE is a webhook listener that monitors Gitea repositories for markdown files containing `BLIGHT:` trigger lines. When a push arrives, it fetches the changed files, sends them to Google Gemini with the embedded instruction, replaces the trigger with the AI's response, and commits the result back — fully automated, no manual steps. It is one module in a larger ecosystem called **BLIGHT**, where independent modules communicate through Gitea repositories as a shared data layer. **Deployed on**: homelab at `/home/artanis/Documents/BLIGHT_CUE` as `blight-cue.service` **Gitea instance**: `https://gitea.bunny-wyvern.ts.net` **Default port**: `5010` --- ## Architecture ``` app.py Flask webhook server — entry point, signature verification, background thread dispatch processor.py Trigger scanning, inline vs document-scope processing, retry logic, AI response sanitization gitea_client.py Gitea REST API wrapper — get_file / update_file config.py Loads .env into module-level constants ai/ base.py AIProvider ABC — complete() and complete_document() gemini.py Gemini 2.5 Flash-Lite implementation, two model instances (one system prompt per trigger type) ``` ### Request flow 1. Gitea POST → `/webhook` 2. HMAC-SHA256 signature verified against `WEBHOOK_SECRET` 3. Push payload parsed — owner, repo, branch, changed file paths extracted 4. Background thread spawned; 200 returned immediately to Gitea 5. For each `.md` file: - Fetch content + SHA via Gitea API (`?ref=`) - `processor.process_document()` — inline triggers first, doc-scope second - If changed: commit back via Gitea API (same branch, SHA required) ### Trigger types | Syntax | Scope | AI output replaces | |---|---|---| | `BLIGHT: ` | Inline | The trigger line only | | `BLIGHT:: ` | Document | The entire file content | Both are case-insensitive. Multiple triggers in one file: inline ones processed first in document order, then doc-scope ones sequentially (each sees the previous result). ### Loop prevention All AI responses are passed through `_sanitize()` in `processor.py` before being written to the document. This replaces `BLIGHT:` with `BLIGHT:`, preventing the service's own commits from triggering another processing cycle. --- ## Key Design Decisions - **Stateless**: no database, no persistent state between webhooks. Everything is derived from the Gitea API on demand. - **Background threads**: webhook returns 200 immediately; processing happens async. Gitea will not retry on timeout. - **SHA-based updates**: Gitea requires the current file SHA to update — prevents race conditions if two pushes arrive close together. - **Pluggable AI**: `AIProvider` ABC makes swapping backends a one-file change. Two methods required: `complete()` for inline, `complete_document()` for doc-scope. - **Gemini model**: `gemini-2.5-flash-lite` — chosen for cost-effectiveness. Two separate model instances are used, each with a different system prompt tuned for inline vs whole-document output. - **Self-update**: `_self_update()` in `app.py` runs `git pull --ff-only` on startup. Only executes under `__main__` (i.e. `python app.py`), not under a WSGI server. --- ## Environment Variables Loaded from `.env` via `python-dotenv`. See `.env.example`. | Variable | Required | Default | Notes | |---|---|---|---| | `GITEA_URL` | Yes | — | No trailing slash | | `GITEA_TOKEN` | Yes | — | Needs repo read/write | | `GEMINI_API_KEY` | Yes | — | From Google AI Studio | | `WEBHOOK_SECRET` | Yes | — | Must match Gitea webhook config | | `WEBHOOK_PORT` | No | `5010` | Port Flask binds to | --- ## Running Locally ```bash python3 -m venv .venv source .venv/bin/activate pip install -r requirements.txt cp .env.example .env # fill in your values python app.py ``` --- ## Dependencies | Package | Purpose | |---|---| | `flask>=3.0.0` | Webhook HTTP server | | `python-dotenv>=1.0.0` | `.env` loading | | `requests>=2.31.0` | Gitea REST API calls | | `google-generativeai>=0.8.0` | Gemini SDK | --- ## Things to Keep in Mind - **No tests exist yet.** Be careful with changes to `processor.py` — the trigger regex and two-pass processing logic are easy to break subtly. When in doubt, trace through a concrete example mentally before changing. - **`_sanitize()` must stay on all AI return paths.** If new call sites are added to `_call_with_retry` or new retry functions are introduced, make sure sanitization is applied before returning. - **`gitea_client.update_file` requires `branch`** — never call it without a valid branch string or the write will go to the default branch silently. - **The `BLIGHT:` commit message prefix is intentional** — it makes BLIGHT: CUE's commits easy to identify in Gitea's history. Don't change it to something that wouldn't match `BLIGHT:` (the sanitizer would catch it but it's still confusing). - **`INLINE_PATTERN` uses a negative lookahead `(?!:)`** to exclude `BLIGHT::` from matching as an inline trigger. If you touch the regex, verify this still holds. - **Background threads are daemon threads** — they will be killed if the main process exits. Long-running AI calls on shutdown will be lost. --- ## Adding a New AI Provider 1. Create `ai/.py` and subclass `AIProvider` from `ai/base.py`. 2. Implement both `complete(document, instruction)` and `complete_document(document, instruction)`. 3. In `processor.py`, replace `GeminiProvider()` with your new class. 4. Export it from `ai/__init__.py`. --- ## Changelog ### v1.2.0 - Fixed loop re-processing: AI responses are sanitized by replacing `BLIGHT:` with `BLIGHT:` before being written to the document, preventing the service's own commits from triggering another processing cycle. - Branch awareness: branch is now extracted from `refs/heads/` in the push payload and passed to all Gitea read/write calls, so pushes to non-default branches are handled correctly. - Commit messages now include the file path (e.g. `BLIGHT: process triggers in notes/todo.md`). ### v1.1.0 - Added `BLIGHT::` (double colon) document-scope trigger syntax. Unlike `BLIGHT:` which replaces only the trigger line, `BLIGHT::` replaces the entire file content with the AI's rewritten document. - Multiple `BLIGHT::` triggers in one file are processed sequentially, each operating on the result of the previous. - Inline `BLIGHT:` triggers are always processed before `BLIGHT::` triggers. - Both trigger types are now case-insensitive (`blight:`, `BLIGHT:`, `Blight::`, etc. all match). - Failure comments updated to two-line format: ``` ``` - Added `complete_document()` to `AIProvider` ABC and `GeminiProvider`, with a dedicated system prompt instructing the model to return the full rewritten document. ### v1.0.0 — Initial release - Flask webhook server listening for Gitea push events. - HMAC-SHA256 signature verification on all incoming webhooks. - Scans changed `.md` files for `BLIGHT: ` trigger lines. - Sends full document + instruction to Google Gemini 2.5 Flash-Lite. - Replaces trigger line with AI response in-place and commits back to Gitea. - Retry logic: 3 attempts with exponential backoff (1s, 2s, 4s). - Processes webhooks in background threads to return 200 immediately. - Deduplicates file paths across multiple commits in a single push. - Self-updates on startup via `git pull --ff-only`. - Pluggable `AIProvider` ABC for swapping AI backends.