Flesh out CLAUDE.md with architecture, design notes, and dev reference
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
This commit is contained in:
121
CLAUDE.md
121
CLAUDE.md
@@ -13,6 +13,127 @@
|
|||||||
|
|
||||||
---
|
---
|
||||||
|
|
||||||
|
## Project Overview
|
||||||
|
|
||||||
|
BLIGHT: CUE is a webhook listener that monitors Gitea repositories for markdown files containing `BLIGHT:` trigger lines. When a push arrives, it fetches the changed files, sends them to Google Gemini with the embedded instruction, replaces the trigger with the AI's response, and commits the result back — fully automated, no manual steps.
|
||||||
|
|
||||||
|
It is one module in a larger ecosystem called **BLIGHT**, where independent modules communicate through Gitea repositories as a shared data layer.
|
||||||
|
|
||||||
|
**Deployed on**: homelab at `/home/artanis/Documents/BLIGHT_CUE` as `blight-cue.service`
|
||||||
|
**Gitea instance**: `https://gitea.bunny-wyvern.ts.net`
|
||||||
|
**Default port**: `5010`
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Architecture
|
||||||
|
|
||||||
|
```
|
||||||
|
app.py Flask webhook server — entry point, signature verification,
|
||||||
|
background thread dispatch
|
||||||
|
processor.py Trigger scanning, inline vs document-scope processing,
|
||||||
|
retry logic, AI response sanitization
|
||||||
|
gitea_client.py Gitea REST API wrapper — get_file / update_file
|
||||||
|
config.py Loads .env into module-level constants
|
||||||
|
ai/
|
||||||
|
base.py AIProvider ABC — complete() and complete_document()
|
||||||
|
gemini.py Gemini 2.5 Flash-Lite implementation, two model instances
|
||||||
|
(one system prompt per trigger type)
|
||||||
|
```
|
||||||
|
|
||||||
|
### Request flow
|
||||||
|
|
||||||
|
1. Gitea POST → `/webhook`
|
||||||
|
2. HMAC-SHA256 signature verified against `WEBHOOK_SECRET`
|
||||||
|
3. Push payload parsed — owner, repo, branch, changed file paths extracted
|
||||||
|
4. Background thread spawned; 200 returned immediately to Gitea
|
||||||
|
5. For each `.md` file:
|
||||||
|
- Fetch content + SHA via Gitea API (`?ref=<branch>`)
|
||||||
|
- `processor.process_document()` — inline triggers first, doc-scope second
|
||||||
|
- If changed: commit back via Gitea API (same branch, SHA required)
|
||||||
|
|
||||||
|
### Trigger types
|
||||||
|
|
||||||
|
| Syntax | Scope | AI output replaces |
|
||||||
|
|---|---|---|
|
||||||
|
| `BLIGHT: <instruction>` | Inline | The trigger line only |
|
||||||
|
| `BLIGHT:: <instruction>` | Document | The entire file content |
|
||||||
|
|
||||||
|
Both are case-insensitive. Multiple triggers in one file: inline ones processed first in document order, then doc-scope ones sequentially (each sees the previous result).
|
||||||
|
|
||||||
|
### Loop prevention
|
||||||
|
|
||||||
|
All AI responses are passed through `_sanitize()` in `processor.py` before being written to the document. This replaces `BLIGHT:` with `BLIGHT:`, preventing the service's own commits from triggering another processing cycle.
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Key Design Decisions
|
||||||
|
|
||||||
|
- **Stateless**: no database, no persistent state between webhooks. Everything is derived from the Gitea API on demand.
|
||||||
|
- **Background threads**: webhook returns 200 immediately; processing happens async. Gitea will not retry on timeout.
|
||||||
|
- **SHA-based updates**: Gitea requires the current file SHA to update — prevents race conditions if two pushes arrive close together.
|
||||||
|
- **Pluggable AI**: `AIProvider` ABC makes swapping backends a one-file change. Two methods required: `complete()` for inline, `complete_document()` for doc-scope.
|
||||||
|
- **Gemini model**: `gemini-2.5-flash-lite` — chosen for cost-effectiveness. Two separate model instances are used, each with a different system prompt tuned for inline vs whole-document output.
|
||||||
|
- **Self-update**: `_self_update()` in `app.py` runs `git pull --ff-only` on startup. Only executes under `__main__` (i.e. `python app.py`), not under a WSGI server.
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Environment Variables
|
||||||
|
|
||||||
|
Loaded from `.env` via `python-dotenv`. See `.env.example`.
|
||||||
|
|
||||||
|
| Variable | Required | Default | Notes |
|
||||||
|
|---|---|---|---|
|
||||||
|
| `GITEA_URL` | Yes | — | No trailing slash |
|
||||||
|
| `GITEA_TOKEN` | Yes | — | Needs repo read/write |
|
||||||
|
| `GEMINI_API_KEY` | Yes | — | From Google AI Studio |
|
||||||
|
| `WEBHOOK_SECRET` | Yes | — | Must match Gitea webhook config |
|
||||||
|
| `WEBHOOK_PORT` | No | `5010` | Port Flask binds to |
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Running Locally
|
||||||
|
|
||||||
|
```bash
|
||||||
|
python3 -m venv .venv
|
||||||
|
source .venv/bin/activate
|
||||||
|
pip install -r requirements.txt
|
||||||
|
cp .env.example .env # fill in your values
|
||||||
|
python app.py
|
||||||
|
```
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Dependencies
|
||||||
|
|
||||||
|
| Package | Purpose |
|
||||||
|
|---|---|
|
||||||
|
| `flask>=3.0.0` | Webhook HTTP server |
|
||||||
|
| `python-dotenv>=1.0.0` | `.env` loading |
|
||||||
|
| `requests>=2.31.0` | Gitea REST API calls |
|
||||||
|
| `google-generativeai>=0.8.0` | Gemini SDK |
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Things to Keep in Mind
|
||||||
|
|
||||||
|
- **No tests exist yet.** Be careful with changes to `processor.py` — the trigger regex and two-pass processing logic are easy to break subtly. When in doubt, trace through a concrete example mentally before changing.
|
||||||
|
- **`_sanitize()` must stay on all AI return paths.** If new call sites are added to `_call_with_retry` or new retry functions are introduced, make sure sanitization is applied before returning.
|
||||||
|
- **`gitea_client.update_file` requires `branch`** — never call it without a valid branch string or the write will go to the default branch silently.
|
||||||
|
- **The `BLIGHT:` commit message prefix is intentional** — it makes BLIGHT: CUE's commits easy to identify in Gitea's history. Don't change it to something that wouldn't match `BLIGHT:` (the sanitizer would catch it but it's still confusing).
|
||||||
|
- **`INLINE_PATTERN` uses a negative lookahead `(?!:)`** to exclude `BLIGHT::` from matching as an inline trigger. If you touch the regex, verify this still holds.
|
||||||
|
- **Background threads are daemon threads** — they will be killed if the main process exits. Long-running AI calls on shutdown will be lost.
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Adding a New AI Provider
|
||||||
|
|
||||||
|
1. Create `ai/<name>.py` and subclass `AIProvider` from `ai/base.py`.
|
||||||
|
2. Implement both `complete(document, instruction)` and `complete_document(document, instruction)`.
|
||||||
|
3. In `processor.py`, replace `GeminiProvider()` with your new class.
|
||||||
|
4. Export it from `ai/__init__.py`.
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
## Changelog
|
## Changelog
|
||||||
|
|
||||||
### v1.2.0
|
### v1.2.0
|
||||||
|
|||||||
Reference in New Issue
Block a user