Compare commits

..

2 Commits

Author SHA1 Message Date
00acf9d029 Flesh out CLAUDE.md with architecture, design notes, and dev reference
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-16 12:52:30 -05:00
a80f945701 Add CLAUDE.md with versioned changelog, update README
- Created CLAUDE.md with instructions to update changelog and README on
  every commit, version number (v1.2.0), and full changelog from v1.0.0
- Updated README: document-scope BLIGHT:: syntax, processing order,
  two-line failure format, complete_document() in AI provider section

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-16 12:45:01 -05:00
2 changed files with 196 additions and 8 deletions

166
CLAUDE.md Normal file
View File

@@ -0,0 +1,166 @@
# CLAUDE.md — BLIGHT: CUE
## Instructions for Claude
- **Every commit**: Update the Changelog section below with a summary of what changed and bump the version number if appropriate.
- **Every commit**: If the changes affect anything documented in README.md (trigger syntax, failure behavior, setup, project structure, AI provider info, etc.), update README.md in the same commit.
---
## Version
**v1.2.0**
---
## Project Overview
BLIGHT: CUE is a webhook listener that monitors Gitea repositories for markdown files containing `BLIGHT:` trigger lines. When a push arrives, it fetches the changed files, sends them to Google Gemini with the embedded instruction, replaces the trigger with the AI's response, and commits the result back — fully automated, no manual steps.
It is one module in a larger ecosystem called **BLIGHT**, where independent modules communicate through Gitea repositories as a shared data layer.
**Deployed on**: homelab at `/home/artanis/Documents/BLIGHT_CUE` as `blight-cue.service`
**Gitea instance**: `https://gitea.bunny-wyvern.ts.net`
**Default port**: `5010`
---
## Architecture
```
app.py Flask webhook server — entry point, signature verification,
background thread dispatch
processor.py Trigger scanning, inline vs document-scope processing,
retry logic, AI response sanitization
gitea_client.py Gitea REST API wrapper — get_file / update_file
config.py Loads .env into module-level constants
ai/
base.py AIProvider ABC — complete() and complete_document()
gemini.py Gemini 2.5 Flash-Lite implementation, two model instances
(one system prompt per trigger type)
```
### Request flow
1. Gitea POST → `/webhook`
2. HMAC-SHA256 signature verified against `WEBHOOK_SECRET`
3. Push payload parsed — owner, repo, branch, changed file paths extracted
4. Background thread spawned; 200 returned immediately to Gitea
5. For each `.md` file:
- Fetch content + SHA via Gitea API (`?ref=<branch>`)
- `processor.process_document()` — inline triggers first, doc-scope second
- If changed: commit back via Gitea API (same branch, SHA required)
### Trigger types
| Syntax | Scope | AI output replaces |
|---|---|---|
| `BLIGHT: <instruction>` | Inline | The trigger line only |
| `BLIGHT:: <instruction>` | Document | The entire file content |
Both are case-insensitive. Multiple triggers in one file: inline ones processed first in document order, then doc-scope ones sequentially (each sees the previous result).
### Loop prevention
All AI responses are passed through `_sanitize()` in `processor.py` before being written to the document. This replaces `BLIGHT:` with `BLIGHT&#58;`, preventing the service's own commits from triggering another processing cycle.
---
## Key Design Decisions
- **Stateless**: no database, no persistent state between webhooks. Everything is derived from the Gitea API on demand.
- **Background threads**: webhook returns 200 immediately; processing happens async. Gitea will not retry on timeout.
- **SHA-based updates**: Gitea requires the current file SHA to update — prevents race conditions if two pushes arrive close together.
- **Pluggable AI**: `AIProvider` ABC makes swapping backends a one-file change. Two methods required: `complete()` for inline, `complete_document()` for doc-scope.
- **Gemini model**: `gemini-2.5-flash-lite` — chosen for cost-effectiveness. Two separate model instances are used, each with a different system prompt tuned for inline vs whole-document output.
- **Self-update**: `_self_update()` in `app.py` runs `git pull --ff-only` on startup. Only executes under `__main__` (i.e. `python app.py`), not under a WSGI server.
---
## Environment Variables
Loaded from `.env` via `python-dotenv`. See `.env.example`.
| Variable | Required | Default | Notes |
|---|---|---|---|
| `GITEA_URL` | Yes | — | No trailing slash |
| `GITEA_TOKEN` | Yes | — | Needs repo read/write |
| `GEMINI_API_KEY` | Yes | — | From Google AI Studio |
| `WEBHOOK_SECRET` | Yes | — | Must match Gitea webhook config |
| `WEBHOOK_PORT` | No | `5010` | Port Flask binds to |
---
## Running Locally
```bash
python3 -m venv .venv
source .venv/bin/activate
pip install -r requirements.txt
cp .env.example .env # fill in your values
python app.py
```
---
## Dependencies
| Package | Purpose |
|---|---|
| `flask>=3.0.0` | Webhook HTTP server |
| `python-dotenv>=1.0.0` | `.env` loading |
| `requests>=2.31.0` | Gitea REST API calls |
| `google-generativeai>=0.8.0` | Gemini SDK |
---
## Things to Keep in Mind
- **No tests exist yet.** Be careful with changes to `processor.py` — the trigger regex and two-pass processing logic are easy to break subtly. When in doubt, trace through a concrete example mentally before changing.
- **`_sanitize()` must stay on all AI return paths.** If new call sites are added to `_call_with_retry` or new retry functions are introduced, make sure sanitization is applied before returning.
- **`gitea_client.update_file` requires `branch`** — never call it without a valid branch string or the write will go to the default branch silently.
- **The `BLIGHT:` commit message prefix is intentional** — it makes BLIGHT: CUE's commits easy to identify in Gitea's history. Don't change it to something that wouldn't match `BLIGHT:` (the sanitizer would catch it but it's still confusing).
- **`INLINE_PATTERN` uses a negative lookahead `(?!:)`** to exclude `BLIGHT::` from matching as an inline trigger. If you touch the regex, verify this still holds.
- **Background threads are daemon threads** — they will be killed if the main process exits. Long-running AI calls on shutdown will be lost.
---
## Adding a New AI Provider
1. Create `ai/<name>.py` and subclass `AIProvider` from `ai/base.py`.
2. Implement both `complete(document, instruction)` and `complete_document(document, instruction)`.
3. In `processor.py`, replace `GeminiProvider()` with your new class.
4. Export it from `ai/__init__.py`.
---
## Changelog
### v1.2.0
- Fixed loop re-processing: AI responses are sanitized by replacing `BLIGHT:` with `BLIGHT&#58;` before being written to the document, preventing the service's own commits from triggering another processing cycle.
- Branch awareness: branch is now extracted from `refs/heads/<branch>` in the push payload and passed to all Gitea read/write calls, so pushes to non-default branches are handled correctly.
- Commit messages now include the file path (e.g. `BLIGHT: process triggers in notes/todo.md`).
### v1.1.0
- Added `BLIGHT::` (double colon) document-scope trigger syntax. Unlike `BLIGHT:` which replaces only the trigger line, `BLIGHT::` replaces the entire file content with the AI's rewritten document.
- Multiple `BLIGHT::` triggers in one file are processed sequentially, each operating on the result of the previous.
- Inline `BLIGHT:` triggers are always processed before `BLIGHT::` triggers.
- Both trigger types are now case-insensitive (`blight:`, `BLIGHT:`, `Blight::`, etc. all match).
- Failure comments updated to two-line format:
```
<!-- BLIGHT_FAILED: <instruction> -->
<!-- BLIGHT_ERROR: <error message> -->
```
- Added `complete_document()` to `AIProvider` ABC and `GeminiProvider`, with a dedicated system prompt instructing the model to return the full rewritten document.
### v1.0.0 — Initial release
- Flask webhook server listening for Gitea push events.
- HMAC-SHA256 signature verification on all incoming webhooks.
- Scans changed `.md` files for `BLIGHT: <instruction>` trigger lines.
- Sends full document + instruction to Google Gemini 2.5 Flash-Lite.
- Replaces trigger line with AI response in-place and commits back to Gitea.
- Retry logic: 3 attempts with exponential backoff (1s, 2s, 4s).
- Processes webhooks in background threads to return 200 immediately.
- Deduplicates file paths across multiple commits in a single push.
- Self-updates on startup via `git pull --ff-only`.
- Pluggable `AIProvider` ABC for swapping AI backends.

View File

@@ -2,7 +2,7 @@
A module in the **BLIGHT** ecosystem. A module in the **BLIGHT** ecosystem.
BLIGHT: CUE monitors Gitea repositories containing markdown files. When a push is received, it scans changed files for `BLIGHT:` trigger lines, sends the surrounding document to an AI model along with the instruction, and writes the result back to the file in-place — fully automated. BLIGHT: CUE monitors Gitea repositories containing markdown files. When a push is received, it scans changed files for `BLIGHT:` trigger lines, sends the document to an AI model along with the instruction, and writes the result back to the file in-place — fully automated.
--- ---
@@ -10,16 +10,25 @@ BLIGHT: CUE monitors Gitea repositories containing markdown files. When a push i
1. You write a `BLIGHT:` trigger anywhere in a markdown file and push it to Gitea. 1. You write a `BLIGHT:` trigger anywhere in a markdown file and push it to Gitea.
2. Gitea sends a webhook POST to this server. 2. Gitea sends a webhook POST to this server.
3. The server fetches the file, finds all `BLIGHT:` triggers, and processes them one by one. 3. The server fetches the file, finds all `BLIGHT:` triggers, and processes them.
4. Each trigger is replaced with the AI's response at the exact position of the trigger line. 4. Each trigger is replaced with the AI's response, then the updated file is committed back automatically.
5. The updated file is committed back to the repo automatically.
### Trigger Syntax ### Trigger Syntax
Triggers are case-insensitive — `BLIGHT:`, `blight:`, `Blight::`, etc. all work.
**Inline trigger** — replaces only the trigger line with the AI's response:
``` ```
BLIGHT: <your instruction here> BLIGHT: <your instruction here>
``` ```
**Document-scope trigger** — replaces the entire file with the AI's rewritten version:
```
BLIGHT:: <your instruction here>
```
Examples: Examples:
```markdown ```markdown
@@ -31,22 +40,30 @@ BLIGHT: Explain the key differences between は and が based on the paragraph a
``` ```
```markdown ```markdown
BLIGHT: Spell check this entire document and list any errors found. BLIGHT: Write a conclusion paragraph for this document.
``` ```
```markdown ```markdown
BLIGHT: Write a conclusion paragraph for this document. BLIGHT:: Spellcheck and lightly reformat this entire document.
``` ```
### Processing Order
When a file contains multiple triggers:
1. All inline (`BLIGHT:`) triggers are processed first, in document order.
2. All document-scope (`BLIGHT::`) triggers are processed next, in document order — each one operates on the result of the previous.
### Failure Behavior ### Failure Behavior
If the AI call fails after 3 attempts, the trigger is replaced with: If the AI call fails after 3 attempts, the trigger is replaced with:
```html ```html
<!-- BLIGHT_FAILED: your original instruction --> <!-- BLIGHT_FAILED: your original instruction -->
<!-- BLIGHT_ERROR: <error message> -->
``` ```
You can re-trigger processing by editing the file to restore the original `BLIGHT:` line and pushing again. You can re-trigger processing by editing the file to restore the original trigger line and pushing again.
--- ---
@@ -224,12 +241,17 @@ from abc import ABC, abstractmethod
class AIProvider(ABC): class AIProvider(ABC):
def complete(self, document: str, instruction: str) -> str: def complete(self, document: str, instruction: str) -> str:
"""Return text to insert in place of an inline BLIGHT: trigger."""
...
def complete_document(self, document: str, instruction: str) -> str:
"""Return the full rewritten document for a BLIGHT:: trigger."""
... ...
``` ```
To add a new provider (e.g. OpenRouter): To add a new provider (e.g. OpenRouter):
1. Create `ai/openrouter.py` and implement `AIProvider`. 1. Create `ai/openrouter.py` and implement both methods of `AIProvider`.
2. In `processor.py`, replace `GeminiProvider()` with your new class. 2. In `processor.py`, replace `GeminiProvider()` with your new class.
--- ---