Compare commits

..

5 Commits

Author SHA1 Message Date
00acf9d029 Flesh out CLAUDE.md with architecture, design notes, and dev reference
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-16 12:52:30 -05:00
a80f945701 Add CLAUDE.md with versioned changelog, update README
- Created CLAUDE.md with instructions to update changelog and README on
  every commit, version number (v1.2.0), and full changelog from v1.0.0
- Updated README: document-scope BLIGHT:: syntax, processing order,
  two-line failure format, complete_document() in AI provider section

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-16 12:45:01 -05:00
6d9dc5982b Updated .gitignore 2026-03-16 12:40:29 -05:00
bc2299824e Fix loop re-processing, branch awareness, and commit message clarity
- Sanitize AI responses by replacing BLIGHT: with BLIGHT&#58; to prevent
  the service's own commits from triggering another processing cycle
- Pass branch (extracted from refs/heads/<branch>) through to Gitea get/update
  calls so pushes to non-default branches are read and written correctly
- Commit message now includes the file path: "BLIGHT: process triggers in <path>"

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-16 12:39:05 -05:00
cf71fb4464 Add BLIGHT:: document-scope triggers and case-insensitive matching
- Split TRIGGER_PATTERN into INLINE_PATTERN (BLIGHT:) and DOCUMENT_PATTERN
  (BLIGHT::), both case-insensitive
- Inline triggers replace only the trigger line (existing behaviour)
- Document-scope triggers replace the entire file; multiple BLIGHT:: triggers
  in one file are processed sequentially, each seeing the previous result
- Updated FAILED_TEMPLATE to two-line format with BLIGHT_FAILED and BLIGHT_ERROR
- Added complete_document() to AIProvider ABC and GeminiProvider with a
  dedicated system prompt instructing the model to return the full document

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-16 12:36:32 -05:00
8 changed files with 316 additions and 40 deletions

1
.gitignore vendored
View File

@@ -2,3 +2,4 @@
.venv/ .venv/
__pycache__/ __pycache__/
*.pyc *.pyc
.claude/

166
CLAUDE.md Normal file
View File

@@ -0,0 +1,166 @@
# CLAUDE.md — BLIGHT: CUE
## Instructions for Claude
- **Every commit**: Update the Changelog section below with a summary of what changed and bump the version number if appropriate.
- **Every commit**: If the changes affect anything documented in README.md (trigger syntax, failure behavior, setup, project structure, AI provider info, etc.), update README.md in the same commit.
---
## Version
**v1.2.0**
---
## Project Overview
BLIGHT: CUE is a webhook listener that monitors Gitea repositories for markdown files containing `BLIGHT:` trigger lines. When a push arrives, it fetches the changed files, sends them to Google Gemini with the embedded instruction, replaces the trigger with the AI's response, and commits the result back — fully automated, no manual steps.
It is one module in a larger ecosystem called **BLIGHT**, where independent modules communicate through Gitea repositories as a shared data layer.
**Deployed on**: homelab at `/home/artanis/Documents/BLIGHT_CUE` as `blight-cue.service`
**Gitea instance**: `https://gitea.bunny-wyvern.ts.net`
**Default port**: `5010`
---
## Architecture
```
app.py Flask webhook server — entry point, signature verification,
background thread dispatch
processor.py Trigger scanning, inline vs document-scope processing,
retry logic, AI response sanitization
gitea_client.py Gitea REST API wrapper — get_file / update_file
config.py Loads .env into module-level constants
ai/
base.py AIProvider ABC — complete() and complete_document()
gemini.py Gemini 2.5 Flash-Lite implementation, two model instances
(one system prompt per trigger type)
```
### Request flow
1. Gitea POST → `/webhook`
2. HMAC-SHA256 signature verified against `WEBHOOK_SECRET`
3. Push payload parsed — owner, repo, branch, changed file paths extracted
4. Background thread spawned; 200 returned immediately to Gitea
5. For each `.md` file:
- Fetch content + SHA via Gitea API (`?ref=<branch>`)
- `processor.process_document()` — inline triggers first, doc-scope second
- If changed: commit back via Gitea API (same branch, SHA required)
### Trigger types
| Syntax | Scope | AI output replaces |
|---|---|---|
| `BLIGHT: <instruction>` | Inline | The trigger line only |
| `BLIGHT:: <instruction>` | Document | The entire file content |
Both are case-insensitive. Multiple triggers in one file: inline ones processed first in document order, then doc-scope ones sequentially (each sees the previous result).
### Loop prevention
All AI responses are passed through `_sanitize()` in `processor.py` before being written to the document. This replaces `BLIGHT:` with `BLIGHT&#58;`, preventing the service's own commits from triggering another processing cycle.
---
## Key Design Decisions
- **Stateless**: no database, no persistent state between webhooks. Everything is derived from the Gitea API on demand.
- **Background threads**: webhook returns 200 immediately; processing happens async. Gitea will not retry on timeout.
- **SHA-based updates**: Gitea requires the current file SHA to update — prevents race conditions if two pushes arrive close together.
- **Pluggable AI**: `AIProvider` ABC makes swapping backends a one-file change. Two methods required: `complete()` for inline, `complete_document()` for doc-scope.
- **Gemini model**: `gemini-2.5-flash-lite` — chosen for cost-effectiveness. Two separate model instances are used, each with a different system prompt tuned for inline vs whole-document output.
- **Self-update**: `_self_update()` in `app.py` runs `git pull --ff-only` on startup. Only executes under `__main__` (i.e. `python app.py`), not under a WSGI server.
---
## Environment Variables
Loaded from `.env` via `python-dotenv`. See `.env.example`.
| Variable | Required | Default | Notes |
|---|---|---|---|
| `GITEA_URL` | Yes | — | No trailing slash |
| `GITEA_TOKEN` | Yes | — | Needs repo read/write |
| `GEMINI_API_KEY` | Yes | — | From Google AI Studio |
| `WEBHOOK_SECRET` | Yes | — | Must match Gitea webhook config |
| `WEBHOOK_PORT` | No | `5010` | Port Flask binds to |
---
## Running Locally
```bash
python3 -m venv .venv
source .venv/bin/activate
pip install -r requirements.txt
cp .env.example .env # fill in your values
python app.py
```
---
## Dependencies
| Package | Purpose |
|---|---|
| `flask>=3.0.0` | Webhook HTTP server |
| `python-dotenv>=1.0.0` | `.env` loading |
| `requests>=2.31.0` | Gitea REST API calls |
| `google-generativeai>=0.8.0` | Gemini SDK |
---
## Things to Keep in Mind
- **No tests exist yet.** Be careful with changes to `processor.py` — the trigger regex and two-pass processing logic are easy to break subtly. When in doubt, trace through a concrete example mentally before changing.
- **`_sanitize()` must stay on all AI return paths.** If new call sites are added to `_call_with_retry` or new retry functions are introduced, make sure sanitization is applied before returning.
- **`gitea_client.update_file` requires `branch`** — never call it without a valid branch string or the write will go to the default branch silently.
- **The `BLIGHT:` commit message prefix is intentional** — it makes BLIGHT: CUE's commits easy to identify in Gitea's history. Don't change it to something that wouldn't match `BLIGHT:` (the sanitizer would catch it but it's still confusing).
- **`INLINE_PATTERN` uses a negative lookahead `(?!:)`** to exclude `BLIGHT::` from matching as an inline trigger. If you touch the regex, verify this still holds.
- **Background threads are daemon threads** — they will be killed if the main process exits. Long-running AI calls on shutdown will be lost.
---
## Adding a New AI Provider
1. Create `ai/<name>.py` and subclass `AIProvider` from `ai/base.py`.
2. Implement both `complete(document, instruction)` and `complete_document(document, instruction)`.
3. In `processor.py`, replace `GeminiProvider()` with your new class.
4. Export it from `ai/__init__.py`.
---
## Changelog
### v1.2.0
- Fixed loop re-processing: AI responses are sanitized by replacing `BLIGHT:` with `BLIGHT&#58;` before being written to the document, preventing the service's own commits from triggering another processing cycle.
- Branch awareness: branch is now extracted from `refs/heads/<branch>` in the push payload and passed to all Gitea read/write calls, so pushes to non-default branches are handled correctly.
- Commit messages now include the file path (e.g. `BLIGHT: process triggers in notes/todo.md`).
### v1.1.0
- Added `BLIGHT::` (double colon) document-scope trigger syntax. Unlike `BLIGHT:` which replaces only the trigger line, `BLIGHT::` replaces the entire file content with the AI's rewritten document.
- Multiple `BLIGHT::` triggers in one file are processed sequentially, each operating on the result of the previous.
- Inline `BLIGHT:` triggers are always processed before `BLIGHT::` triggers.
- Both trigger types are now case-insensitive (`blight:`, `BLIGHT:`, `Blight::`, etc. all match).
- Failure comments updated to two-line format:
```
<!-- BLIGHT_FAILED: <instruction> -->
<!-- BLIGHT_ERROR: <error message> -->
```
- Added `complete_document()` to `AIProvider` ABC and `GeminiProvider`, with a dedicated system prompt instructing the model to return the full rewritten document.
### v1.0.0 — Initial release
- Flask webhook server listening for Gitea push events.
- HMAC-SHA256 signature verification on all incoming webhooks.
- Scans changed `.md` files for `BLIGHT: <instruction>` trigger lines.
- Sends full document + instruction to Google Gemini 2.5 Flash-Lite.
- Replaces trigger line with AI response in-place and commits back to Gitea.
- Retry logic: 3 attempts with exponential backoff (1s, 2s, 4s).
- Processes webhooks in background threads to return 200 immediately.
- Deduplicates file paths across multiple commits in a single push.
- Self-updates on startup via `git pull --ff-only`.
- Pluggable `AIProvider` ABC for swapping AI backends.

View File

@@ -2,7 +2,7 @@
A module in the **BLIGHT** ecosystem. A module in the **BLIGHT** ecosystem.
BLIGHT: CUE monitors Gitea repositories containing markdown files. When a push is received, it scans changed files for `BLIGHT:` trigger lines, sends the surrounding document to an AI model along with the instruction, and writes the result back to the file in-place — fully automated. BLIGHT: CUE monitors Gitea repositories containing markdown files. When a push is received, it scans changed files for `BLIGHT:` trigger lines, sends the document to an AI model along with the instruction, and writes the result back to the file in-place — fully automated.
--- ---
@@ -10,16 +10,25 @@ BLIGHT: CUE monitors Gitea repositories containing markdown files. When a push i
1. You write a `BLIGHT:` trigger anywhere in a markdown file and push it to Gitea. 1. You write a `BLIGHT:` trigger anywhere in a markdown file and push it to Gitea.
2. Gitea sends a webhook POST to this server. 2. Gitea sends a webhook POST to this server.
3. The server fetches the file, finds all `BLIGHT:` triggers, and processes them one by one. 3. The server fetches the file, finds all `BLIGHT:` triggers, and processes them.
4. Each trigger is replaced with the AI's response at the exact position of the trigger line. 4. Each trigger is replaced with the AI's response, then the updated file is committed back automatically.
5. The updated file is committed back to the repo automatically.
### Trigger Syntax ### Trigger Syntax
Triggers are case-insensitive — `BLIGHT:`, `blight:`, `Blight::`, etc. all work.
**Inline trigger** — replaces only the trigger line with the AI's response:
``` ```
BLIGHT: <your instruction here> BLIGHT: <your instruction here>
``` ```
**Document-scope trigger** — replaces the entire file with the AI's rewritten version:
```
BLIGHT:: <your instruction here>
```
Examples: Examples:
```markdown ```markdown
@@ -31,22 +40,30 @@ BLIGHT: Explain the key differences between は and が based on the paragraph a
``` ```
```markdown ```markdown
BLIGHT: Spell check this entire document and list any errors found. BLIGHT: Write a conclusion paragraph for this document.
``` ```
```markdown ```markdown
BLIGHT: Write a conclusion paragraph for this document. BLIGHT:: Spellcheck and lightly reformat this entire document.
``` ```
### Processing Order
When a file contains multiple triggers:
1. All inline (`BLIGHT:`) triggers are processed first, in document order.
2. All document-scope (`BLIGHT::`) triggers are processed next, in document order — each one operates on the result of the previous.
### Failure Behavior ### Failure Behavior
If the AI call fails after 3 attempts, the trigger is replaced with: If the AI call fails after 3 attempts, the trigger is replaced with:
```html ```html
<!-- BLIGHT_FAILED: your original instruction --> <!-- BLIGHT_FAILED: your original instruction -->
<!-- BLIGHT_ERROR: <error message> -->
``` ```
You can re-trigger processing by editing the file to restore the original `BLIGHT:` line and pushing again. You can re-trigger processing by editing the file to restore the original trigger line and pushing again.
--- ---
@@ -224,12 +241,17 @@ from abc import ABC, abstractmethod
class AIProvider(ABC): class AIProvider(ABC):
def complete(self, document: str, instruction: str) -> str: def complete(self, document: str, instruction: str) -> str:
"""Return text to insert in place of an inline BLIGHT: trigger."""
...
def complete_document(self, document: str, instruction: str) -> str:
"""Return the full rewritten document for a BLIGHT:: trigger."""
... ...
``` ```
To add a new provider (e.g. OpenRouter): To add a new provider (e.g. OpenRouter):
1. Create `ai/openrouter.py` and implement `AIProvider`. 1. Create `ai/openrouter.py` and implement both methods of `AIProvider`.
2. In `processor.py`, replace `GeminiProvider()` with your new class. 2. In `processor.py`, replace `GeminiProvider()` with your new class.
--- ---

View File

@@ -4,13 +4,14 @@ from abc import ABC, abstractmethod
class AIProvider(ABC): class AIProvider(ABC):
"""Base class for all AI provider implementations. """Base class for all AI provider implementations.
To add a new provider, subclass this and implement `complete`, then To add a new provider, subclass this and implement `complete` and
instantiate your provider in `processor.py` instead of GeminiProvider. `complete_document`, then instantiate your provider in `processor.py`
instead of GeminiProvider.
""" """
@abstractmethod @abstractmethod
def complete(self, document: str, instruction: str) -> str: def complete(self, document: str, instruction: str) -> str:
"""Process an instruction in the context of a full document. """Process an inline instruction in the context of a full document.
Args: Args:
document: The full markdown document text (for context). document: The full markdown document text (for context).
@@ -19,3 +20,15 @@ class AIProvider(ABC):
Returns: Returns:
The text to insert in place of the trigger line. The text to insert in place of the trigger line.
""" """
@abstractmethod
def complete_document(self, document: str, instruction: str) -> str:
"""Apply a document-scope instruction and return the full rewritten document.
Args:
document: The full markdown document text.
instruction: The BLIGHT:: instruction extracted from the trigger line.
Returns:
The full rewritten document as a string.
"""

View File

@@ -2,7 +2,7 @@ import google.generativeai as genai
import config import config
from .base import AIProvider from .base import AIProvider
_SYSTEM_PROMPT = ( _INLINE_SYSTEM_PROMPT = (
"You are an inline document assistant. " "You are an inline document assistant. "
"The user will provide a markdown document and a specific instruction. " "The user will provide a markdown document and a specific instruction. "
"Your response must contain ONLY the text to be inserted into the document — " "Your response must contain ONLY the text to be inserted into the document — "
@@ -11,13 +11,26 @@ _SYSTEM_PROMPT = (
"Respond as if your output will be dropped directly into the middle of a document." "Respond as if your output will be dropped directly into the middle of a document."
) )
_DOCUMENT_SYSTEM_PROMPT = (
"You are a document editing assistant. "
"The user will provide a markdown document and a specific instruction. "
"Apply the instruction to the entire document and return the full rewritten document. "
"Your response must contain ONLY the rewritten document — "
"no preamble, no explanation, no meta-commentary, no markdown code fences. "
"Preserve the document's structure and formatting unless the instruction says otherwise."
)
class GeminiProvider(AIProvider): class GeminiProvider(AIProvider):
def __init__(self) -> None: def __init__(self) -> None:
genai.configure(api_key=config.GEMINI_API_KEY) genai.configure(api_key=config.GEMINI_API_KEY)
self._model = genai.GenerativeModel( self._inline_model = genai.GenerativeModel(
model_name="gemini-2.5-flash-lite", model_name="gemini-2.5-flash-lite",
system_instruction=_SYSTEM_PROMPT, system_instruction=_INLINE_SYSTEM_PROMPT,
)
self._document_model = genai.GenerativeModel(
model_name="gemini-2.5-flash-lite",
system_instruction=_DOCUMENT_SYSTEM_PROMPT,
) )
def complete(self, document: str, instruction: str) -> str: def complete(self, document: str, instruction: str) -> str:
@@ -25,5 +38,13 @@ class GeminiProvider(AIProvider):
f"DOCUMENT:\n\n{document}\n\n" f"DOCUMENT:\n\n{document}\n\n"
f"INSTRUCTION: {instruction}" f"INSTRUCTION: {instruction}"
) )
response = self._model.generate_content(prompt) response = self._inline_model.generate_content(prompt)
return response.text.strip()
def complete_document(self, document: str, instruction: str) -> str:
prompt = (
f"DOCUMENT:\n\n{document}\n\n"
f"INSTRUCTION: {instruction}"
)
response = self._document_model.generate_content(prompt)
return response.text.strip() return response.text.strip()

11
app.py
View File

@@ -32,17 +32,17 @@ def _verify_signature(payload: bytes, signature_header: str | None) -> bool:
return hmac.compare_digest(expected, signature_header.strip()) return hmac.compare_digest(expected, signature_header.strip())
def _handle_push(owner: str, repo: str, changed_files: list[str]) -> None: def _handle_push(owner: str, repo: str, branch: str, changed_files: list[str]) -> None:
"""Process all changed markdown files in a push event.""" """Process all changed markdown files in a push event."""
for file_path in changed_files: for file_path in changed_files:
if not file_path.endswith(".md"): if not file_path.endswith(".md"):
continue continue
logger.info("Checking %s/%s: %s", owner, repo, file_path) logger.info("Checking %s/%s@%s: %s", owner, repo, branch, file_path)
try: try:
content, sha = gitea_client.get_file(owner, repo, file_path) content, sha = gitea_client.get_file(owner, repo, file_path, branch)
updated, changed = processor.process_document(content) updated, changed = processor.process_document(content)
if changed: if changed:
gitea_client.update_file(owner, repo, file_path, updated, sha) gitea_client.update_file(owner, repo, file_path, updated, sha, branch)
logger.info("Updated %s", file_path) logger.info("Updated %s", file_path)
else: else:
logger.info("No BLIGHT triggers found in %s", file_path) logger.info("No BLIGHT triggers found in %s", file_path)
@@ -65,6 +65,7 @@ def webhook():
data = json.loads(payload) data = json.loads(payload)
owner = data["repository"]["owner"]["login"] owner = data["repository"]["owner"]["login"]
repo = data["repository"]["name"] repo = data["repository"]["name"]
branch = data.get("ref", "").removeprefix("refs/heads/")
# Collect unique file paths from all commits in the push # Collect unique file paths from all commits in the push
seen: set[str] = set() seen: set[str] = set()
@@ -81,7 +82,7 @@ def webhook():
# Process in background so we return 200 to Gitea immediately # Process in background so we return 200 to Gitea immediately
thread = threading.Thread( thread = threading.Thread(
target=_handle_push, target=_handle_push,
args=(owner, repo, changed_files), args=(owner, repo, branch, changed_files),
daemon=True, daemon=True,
) )
thread.start() thread.start()

View File

@@ -10,7 +10,7 @@ def _headers() -> dict:
} }
def get_file(owner: str, repo: str, path: str) -> tuple[str, str]: def get_file(owner: str, repo: str, path: str, branch: str) -> tuple[str, str]:
"""Fetch a file's decoded content and its SHA from Gitea. """Fetch a file's decoded content and its SHA from Gitea.
Returns: Returns:
@@ -18,7 +18,7 @@ def get_file(owner: str, repo: str, path: str) -> tuple[str, str]:
required for the subsequent update call. required for the subsequent update call.
""" """
url = f"{config.GITEA_URL}/api/v1/repos/{owner}/{repo}/contents/{path}" url = f"{config.GITEA_URL}/api/v1/repos/{owner}/{repo}/contents/{path}"
response = requests.get(url, headers=_headers(), timeout=30) response = requests.get(url, headers=_headers(), params={"ref": branch}, timeout=30)
response.raise_for_status() response.raise_for_status()
data = response.json() data = response.json()
content = base64.b64decode(data["content"]).decode("utf-8") content = base64.b64decode(data["content"]).decode("utf-8")
@@ -31,14 +31,16 @@ def update_file(
path: str, path: str,
content: str, content: str,
sha: str, sha: str,
commit_message: str = "BLIGHT: process triggers", branch: str,
commit_message: str | None = None,
) -> None: ) -> None:
"""Write updated file content back to Gitea.""" """Write updated file content back to Gitea."""
url = f"{config.GITEA_URL}/api/v1/repos/{owner}/{repo}/contents/{path}" url = f"{config.GITEA_URL}/api/v1/repos/{owner}/{repo}/contents/{path}"
payload = { payload = {
"message": commit_message, "message": commit_message or f"BLIGHT: process triggers in {path}",
"content": base64.b64encode(content.encode("utf-8")).decode("ascii"), "content": base64.b64encode(content.encode("utf-8")).decode("ascii"),
"sha": sha, "sha": sha,
"branch": branch,
} }
response = requests.put(url, headers=_headers(), json=payload, timeout=30) response = requests.put(url, headers=_headers(), json=payload, timeout=30)
response.raise_for_status() response.raise_for_status()

View File

@@ -5,8 +5,15 @@ from ai import GeminiProvider
logger = logging.getLogger(__name__) logger = logging.getLogger(__name__)
TRIGGER_PATTERN = re.compile(r"^BLIGHT:\s+(.+)$", re.MULTILINE) # Inline trigger: BLIGHT: <instruction> (single colon, case-insensitive)
FAILED_TEMPLATE = "<!-- BLIGHT_FAILED: {instruction} -->" INLINE_PATTERN = re.compile(r"^BLIGHT:(?!:)\s+(.+)$", re.MULTILINE | re.IGNORECASE)
# Document-scope trigger: BLIGHT:: <instruction> (double colon, case-insensitive)
DOCUMENT_PATTERN = re.compile(r"^BLIGHT::\s+(.+)$", re.MULTILINE | re.IGNORECASE)
FAILED_TEMPLATE = "<!-- BLIGHT_FAILED: {instruction} -->\n<!-- BLIGHT_ERROR: {error} -->"
# Matches any BLIGHT: trigger in AI output that could cause a processing loop.
_SANITIZE_PATTERN = re.compile(r"BLIGHT:", re.IGNORECASE)
_MAX_RETRIES = 3 _MAX_RETRIES = 3
_RETRY_DELAYS = [1, 2, 4] # seconds between attempts _RETRY_DELAYS = [1, 2, 4] # seconds between attempts
@@ -17,43 +24,81 @@ _provider = GeminiProvider()
def process_document(content: str) -> tuple[str, bool]: def process_document(content: str) -> tuple[str, bool]:
"""Scan content for BLIGHT triggers and process each one. """Scan content for BLIGHT triggers and process each one.
Inline triggers (BLIGHT:) are processed first in document order, each
replacing only the trigger line. Document-scope triggers (BLIGHT::) are
processed next in document order, each replacing the entire file content
and operating on the result of the previous.
Returns: Returns:
(updated_content, changed) where changed is True if any triggers (updated_content, changed) where changed is True if any triggers
were found and the content was modified. were found and the content was modified.
""" """
triggers = list(TRIGGER_PATTERN.finditer(content)) has_inline = bool(INLINE_PATTERN.search(content))
if not triggers: has_document = bool(DOCUMENT_PATTERN.search(content))
if not has_inline and not has_document:
return content, False return content, False
# Process triggers one by one. After each replacement the string length
# may change, so we re-search on the updated content each iteration.
changed = False changed = False
for _ in range(len(triggers)):
match = TRIGGER_PATTERN.search(content) # --- Pass 1: inline triggers ---
# Re-search after each replacement since string length may change.
inline_count = len(INLINE_PATTERN.findall(content))
for _ in range(inline_count):
match = INLINE_PATTERN.search(content)
if not match: if not match:
break break
instruction = match.group(1).strip() instruction = match.group(1).strip()
trigger_line = match.group(0) logger.info("Processing inline trigger: %s", instruction)
logger.info("Processing trigger: %s", instruction)
replacement = _call_with_retry(content, instruction) replacement = _call_with_retry(content, instruction, document_scope=False)
content = content[:match.start()] + replacement + content[match.end():] content = content[:match.start()] + replacement + content[match.end():]
changed = True changed = True
# --- Pass 2: document-scope triggers ---
# Each trigger operates on the result of the previous.
doc_count = len(DOCUMENT_PATTERN.findall(content))
for _ in range(doc_count):
match = DOCUMENT_PATTERN.search(content)
if not match:
break
instruction = match.group(1).strip()
logger.info("Processing document-scope trigger: %s", instruction)
# Remove the trigger line before passing to AI so it doesn't appear
# in the rewritten document. Also consume the trailing newline that
# follows the trigger line, if present.
trigger_start, trigger_end = match.start(), match.end()
if trigger_end < len(content) and content[trigger_end] == "\n":
trigger_end += 1
content_without_trigger = content[:trigger_start] + content[trigger_end:]
result = _call_with_retry(content_without_trigger, instruction, document_scope=True)
if result.startswith("<!-- BLIGHT_FAILED:"):
# On failure, restore the trigger line and insert the failure comment.
content = content[:trigger_start] + result + content[trigger_end:]
else:
content = result
changed = True
return content, changed return content, changed
def _call_with_retry(document: str, instruction: str) -> str: def _call_with_retry(document: str, instruction: str, *, document_scope: bool) -> str:
"""Call the AI provider with up to _MAX_RETRIES attempts. """Call the AI provider with up to _MAX_RETRIES attempts.
Returns the AI response on success, or a BLIGHT_FAILED comment on Returns the AI response on success, or BLIGHT_FAILED/BLIGHT_ERROR comments
exhausted retries. on exhausted retries.
""" """
last_error: Exception | None = None last_error: Exception | None = None
for attempt in range(_MAX_RETRIES): for attempt in range(_MAX_RETRIES):
try: try:
return _provider.complete(document, instruction) if document_scope:
return _sanitize(_provider.complete_document(document, instruction))
return _sanitize(_provider.complete(document, instruction))
except Exception as exc: except Exception as exc:
last_error = exc last_error = exc
if attempt < _MAX_RETRIES - 1: if attempt < _MAX_RETRIES - 1:
@@ -74,4 +119,9 @@ def _call_with_retry(document: str, instruction: str) -> str:
instruction, instruction,
last_error, last_error,
) )
return FAILED_TEMPLATE.format(instruction=instruction) return FAILED_TEMPLATE.format(instruction=instruction, error=last_error)
def _sanitize(text: str) -> str:
"""Defuse any BLIGHT: trigger patterns in AI output to prevent loop re-processing."""
return _SANITIZE_PATTERN.sub("BLIGHT&#58;", text)