Initial Commit with Plan

2025-08-24 14:49:15 -05:00
commit 797fb638f4
3 changed files with 134 additions and 0 deletions
--- a/AGENTS.md
+++ b/AGENTS.md
@@ -0,0 +1,16 @@
 # Agent Constraints and Directives
 This document outlines the operational constraints and directives for the AI agent, Blight.
 ## Core Directives
 *   [ ] **Task Completion:** Upon completing a task listed in `SPEC.md` or `PROGRESS.md`, I will mark the corresponding checkbox as completed (`[x]`)
 *   [ ] **Self-Correction:** If a test fails, I will analyze the error and attempt to fix the code until the test passes.
 *   [ ] **Continuous Improvement:** I will update this file (`AGENTS.md`) with any new constraints or core files that are added to the project.
 ## File-Specific Directives
 *   **`PROMPT.md`**: This file contains the original user request and project goal. I will use it as a primary reference for the project's objectives.
 *   **`SPEC.md`**: This file contains the high-level project plan and architecture. I will adhere to the design and workflow outlined in this document.
 *   **`PROGRESS.md`**: This file will contain a detailed, step-by-step plan for implementing the project. I will create this file after the `SPEC.md` is approved.
 *   **`AGENTS.md`**: This file (the one you are reading now) contains my operational constraints and directives. I will consult this file to ensure I am following the rules.
--- a/PROMPT.md
+++ b/PROMPT.md
@@ -0,0 +1,53 @@
 You are a Senior Software Architect and a technical project planner specializing in AI agent development. Your task is to generate a comprehensive project plan to build a fully self-hosted, CLI-based coding agent. The final output must be a markdown file outlining a clear, step-by-step development strategy.
 Project Goal:
 Develop a single-agent system that takes natural language input from the command line and autonomously creates, tests, and refines code scripts until they successfully fulfill the initial request. The agent must operate in a continuous loop, re-evaluating its work based on test results and refining the script until it matches the original goal.
 Core Technology Stack:
    Framework: LangChain, leveraging the LangGraph library for its advanced control over stateful, multi-step workflows with loops.   
    Model: A locally hosted Phi4-mini large language model.
    Interface: A command-line interface (CLI).
 Project Plan Requirements:
 Your plan must include the following sections, detailing the architecture and implementation steps:
 1. Architectural Design & Workflow (LangGraph Graph):
 Describe the agent's core logic as a directed graph using LangGraph. Define the key nodes and edges, and how they connect to enable the required looping behavior.
    Nodes: Define nodes for each distinct step in the agent's process. At a minimum, include:
        plan_task: Analyzes the user's natural language input and breaks it down into a formal, executable plan.
        generate_code: Writes the initial code script based on the plan.
        execute_and_test: Runs the generated script in a sandboxed environment, capturing the output and any errors.
        analyze_results: Evaluates the test results to determine if the goal has been met.
        refine_code: If the goal is not met, a node for debugging and generating a refined script.
    Edges: Describe the flow of logic between the nodes. The plan must explicitly show how the analyze_results node creates a conditional edge that either leads to the final output or loops back to the refine_code and generate_code nodes for another iteration. This is a critical component of the looping behavior.   
 2. Tooling Strategy:
 Identify and define the specific Tools that the agent will need to accomplish its tasks. LangChain's modular design allows for creating custom tools from any function or API. The plan should specify:  
    CodeExecutionTool: A tool for safely executing the generated code.
    FileManagementTool: Tools for reading and writing scripts to a local file system.
    TestRunnerTool: A tool to run specific test cases against the code.
 3. Development Phases & Milestones:
 Break the project into clear, actionable phases.
    Phase 1: Foundation & Tooling: Focus on setting up the local environment, integrating the Phi4-mini model, and building the essential custom tools.
    Phase 2: Implementing the LangGraph Workflow: Implement the core graph-based logic described in the plan, focusing on getting a basic, single-loop process working.
    Phase 3: CLI & Error Handling: Integrate the workflow with a command-line interface and add robust error handling and persistence to the LangGraph state, ensuring a smooth user experience.
 4. Final Deliverables:
 The final output of this project plan should be a detailed markdown document that can be used as a blueprint for development, ensuring all key components and their interactions are clearly defined.
--- a/SPEC.md
+++ b/SPEC.md
@@ -0,0 +1,65 @@
 # Project: Self-Hosted, CLI-Based Coding Agent
 This document outlines the specification for building a self-hosted, CLI-based coding agent that can autonomously create, test, and refine code scripts.
 ## 1. Architectural Design & Workflow (LangGraph Graph)
 The core of the agent will be a LangGraph graph that defines the agent's state and logic flow.
 *   **Nodes:**
    *   `[ ] plan_task`: This node will receive the user's natural language input and create a formal, executable plan. This plan will be a sequence of steps for the agent to follow.
    *   `[ ] generate_code`: This node will take the plan and generate the initial code script.
    *   `[ ] execute_and_test`: This node will execute the generated code in a sandboxed environment and run tests against it. It will capture the output, errors, and test results.
    *   `[ ] analyze_results`: This node will analyze the results from the `execute_and_test` node to determine if the code meets the requirements of the plan.
    *   `[ ] refine_code`: If the `analyze_results` node determines that the code is not yet correct, this node will be responsible for debugging the code and generating a refined version.
 *   **Edges:**
    *   `[ ]` The graph will start at the `plan_task` node.
    *   `[ ]` From `plan_task`, the graph will proceed to `generate_code`.
    *   `[ ]` From `generate_code`, the graph will proceed to `execute_and_test`.
    *   `[ ]` From `execute_and_test`, the graph will proceed to `analyze_results`.
    *   `[ ]` The `analyze_results` node will have a conditional edge.
        *   If the code is correct, the graph will terminate and output the final code.
        *   If the code is incorrect, the graph will loop back to the `refine_code` node, which will then pass the refined plan to the `generate_code` node to start the loop again.
 ## 2. Tooling Strategy
 The agent will be equipped with the following tools:
 *   `[ ]` **CodeExecutionTool**: A tool for safely executing the generated code in a sandboxed environment. This will likely involve using Docker or a similar containerization technology.
 *   `[ ]` **FileManagementTool**: A set of tools for reading and writing files to the local filesystem. This will be necessary for the agent to create, modify, and save the code scripts it is working on.
 *   `[ ]` **TestRunnerTool**: A tool for running specific test cases against the generated code. This will be used to verify the correctness of the code.
 ## 3. Development Phases & Milestones
 The project will be developed in the following phases:
 *   **Phase 1: Foundation & Tooling**
    *   `[ ]` Set up the local development environment.
    *   `[ ]` Download and set up the Phi4-mini model.
    *   `[ ]` Implement the `CodeExecutionTool`.
    *   `[ ]` Implement the `FileManagementTool`.
    *   `[ ]` Implement the `TestRunnerTool`.
 *   **Phase 2: Implementing the LangGraph Workflow**
    *   `[ ]` Implement the `plan_task` node.
    *   `[ ]` Implement the `generate_code` node.
    *   `[ ]` Implement the `execute_and_test` node.
    *   `[ ]` Implement the `analyze_results` node.
    *   `[ ]` Implement the `refine_code` node.
    *   `[ ]` Connect the nodes and implement the conditional looping logic.
 *   **Phase 3: CLI & Error Handling**
    *   `[ ]` Create a command-line interface (CLI) for interacting with the agent.
    *   `[ ]` Implement robust error handling throughout the system.
    *   `[ ]` Implement persistence for the LangGraph state, so that the agent can be stopped and restarted without losing its progress.
 ## 4. Final Deliverables
 *   `[ ]` A detailed markdown document (`SPEC.md`) that can be used as a blueprint for development.
 *   `[ ]` The source code for the self-hosted, CLI-based coding agent.
 *   `[ ]` A `README.md` file with instructions on how to set up and run the agent.
 ## Additional Notes for Astra/Inanis
 - A collection of AI Prompts from various vendors
    - [system-prompts-and-models-of-ai-tools](https://github.com/x1xhlol/system-prompts-and-models-of-ai-tools/tree/main/Kiro)