Files
LLM-Powered-Monitoring-Agent/PROMPT.md
Spencer 07c768a4cf feat: Implement data retention policy
- Replaced `data_storage.py` with `database.py` to use SQLite instead of a JSON file for data storage.
- Added an `enforce_retention_policy` function to `database.py` to delete data older than 7 days.
- Called this function in the main monitoring loop in `monitor_agent.py`.
- Added Docker container monitoring.
- Updated `.gitignore` to ignore `monitoring.db`.
2025-09-15 13:12:05 -05:00

77 lines
5.0 KiB
Markdown
Executable File

You are an expert Python developer agent. Your task is to create an initial, self-contained Python script named `monitor_agent.py` for a self-hosted LLM monitoring system. The script should perform the core monitoring loop: collect data, analyze it with a local LLM, and trigger alerts if an anomaly is detected.
The script should be ready to run on an Ubuntu server with the following environment:
- Ollama is already installed and running.
- The `llama3.1:8b` model has been pulled and is available locally.
- Python 3.8 or newer is installed.
**Required Python Libraries:**
The script must use the following libraries. Please include a `requirements.txt` file in your response.
- `ollama` (for LLM inference)
- `discord-webhook` (for Discord integration)
- `requests` (for Home Assistant integration)
- `syslog-rfc5424-parser` (for parsing syslog)
- `apachelogs` (for parsing Apache logs)
- `jc` (for parsing CLI tool output)
**Core Tasks:**
**1. Configuration:**
- Create a `config.py` or a section at the top of the main script to define configuration variables. Use placeholders for sensitive information.
- `DISCORD_WEBHOOK_URL`: Placeholder for the Discord webhook URL.
- `HOME_ASSISTANT_URL`: Placeholder for the Home Assistant server URL (e.g., `http://192.168.1.50:8123`).
- `HOME_ASSISTANT_TOKEN`: Placeholder for the Home Assistant Long-Lived Access Token.
- `GOOGLE_HOME_SPEAKER_ID`: Placeholder for the Home Assistant `media_player` entity ID for the Google Home speaker (e.g., `media_player.kitchen_speaker`).
**2. Data Ingestion & Parsing Functions:**
- Create a function `get_system_logs()` that simulates collecting and parsing system logs.
- The function should use the `syslog-rfc5424-parser` library to process a mock log entry or read from a placeholder log file.
- The output should be a Python dictionary or JSON object.
- **Example data to parse:** `{"timestamp": "2025-08-15T12:00:00Z", "log": "Failed login attempt for user 'root' from 10.0.0.1"}`
- Create a function `get_network_metrics()` that simulates collecting and parsing network data.
- The function should use a tool like `ping` to generate output, and then the `jc` library to parse it into a structured format.[1, 2]
- The output should be a Python dictionary or JSON object.
- **Example data to parse:** `{"packets_transmitted": 3, "packets_received": 3, "packet_loss_percent": 0.0, "round_trip_ms_avg": 25.5}`
**3. LLM Interaction Function:**
- Create a function `analyze_data_with_llm(data)`. This function will take the structured data as input and send it to Ollama.
- Inside this function, use the `ollama.generate()` method to interact with the LLM.[3, 4]
- The prompt provided to the LLM is critical for its performance.[5] Construct a comprehensive prompt using the `data` input.
- **The LLM Prompt Template:**
- **Role:** `You are a dedicated and expert system administrator. Your primary role is to identify anomalies and provide concise, actionable reports.` [6, 5]
- **Instruction:** `Analyze the following system and network data for any activity that appears out of place or different. Consider unusual values, errors, or unexpected patterns as anomalies.` [6, 7]
- **Context:** `Here is the system data in JSON format for your analysis: {structured_data_as_string}`
- **Output Request:** `If you find an anomaly, provide a report as a single, coherent, natural language paragraph. The report must clearly state the anomaly, its potential cause, and its severity (e.g., high, medium, low). If no anomaly is found, respond with "OK".` [6, 5, 8]
- **Reasoning Hint:** `Think step by step to come to your conclusion. This is very important.` [9]
- The function should return the LLM's raw response text.
**4. Alerting Functions:**
- Create a function `send_discord_alert(message)`.
- This function should use the `discord-webhook` library to send the `message` to the configured `DISCORD_WEBHOOK_URL`.[10, 11]
- Create a function `send_google_home_alert(message)`.
- This function should use the `requests` library to make a `POST` request to the Home Assistant REST API.[12, 13]
- Use the `/api/services/tts/speak` endpoint .
- The JSON payload for the request must contain the `entity_id` of the TTS engine (e.g., `tts.google_en_com`), the `media_player_entity_id`, and the `message` to be spoken .
- Add a comment to the code noting that long or complex messages should be simplified for better Text-to-Speech delivery.
**5. Main Script Logic:**
- Implement a main execution loop that runs periodically (e.g., every 5 minutes).
- The loop should:
- Call `get_system_logs()` to get the latest system data.
- Call `get_network_metrics()` to get the latest network data.
- Combine the data and pass it to `analyze_data_with_llm()`.
- Check the LLM's response. If the response is not "OK," treat it as an anomaly report.
- If an anomaly is detected, call `send_discord_alert()` and `send_google_home_alert()` with the LLM's report.
- Include a simple `time.sleep()` within the loop to control the monitoring frequency.
**Notes:**
- The code should be well-commented to explain each section of the pipeline.