feat: Implement monitor agent

2025-08-15 14:04:09 -05:00
commit 89902dfd6b
5 changed files with 267 additions and 0 deletions
--- a/.gitignore
+++ b/.gitignore
@@ -0,0 +1,3 @@
 GEMINI.md
 PROGRESS.md
 SPEC.md
--- a/PROMPT.md
+++ b/PROMPT.md
@@ -0,0 +1,76 @@
 You are an expert Python developer agent. Your task is to create an initial, self-contained Python script named `monitor_agent.py` for a self-hosted LLM monitoring system. The script should perform the core monitoring loop: collect data, analyze it with a local LLM, and trigger alerts if an anomaly is detected.
 The script should be ready to run on an Ubuntu server with the following environment:
 - Ollama is already installed and running.
 - The `llama3.1:8b` model has been pulled and is available locally.
 - Python 3.8 or newer is installed.
 **Required Python Libraries:**
 The script must use the following libraries. Please include a `requirements.txt` file in your response.
 - `ollama` (for LLM inference)
 - `discord-webhook` (for Discord integration)
 - `requests` (for Home Assistant integration)
 - `syslog-rfc5424-parser` (for parsing syslog)
 - `apachelogs` (for parsing Apache logs)
 - `jc` (for parsing CLI tool output)
 **Core Tasks:**
 **1. Configuration:**
 - Create a `config.py` or a section at the top of the main script to define configuration variables. Use placeholders for sensitive information.
 - `DISCORD_WEBHOOK_URL`: Placeholder for the Discord webhook URL.
 - `HOME_ASSISTANT_URL`: Placeholder for the Home Assistant server URL (e.g., `http://192.168.1.50:8123`).
 - `HOME_ASSISTANT_TOKEN`: Placeholder for the Home Assistant Long-Lived Access Token.
 - `GOOGLE_HOME_SPEAKER_ID`: Placeholder for the Home Assistant `media_player` entity ID for the Google Home speaker (e.g., `media_player.kitchen_speaker`).
 **2. Data Ingestion & Parsing Functions:**
 - Create a function `get_system_logs()` that simulates collecting and parsing system logs.
  - The function should use the `syslog-rfc5424-parser` library to process a mock log entry or read from a placeholder log file.
  - The output should be a Python dictionary or JSON object.
  - **Example data to parse:** `{"timestamp": "2025-08-15T12:00:00Z", "log": "Failed login attempt for user 'root' from 10.0.0.1"}`
 - Create a function `get_network_metrics()` that simulates collecting and parsing network data.
  - The function should use a tool like `ping` to generate output, and then the `jc` library to parse it into a structured format.[1, 2]
  - The output should be a Python dictionary or JSON object.
  - **Example data to parse:** `{"packets_transmitted": 3, "packets_received": 3, "packet_loss_percent": 0.0, "round_trip_ms_avg": 25.5}`
 **3. LLM Interaction Function:**
 - Create a function `analyze_data_with_llm(data)`. This function will take the structured data as input and send it to Ollama.
 - Inside this function, use the `ollama.generate()` method to interact with the LLM.[3, 4]
 - The prompt provided to the LLM is critical for its performance.[5] Construct a comprehensive prompt using the `data` input.
 - **The LLM Prompt Template:**
  - **Role:** `You are a dedicated and expert system administrator. Your primary role is to identify anomalies and provide concise, actionable reports.` [6, 5]
  - **Instruction:** `Analyze the following system and network data for any activity that appears out of place or different. Consider unusual values, errors, or unexpected patterns as anomalies.` [6, 7]
  - **Context:** `Here is the system data in JSON format for your analysis: {structured_data_as_string}`
  - **Output Request:** `If you find an anomaly, provide a report as a single, coherent, natural language paragraph. The report must clearly state the anomaly, its potential cause, and its severity (e.g., high, medium, low). If no anomaly is found, respond with "OK".` [6, 5, 8]
  - **Reasoning Hint:** `Think step by step to come to your conclusion. This is very important.` [9]
 - The function should return the LLM's raw response text.
 **4. Alerting Functions:**
 - Create a function `send_discord_alert(message)`.
  - This function should use the `discord-webhook` library to send the `message` to the configured `DISCORD_WEBHOOK_URL`.[10, 11]
 - Create a function `send_google_home_alert(message)`.
  - This function should use the `requests` library to make a `POST` request to the Home Assistant REST API.[12, 13]
  - Use the `/api/services/tts/speak` endpoint .
  - The JSON payload for the request must contain the `entity_id` of the TTS engine (e.g., `tts.google_en_com`), the `media_player_entity_id`, and the `message` to be spoken .
  - Add a comment to the code noting that long or complex messages should be simplified for better Text-to-Speech delivery.
 **5. Main Script Logic:**
 - Implement a main execution loop that runs periodically (e.g., every 5 minutes).
 - The loop should:
  - Call `get_system_logs()` to get the latest system data.
  - Call `get_network_metrics()` to get the latest network data.
  - Combine the data and pass it to `analyze_data_with_llm()`.
  - Check the LLM's response. If the response is not "OK," treat it as an anomaly report.
  - If an anomaly is detected, call `send_discord_alert()` and `send_google_home_alert()` with the LLM's report.
 - Include a simple `time.sleep()` within the loop to control the monitoring frequency.
 **Notes:**
 - The code should be well-commented to explain each section of the pipeline.
--- a/config.py
+++ b/config.py
@@ -0,0 +1,11 @@
 # Configuration for the Monitor Agent
 # Discord Webhook URL for alerts
 DISCORD_WEBHOOK_URL = "YOUR_DISCORD_WEBHOOK_URL_HERE"
 # Home Assistant Configuration
 HOME_ASSISTANT_URL = "http://YOUR_HOME_ASSISTANT_IP:8123"
 HOME_ASSISTANT_TOKEN = "YOUR_HOME_ASSISTANT_LONG_LIVED_ACCESS_TOKEN"
 # Google Home Speaker Entity ID in Home Assistant
 GOOGLE_HOME_SPEAKER_ID = "media_player.your_google_home_speaker_entity_id"
--- a/monitor_agent.py
+++ b/monitor_agent.py
@@ -0,0 +1,171 @@
 import json
 import subprocess
 import time
 from syslog_rfc5424_parser import parse
 import jc
 import ollama
 from discord_webhook import DiscordWebhook, DiscordEmbed
 import requests
 import config
 # --- Data Ingestion & Parsing Functions ---
 def get_system_logs():
    """
    Simulates collecting and parsing a system log entry.
    This function uses a mock syslog entry and parses it using the
    syslog-rfc5424-parser library.
    Returns:
        dict: A dictionary representing the parsed log entry.
    """
    mock_log_entry = '<165>1 2025-08-15T12:00:00Z my-host app-name - - [meta sequenceId="1"] { "log": "Failed login attempt for user \'root\' from 10.0.0.1" }'
    parsed_log = parse(mock_log_entry)
    # The message part is a string, so we need to parse it as JSON
    # In a real scenario, you might need to handle non-json messages
    if parsed_log.message:
        try:
            log_content = json.loads(parsed_log.message)
            # We can merge the log content with the parsed log for a more complete picture
            # For now, we'll just return the content of the log message
            return log_content
        except json.JSONDecodeError:
            return {"log": parsed_log.message}
    return {}
 def get_network_metrics():
    """
    Simulates collecting and parsing network data by running the ping command.
    This function uses the `ping` command to generate network statistics
    and the `jc` library to parse the output into a structured format.
    Returns:
        dict: A dictionary containing the parsed network metrics.
    """
    # We ping a reliable address, like Google's DNS, 3 times.
    # The '-c 3' argument is for Linux/macOS. For Windows, it would be '-n 3'.
    # Since the target is an Ubuntu server, we'll use '-c'.
    try:
        ping_output = subprocess.run(['ping', '-c', '3', '8.8.8.8'], capture_output=True, text=True, check=True).stdout
        parsed_metrics = jc.parse('ping', ping_output)
        # We're interested in the summary statistics
        if parsed_metrics:
            return parsed_metrics[0]
    except (subprocess.CalledProcessError, FileNotFoundError) as e:
        # Handle cases where ping fails or is not installed
        return {"error": str(e)}
    return {}
 # --- LLM Interaction Function ---
 def analyze_data_with_llm(data):
    """
    Analyzes the given data with a local LLM to detect anomalies.
    Args:
        data (dict): The structured data to analyze.
    Returns:
        str: The raw response text from the LLM.
    """
    structured_data_as_string = json.dumps(data, indent=2)
    prompt = f"""Role: You are a dedicated and expert system administrator. Your primary role is to identify anomalies and provide concise, actionable reports.
 Instruction: Analyze the following system and network data for any activity that appears out of place or different. Consider unusual values, errors, or unexpected patterns as anomalies.
 Context: Here is the system data in JSON format for your analysis: {structured_data_as_string}
 Output Request: If you find an anomaly, provide a report as a single, coherent, natural language paragraph. The report must clearly state the anomaly, its potential cause, and its severity (e.g., high, medium, low). If no anomaly is found, respond with \"OK\".
 Reasoning Hint: Think step by step to come to your conclusion. This is very important."""
    try:
        response = ollama.generate(
            model="llama3.1:8b",
            prompt=prompt
        )
        return response['response'].strip()
    except Exception as e:
        return f"Error communicating with Ollama: {e}"
 # --- Alerting Functions ---
 def send_discord_alert(message):
    """
    Sends an alert message to a Discord webhook.
    Args:
        message (str): The message to send.
    """
    if config.DISCORD_WEBHOOK_URL == "YOUR_DISCORD_WEBHOOK_URL_HERE":
        print("Skipping Discord alert: Webhook URL not configured.")
        return
    webhook = DiscordWebhook(url=config.DISCORD_WEBHOOK_URL)
    embed = DiscordEmbed(title="Anomaly Detected!", description=message, color='FF0000')
    webhook.add_embed(embed)
    try:
        response = webhook.execute()
        print("Discord alert sent.")
    except Exception as e:
        print(f"Error sending Discord alert: {e}")
 def send_google_home_alert(message):
    """
    Sends an alert message to a Google Home speaker via Home Assistant.
    Args:
        message (str): The message to be spoken.
    """
    # Long or complex messages should be simplified for better Text-to-Speech delivery.
    if config.HOME_ASSISTANT_URL == "http://YOUR_HOME_ASSISTANT_IP:8123":
        print("Skipping Google Home alert: Home Assistant URL not configured.")
        return
    url = f"{config.HOME_ASSISTANT_URL}/api/services/tts/speak"
    headers = {
        "Authorization": f"Bearer {config.HOME_ASSISTANT_TOKEN}",
        "Content-Type": "application/json",
    }
    payload = {
        "entity_id": "tts.google_en_com",
        "media_player_entity_id": config.GOOGLE_HOME_SPEAKER_ID,
        "message": message,
    }
    try:
        response = requests.post(url, headers=headers, json=payload)
        response.raise_for_status()  # Raise an exception for bad status codes
        print("Google Home alert sent.")
    except requests.exceptions.RequestException as e:
        print(f"Error sending Google Home alert: {e}")
 # --- Main Script Logic ---
 def main():
    """
    The main execution loop for the monitoring agent.
    """
    while True:
        print("--- Running Monitoring Cycle ---")
        system_logs = get_system_logs()
        network_metrics = get_network_metrics()
        combined_data = {
            "system_logs": system_logs,
            "network_metrics": network_metrics
        }
        llm_response = analyze_data_with_llm(combined_data)
        print(f"LLM Response: {llm_response}")
        if llm_response != "OK":
            print("Anomaly detected, sending alerts...")
            send_discord_alert(llm_response)
            send_google_home_alert(llm_response)
        print("--- Cycle Complete, sleeping for 5 minutes ---")
        time.sleep(300) # 300 seconds = 5 minutes
 if __name__ == "__main__":
    main()
--- a/requirements.txt
+++ b/requirements.txt
@@ -0,0 +1,6 @@
 ollama
 discord-webhook
 requests
 syslog-rfc5424-parser
 apachelogs
 jc