Files
LLM-Powered-Monitoring-Agent/PROMPT.md
Spencer 07c768a4cf feat: Implement data retention policy
- Replaced `data_storage.py` with `database.py` to use SQLite instead of a JSON file for data storage.
- Added an `enforce_retention_policy` function to `database.py` to delete data older than 7 days.
- Called this function in the main monitoring loop in `monitor_agent.py`.
- Added Docker container monitoring.
- Updated `.gitignore` to ignore `monitoring.db`.
2025-09-15 13:12:05 -05:00

5.0 KiB
Executable File

You are an expert Python developer agent. Your task is to create an initial, self-contained Python script named monitor_agent.py for a self-hosted LLM monitoring system. The script should perform the core monitoring loop: collect data, analyze it with a local LLM, and trigger alerts if an anomaly is detected.

The script should be ready to run on an Ubuntu server with the following environment:

  • Ollama is already installed and running.
  • The llama3.1:8b model has been pulled and is available locally.
  • Python 3.8 or newer is installed.

Required Python Libraries: The script must use the following libraries. Please include a requirements.txt file in your response.

  • ollama (for LLM inference)
  • discord-webhook (for Discord integration)
  • requests (for Home Assistant integration)
  • syslog-rfc5424-parser (for parsing syslog)
  • apachelogs (for parsing Apache logs)
  • jc (for parsing CLI tool output)

Core Tasks:

1. Configuration:

  • Create a config.py or a section at the top of the main script to define configuration variables. Use placeholders for sensitive information.
  • DISCORD_WEBHOOK_URL: Placeholder for the Discord webhook URL.
  • HOME_ASSISTANT_URL: Placeholder for the Home Assistant server URL (e.g., http://192.168.1.50:8123).
  • HOME_ASSISTANT_TOKEN: Placeholder for the Home Assistant Long-Lived Access Token.
  • GOOGLE_HOME_SPEAKER_ID: Placeholder for the Home Assistant media_player entity ID for the Google Home speaker (e.g., media_player.kitchen_speaker).

2. Data Ingestion & Parsing Functions:

  • Create a function get_system_logs() that simulates collecting and parsing system logs.
    • The function should use the syslog-rfc5424-parser library to process a mock log entry or read from a placeholder log file.
    • The output should be a Python dictionary or JSON object.
    • Example data to parse: {"timestamp": "2025-08-15T12:00:00Z", "log": "Failed login attempt for user 'root' from 10.0.0.1"}
  • Create a function get_network_metrics() that simulates collecting and parsing network data.
    • The function should use a tool like ping to generate output, and then the jc library to parse it into a structured format.[1, 2]
    • The output should be a Python dictionary or JSON object.
    • Example data to parse: {"packets_transmitted": 3, "packets_received": 3, "packet_loss_percent": 0.0, "round_trip_ms_avg": 25.5}

3. LLM Interaction Function:

  • Create a function analyze_data_with_llm(data). This function will take the structured data as input and send it to Ollama.
  • Inside this function, use the ollama.generate() method to interact with the LLM.[3, 4]
  • The prompt provided to the LLM is critical for its performance.[5] Construct a comprehensive prompt using the data input.
  • The LLM Prompt Template:
    • Role: You are a dedicated and expert system administrator. Your primary role is to identify anomalies and provide concise, actionable reports. [6, 5]
    • Instruction: Analyze the following system and network data for any activity that appears out of place or different. Consider unusual values, errors, or unexpected patterns as anomalies. [6, 7]
    • Context: Here is the system data in JSON format for your analysis: {structured_data_as_string}
    • Output Request: If you find an anomaly, provide a report as a single, coherent, natural language paragraph. The report must clearly state the anomaly, its potential cause, and its severity (e.g., high, medium, low). If no anomaly is found, respond with "OK". [6, 5, 8]
    • Reasoning Hint: Think step by step to come to your conclusion. This is very important. [9]
  • The function should return the LLM's raw response text.

4. Alerting Functions:

  • Create a function send_discord_alert(message).
    • This function should use the discord-webhook library to send the message to the configured DISCORD_WEBHOOK_URL.[10, 11]
  • Create a function send_google_home_alert(message).
    • This function should use the requests library to make a POST request to the Home Assistant REST API.[12, 13]
    • Use the /api/services/tts/speak endpoint .
    • The JSON payload for the request must contain the entity_id of the TTS engine (e.g., tts.google_en_com), the media_player_entity_id, and the message to be spoken .
    • Add a comment to the code noting that long or complex messages should be simplified for better Text-to-Speech delivery.

5. Main Script Logic:

  • Implement a main execution loop that runs periodically (e.g., every 5 minutes).
  • The loop should:
    • Call get_system_logs() to get the latest system data.
    • Call get_network_metrics() to get the latest network data.
    • Combine the data and pass it to analyze_data_with_llm().
    • Check the LLM's response. If the response is not "OK," treat it as an anomaly report.
    • If an anomaly is detected, call send_discord_alert() and send_google_home_alert() with the LLM's report.
  • Include a simple time.sleep() within the loop to control the monitoring frequency.

Notes:

  • The code should be well-commented to explain each section of the pipeline.