136 lines
6.9 KiB
Markdown
Executable File
136 lines
6.9 KiB
Markdown
Executable File
# Project Specification: LLM-Powered Monitoring Agent
|
|
|
|
## 1. Project Goal
|
|
|
|
The primary goal of this project is to develop a self-contained Python script, `monitor_agent.py`, that functions as a monitoring agent. This agent will collect system and network data, use a locally hosted Large Language Model (LLM) to analyze the data for anomalies, and send alerts through Discord and Home Assistant if an anomaly is detected.
|
|
|
|
## 2. Core Components
|
|
|
|
The project will be composed of the following files:
|
|
|
|
- **`monitor_agent.py`**: The main Python script containing the core logic for data collection, analysis, and alerting.
|
|
- **`config.py`**: A configuration file to store sensitive information and settings, such as API keys and URLs.
|
|
- **`requirements.txt`**: A file listing all the necessary Python libraries for the project.
|
|
- **`README.md`**: A documentation file providing an overview of the project, setup instructions, and usage examples.
|
|
- **`.gitignore`**: A file to specify which files and directories should be ignored by Git.
|
|
- **`PROGRESS.md`**: A file to track the development progress of the project.
|
|
- **`data_storage.py`**: Handles loading, storing, and calculating baselines from historical data.
|
|
- **`CONSTRAINTS.md`**: Defines constraints and guidelines for the LLM's analysis.
|
|
- **`known_issues.json`**: A JSON file containing a list of known issues to be considered by the LLM.
|
|
- **`AGENTS.md`**: Documents the human and autonomous agents involved in the project.
|
|
|
|
## 3. Functional Requirements
|
|
|
|
### 3.1. Configuration
|
|
|
|
- The agent must load configuration from `config.py`.
|
|
- The configuration shall include placeholders for:
|
|
- `DISCORD_WEBHOOK_URL`
|
|
- `HOME_ASSISTANT_URL`
|
|
- `HOME_ASSISTANT_TOKEN`
|
|
- `GOOGLE_HOME_SPEAKER_ID`
|
|
- `DAILY_RECAP_TIME`
|
|
- `NMAP_TARGETS`
|
|
- `NMAP_SCAN_OPTIONS`
|
|
|
|
### 3.2. Data Ingestion and Parsing
|
|
|
|
- The agent must be able to collect and parse system logs (syslog and auth.log).
|
|
- The agent must be able to collect and parse network metrics.
|
|
- The parsing of this data should result in a structured format (JSON or Python dictionary).
|
|
|
|
### 3.3. Monitored Metrics
|
|
|
|
- **CPU Temperature**: The agent will monitor the CPU temperature.
|
|
- **GPU Temperature**: The agent will monitor the GPU temperature.
|
|
- **System Login Attempts**: The agent will monitor system login attempts.
|
|
- **Network Scan Results (Nmap)**: The agent will periodically perform Nmap scans to discover hosts and open ports, logging detailed information including IP addresses, host status, and open ports with service details.
|
|
|
|
### 3.4. LLM Analysis
|
|
|
|
- The agent must use a local LLM (via Ollama) to analyze the collected data.
|
|
- The agent must construct a specific prompt to guide the LLM in identifying anomalies, incorporating historical baselines and known issues.
|
|
- The LLM's response will be a structured JSON object with `severity` (high, medium, low, none) and `reason` fields.
|
|
|
|
### 3.5. Alerting
|
|
|
|
- The agent must be able to send alerts to a Discord webhook.
|
|
- The agent must be able to trigger a text-to-speech (TTS) alert on a Google Home speaker via Home Assistant.
|
|
|
|
### 3.6. Alerting Logic
|
|
|
|
- Immediate alerts (Discord and Home Assistant) will only be sent for "high" severity anomalies.
|
|
- A daily recap of all anomalies (high, medium, and low) will be sent at a configurable time.
|
|
|
|
### 3.7. Main Loop
|
|
|
|
- The agent will run in a continuous loop.
|
|
- The loop will execute the data collection, analysis, and alerting steps periodically.
|
|
- The frequency of the monitoring loop will be configurable.
|
|
|
|
## 4. Data Storage and Baselining
|
|
|
|
- **4.1. Data Storage**: The agent will store historical monitoring data in a JSON file (`monitoring_data.json`).
|
|
- **4.2. Baselining**: The agent will calculate baseline averages for key metrics (e.g., RTT, packet loss, temperatures, open ports) from the stored historical data. This baseline will be used by the LLM to improve anomaly detection accuracy.
|
|
|
|
## 5. Technical Requirements
|
|
|
|
- **Language**: Python 3.8+
|
|
- **LLM**: `llama3.1:8b` running on a local Ollama instance.
|
|
- **Prerequisites**: `nmap`, `lm-sensors`
|
|
- **Libraries**:
|
|
- `ollama`
|
|
- `discord-webhook`
|
|
- `requests`
|
|
- `syslog-rfc5424-parser`
|
|
- `pingparsing`
|
|
- `python-nmap`
|
|
|
|
## 6. Project Structure
|
|
|
|
```
|
|
/
|
|
├── .gitignore
|
|
├── AGENTS.md
|
|
├── config.py
|
|
├── CONSTRAINTS.md
|
|
├── data_storage.py
|
|
├── known_issues.json
|
|
├── log_position.txt
|
|
├── auth_log_position.txt
|
|
├── monitor_agent.py
|
|
├── PROMPT.md
|
|
├── README.md
|
|
├── requirements.txt
|
|
├── PROGRESS.md
|
|
└── SPEC.md
|
|
```
|
|
|
|
## 7. Testing and Debugging
|
|
The script is equipped with a test mode, that only runs the script once, and not continuously. To enable, change the `TEST_MODE` variable in `config.py` to `True`. Once finished testing, change the variable back to `False`.
|
|
|
|
## 8. Future Enhancements
|
|
|
|
### 8.1. Process Monitoring
|
|
|
|
**Description:** The agent will be able to monitor a list of critical processes to ensure they are running. If a process is not running, an anomaly will be generated.
|
|
|
|
**Implementation Plan:**
|
|
|
|
1. **Configuration:** Add a new list variable to `config.py` named `PROCESSES_TO_MONITOR` which will contain the names of the processes to be monitored.
|
|
2. **Data Ingestion:** Create a new function in `monitor_agent.py` called `get_running_processes()` that uses the `psutil` library to get a list of all running processes.
|
|
3. **Data Analysis:** In `analyze_data_locally()`, compare the list of running processes with the `PROCESSES_TO_MONITOR` list from the configuration. If a process from the configured list is not found in the running processes, generate a "high" severity anomaly.
|
|
4. **LLM Integration:** The existing `generate_llm_report()` function will be used to generate a report for the new anomaly type.
|
|
5. **Alerting:** The existing alerting system will be used to send alerts for the new anomaly type.
|
|
|
|
### 8.2. Docker Container Monitoring
|
|
|
|
**Description:** The agent will be able to monitor a list of critical Docker containers to ensure they are running and healthy. If a container is not running or is in an unhealthy state, an anomaly will be generated.
|
|
|
|
**Implementation Plan:**
|
|
|
|
1. **Configuration:** Add a new list variable to `config.py` named `DOCKER_CONTAINERS_TO_MONITOR` which will contain the names of the Docker containers to be monitored.
|
|
2. **Data Ingestion:** Create a new function in `monitor_agent.py` called `get_docker_container_status()` that uses the `docker` Python library to get the status of all running containers.
|
|
3. **Data Analysis:** In `analyze_data_locally()`, iterate through the `DOCKER_CONTAINERS_TO_MONITOR` list. For each container, check its status. If a container is not running or its status is not "running", generate a "high" severity anomaly.
|
|
4. **LLM Integration:** The existing `generate_llm_report()` function will be used to generate a report for the new anomaly type.
|
|
5. **Alerting:** The existing alerting system will be used to send alerts for the new anomaly type. |