Updated Docs
This commit is contained in:
47
AGENTS.md
Normal file
47
AGENTS.md
Normal file
@@ -0,0 +1,47 @@
|
||||
# AGENTS.md
|
||||
|
||||
This document outlines the autonomous and human agents involved in the LLM-Powered Monitoring Agent project.
|
||||
|
||||
## Human Agents
|
||||
|
||||
### Inanis
|
||||
|
||||
- **Role**: Primary Operator, Project Owner
|
||||
- **Responsibilities**:
|
||||
- Defines project goals and requirements.
|
||||
- Provides high-level guidance and approval for major changes.
|
||||
- Reviews agent outputs and provides feedback.
|
||||
- Manages overall project direction.
|
||||
- **Contact**: [If Inanis wants to provide contact info, it would go here]
|
||||
|
||||
## Autonomous Agents
|
||||
|
||||
### Blight (LLM-Powered Monitoring Agent)
|
||||
|
||||
- **Role**: Autonomous Monitoring and Anomaly Detection Agent
|
||||
- **Type**: Large Language Model (LLM) based agent
|
||||
- **Capabilities**:
|
||||
- Collects system and network metrics (logs, temperatures, network performance, Nmap scans).
|
||||
- Analyzes collected data against historical baselines.
|
||||
- Detects anomalies using an integrated LLM (Llama3.1).
|
||||
- Generates actionable reports on detected anomalies.
|
||||
- Sends alerts via Discord and Google Home.
|
||||
- Provides daily recaps of events.
|
||||
- **Interaction**:
|
||||
- Receives instructions and context from Inanis via CLI.
|
||||
- Provides analysis and reports in JSON format.
|
||||
- Operates continuously in the background (unless in test mode).
|
||||
- **Dependencies**:
|
||||
- `ollama` (for LLM inference)
|
||||
- `nmap`
|
||||
- `lm-sensors`
|
||||
- Python libraries (as listed in `requirements.txt`)
|
||||
- **Configuration**: Configured via `config.py`, `CONSTRAINTS.md`, and `known_issues.json`.
|
||||
- **Status**: Operational and continuously evolving.
|
||||
|
||||
## Agent Interactions
|
||||
|
||||
- **Inanis -> Blight**: Inanis provides high-level tasks, reviews Blight's output, and refines its behavior through code modifications and configuration updates.
|
||||
- **Blight -> Inanis**: Blight reports detected anomalies, system status, and daily summaries to Inanis through configured alerting channels (Discord, Google Home) and logs.
|
||||
- **Blight <-> System**: Blight interacts with the local operating system to collect data (reading logs, running commands like `sensors` and `nmap`).
|
||||
- **Blight <-> LLM**: Blight sends collected and processed data to the local Ollama LLM for intelligent analysis and receives anomaly reports.
|
||||
75
PROGRESS.md
75
PROGRESS.md
@@ -13,52 +13,57 @@
|
||||
|
||||
## Phase 2: Data Storage
|
||||
|
||||
9. [x] Create `data_storage.py`
|
||||
10. [x] Implement data storage functions in `data_storage.py`
|
||||
11. [x] Update `monitor_agent.py` to use data storage
|
||||
12. [x] Update `SPEC.md` to reflect data storage functionality
|
||||
9. [x] Implement data storage functions in `data_storage.py`
|
||||
10. [x] Update `monitor_agent.py` to use data storage
|
||||
11. [x] Update `SPEC.md` to reflect data storage functionality
|
||||
|
||||
## Phase 3: Expanded Monitoring
|
||||
|
||||
13. [x] Implement CPU temperature monitoring
|
||||
14. [x] Implement GPU temperature monitoring
|
||||
15. [x] Implement system login attempt monitoring
|
||||
16. [x] Update `monitor_agent.py` to include new metrics
|
||||
17. [x] Update `SPEC.md` to reflect new metrics
|
||||
18. [x] Extend `calculate_baselines` to include system temps
|
||||
12. [x] Implement CPU temperature monitoring
|
||||
13. [x] Implement GPU temperature monitoring
|
||||
14. [x] Implement system login attempt monitoring
|
||||
15. [x] Update `monitor_agent.py` to include new metrics
|
||||
16. [x] Update `SPEC.md` to reflect new metrics
|
||||
17. [x] Extend `calculate_baselines` to include system temps
|
||||
|
||||
## Phase 4: Troubleshooting
|
||||
|
||||
19. [x] Investigated and resolved issue with `jc` library
|
||||
20. [x] Removed `jc` library as a dependency
|
||||
21. [x] Implemented manual parsing of `sensors` command output
|
||||
18. [x] Investigated and resolved issue with `jc` library
|
||||
19. [x] Removed `jc` library as a dependency
|
||||
20. [x] Implemented manual parsing of `sensors` command output
|
||||
|
||||
## Tasks Already Done
|
||||
## Phase 5: Network Scanning (Nmap Integration)
|
||||
|
||||
[x] Ensure we aren't using mockdata for get_system_logs() and get_network_metrics()
|
||||
[x] Improve `get_system_logs()` to read new lines since last check
|
||||
[x] Improve `get_network_metrics()` by using a library like `pingparsing`
|
||||
[x] Ensure we are including CONSTRAINTS.md in our analyze_data_with_llm() function
|
||||
[x] Summarize entire report into a single sentence to said to Home Assistant
|
||||
[x] Figure out why Home Assitant isn't using the speaker
|
||||
21. [x] Add `python-nmap` to `requirements.txt` and install.
|
||||
22. [x] Define `NMAP_TARGETS` and `NMAP_SCAN_OPTIONS` in `config.py`.
|
||||
23. [x] Create a new function `get_nmap_scan_results()` in `monitor_agent.py`:
|
||||
* [x] Use `python-nmap` to perform a scan on the defined targets with the specified options.
|
||||
* [x] Return the parsed results.
|
||||
24. [x] Integrate `get_nmap_scan_results()` into the main monitoring loop:
|
||||
* [x] Call this function periodically (e.g., less frequently than other metrics).
|
||||
* [x] Add the `nmap` results to the `combined_data` dictionary.
|
||||
25. [x] Update `data_storage.py` to store `nmap` results.
|
||||
26. [x] Extend `calculate_baselines()` in `data_storage.py` to include `nmap` baselines:
|
||||
* [x] Compare current `nmap` results with historical data to identify changes.
|
||||
27. [x] Modify `analyze_data_with_llm()` prompt to include `nmap` scan results for analysis.
|
||||
28. [x] Consider how to handle `nmap` permissions.
|
||||
29. [x] Improve Nmap data logging to include IP addresses, open ports, and service details.
|
||||
|
||||
## Keeping track of Current Objectives
|
||||
## Phase 6: Code Refactoring and Documentation
|
||||
|
||||
30. [x] Remove duplicate `pingparsing` import in `monitor_agent.py`.
|
||||
31. [x] Refactor `get_cpu_temperature` and `get_gpu_temperature` to call `sensors` command only once.
|
||||
32. [x] Refactor `get_login_attempts` to use a position file for efficient log reading.
|
||||
33. [x] Simplify JSON parsing in `analyze_data_with_llm`.
|
||||
34. [x] Move LLM prompt to a separate function `build_llm_prompt`.
|
||||
35. [x] Refactor main loop into smaller functions (`run_monitoring_cycle`, `main`).
|
||||
36. [x] Create helper function in `data_storage.py` for calculating average metrics.
|
||||
37. [x] Update `README.md` with current project status and improvements.
|
||||
38. [x] Create `AGENTS.md` to document human and autonomous agents.
|
||||
[x] Improve "high" priority detection by explicitly instructing LLM to output severity in structured JSON format.
|
||||
[x] Implement dynamic contextual information (Known/Resolved Issues Feed) for LLM to improve severity detection.
|
||||
|
||||
## Network Scanning (Nmap Integration)
|
||||
## TODO
|
||||
|
||||
|
||||
|
||||
1. [x] Add `python-nmap` to `requirements.txt` and install.
|
||||
2. [x] Define `NMAP_TARGETS` and `NMAP_SCAN_OPTIONS` in `config.py`.
|
||||
3. [x] Create a new function `get_nmap_scan_results()` in `monitor_agent.py`:
|
||||
* [x] Use `python-nmap` to perform a scan on the defined targets with the specified options.
|
||||
* [x] Return the parsed results.
|
||||
4. [x] Integrate `get_nmap_scan_results()` into the main monitoring loop:
|
||||
* [x] Call this function periodically (e.g., less frequently than other metrics).
|
||||
* [x] Add the `nmap` results to the `combined_data` dictionary.
|
||||
5. [x] Update `data_storage.py` to store `nmap` results.
|
||||
6. [x] Extend `calculate_baselines()` in `data_storage.py` to include `nmap` baselines:
|
||||
* [x] Compare current `nmap` results with historical data to identify changes.
|
||||
7. [x] Modify `analyze_data_with_llm()` prompt to include `nmap` scan results for analysis.
|
||||
8. [x] Consider how to handle `nmap` permissions.
|
||||
36
SPEC.md
36
SPEC.md
@@ -14,6 +14,10 @@ The project will be composed of the following files:
|
||||
- **`README.md`**: A documentation file providing an overview of the project, setup instructions, and usage examples.
|
||||
- **`.gitignore`**: A file to specify which files and directories should be ignored by Git.
|
||||
- **`PROGRESS.md`**: A file to track the development progress of the project.
|
||||
- **`data_storage.py`**: Handles loading, storing, and calculating baselines from historical data.
|
||||
- **`CONSTRAINTS.md`**: Defines constraints and guidelines for the LLM's analysis.
|
||||
- **`known_issues.json`**: A JSON file containing a list of known issues to be considered by the LLM.
|
||||
- **`AGENTS.md`**: Documents the human and autonomous agents involved in the project.
|
||||
|
||||
## 3. Functional Requirements
|
||||
|
||||
@@ -26,10 +30,12 @@ The project will be composed of the following files:
|
||||
- `HOME_ASSISTANT_TOKEN`
|
||||
- `GOOGLE_HOME_SPEAKER_ID`
|
||||
- `DAILY_RECAP_TIME`
|
||||
- `NMAP_TARGETS`
|
||||
- `NMAP_SCAN_OPTIONS`
|
||||
|
||||
### 3.2. Data Ingestion and Parsing
|
||||
|
||||
- The agent must be able to collect and parse system logs.
|
||||
- The agent must be able to collect and parse system logs (syslog and auth.log).
|
||||
- The agent must be able to collect and parse network metrics.
|
||||
- The parsing of this data should result in a structured format (JSON or Python dictionary).
|
||||
|
||||
@@ -38,24 +44,25 @@ The project will be composed of the following files:
|
||||
- **CPU Temperature**: The agent will monitor the CPU temperature.
|
||||
- **GPU Temperature**: The agent will monitor the GPU temperature.
|
||||
- **System Login Attempts**: The agent will monitor system login attempts.
|
||||
- **Network Scan Results (Nmap)**: The agent will periodically perform Nmap scans to discover hosts and open ports, logging detailed information including IP addresses, host status, and open ports with service details.
|
||||
|
||||
### 3.3. LLM Analysis
|
||||
### 3.4. LLM Analysis
|
||||
|
||||
- The agent must use a local LLM (via Ollama) to analyze the collected data.
|
||||
- The agent must construct a specific prompt to guide the LLM in identifying anomalies.
|
||||
- The LLM's response will be either "OK" (no anomaly) or a natural language paragraph describing the anomaly, including a severity level (high, medium, low).
|
||||
- The agent must construct a specific prompt to guide the LLM in identifying anomalies, incorporating historical baselines and known issues.
|
||||
- The LLM's response will be a structured JSON object with `severity` (high, medium, low, none) and `reason` fields.
|
||||
|
||||
### 3.4. Alerting
|
||||
### 3.5. Alerting
|
||||
|
||||
- The agent must be able to send alerts to a Discord webhook.
|
||||
- The agent must be able to trigger a text-to-speech (TTS) alert on a Google Home speaker via Home Assistant.
|
||||
|
||||
### 3.5. Alerting Logic
|
||||
### 3.6. Alerting Logic
|
||||
|
||||
- Immediate alerts (Discord and Home Assistant) will only be sent for "high" severity anomalies.
|
||||
- A daily recap of all anomalies (high, medium, and low) will be sent at a configurable time.
|
||||
|
||||
### 3.6. Main Loop
|
||||
### 3.7. Main Loop
|
||||
|
||||
- The agent will run in a continuous loop.
|
||||
- The loop will execute the data collection, analysis, and alerting steps periodically.
|
||||
@@ -64,26 +71,33 @@ The project will be composed of the following files:
|
||||
## 4. Data Storage and Baselining
|
||||
|
||||
- **4.1. Data Storage**: The agent will store historical monitoring data in a JSON file (`monitoring_data.json`).
|
||||
- **4.2. Baselining**: The agent will calculate baseline averages for key metrics (e.g., RTT, packet loss) from the stored historical data. This baseline will be used by the LLM to improve anomaly detection accuracy.
|
||||
- **4.2. Baselining**: The agent will calculate baseline averages for key metrics (e.g., RTT, packet loss, temperatures, open ports) from the stored historical data. This baseline will be used by the LLM to improve anomaly detection accuracy.
|
||||
|
||||
## 5. Technical Requirements
|
||||
|
||||
- **Language**: Python 3.8+
|
||||
- **LLM**: `llama3.1:8b` running on a local Ollama instance.
|
||||
- **Prerequisites**: `nmap`, `lm-sensors`
|
||||
- **Libraries**:
|
||||
- `ollama`
|
||||
- `discord-webhook`
|
||||
- `requests`
|
||||
- `syslog-rfc5424-parser`
|
||||
- `apachelogs`
|
||||
- `jc`
|
||||
- `pingparsing`
|
||||
- `python-nmap`
|
||||
|
||||
## 6. Project Structure
|
||||
|
||||
```
|
||||
/
|
||||
├── .gitignore
|
||||
├── AGENTS.md
|
||||
├── config.py
|
||||
├── CONSTRAINTS.md
|
||||
├── data_storage.py
|
||||
├── known_issues.json
|
||||
├── log_position.txt
|
||||
├── auth_log_position.txt
|
||||
├── monitor_agent.py
|
||||
├── PROMPT.md
|
||||
├── README.md
|
||||
@@ -93,4 +107,4 @@ The project will be composed of the following files:
|
||||
```
|
||||
|
||||
## 7. Testing and Debugging
|
||||
The script is equipped with a test mode, that only runs the script once, and not continuously. To enable, change the `TEST_MODE` variable in `config.py` to `True`. Once finished testing, change the variable back to `False`.
|
||||
The script is equipped with a test mode, that only runs the script once, and not continuously. To enable, change the `TEST_MODE` variable in `config.py` to `True`. Once finished testing, change the variable back to `False`.
|
||||
|
||||
@@ -12,7 +12,15 @@
|
||||
"resolution": "In networking, timing values lower then the average are often good things, and do not need to be considered an anomaly"
|
||||
},
|
||||
{
|
||||
"issue": "Port 62078 opens periodically",
|
||||
"resolution": "Port 62078 is used in apple devices for syncing communcation between each other. It is normal to appear on Apple Devices"
|
||||
"issue": "Port 62078 is open",
|
||||
"resolution": "Port 62078 is used in apple devices for syncing communcation between each other. This is not an amomaly, this is expected and normal behavior used by Apple Devices to communicate."
|
||||
},
|
||||
{
|
||||
"issue": "RTT averages are higher then average",
|
||||
"resolution": "Fluctuation is normal, and there is no need to report anything within 5s of the average RTT."
|
||||
},
|
||||
{
|
||||
"issue": "Temperatures are higher then average",
|
||||
"resolution": "Fluctuation is normal, and there is no need to report anything within 5deg Celcius of the average Temperature."
|
||||
}
|
||||
]
|
||||
Reference in New Issue
Block a user