Compare commits

...

4 Commits

Author SHA1 Message Date
c5a446ea65 Updated Docs 2025-08-20 15:38:22 -05:00
b8b91880d6 Added log to .gitignore 2025-08-20 15:23:44 -05:00
e7730ebde5 Removed Log from Git 2025-08-20 15:23:28 -05:00
63ee043f34 Completed NMAP & Refactoring 2025-08-20 15:16:21 -05:00
10 changed files with 388 additions and 1457 deletions

2
.gitignore vendored
View File

@@ -2,3 +2,5 @@ __pycache__/*
__pycache__/
monitoring_data.json
log_position.txt
auth_log_position.txt
monitor_agent.log

47
AGENTS.md Normal file
View File

@@ -0,0 +1,47 @@
# AGENTS.md
This document outlines the autonomous and human agents involved in the LLM-Powered Monitoring Agent project.
## Human Agents
### Inanis
- **Role**: Primary Operator, Project Owner
- **Responsibilities**:
- Defines project goals and requirements.
- Provides high-level guidance and approval for major changes.
- Reviews agent outputs and provides feedback.
- Manages overall project direction.
- **Contact**: [If Inanis wants to provide contact info, it would go here]
## Autonomous Agents
### Blight (LLM-Powered Monitoring Agent)
- **Role**: Autonomous Monitoring and Anomaly Detection Agent
- **Type**: Large Language Model (LLM) based agent
- **Capabilities**:
- Collects system and network metrics (logs, temperatures, network performance, Nmap scans).
- Analyzes collected data against historical baselines.
- Detects anomalies using an integrated LLM (Llama3.1).
- Generates actionable reports on detected anomalies.
- Sends alerts via Discord and Google Home.
- Provides daily recaps of events.
- **Interaction**:
- Receives instructions and context from Inanis via CLI.
- Provides analysis and reports in JSON format.
- Operates continuously in the background (unless in test mode).
- **Dependencies**:
- `ollama` (for LLM inference)
- `nmap`
- `lm-sensors`
- Python libraries (as listed in `requirements.txt`)
- **Configuration**: Configured via `config.py`, `CONSTRAINTS.md`, and `known_issues.json`.
- **Status**: Operational and continuously evolving.
## Agent Interactions
- **Inanis -> Blight**: Inanis provides high-level tasks, reviews Blight's output, and refines its behavior through code modifications and configuration updates.
- **Blight -> Inanis**: Blight reports detected anomalies, system status, and daily summaries to Inanis through configured alerting channels (Discord, Google Home) and logs.
- **Blight <-> System**: Blight interacts with the local operating system to collect data (reading logs, running commands like `sensors` and `nmap`).
- **Blight <-> LLM**: Blight sends collected and processed data to the local Ollama LLM for intelligent analysis and receives anomaly reports.

View File

@@ -13,52 +13,57 @@
## Phase 2: Data Storage
9. [x] Create `data_storage.py`
10. [x] Implement data storage functions in `data_storage.py`
11. [x] Update `monitor_agent.py` to use data storage
12. [x] Update `SPEC.md` to reflect data storage functionality
9. [x] Implement data storage functions in `data_storage.py`
10. [x] Update `monitor_agent.py` to use data storage
11. [x] Update `SPEC.md` to reflect data storage functionality
## Phase 3: Expanded Monitoring
13. [x] Implement CPU temperature monitoring
14. [x] Implement GPU temperature monitoring
15. [x] Implement system login attempt monitoring
16. [x] Update `monitor_agent.py` to include new metrics
17. [x] Update `SPEC.md` to reflect new metrics
18. [x] Extend `calculate_baselines` to include system temps
12. [x] Implement CPU temperature monitoring
13. [x] Implement GPU temperature monitoring
14. [x] Implement system login attempt monitoring
15. [x] Update `monitor_agent.py` to include new metrics
16. [x] Update `SPEC.md` to reflect new metrics
17. [x] Extend `calculate_baselines` to include system temps
## Phase 4: Troubleshooting
19. [x] Investigated and resolved issue with `jc` library
20. [x] Removed `jc` library as a dependency
21. [x] Implemented manual parsing of `sensors` command output
18. [x] Investigated and resolved issue with `jc` library
19. [x] Removed `jc` library as a dependency
20. [x] Implemented manual parsing of `sensors` command output
## Tasks Already Done
## Phase 5: Network Scanning (Nmap Integration)
[x] Ensure we aren't using mockdata for get_system_logs() and get_network_metrics()
[x] Improve `get_system_logs()` to read new lines since last check
[x] Improve `get_network_metrics()` by using a library like `pingparsing`
[x] Ensure we are including CONSTRAINTS.md in our analyze_data_with_llm() function
[x] Summarize entire report into a single sentence to said to Home Assistant
[x] Figure out why Home Assitant isn't using the speaker
21. [x] Add `python-nmap` to `requirements.txt` and install.
22. [x] Define `NMAP_TARGETS` and `NMAP_SCAN_OPTIONS` in `config.py`.
23. [x] Create a new function `get_nmap_scan_results()` in `monitor_agent.py`:
* [x] Use `python-nmap` to perform a scan on the defined targets with the specified options.
* [x] Return the parsed results.
24. [x] Integrate `get_nmap_scan_results()` into the main monitoring loop:
* [x] Call this function periodically (e.g., less frequently than other metrics).
* [x] Add the `nmap` results to the `combined_data` dictionary.
25. [x] Update `data_storage.py` to store `nmap` results.
26. [x] Extend `calculate_baselines()` in `data_storage.py` to include `nmap` baselines:
* [x] Compare current `nmap` results with historical data to identify changes.
27. [x] Modify `analyze_data_with_llm()` prompt to include `nmap` scan results for analysis.
28. [x] Consider how to handle `nmap` permissions.
29. [x] Improve Nmap data logging to include IP addresses, open ports, and service details.
## Keeping track of Current Objectives
## Phase 6: Code Refactoring and Documentation
30. [x] Remove duplicate `pingparsing` import in `monitor_agent.py`.
31. [x] Refactor `get_cpu_temperature` and `get_gpu_temperature` to call `sensors` command only once.
32. [x] Refactor `get_login_attempts` to use a position file for efficient log reading.
33. [x] Simplify JSON parsing in `analyze_data_with_llm`.
34. [x] Move LLM prompt to a separate function `build_llm_prompt`.
35. [x] Refactor main loop into smaller functions (`run_monitoring_cycle`, `main`).
36. [x] Create helper function in `data_storage.py` for calculating average metrics.
37. [x] Update `README.md` with current project status and improvements.
38. [x] Create `AGENTS.md` to document human and autonomous agents.
[x] Improve "high" priority detection by explicitly instructing LLM to output severity in structured JSON format.
[x] Implement dynamic contextual information (Known/Resolved Issues Feed) for LLM to improve severity detection.
## Network Scanning (Nmap Integration)
## TODO
1. [x] Add `python-nmap` to `requirements.txt` and install.
2. [x] Define `NMAP_TARGETS` and `NMAP_SCAN_OPTIONS` in `config.py`.
3. [x] Create a new function `get_nmap_scan_results()` in `monitor_agent.py`:
* [x] Use `python-nmap` to perform a scan on the defined targets with the specified options.
* [x] Return the parsed results.
4. [x] Integrate `get_nmap_scan_results()` into the main monitoring loop:
* [x] Call this function periodically (e.g., less frequently than other metrics).
* [x] Add the `nmap` results to the `combined_data` dictionary.
5. [x] Update `data_storage.py` to store `nmap` results.
6. [x] Extend `calculate_baselines()` in `data_storage.py` to include `nmap` baselines:
* [x] Compare current `nmap` results with historical data to identify changes.
7. [x] Modify `analyze_data_with_llm()` prompt to include `nmap` scan results for analysis.
8. [x] Consider how to handle `nmap` permissions.

147
README.md
View File

@@ -1,104 +1,93 @@
# LLM-Powered Monitoring Agent
This project is a self-hosted monitoring agent that uses a local Large Language Model (LLM) to detect anomalies in system and network data. It's designed to be a simple, self-contained Python script that can be easily deployed on a server.
This project implements an LLM-powered monitoring agent designed to continuously collect system and network data, analyze it against historical baselines, and alert on anomalies. The agent leverages a local Large Language Model (LLM) for intelligent anomaly detection and integrates with Discord and Google Home for notifications.
## 1. Installation
## Features
To get started, you'll need to have Python 3.8 or newer installed. Then, follow these steps:
- **System Log Monitoring**: Tracks new entries in `/var/log/syslog` and `/var/log/auth.log` (for login attempts).
- **Network Metrics**: Gathers network performance data by pinging a public IP (e.g., 8.8.8.8).
- **Hardware Monitoring**: Collects CPU and GPU temperature data.
- **Nmap Scanning**: Periodically performs network scans to discover hosts and open ports.
- **Historical Baseline Analysis**: Compares current data against a 24-hour rolling baseline to identify deviations.
- **LLM-Powered Anomaly Detection**: Utilizes a local LLM (Ollama with Llama3.1) to analyze combined system data, baselines, and Nmap changes for anomalies.
- **Alerting**: Sends high-severity anomaly alerts to Discord and Google Home speakers (via Home Assistant).
- **Daily Recap**: Provides a daily summary of detected events.
1. **Clone the repository or download the files:**
## Recent Improvements
```bash
git clone <repository_url>
cd <repository_directory>
```
- **Enhanced Nmap Data Logging**: The Nmap scan results are now processed and stored in a more structured format, including:
- Discovered IP addresses.
- Status of each host.
- Detailed list of open ports for each host, including service, product, and version information.
This significantly improves the clarity and utility of Nmap data for anomaly detection.
- **Code Refactoring (`monitor_agent.py`)**:
- **Optimized Sensor Data Collection**: CPU and GPU temperature data are now collected with a single call to the `sensors` command, improving efficiency.
- **Efficient Login Attempt Logging**: The agent now tracks its position in `/var/log/auth.log`, preventing redundant reads of the entire file and improving performance for large log files.
- **Modular Main Loop**: The core monitoring logic has been broken down into smaller, more manageable functions, enhancing readability and maintainability.
- **Separated LLM Prompt Building**: The complex LLM prompt construction logic has been moved into a dedicated function, making `analyze_data_with_llm` more focused.
- **Code Refactoring (`data_storage.py`)**:
- **Streamlined Baseline Calculations**: Helper functions have been introduced to reduce code duplication and improve clarity in the calculation of average metrics for baselines.
2. **Create and activate a Python virtual environment:**
```bash
python -m venv venv
source venv/bin/activate # On Windows, use `venv\Scripts\activate`
```
3. **Install the required Python libraries:**
```bash
pip install -r requirements.txt
```
## 2. Setup
Before running the agent, you need to configure it and ensure the necessary services are running.
## Setup and Installation
### Prerequisites
- **Ollama:** The agent requires that [Ollama](https://ollama.com/) is installed and running on the server.
- **LLM Model:** You must have the `llama3.1:8b` model pulled and available in Ollama. You can pull it with the following command:
- Python 3.x
- `ollama` installed and running with the `llama3.1:8b` model pulled (`ollama pull llama3.1:8b`)
- `nmap` installed
- `lm-sensors` installed (for CPU/GPU temperature monitoring)
- Discord webhook URL
- (Optional) Home Assistant instance with a long-lived access token and a Google Home speaker configured.
### Installation
1. Clone the repository:
```bash
ollama pull llama3.1:8b
git clone <repository_url>
cd LLM-Powered-Monitoring-Agent
```
2. Install Python dependencies:
```bash
pip install -r requirements.txt
```
3. Configure the agent:
- Open `config.py` and update the following variables:
- `DISCORD_WEBHOOK_URL`
- `HOME_ASSISTANT_URL` (if using Google Home alerts)
- `HOME_ASSISTANT_TOKEN` (if using Google Home alerts)
- `GOOGLE_HOME_SPEAKER_ID` (if using Google Home alerts)
- `NMAP_TARGETS` (e.g., "192.168.1.0/24" or "192.168.1.100")
- `NMAP_SCAN_OPTIONS` (default is "-sS -T4")
- `DAILY_RECAP_TIME` (e.g., "20:00" for 8 PM)
- `TEST_MODE` (set to `True` for a single run, `False` for continuous operation)
### Configuration
## Usage
All configuration is done in the `config.py` file. You will need to replace the placeholder values with your actual credentials and URLs.
- `DISCORD_WEBHOOK_URL`: Your Discord channel's webhook URL. This is used to send alerts.
- `HOME_ASSISTANT_URL`: The URL of your Home Assistant instance (e.g., `http://192.168.1.50:8123`).
- `HOME_ASSISTANT_TOKEN`: A Long-Lived Access Token for your Home Assistant instance. You can generate this in your Home Assistant profile settings.
- `GOOGLE_HOME_SPEAKER_ID`: The `media_player` entity ID for your Google Home speaker in Home Assistant (e.g., `media_player.kitchen_speaker`).
## 3. Usage
Once the installation and setup are complete, you can run the monitoring agent with the following command:
To run the monitoring agent:
```bash
python monitor_agent.py
```
The script will start a continuous monitoring loop. Every 5 minutes, it will:
### Test Mode
1. Collect simulated system and network data.
2. Send the data to the local LLM for analysis.
3. If the LLM detects a **high-severity** anomaly, it will send an alert to your configured Discord channel and broadcast a message to your Google Home speaker via Home Assistant.
4. At the time specified in `DAILY_RECAP_TIME`, a summary of all anomalies for the day will be sent to the Discord channel.
Set `TEST_MODE = True` in `config.py` to run the agent once and exit. This is useful for testing configurations and initial setup.
The script will print its status and any detected anomalies to the console.
## Extending and Customizing
### Nmap Scans
- **Adding New Metrics**: You can add new data collection functions in `monitor_agent.py` and include their results in the `combined_data` dictionary.
- **Customizing LLM Analysis**: Modify the `CONSTRAINTS.md` file to provide specific instructions or constraints to the LLM for anomaly detection.
- **Known Issues**: Update `known_issues.json` with any known or expected system behaviors to prevent the LLM from flagging them as anomalies.
- **Alerting Mechanisms**: Implement additional alerting functions (e.g., email, SMS) in `monitor_agent.py` and integrate them into the anomaly detection logic.
The agent uses `nmap` to scan the network for open ports. By default, it uses a TCP SYN scan (`-sS`), which requires root privileges. If the script is not run as root, it will fall back to a TCP connect scan (`-sT`), which does not require root privileges but is slower and more likely to be detected.
## Project Structure
To run the agent with root privileges, use the `sudo` command:
```bash
sudo python monitor_agent.py
```
## 4. Features
### Priority System
The monitoring agent uses a priority system to classify anomalies. The LLM is instructed to return a severity level for each anomaly it detects. The possible severity levels are:
- **high**: Indicates a critical issue that requires immediate attention. An alert is sent to Discord and Google Home.
- **medium**: Indicates a non-critical issue that should be investigated. No alert is sent.
- **low**: Indicates a minor issue or a potential false positive. No alert is sent.
- **none**: Indicates that no anomaly was detected.
### Known Issues Feed
The agent uses a `known_issues.json` file to provide the LLM with a list of known issues and their resolutions. This helps the LLM to avoid flagging resolved or expected issues as anomalies.
You can add new issues to the `known_issues.json` file by following the existing format. Each issue should have an "issue" and a "resolution" key. For example:
```json
[
{
"issue": "CPU temperature spikes to 80C under heavy load",
"resolution": "This is normal behavior for this CPU model and is not a cause for concern."
}
]
```
**Note on Mock Data:** The current version of the script uses mock data for system logs and network metrics. To use this in a real-world scenario, you would need to replace the mock data with actual data from your systems.
- `monitor_agent.py`: Main script for data collection, LLM interaction, and alerting.
- `data_storage.py`: Handles loading, storing, and calculating baselines from historical data.
- `config.py`: Stores configurable parameters for the agent.
- `requirements.txt`: Lists Python dependencies.
- `CONSTRAINTS.md`: Defines constraints and guidelines for the LLM's analysis.
- `known_issues.json`: A JSON file containing a list of known issues to be considered by the LLM.
- `monitoring_data.json`: (Generated) Stores historical monitoring data.
- `log_position.txt`: (Generated) Stores the last read position for `/var/log/syslog`.
- `auth_log_position.txt`: (Generated) Stores the last read position for `/var/log/auth.log`.

37
SPEC.md
View File

@@ -14,6 +14,10 @@ The project will be composed of the following files:
- **`README.md`**: A documentation file providing an overview of the project, setup instructions, and usage examples.
- **`.gitignore`**: A file to specify which files and directories should be ignored by Git.
- **`PROGRESS.md`**: A file to track the development progress of the project.
- **`data_storage.py`**: Handles loading, storing, and calculating baselines from historical data.
- **`CONSTRAINTS.md`**: Defines constraints and guidelines for the LLM's analysis.
- **`known_issues.json`**: A JSON file containing a list of known issues to be considered by the LLM.
- **`AGENTS.md`**: Documents the human and autonomous agents involved in the project.
## 3. Functional Requirements
@@ -26,10 +30,12 @@ The project will be composed of the following files:
- `HOME_ASSISTANT_TOKEN`
- `GOOGLE_HOME_SPEAKER_ID`
- `DAILY_RECAP_TIME`
- `NMAP_TARGETS`
- `NMAP_SCAN_OPTIONS`
### 3.2. Data Ingestion and Parsing
- The agent must be able to collect and parse system logs.
- The agent must be able to collect and parse system logs (syslog and auth.log).
- The agent must be able to collect and parse network metrics.
- The parsing of this data should result in a structured format (JSON or Python dictionary).
@@ -38,24 +44,25 @@ The project will be composed of the following files:
- **CPU Temperature**: The agent will monitor the CPU temperature.
- **GPU Temperature**: The agent will monitor the GPU temperature.
- **System Login Attempts**: The agent will monitor system login attempts.
- **Network Scan Results (Nmap)**: The agent will periodically perform Nmap scans to discover hosts and open ports, logging detailed information including IP addresses, host status, and open ports with service details.
### 3.3. LLM Analysis
### 3.4. LLM Analysis
- The agent must use a local LLM (via Ollama) to analyze the collected data.
- The agent must construct a specific prompt to guide the LLM in identifying anomalies.
- The LLM's response will be either "OK" (no anomaly) or a natural language paragraph describing the anomaly, including a severity level (high, medium, low).
- The agent must construct a specific prompt to guide the LLM in identifying anomalies, incorporating historical baselines and known issues.
- The LLM's response will be a structured JSON object with `severity` (high, medium, low, none) and `reason` fields.
### 3.4. Alerting
### 3.5. Alerting
- The agent must be able to send alerts to a Discord webhook.
- The agent must be able to trigger a text-to-speech (TTS) alert on a Google Home speaker via Home Assistant.
### 3.5. Alerting Logic
### 3.6. Alerting Logic
- Immediate alerts (Discord and Home Assistant) will only be sent for "high" severity anomalies.
- A daily recap of all anomalies (high, medium, and low) will be sent at a configurable time.
### 3.6. Main Loop
### 3.7. Main Loop
- The agent will run in a continuous loop.
- The loop will execute the data collection, analysis, and alerting steps periodically.
@@ -64,26 +71,33 @@ The project will be composed of the following files:
## 4. Data Storage and Baselining
- **4.1. Data Storage**: The agent will store historical monitoring data in a JSON file (`monitoring_data.json`).
- **4.2. Baselining**: The agent will calculate baseline averages for key metrics (e.g., RTT, packet loss) from the stored historical data. This baseline will be used by the LLM to improve anomaly detection accuracy.
- **4.2. Baselining**: The agent will calculate baseline averages for key metrics (e.g., RTT, packet loss, temperatures, open ports) from the stored historical data. This baseline will be used by the LLM to improve anomaly detection accuracy.
## 5. Technical Requirements
- **Language**: Python 3.8+
- **LLM**: `llama3.1:8b` running on a local Ollama instance.
- **Prerequisites**: `nmap`, `lm-sensors`
- **Libraries**:
- `ollama`
- `discord-webhook`
- `requests`
- `syslog-rfc5424-parser`
- `apachelogs`
- `jc`
- `pingparsing`
- `python-nmap`
## 6. Project Structure
```
/
├── .gitignore
├── AGENTS.md
├── config.py
├── CONSTRAINTS.md
├── data_storage.py
├── known_issues.json
├── log_position.txt
├── auth_log_position.txt
├── monitor_agent.py
├── PROMPT.md
├── README.md
@@ -91,3 +105,6 @@ The project will be composed of the following files:
├── PROGRESS.md
└── SPEC.md
```
## 7. Testing and Debugging
The script is equipped with a test mode, that only runs the script once, and not continuously. To enable, change the `TEST_MODE` variable in `config.py` to `True`. Once finished testing, change the variable back to `False`.

View File

@@ -12,7 +12,7 @@ GOOGLE_HOME_SPEAKER_ID = "media_player.spencer_room_speaker"
DAILY_RECAP_TIME = "20:00"
# Nmap Configuration
NMAP_TARGETS = "192.168.1.0/24"
NMAP_TARGETS = "192.168.2.0/24"
NMAP_SCAN_OPTIONS = "-sS -T4"
# Test Mode (True to run once and exit, False to run continuously)

View File

@@ -16,6 +16,11 @@ def store_data(new_data):
with open(DATA_FILE, 'w') as f:
json.dump(data, f, indent=4)
def _calculate_average(data, key1, key2):
"""Helper function to calculate the average of a nested key in a list of dicts."""
values = [d[key1][key2] for d in data if key1 in d and key2 in d[key1] and d[key1][key2] != "N/A"]
return sum(values) / len(values) if values else 0
def calculate_baselines():
data = load_data()
if not data:
@@ -29,23 +34,23 @@ def calculate_baselines():
return {}
baseline_metrics = {
'avg_rtt': sum(d['network_metrics']['rtt_avg'] for d in recent_data if 'rtt_avg' in d['network_metrics']) / len(recent_data),
'packet_loss': sum(d['network_metrics']['packet_loss_rate'] for d in recent_data if 'packet_loss_rate' in d['network_metrics']) / len(recent_data),
'avg_cpu_temp': sum(d['cpu_temperature']['cpu_temperature'] for d in recent_data if d['cpu_temperature']['cpu_temperature'] != "N/A") / len(recent_data),
'avg_gpu_temp': sum(d['gpu_temperature']['gpu_temperature'] for d in recent_data if d['gpu_temperature']['gpu_temperature'] != "N/A") / len(recent_data),
'avg_rtt': _calculate_average(recent_data, 'network_metrics', 'rtt_avg'),
'packet_loss': _calculate_average(recent_data, 'network_metrics', 'packet_loss_rate'),
'avg_cpu_temp': _calculate_average(recent_data, 'cpu_temperature', 'cpu_temperature'),
'avg_gpu_temp': _calculate_average(recent_data, 'gpu_temperature', 'gpu_temperature'),
}
# Baseline for open ports from nmap scans
host_ports = {}
for d in recent_data:
if 'nmap_results' in d and 'scan' in d['nmap_results']:
for host, scan_data in d['nmap_results']['scan'].items():
if host not in host_ports:
host_ports[host] = set()
if 'tcp' in scan_data:
for port, port_data in scan_data['tcp'].items():
if port_data['state'] == 'open':
host_ports[host].add(port)
if 'nmap_results' in d and 'hosts' in d.get('nmap_results', {}):
for host_info in d['nmap_results']['hosts']:
host_ip = host_info['ip']
if host_ip not in host_ports:
host_ports[host_ip] = set()
for port_info in host_info.get('open_ports', []):
host_ports[host_ip].add(port_info['port'])
# Convert sets to sorted lists for JSON serialization
for host, ports in host_ports.items():

View File

@@ -1,10 +1,26 @@
[
{
"issue": "CPU temperature spikes to 90C under heavy load",
"resolution": "This is normal behavior for this CPU model and is not a cause for concern."
"issue": "CPU temperatures less then the average",
"resolution": "This is normal behavior for CPU's when not in use. Lower Temps are usually a good thing"
},
{
"issue": "Access attempts from unknown IP Addresses",
"resolution": "ufw has been enabled, and blocks all default connections by default. The only IP Addresses allowed are 192.168.2.0/24 and 100.64.0.0/10"
},
{
"issue": "Network Timing values are lower then average",
"resolution": "In networking, timing values lower then the average are often good things, and do not need to be considered an anomaly"
},
{
"issue": "Port 62078 is open",
"resolution": "Port 62078 is used in apple devices for syncing communcation between each other. This is not an amomaly, this is expected and normal behavior used by Apple Devices to communicate."
},
{
"issue": "RTT averages are higher then average",
"resolution": "Fluctuation is normal, and there is no need to report anything within 5s of the average RTT."
},
{
"issue": "Temperatures are higher then average",
"resolution": "Fluctuation is normal, and there is no need to report anything within 5deg Celcius of the average Temperature."
}
]

File diff suppressed because it is too large Load Diff

View File

@@ -19,6 +19,7 @@ import config
from syslog_rfc5424_parser import parser
LOG_POSITION_FILE = 'log_position.txt'
AUTH_LOG_POSITION_FILE = 'auth_log_position.txt'
# --- Data Ingestion & Parsing Functions ---
@@ -54,7 +55,6 @@ def get_system_logs():
print(f"Error reading syslog: {e}")
return {"syslog": []}
import pingparsing
def get_network_metrics():
"""Gets network metrics by pinging 8.8.8.8."""
@@ -69,44 +69,58 @@ def get_network_metrics():
print(f"Error getting network metrics: {e}")
return {"error": "ping command failed"}
def get_cpu_temperature():
"""Gets the CPU temperature using the sensors command."""
def get_sensor_data():
"""Gets all sensor data at once."""
try:
sensors_output = subprocess.check_output(["sensors"], text=True)
# Use regex to find the CPU temperature
match = re.search(r"Package id 0:\s+\+([\d\.]+)", sensors_output)
if match:
return {"cpu_temperature": float(match.group(1))}
else:
return {"cpu_temperature": "N/A"}
return subprocess.check_output(["sensors"], text=True)
except (subprocess.CalledProcessError, FileNotFoundError):
print("Error: 'sensors' command not found. Please install lm-sensors.")
return None
def get_cpu_temperature(sensors_output):
"""Gets the CPU temperature from the sensors output."""
if not sensors_output:
return {"cpu_temperature": "N/A"}
# Use regex to find the CPU temperature
match = re.search(r"Package id 0:\s+\+([\d\.]+)", sensors_output)
if match:
return {"cpu_temperature": float(match.group(1))}
else:
return {"cpu_temperature": "N/A"}
def get_gpu_temperature():
"""Gets the GPU temperature using the sensors command."""
try:
sensors_output = subprocess.check_output(["sensors"], text=True)
# Use regex to find the GPU temperature for amdgpu
match = re.search(r"edge:\s+\+([\d\.]+)", sensors_output)
def get_gpu_temperature(sensors_output):
"""Gets the GPU temperature from the sensors output."""
if not sensors_output:
return {"gpu_temperature": "N/A"}
# Use regex to find the GPU temperature for amdgpu
match = re.search(r"edge:\s+\+([\d\.]+)", sensors_output)
if match:
return {"gpu_temperature": float(match.group(1))}
else:
# if amdgpu not found, try radeon
match = re.search(r"temp1:\s+\+([\d\.]+)", sensors_output)
if match:
return {"gpu_temperature": float(match.group(1))}
else:
# if amdgpu not found, try radeon
match = re.search(r"temp1:\s+\+([\d\.]+)", sensors_output)
if match:
return {"gpu_temperature": float(match.group(1))}
else:
return {"gpu_temperature": "N/A"}
except (subprocess.CalledProcessError, FileNotFoundError):
print("Error: 'sensors' command not found. Please install lm-sensors.")
return {"gpu_temperature": "N/A"}
return {"gpu_temperature": "N/A"}
def get_login_attempts():
"""Gets system login attempts from /var/log/auth.log."""
"""Gets system login attempts from /var/log/auth.log since the last check."""
try:
last_position = 0
if os.path.exists(AUTH_LOG_POSITION_FILE):
with open(AUTH_LOG_POSITION_FILE, 'r') as f:
last_position = int(f.read())
with open("/var/log/auth.log", "r") as f:
f.seek(last_position)
log_lines = f.readlines()
current_position = f.tell()
with open(AUTH_LOG_POSITION_FILE, 'w') as f:
f.write(str(current_position))
failed_logins = []
for line in log_lines:
@@ -122,7 +136,7 @@ def get_login_attempts():
return {"failed_logins": []}
def get_nmap_scan_results():
"""Performs an Nmap scan and returns the results."""
"""Performs an Nmap scan and returns a structured summary."""
try:
nm = nmap.PortScanner()
scan_options = config.NMAP_SCAN_OPTIONS
@@ -131,47 +145,37 @@ def get_nmap_scan_results():
scan_options = scan_options.replace("-sS", "-sT")
scan_results = nm.scan(hosts=config.NMAP_TARGETS, arguments=scan_options)
return scan_results
# Process the results into a more structured format
processed_results = {"hosts": []}
if "scan" in scan_results:
for host, scan_data in scan_results["scan"].items():
host_info = {
"ip": host,
"status": scan_data.get("status", {}).get("state", "unknown"),
"open_ports": []
}
if "tcp" in scan_data:
for port, port_data in scan_data["tcp"].items():
if port_data.get("state") == "open":
host_info["open_ports"].append({
"port": port,
"service": port_data.get("name", ""),
"product": port_data.get("product", ""),
"version": port_data.get("version", "")
})
processed_results["hosts"].append(host_info)
return processed_results
except Exception as e:
print(f"Error performing Nmap scan: {e}")
return {"error": "Nmap scan failed"}
# --- LLM Interaction Function ---
def analyze_data_with_llm(data, baselines):
"""Analyzes data with the local LLM."""
with open("CONSTRAINTS.md", "r") as f:
constraints = f.read()
with open("known_issues.json", "r") as f:
known_issues = json.load(f)
# Compare current nmap results with baseline
nmap_changes = {"new_hosts": [], "changed_ports": {}}
if "nmap_results" in data and "host_ports" in baselines:
current_hosts = set(data["nmap_results"].get("scan", {}).keys())
baseline_hosts = set(baselines["host_ports"].keys())
# New hosts
nmap_changes["new_hosts"] = sorted(list(current_hosts - baseline_hosts))
# Changed ports on existing hosts
for host in current_hosts.intersection(baseline_hosts):
current_ports = set()
if "tcp" in data["nmap_results"]["scan"][host]:
for port, port_data in data["nmap_results"]["scan"][host]["tcp"].items():
if port_data["state"] == "open":
current_ports.add(port)
baseline_ports = set(baselines["host_ports"].get(host, []))
newly_opened = sorted(list(current_ports - baseline_ports))
newly_closed = sorted(list(baseline_ports - current_ports))
if newly_opened or newly_closed:
nmap_changes["changed_ports"][host] = {"opened": newly_opened, "closed": newly_closed}
prompt = f"""
def build_llm_prompt(data, baselines, nmap_changes, constraints, known_issues):
"""Builds the prompt for the LLM analysis."""
return f"""
**Role:** You are a dedicated and expert system administrator. Your primary role is to identify anomalies and provide concise, actionable reports.
**Instruction:** Analyze the following system and network data for any activity that appears out of place or different. Consider unusual values, errors, or unexpected patterns as anomalies. Compare the current data with the historical baseline data to identify significant deviations. Consult the known issues feed to avoid flagging resolved or expected issues. Pay special attention to the Nmap scan results for any new or unexpected open ports.
@@ -191,35 +195,64 @@ def analyze_data_with_llm(data, baselines):
**Constraints and Guidelines:**
{constraints}
**Output Request:** If you find an anomaly, provide a report as a single JSON object with two keys: "severity" and "reason". The "severity" must be one of "high", "medium", "low", or "none". The "reason" must be a natural language explanation of the anomaly. If no anomaly is found, return a single JSON object with "severity" set to "none" and "reason" as an empty string. Do not wrap the JSON in markdown or any other formatting.
**Output Request:** If you find an anomaly, provide a report as a single JSON object with two keys: "severity" and "reason". The "severity" must be one of "high", "medium", "low", or "none". The "reason" must be a natural language explanation of the anomaly. Please include specific values if the anomoly has them. If no anomaly is found, return a single JSON object with "severity" set to "none" and "reason" as an empty string. Do not wrap the JSON in markdown or any other formatting.
**Reasoning Hint:** Think step by step to come to your conclusion. This is very important.
"""
def analyze_data_with_llm(data, baselines):
"""Analyzes data with the local LLM."""
with open("CONSTRAINTS.md", "r") as f:
constraints = f.read()
with open("known_issues.json", "r") as f:
known_issues = json.load(f)
# Compare current nmap results with baseline
nmap_changes = {"new_hosts": [], "changed_ports": {}}
if "nmap_results" in data and "host_ports" in baselines:
current_hosts_info = {host['ip']: host for host in data["nmap_results"].get("hosts", [])}
current_hosts = set(current_hosts_info.keys())
baseline_hosts = set(baselines["host_ports"].keys())
# New hosts
nmap_changes["new_hosts"] = sorted(list(current_hosts - baseline_hosts))
# Changed ports on existing hosts
for host_ip in current_hosts.intersection(baseline_hosts):
current_ports = set(p['port'] for p in current_hosts_info[host_ip].get("open_ports", []))
baseline_ports = set(baselines["host_ports"].get(host_ip, []))
newly_opened = sorted(list(current_ports - baseline_ports))
newly_closed = sorted(list(baseline_ports - current_ports))
if newly_opened or newly_closed:
nmap_changes["changed_ports"][host_ip] = {"opened": newly_opened, "closed": newly_closed}
prompt = build_llm_prompt(data, baselines, nmap_changes, constraints, known_issues)
try:
response = ollama.generate(model="llama3.1:8b", prompt=prompt)
# Sanitize the response to ensure it's valid JSON
sanitized_response = response['response'].strip()
# Find the first '{' and the last '}' to extract the JSON object
start_index = sanitized_response.find('{')
end_index = sanitized_response.rfind('}')
if start_index != -1 and end_index != -1:
json_string = sanitized_response[start_index:end_index+1]
try:
# Extract JSON from the response
try:
# Find the first '{' and the last '}' to extract the JSON object
start_index = sanitized_response.find('{')
end_index = sanitized_response.rfind('}')
if start_index != -1 and end_index != -1:
json_string = sanitized_response[start_index:end_index+1]
return json.loads(json_string)
except json.JSONDecodeError:
# If parsing a single object fails, try parsing as a list
try:
json_list = json.loads(json_string)
if isinstance(json_list, list) and json_list:
return json_list[0] # Return the first object in the list
except json.JSONDecodeError as e:
print(f"Error decoding LLM response: {e}")
# Fallback for invalid JSON
return {{"severity": "low", "reason": response['response'].strip()}} # type: ignore
else:
# Handle cases where the response is not valid JSON
print(f"LLM returned a non-JSON response: {sanitized_response}")
return {{"severity": "low", "reason": sanitized_response}} # type: ignore
else:
# Handle cases where the response is not valid JSON
print(f"LLM returned a non-JSON response: {sanitized_response}")
return {"severity": "low", "reason": sanitized_response}
except json.JSONDecodeError as e:
print(f"Error decoding LLM response: {e}")
# Fallback for invalid JSON
return {"severity": "low", "reason": sanitized_response}
except Exception as e:
print(f"Error interacting with LLM: {e}")
return None
@@ -272,84 +305,68 @@ def send_google_home_alert(message):
daily_events = []
if __name__ == "__main__":
if config.TEST_MODE:
print("Running in test mode...")
system_logs = get_system_logs()
network_metrics = get_network_metrics()
cpu_temp = get_cpu_temperature()
gpu_temp = get_gpu_temperature()
login_attempts = get_login_attempts()
def run_monitoring_cycle(nmap_scan_counter):
"""Runs a single monitoring cycle."""
print("Running monitoring cycle...")
system_logs = get_system_logs()
network_metrics = get_network_metrics()
sensors_output = get_sensor_data()
cpu_temp = get_cpu_temperature(sensors_output)
gpu_temp = get_gpu_temperature(sensors_output)
login_attempts = get_login_attempts()
nmap_results = None
if nmap_scan_counter == 0:
nmap_results = get_nmap_scan_results()
if system_logs and network_metrics:
combined_data = {
"timestamp": datetime.now(timezone.utc).isoformat(),
"system_logs": system_logs,
"network_metrics": network_metrics,
"cpu_temperature": cpu_temp,
"gpu_temperature": gpu_temp,
"login_attempts": login_attempts,
"nmap_results": nmap_results
}
data_storage.store_data(combined_data)
nmap_scan_counter = (nmap_scan_counter + 1) % 4 # Run nmap scan every 4th cycle (20 minutes)
llm_response = analyze_data_with_llm(combined_data, data_storage.calculate_baselines())
if system_logs and network_metrics:
combined_data = {
"timestamp": datetime.now(timezone.utc).isoformat(),
"system_logs": system_logs,
"network_metrics": network_metrics,
"cpu_temperature": cpu_temp,
"gpu_temperature": gpu_temp,
"login_attempts": login_attempts
}
if llm_response and llm_response.get('severity') != "none":
print(f"Anomaly detected: {llm_response.get('reason')}")
if llm_response.get('severity') == "high":
send_discord_alert(llm_response.get('reason'))
send_google_home_alert(llm_response.get('reason'))
else:
print("No anomaly detected.")
if nmap_results:
combined_data["nmap_results"] = nmap_results
data_storage.store_data(combined_data)
llm_response = analyze_data_with_llm(combined_data, data_storage.calculate_baselines())
if llm_response and llm_response.get('severity') != "none":
daily_events.append(llm_response.get('reason'))
if llm_response.get('severity') == "high":
send_discord_alert(llm_response.get('reason'))
send_google_home_alert(llm_response.get('reason'))
return nmap_scan_counter
def main():
"""Main function to run the monitoring agent."""
if config.TEST_MODE:
print("Running in test mode...")
run_monitoring_cycle(0)
else:
nmap_scan_counter = 0
while True:
print("Running monitoring cycle...")
system_logs = get_system_logs()
network_metrics = get_network_metrics()
cpu_temp = get_cpu_temperature()
gpu_temp = get_gpu_temperature()
login_attempts = get_login_attempts()
nmap_results = None
if nmap_scan_counter == 0:
nmap_results = get_nmap_scan_results()
nmap_scan_counter = (nmap_scan_counter + 1) % 4 # Run nmap scan every 4th cycle (20 minutes)
if system_logs and network_metrics:
combined_data = {
"timestamp": datetime.now(timezone.utc).isoformat(),
"system_logs": system_logs,
"network_metrics": network_metrics,
"cpu_temperature": cpu_temp,
"gpu_temperature": gpu_temp,
"login_attempts": login_attempts
}
if nmap_results:
combined_data["nmap_results"] = nmap_results
data_storage.store_data(combined_data)
llm_response = analyze_data_with_llm(combined_data, data_storage.calculate_baselines())
if llm_response and llm_response.get('severity') != "none":
daily_events.append(llm_response.get('reason'))
if llm_response.get('severity') == "high":
send_discord_alert(llm_response.get('reason'))
send_google_home_alert(llm_response.get('reason'))
nmap_scan_counter = run_monitoring_cycle(nmap_scan_counter)
# Daily Recap Logic
current_time = time.strftime("%H:%M")
if current_time == config.DAILY_RECAP_TIME and daily_events:
if current_time == config.DAILY_RECAP_TIME and daily_events: # type: ignore
recap_message = "\n".join(daily_events)
send_discord_alert(f"**Daily Recap:**\n{recap_message}")
daily_events = [] # Reset for the next day
time.sleep(300) # Run every 5 minutes
if __name__ == "__main__":
main()