Completed NMAP & Refactoring
This commit is contained in:
1
.gitignore
vendored
1
.gitignore
vendored
@@ -2,3 +2,4 @@ __pycache__/*
|
||||
__pycache__/
|
||||
monitoring_data.json
|
||||
log_position.txt
|
||||
auth_log_position.txt
|
||||
|
||||
147
README.md
147
README.md
@@ -1,104 +1,93 @@
|
||||
# LLM-Powered Monitoring Agent
|
||||
|
||||
This project is a self-hosted monitoring agent that uses a local Large Language Model (LLM) to detect anomalies in system and network data. It's designed to be a simple, self-contained Python script that can be easily deployed on a server.
|
||||
This project implements an LLM-powered monitoring agent designed to continuously collect system and network data, analyze it against historical baselines, and alert on anomalies. The agent leverages a local Large Language Model (LLM) for intelligent anomaly detection and integrates with Discord and Google Home for notifications.
|
||||
|
||||
## 1. Installation
|
||||
## Features
|
||||
|
||||
To get started, you'll need to have Python 3.8 or newer installed. Then, follow these steps:
|
||||
- **System Log Monitoring**: Tracks new entries in `/var/log/syslog` and `/var/log/auth.log` (for login attempts).
|
||||
- **Network Metrics**: Gathers network performance data by pinging a public IP (e.g., 8.8.8.8).
|
||||
- **Hardware Monitoring**: Collects CPU and GPU temperature data.
|
||||
- **Nmap Scanning**: Periodically performs network scans to discover hosts and open ports.
|
||||
- **Historical Baseline Analysis**: Compares current data against a 24-hour rolling baseline to identify deviations.
|
||||
- **LLM-Powered Anomaly Detection**: Utilizes a local LLM (Ollama with Llama3.1) to analyze combined system data, baselines, and Nmap changes for anomalies.
|
||||
- **Alerting**: Sends high-severity anomaly alerts to Discord and Google Home speakers (via Home Assistant).
|
||||
- **Daily Recap**: Provides a daily summary of detected events.
|
||||
|
||||
1. **Clone the repository or download the files:**
|
||||
## Recent Improvements
|
||||
|
||||
```bash
|
||||
git clone <repository_url>
|
||||
cd <repository_directory>
|
||||
```
|
||||
- **Enhanced Nmap Data Logging**: The Nmap scan results are now processed and stored in a more structured format, including:
|
||||
- Discovered IP addresses.
|
||||
- Status of each host.
|
||||
- Detailed list of open ports for each host, including service, product, and version information.
|
||||
This significantly improves the clarity and utility of Nmap data for anomaly detection.
|
||||
- **Code Refactoring (`monitor_agent.py`)**:
|
||||
- **Optimized Sensor Data Collection**: CPU and GPU temperature data are now collected with a single call to the `sensors` command, improving efficiency.
|
||||
- **Efficient Login Attempt Logging**: The agent now tracks its position in `/var/log/auth.log`, preventing redundant reads of the entire file and improving performance for large log files.
|
||||
- **Modular Main Loop**: The core monitoring logic has been broken down into smaller, more manageable functions, enhancing readability and maintainability.
|
||||
- **Separated LLM Prompt Building**: The complex LLM prompt construction logic has been moved into a dedicated function, making `analyze_data_with_llm` more focused.
|
||||
- **Code Refactoring (`data_storage.py`)**:
|
||||
- **Streamlined Baseline Calculations**: Helper functions have been introduced to reduce code duplication and improve clarity in the calculation of average metrics for baselines.
|
||||
|
||||
2. **Create and activate a Python virtual environment:**
|
||||
|
||||
```bash
|
||||
python -m venv venv
|
||||
source venv/bin/activate # On Windows, use `venv\Scripts\activate`
|
||||
```
|
||||
|
||||
3. **Install the required Python libraries:**
|
||||
|
||||
```bash
|
||||
pip install -r requirements.txt
|
||||
```
|
||||
|
||||
## 2. Setup
|
||||
|
||||
Before running the agent, you need to configure it and ensure the necessary services are running.
|
||||
## Setup and Installation
|
||||
|
||||
### Prerequisites
|
||||
|
||||
- **Ollama:** The agent requires that [Ollama](https://ollama.com/) is installed and running on the server.
|
||||
- **LLM Model:** You must have the `llama3.1:8b` model pulled and available in Ollama. You can pull it with the following command:
|
||||
- Python 3.x
|
||||
- `ollama` installed and running with the `llama3.1:8b` model pulled (`ollama pull llama3.1:8b`)
|
||||
- `nmap` installed
|
||||
- `lm-sensors` installed (for CPU/GPU temperature monitoring)
|
||||
- Discord webhook URL
|
||||
- (Optional) Home Assistant instance with a long-lived access token and a Google Home speaker configured.
|
||||
|
||||
### Installation
|
||||
|
||||
1. Clone the repository:
|
||||
```bash
|
||||
ollama pull llama3.1:8b
|
||||
git clone <repository_url>
|
||||
cd LLM-Powered-Monitoring-Agent
|
||||
```
|
||||
2. Install Python dependencies:
|
||||
```bash
|
||||
pip install -r requirements.txt
|
||||
```
|
||||
3. Configure the agent:
|
||||
- Open `config.py` and update the following variables:
|
||||
- `DISCORD_WEBHOOK_URL`
|
||||
- `HOME_ASSISTANT_URL` (if using Google Home alerts)
|
||||
- `HOME_ASSISTANT_TOKEN` (if using Google Home alerts)
|
||||
- `GOOGLE_HOME_SPEAKER_ID` (if using Google Home alerts)
|
||||
- `NMAP_TARGETS` (e.g., "192.168.1.0/24" or "192.168.1.100")
|
||||
- `NMAP_SCAN_OPTIONS` (default is "-sS -T4")
|
||||
- `DAILY_RECAP_TIME` (e.g., "20:00" for 8 PM)
|
||||
- `TEST_MODE` (set to `True` for a single run, `False` for continuous operation)
|
||||
|
||||
### Configuration
|
||||
## Usage
|
||||
|
||||
All configuration is done in the `config.py` file. You will need to replace the placeholder values with your actual credentials and URLs.
|
||||
|
||||
- `DISCORD_WEBHOOK_URL`: Your Discord channel's webhook URL. This is used to send alerts.
|
||||
- `HOME_ASSISTANT_URL`: The URL of your Home Assistant instance (e.g., `http://192.168.1.50:8123`).
|
||||
- `HOME_ASSISTANT_TOKEN`: A Long-Lived Access Token for your Home Assistant instance. You can generate this in your Home Assistant profile settings.
|
||||
- `GOOGLE_HOME_SPEAKER_ID`: The `media_player` entity ID for your Google Home speaker in Home Assistant (e.g., `media_player.kitchen_speaker`).
|
||||
|
||||
## 3. Usage
|
||||
|
||||
Once the installation and setup are complete, you can run the monitoring agent with the following command:
|
||||
To run the monitoring agent:
|
||||
|
||||
```bash
|
||||
python monitor_agent.py
|
||||
```
|
||||
|
||||
The script will start a continuous monitoring loop. Every 5 minutes, it will:
|
||||
### Test Mode
|
||||
|
||||
1. Collect simulated system and network data.
|
||||
2. Send the data to the local LLM for analysis.
|
||||
3. If the LLM detects a **high-severity** anomaly, it will send an alert to your configured Discord channel and broadcast a message to your Google Home speaker via Home Assistant.
|
||||
4. At the time specified in `DAILY_RECAP_TIME`, a summary of all anomalies for the day will be sent to the Discord channel.
|
||||
Set `TEST_MODE = True` in `config.py` to run the agent once and exit. This is useful for testing configurations and initial setup.
|
||||
|
||||
The script will print its status and any detected anomalies to the console.
|
||||
## Extending and Customizing
|
||||
|
||||
### Nmap Scans
|
||||
- **Adding New Metrics**: You can add new data collection functions in `monitor_agent.py` and include their results in the `combined_data` dictionary.
|
||||
- **Customizing LLM Analysis**: Modify the `CONSTRAINTS.md` file to provide specific instructions or constraints to the LLM for anomaly detection.
|
||||
- **Known Issues**: Update `known_issues.json` with any known or expected system behaviors to prevent the LLM from flagging them as anomalies.
|
||||
- **Alerting Mechanisms**: Implement additional alerting functions (e.g., email, SMS) in `monitor_agent.py` and integrate them into the anomaly detection logic.
|
||||
|
||||
The agent uses `nmap` to scan the network for open ports. By default, it uses a TCP SYN scan (`-sS`), which requires root privileges. If the script is not run as root, it will fall back to a TCP connect scan (`-sT`), which does not require root privileges but is slower and more likely to be detected.
|
||||
## Project Structure
|
||||
|
||||
To run the agent with root privileges, use the `sudo` command:
|
||||
|
||||
```bash
|
||||
sudo python monitor_agent.py
|
||||
```
|
||||
|
||||
## 4. Features
|
||||
|
||||
### Priority System
|
||||
|
||||
The monitoring agent uses a priority system to classify anomalies. The LLM is instructed to return a severity level for each anomaly it detects. The possible severity levels are:
|
||||
|
||||
- **high**: Indicates a critical issue that requires immediate attention. An alert is sent to Discord and Google Home.
|
||||
- **medium**: Indicates a non-critical issue that should be investigated. No alert is sent.
|
||||
- **low**: Indicates a minor issue or a potential false positive. No alert is sent.
|
||||
- **none**: Indicates that no anomaly was detected.
|
||||
|
||||
### Known Issues Feed
|
||||
|
||||
The agent uses a `known_issues.json` file to provide the LLM with a list of known issues and their resolutions. This helps the LLM to avoid flagging resolved or expected issues as anomalies.
|
||||
|
||||
You can add new issues to the `known_issues.json` file by following the existing format. Each issue should have an "issue" and a "resolution" key. For example:
|
||||
|
||||
```json
|
||||
[
|
||||
{
|
||||
"issue": "CPU temperature spikes to 80C under heavy load",
|
||||
"resolution": "This is normal behavior for this CPU model and is not a cause for concern."
|
||||
}
|
||||
]
|
||||
```
|
||||
|
||||
**Note on Mock Data:** The current version of the script uses mock data for system logs and network metrics. To use this in a real-world scenario, you would need to replace the mock data with actual data from your systems.
|
||||
- `monitor_agent.py`: Main script for data collection, LLM interaction, and alerting.
|
||||
- `data_storage.py`: Handles loading, storing, and calculating baselines from historical data.
|
||||
- `config.py`: Stores configurable parameters for the agent.
|
||||
- `requirements.txt`: Lists Python dependencies.
|
||||
- `CONSTRAINTS.md`: Defines constraints and guidelines for the LLM's analysis.
|
||||
- `known_issues.json`: A JSON file containing a list of known issues to be considered by the LLM.
|
||||
- `monitoring_data.json`: (Generated) Stores historical monitoring data.
|
||||
- `log_position.txt`: (Generated) Stores the last read position for `/var/log/syslog`.
|
||||
- `auth_log_position.txt`: (Generated) Stores the last read position for `/var/log/auth.log`.
|
||||
5
SPEC.md
5
SPEC.md
@@ -90,4 +90,7 @@ The project will be composed of the following files:
|
||||
├── requirements.txt
|
||||
├── PROGRESS.md
|
||||
└── SPEC.md
|
||||
```
|
||||
```
|
||||
|
||||
## 7. Testing and Debugging
|
||||
The script is equipped with a test mode, that only runs the script once, and not continuously. To enable, change the `TEST_MODE` variable in `config.py` to `True`. Once finished testing, change the variable back to `False`.
|
||||
@@ -12,7 +12,7 @@ GOOGLE_HOME_SPEAKER_ID = "media_player.spencer_room_speaker"
|
||||
DAILY_RECAP_TIME = "20:00"
|
||||
|
||||
# Nmap Configuration
|
||||
NMAP_TARGETS = "192.168.1.0/24"
|
||||
NMAP_TARGETS = "192.168.2.0/24"
|
||||
NMAP_SCAN_OPTIONS = "-sS -T4"
|
||||
|
||||
# Test Mode (True to run once and exit, False to run continuously)
|
||||
|
||||
@@ -16,6 +16,11 @@ def store_data(new_data):
|
||||
with open(DATA_FILE, 'w') as f:
|
||||
json.dump(data, f, indent=4)
|
||||
|
||||
def _calculate_average(data, key1, key2):
|
||||
"""Helper function to calculate the average of a nested key in a list of dicts."""
|
||||
values = [d[key1][key2] for d in data if key1 in d and key2 in d[key1] and d[key1][key2] != "N/A"]
|
||||
return sum(values) / len(values) if values else 0
|
||||
|
||||
def calculate_baselines():
|
||||
data = load_data()
|
||||
if not data:
|
||||
@@ -29,23 +34,23 @@ def calculate_baselines():
|
||||
return {}
|
||||
|
||||
baseline_metrics = {
|
||||
'avg_rtt': sum(d['network_metrics']['rtt_avg'] for d in recent_data if 'rtt_avg' in d['network_metrics']) / len(recent_data),
|
||||
'packet_loss': sum(d['network_metrics']['packet_loss_rate'] for d in recent_data if 'packet_loss_rate' in d['network_metrics']) / len(recent_data),
|
||||
'avg_cpu_temp': sum(d['cpu_temperature']['cpu_temperature'] for d in recent_data if d['cpu_temperature']['cpu_temperature'] != "N/A") / len(recent_data),
|
||||
'avg_gpu_temp': sum(d['gpu_temperature']['gpu_temperature'] for d in recent_data if d['gpu_temperature']['gpu_temperature'] != "N/A") / len(recent_data),
|
||||
'avg_rtt': _calculate_average(recent_data, 'network_metrics', 'rtt_avg'),
|
||||
'packet_loss': _calculate_average(recent_data, 'network_metrics', 'packet_loss_rate'),
|
||||
'avg_cpu_temp': _calculate_average(recent_data, 'cpu_temperature', 'cpu_temperature'),
|
||||
'avg_gpu_temp': _calculate_average(recent_data, 'gpu_temperature', 'gpu_temperature'),
|
||||
}
|
||||
|
||||
# Baseline for open ports from nmap scans
|
||||
host_ports = {}
|
||||
for d in recent_data:
|
||||
if 'nmap_results' in d and 'scan' in d['nmap_results']:
|
||||
for host, scan_data in d['nmap_results']['scan'].items():
|
||||
if host not in host_ports:
|
||||
host_ports[host] = set()
|
||||
if 'tcp' in scan_data:
|
||||
for port, port_data in scan_data['tcp'].items():
|
||||
if port_data['state'] == 'open':
|
||||
host_ports[host].add(port)
|
||||
if 'nmap_results' in d and 'hosts' in d.get('nmap_results', {}):
|
||||
for host_info in d['nmap_results']['hosts']:
|
||||
host_ip = host_info['ip']
|
||||
if host_ip not in host_ports:
|
||||
host_ports[host_ip] = set()
|
||||
|
||||
for port_info in host_info.get('open_ports', []):
|
||||
host_ports[host_ip].add(port_info['port'])
|
||||
|
||||
# Convert sets to sorted lists for JSON serialization
|
||||
for host, ports in host_ports.items():
|
||||
|
||||
@@ -1,10 +1,18 @@
|
||||
[
|
||||
{
|
||||
"issue": "CPU temperature spikes to 90C under heavy load",
|
||||
"resolution": "This is normal behavior for this CPU model and is not a cause for concern."
|
||||
"issue": "CPU temperatures less then the average",
|
||||
"resolution": "This is normal behavior for CPU's when not in use. Lower Temps are usually a good thing"
|
||||
},
|
||||
{
|
||||
"issue": "Access attempts from unknown IP Addresses",
|
||||
"resolution": "ufw has been enabled, and blocks all default connections by default. The only IP Addresses allowed are 192.168.2.0/24 and 100.64.0.0/10"
|
||||
},
|
||||
{
|
||||
"issue": "Network Timing values are lower then average",
|
||||
"resolution": "In networking, timing values lower then the average are often good things, and do not need to be considered an anomaly"
|
||||
},
|
||||
{
|
||||
"issue": "Port 62078 opens periodically",
|
||||
"resolution": "Port 62078 is used in apple devices for syncing communcation between each other. It is normal to appear on Apple Devices"
|
||||
}
|
||||
]
|
||||
317
monitor_agent.py
317
monitor_agent.py
@@ -19,6 +19,7 @@ import config
|
||||
from syslog_rfc5424_parser import parser
|
||||
|
||||
LOG_POSITION_FILE = 'log_position.txt'
|
||||
AUTH_LOG_POSITION_FILE = 'auth_log_position.txt'
|
||||
|
||||
# --- Data Ingestion & Parsing Functions ---
|
||||
|
||||
@@ -54,7 +55,6 @@ def get_system_logs():
|
||||
print(f"Error reading syslog: {e}")
|
||||
return {"syslog": []}
|
||||
|
||||
import pingparsing
|
||||
|
||||
def get_network_metrics():
|
||||
"""Gets network metrics by pinging 8.8.8.8."""
|
||||
@@ -69,45 +69,59 @@ def get_network_metrics():
|
||||
print(f"Error getting network metrics: {e}")
|
||||
return {"error": "ping command failed"}
|
||||
|
||||
def get_cpu_temperature():
|
||||
"""Gets the CPU temperature using the sensors command."""
|
||||
def get_sensor_data():
|
||||
"""Gets all sensor data at once."""
|
||||
try:
|
||||
sensors_output = subprocess.check_output(["sensors"], text=True)
|
||||
# Use regex to find the CPU temperature
|
||||
match = re.search(r"Package id 0:\s+\+([\d\.]+)", sensors_output)
|
||||
if match:
|
||||
return {"cpu_temperature": float(match.group(1))}
|
||||
else:
|
||||
return {"cpu_temperature": "N/A"}
|
||||
return subprocess.check_output(["sensors"], text=True)
|
||||
except (subprocess.CalledProcessError, FileNotFoundError):
|
||||
print("Error: 'sensors' command not found. Please install lm-sensors.")
|
||||
return None
|
||||
|
||||
def get_cpu_temperature(sensors_output):
|
||||
"""Gets the CPU temperature from the sensors output."""
|
||||
if not sensors_output:
|
||||
return {"cpu_temperature": "N/A"}
|
||||
# Use regex to find the CPU temperature
|
||||
match = re.search(r"Package id 0:\s+\+([\d\.]+)", sensors_output)
|
||||
if match:
|
||||
return {"cpu_temperature": float(match.group(1))}
|
||||
else:
|
||||
return {"cpu_temperature": "N/A"}
|
||||
|
||||
def get_gpu_temperature():
|
||||
"""Gets the GPU temperature using the sensors command."""
|
||||
try:
|
||||
sensors_output = subprocess.check_output(["sensors"], text=True)
|
||||
# Use regex to find the GPU temperature for amdgpu
|
||||
match = re.search(r"edge:\s+\+([\d\.]+)", sensors_output)
|
||||
def get_gpu_temperature(sensors_output):
|
||||
"""Gets the GPU temperature from the sensors output."""
|
||||
if not sensors_output:
|
||||
return {"gpu_temperature": "N/A"}
|
||||
# Use regex to find the GPU temperature for amdgpu
|
||||
match = re.search(r"edge:\s+\+([\d\.]+)", sensors_output)
|
||||
if match:
|
||||
return {"gpu_temperature": float(match.group(1))}
|
||||
else:
|
||||
# if amdgpu not found, try radeon
|
||||
match = re.search(r"temp1:\s+\+([\d\.]+)", sensors_output)
|
||||
if match:
|
||||
return {"gpu_temperature": float(match.group(1))}
|
||||
else:
|
||||
# if amdgpu not found, try radeon
|
||||
match = re.search(r"temp1:\s+\+([\d\.]+)", sensors_output)
|
||||
if match:
|
||||
return {"gpu_temperature": float(match.group(1))}
|
||||
else:
|
||||
return {"gpu_temperature": "N/A"}
|
||||
except (subprocess.CalledProcessError, FileNotFoundError):
|
||||
print("Error: 'sensors' command not found. Please install lm-sensors.")
|
||||
return {"gpu_temperature": "N/A"}
|
||||
return {"gpu_temperature": "N/A"}
|
||||
|
||||
|
||||
|
||||
def get_login_attempts():
|
||||
"""Gets system login attempts from /var/log/auth.log."""
|
||||
"""Gets system login attempts from /var/log/auth.log since the last check."""
|
||||
try:
|
||||
last_position = 0
|
||||
if os.path.exists(AUTH_LOG_POSITION_FILE):
|
||||
with open(AUTH_LOG_POSITION_FILE, 'r') as f:
|
||||
last_position = int(f.read())
|
||||
|
||||
with open("/var/log/auth.log", "r") as f:
|
||||
f.seek(last_position)
|
||||
log_lines = f.readlines()
|
||||
|
||||
current_position = f.tell()
|
||||
|
||||
with open(AUTH_LOG_POSITION_FILE, 'w') as f:
|
||||
f.write(str(current_position))
|
||||
|
||||
failed_logins = []
|
||||
for line in log_lines:
|
||||
if "Failed password" in line:
|
||||
@@ -122,7 +136,7 @@ def get_login_attempts():
|
||||
return {"failed_logins": []}
|
||||
|
||||
def get_nmap_scan_results():
|
||||
"""Performs an Nmap scan and returns the results."""
|
||||
"""Performs an Nmap scan and returns a structured summary."""
|
||||
try:
|
||||
nm = nmap.PortScanner()
|
||||
scan_options = config.NMAP_SCAN_OPTIONS
|
||||
@@ -131,47 +145,37 @@ def get_nmap_scan_results():
|
||||
scan_options = scan_options.replace("-sS", "-sT")
|
||||
|
||||
scan_results = nm.scan(hosts=config.NMAP_TARGETS, arguments=scan_options)
|
||||
return scan_results
|
||||
|
||||
# Process the results into a more structured format
|
||||
processed_results = {"hosts": []}
|
||||
if "scan" in scan_results:
|
||||
for host, scan_data in scan_results["scan"].items():
|
||||
host_info = {
|
||||
"ip": host,
|
||||
"status": scan_data.get("status", {}).get("state", "unknown"),
|
||||
"open_ports": []
|
||||
}
|
||||
if "tcp" in scan_data:
|
||||
for port, port_data in scan_data["tcp"].items():
|
||||
if port_data.get("state") == "open":
|
||||
host_info["open_ports"].append({
|
||||
"port": port,
|
||||
"service": port_data.get("name", ""),
|
||||
"product": port_data.get("product", ""),
|
||||
"version": port_data.get("version", "")
|
||||
})
|
||||
processed_results["hosts"].append(host_info)
|
||||
|
||||
return processed_results
|
||||
except Exception as e:
|
||||
print(f"Error performing Nmap scan: {e}")
|
||||
return {"error": "Nmap scan failed"}
|
||||
|
||||
# --- LLM Interaction Function ---
|
||||
|
||||
def analyze_data_with_llm(data, baselines):
|
||||
"""Analyzes data with the local LLM."""
|
||||
with open("CONSTRAINTS.md", "r") as f:
|
||||
constraints = f.read()
|
||||
|
||||
with open("known_issues.json", "r") as f:
|
||||
known_issues = json.load(f)
|
||||
|
||||
# Compare current nmap results with baseline
|
||||
nmap_changes = {"new_hosts": [], "changed_ports": {}}
|
||||
if "nmap_results" in data and "host_ports" in baselines:
|
||||
current_hosts = set(data["nmap_results"].get("scan", {}).keys())
|
||||
baseline_hosts = set(baselines["host_ports"].keys())
|
||||
|
||||
# New hosts
|
||||
nmap_changes["new_hosts"] = sorted(list(current_hosts - baseline_hosts))
|
||||
|
||||
# Changed ports on existing hosts
|
||||
for host in current_hosts.intersection(baseline_hosts):
|
||||
current_ports = set()
|
||||
if "tcp" in data["nmap_results"]["scan"][host]:
|
||||
for port, port_data in data["nmap_results"]["scan"][host]["tcp"].items():
|
||||
if port_data["state"] == "open":
|
||||
current_ports.add(port)
|
||||
|
||||
baseline_ports = set(baselines["host_ports"].get(host, []))
|
||||
|
||||
newly_opened = sorted(list(current_ports - baseline_ports))
|
||||
newly_closed = sorted(list(baseline_ports - current_ports))
|
||||
|
||||
if newly_opened or newly_closed:
|
||||
nmap_changes["changed_ports"][host] = {"opened": newly_opened, "closed": newly_closed}
|
||||
|
||||
prompt = f"""
|
||||
def build_llm_prompt(data, baselines, nmap_changes, constraints, known_issues):
|
||||
"""Builds the prompt for the LLM analysis."""
|
||||
return f"""
|
||||
**Role:** You are a dedicated and expert system administrator. Your primary role is to identify anomalies and provide concise, actionable reports.
|
||||
|
||||
**Instruction:** Analyze the following system and network data for any activity that appears out of place or different. Consider unusual values, errors, or unexpected patterns as anomalies. Compare the current data with the historical baseline data to identify significant deviations. Consult the known issues feed to avoid flagging resolved or expected issues. Pay special attention to the Nmap scan results for any new or unexpected open ports.
|
||||
@@ -191,35 +195,64 @@ def analyze_data_with_llm(data, baselines):
|
||||
**Constraints and Guidelines:**
|
||||
{constraints}
|
||||
|
||||
**Output Request:** If you find an anomaly, provide a report as a single JSON object with two keys: "severity" and "reason". The "severity" must be one of "high", "medium", "low", or "none". The "reason" must be a natural language explanation of the anomaly. If no anomaly is found, return a single JSON object with "severity" set to "none" and "reason" as an empty string. Do not wrap the JSON in markdown or any other formatting.
|
||||
**Output Request:** If you find an anomaly, provide a report as a single JSON object with two keys: "severity" and "reason". The "severity" must be one of "high", "medium", "low", or "none". The "reason" must be a natural language explanation of the anomaly. Please include specific values if the anomoly has them. If no anomaly is found, return a single JSON object with "severity" set to "none" and "reason" as an empty string. Do not wrap the JSON in markdown or any other formatting.
|
||||
|
||||
**Reasoning Hint:** Think step by step to come to your conclusion. This is very important.
|
||||
"""
|
||||
|
||||
def analyze_data_with_llm(data, baselines):
|
||||
"""Analyzes data with the local LLM."""
|
||||
with open("CONSTRAINTS.md", "r") as f:
|
||||
constraints = f.read()
|
||||
|
||||
with open("known_issues.json", "r") as f:
|
||||
known_issues = json.load(f)
|
||||
|
||||
# Compare current nmap results with baseline
|
||||
nmap_changes = {"new_hosts": [], "changed_ports": {}}
|
||||
if "nmap_results" in data and "host_ports" in baselines:
|
||||
current_hosts_info = {host['ip']: host for host in data["nmap_results"].get("hosts", [])}
|
||||
current_hosts = set(current_hosts_info.keys())
|
||||
baseline_hosts = set(baselines["host_ports"].keys())
|
||||
|
||||
# New hosts
|
||||
nmap_changes["new_hosts"] = sorted(list(current_hosts - baseline_hosts))
|
||||
|
||||
# Changed ports on existing hosts
|
||||
for host_ip in current_hosts.intersection(baseline_hosts):
|
||||
current_ports = set(p['port'] for p in current_hosts_info[host_ip].get("open_ports", []))
|
||||
|
||||
baseline_ports = set(baselines["host_ports"].get(host_ip, []))
|
||||
|
||||
newly_opened = sorted(list(current_ports - baseline_ports))
|
||||
newly_closed = sorted(list(baseline_ports - current_ports))
|
||||
|
||||
if newly_opened or newly_closed:
|
||||
nmap_changes["changed_ports"][host_ip] = {"opened": newly_opened, "closed": newly_closed}
|
||||
|
||||
prompt = build_llm_prompt(data, baselines, nmap_changes, constraints, known_issues)
|
||||
|
||||
try:
|
||||
response = ollama.generate(model="llama3.1:8b", prompt=prompt)
|
||||
# Sanitize the response to ensure it's valid JSON
|
||||
sanitized_response = response['response'].strip()
|
||||
# Find the first '{' and the last '}' to extract the JSON object
|
||||
start_index = sanitized_response.find('{')
|
||||
end_index = sanitized_response.rfind('}')
|
||||
if start_index != -1 and end_index != -1:
|
||||
json_string = sanitized_response[start_index:end_index+1]
|
||||
try:
|
||||
|
||||
# Extract JSON from the response
|
||||
try:
|
||||
# Find the first '{' and the last '}' to extract the JSON object
|
||||
start_index = sanitized_response.find('{')
|
||||
end_index = sanitized_response.rfind('}')
|
||||
if start_index != -1 and end_index != -1:
|
||||
json_string = sanitized_response[start_index:end_index+1]
|
||||
return json.loads(json_string)
|
||||
except json.JSONDecodeError:
|
||||
# If parsing a single object fails, try parsing as a list
|
||||
try:
|
||||
json_list = json.loads(json_string)
|
||||
if isinstance(json_list, list) and json_list:
|
||||
return json_list[0] # Return the first object in the list
|
||||
except json.JSONDecodeError as e:
|
||||
print(f"Error decoding LLM response: {e}")
|
||||
# Fallback for invalid JSON
|
||||
return {{"severity": "low", "reason": response['response'].strip()}} # type: ignore
|
||||
else:
|
||||
# Handle cases where the response is not valid JSON
|
||||
print(f"LLM returned a non-JSON response: {sanitized_response}")
|
||||
return {{"severity": "low", "reason": sanitized_response}} # type: ignore
|
||||
else:
|
||||
# Handle cases where the response is not valid JSON
|
||||
print(f"LLM returned a non-JSON response: {sanitized_response}")
|
||||
return {"severity": "low", "reason": sanitized_response}
|
||||
except json.JSONDecodeError as e:
|
||||
print(f"Error decoding LLM response: {e}")
|
||||
# Fallback for invalid JSON
|
||||
return {"severity": "low", "reason": sanitized_response}
|
||||
|
||||
except Exception as e:
|
||||
print(f"Error interacting with LLM: {e}")
|
||||
return None
|
||||
@@ -272,84 +305,68 @@ def send_google_home_alert(message):
|
||||
|
||||
daily_events = []
|
||||
|
||||
if __name__ == "__main__":
|
||||
def run_monitoring_cycle(nmap_scan_counter):
|
||||
"""Runs a single monitoring cycle."""
|
||||
print("Running monitoring cycle...")
|
||||
system_logs = get_system_logs()
|
||||
network_metrics = get_network_metrics()
|
||||
sensors_output = get_sensor_data()
|
||||
cpu_temp = get_cpu_temperature(sensors_output)
|
||||
gpu_temp = get_gpu_temperature(sensors_output)
|
||||
login_attempts = get_login_attempts()
|
||||
|
||||
nmap_results = None
|
||||
if nmap_scan_counter == 0:
|
||||
nmap_results = get_nmap_scan_results()
|
||||
|
||||
nmap_scan_counter = (nmap_scan_counter + 1) % 4 # Run nmap scan every 4th cycle (20 minutes)
|
||||
|
||||
if system_logs and network_metrics:
|
||||
combined_data = {
|
||||
"timestamp": datetime.now(timezone.utc).isoformat(),
|
||||
"system_logs": system_logs,
|
||||
"network_metrics": network_metrics,
|
||||
"cpu_temperature": cpu_temp,
|
||||
"gpu_temperature": gpu_temp,
|
||||
"login_attempts": login_attempts
|
||||
}
|
||||
|
||||
if nmap_results:
|
||||
combined_data["nmap_results"] = nmap_results
|
||||
|
||||
data_storage.store_data(combined_data)
|
||||
|
||||
llm_response = analyze_data_with_llm(combined_data, data_storage.calculate_baselines())
|
||||
|
||||
if llm_response and llm_response.get('severity') != "none":
|
||||
daily_events.append(llm_response.get('reason'))
|
||||
if llm_response.get('severity') == "high":
|
||||
send_discord_alert(llm_response.get('reason'))
|
||||
send_google_home_alert(llm_response.get('reason'))
|
||||
return nmap_scan_counter
|
||||
|
||||
def main():
|
||||
"""Main function to run the monitoring agent."""
|
||||
if config.TEST_MODE:
|
||||
print("Running in test mode...")
|
||||
system_logs = get_system_logs()
|
||||
network_metrics = get_network_metrics()
|
||||
cpu_temp = get_cpu_temperature()
|
||||
gpu_temp = get_gpu_temperature()
|
||||
login_attempts = get_login_attempts()
|
||||
nmap_results = get_nmap_scan_results()
|
||||
|
||||
if system_logs and network_metrics:
|
||||
combined_data = {
|
||||
"timestamp": datetime.now(timezone.utc).isoformat(),
|
||||
"system_logs": system_logs,
|
||||
"network_metrics": network_metrics,
|
||||
"cpu_temperature": cpu_temp,
|
||||
"gpu_temperature": gpu_temp,
|
||||
"login_attempts": login_attempts,
|
||||
"nmap_results": nmap_results
|
||||
}
|
||||
data_storage.store_data(combined_data)
|
||||
|
||||
llm_response = analyze_data_with_llm(combined_data, data_storage.calculate_baselines())
|
||||
|
||||
if llm_response and llm_response.get('severity') != "none":
|
||||
print(f"Anomaly detected: {llm_response.get('reason')}")
|
||||
if llm_response.get('severity') == "high":
|
||||
send_discord_alert(llm_response.get('reason'))
|
||||
send_google_home_alert(llm_response.get('reason'))
|
||||
else:
|
||||
print("No anomaly detected.")
|
||||
run_monitoring_cycle(0)
|
||||
else:
|
||||
nmap_scan_counter = 0
|
||||
while True:
|
||||
print("Running monitoring cycle...")
|
||||
system_logs = get_system_logs()
|
||||
network_metrics = get_network_metrics()
|
||||
cpu_temp = get_cpu_temperature()
|
||||
gpu_temp = get_gpu_temperature()
|
||||
login_attempts = get_login_attempts()
|
||||
|
||||
nmap_results = None
|
||||
if nmap_scan_counter == 0:
|
||||
nmap_results = get_nmap_scan_results()
|
||||
|
||||
nmap_scan_counter = (nmap_scan_counter + 1) % 4 # Run nmap scan every 4th cycle (20 minutes)
|
||||
|
||||
if system_logs and network_metrics:
|
||||
combined_data = {
|
||||
"timestamp": datetime.now(timezone.utc).isoformat(),
|
||||
"system_logs": system_logs,
|
||||
"network_metrics": network_metrics,
|
||||
"cpu_temperature": cpu_temp,
|
||||
"gpu_temperature": gpu_temp,
|
||||
"login_attempts": login_attempts
|
||||
}
|
||||
|
||||
if nmap_results:
|
||||
combined_data["nmap_results"] = nmap_results
|
||||
|
||||
data_storage.store_data(combined_data)
|
||||
|
||||
llm_response = analyze_data_with_llm(combined_data, data_storage.calculate_baselines())
|
||||
|
||||
if llm_response and llm_response.get('severity') != "none":
|
||||
daily_events.append(llm_response.get('reason'))
|
||||
if llm_response.get('severity') == "high":
|
||||
send_discord_alert(llm_response.get('reason'))
|
||||
send_google_home_alert(llm_response.get('reason'))
|
||||
nmap_scan_counter = run_monitoring_cycle(nmap_scan_counter)
|
||||
|
||||
# Daily Recap Logic
|
||||
current_time = time.strftime("%H:%M")
|
||||
if current_time == config.DAILY_RECAP_TIME and daily_events:
|
||||
if current_time == config.DAILY_RECAP_TIME and daily_events: # type: ignore
|
||||
recap_message = "\n".join(daily_events)
|
||||
send_discord_alert(f"**Daily Recap:**\n{recap_message}")
|
||||
daily_events = [] # Reset for the next day
|
||||
|
||||
time.sleep(300) # Run every 5 minutes
|
||||
|
||||
if __name__ == "__main__":
|
||||
main()
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
Reference in New Issue
Block a user