Completed NMAP & Refactoring
This commit is contained in:
147
README.md
147
README.md
@@ -1,104 +1,93 @@
|
||||
# LLM-Powered Monitoring Agent
|
||||
|
||||
This project is a self-hosted monitoring agent that uses a local Large Language Model (LLM) to detect anomalies in system and network data. It's designed to be a simple, self-contained Python script that can be easily deployed on a server.
|
||||
This project implements an LLM-powered monitoring agent designed to continuously collect system and network data, analyze it against historical baselines, and alert on anomalies. The agent leverages a local Large Language Model (LLM) for intelligent anomaly detection and integrates with Discord and Google Home for notifications.
|
||||
|
||||
## 1. Installation
|
||||
## Features
|
||||
|
||||
To get started, you'll need to have Python 3.8 or newer installed. Then, follow these steps:
|
||||
- **System Log Monitoring**: Tracks new entries in `/var/log/syslog` and `/var/log/auth.log` (for login attempts).
|
||||
- **Network Metrics**: Gathers network performance data by pinging a public IP (e.g., 8.8.8.8).
|
||||
- **Hardware Monitoring**: Collects CPU and GPU temperature data.
|
||||
- **Nmap Scanning**: Periodically performs network scans to discover hosts and open ports.
|
||||
- **Historical Baseline Analysis**: Compares current data against a 24-hour rolling baseline to identify deviations.
|
||||
- **LLM-Powered Anomaly Detection**: Utilizes a local LLM (Ollama with Llama3.1) to analyze combined system data, baselines, and Nmap changes for anomalies.
|
||||
- **Alerting**: Sends high-severity anomaly alerts to Discord and Google Home speakers (via Home Assistant).
|
||||
- **Daily Recap**: Provides a daily summary of detected events.
|
||||
|
||||
1. **Clone the repository or download the files:**
|
||||
## Recent Improvements
|
||||
|
||||
```bash
|
||||
git clone <repository_url>
|
||||
cd <repository_directory>
|
||||
```
|
||||
- **Enhanced Nmap Data Logging**: The Nmap scan results are now processed and stored in a more structured format, including:
|
||||
- Discovered IP addresses.
|
||||
- Status of each host.
|
||||
- Detailed list of open ports for each host, including service, product, and version information.
|
||||
This significantly improves the clarity and utility of Nmap data for anomaly detection.
|
||||
- **Code Refactoring (`monitor_agent.py`)**:
|
||||
- **Optimized Sensor Data Collection**: CPU and GPU temperature data are now collected with a single call to the `sensors` command, improving efficiency.
|
||||
- **Efficient Login Attempt Logging**: The agent now tracks its position in `/var/log/auth.log`, preventing redundant reads of the entire file and improving performance for large log files.
|
||||
- **Modular Main Loop**: The core monitoring logic has been broken down into smaller, more manageable functions, enhancing readability and maintainability.
|
||||
- **Separated LLM Prompt Building**: The complex LLM prompt construction logic has been moved into a dedicated function, making `analyze_data_with_llm` more focused.
|
||||
- **Code Refactoring (`data_storage.py`)**:
|
||||
- **Streamlined Baseline Calculations**: Helper functions have been introduced to reduce code duplication and improve clarity in the calculation of average metrics for baselines.
|
||||
|
||||
2. **Create and activate a Python virtual environment:**
|
||||
|
||||
```bash
|
||||
python -m venv venv
|
||||
source venv/bin/activate # On Windows, use `venv\Scripts\activate`
|
||||
```
|
||||
|
||||
3. **Install the required Python libraries:**
|
||||
|
||||
```bash
|
||||
pip install -r requirements.txt
|
||||
```
|
||||
|
||||
## 2. Setup
|
||||
|
||||
Before running the agent, you need to configure it and ensure the necessary services are running.
|
||||
## Setup and Installation
|
||||
|
||||
### Prerequisites
|
||||
|
||||
- **Ollama:** The agent requires that [Ollama](https://ollama.com/) is installed and running on the server.
|
||||
- **LLM Model:** You must have the `llama3.1:8b` model pulled and available in Ollama. You can pull it with the following command:
|
||||
- Python 3.x
|
||||
- `ollama` installed and running with the `llama3.1:8b` model pulled (`ollama pull llama3.1:8b`)
|
||||
- `nmap` installed
|
||||
- `lm-sensors` installed (for CPU/GPU temperature monitoring)
|
||||
- Discord webhook URL
|
||||
- (Optional) Home Assistant instance with a long-lived access token and a Google Home speaker configured.
|
||||
|
||||
### Installation
|
||||
|
||||
1. Clone the repository:
|
||||
```bash
|
||||
ollama pull llama3.1:8b
|
||||
git clone <repository_url>
|
||||
cd LLM-Powered-Monitoring-Agent
|
||||
```
|
||||
2. Install Python dependencies:
|
||||
```bash
|
||||
pip install -r requirements.txt
|
||||
```
|
||||
3. Configure the agent:
|
||||
- Open `config.py` and update the following variables:
|
||||
- `DISCORD_WEBHOOK_URL`
|
||||
- `HOME_ASSISTANT_URL` (if using Google Home alerts)
|
||||
- `HOME_ASSISTANT_TOKEN` (if using Google Home alerts)
|
||||
- `GOOGLE_HOME_SPEAKER_ID` (if using Google Home alerts)
|
||||
- `NMAP_TARGETS` (e.g., "192.168.1.0/24" or "192.168.1.100")
|
||||
- `NMAP_SCAN_OPTIONS` (default is "-sS -T4")
|
||||
- `DAILY_RECAP_TIME` (e.g., "20:00" for 8 PM)
|
||||
- `TEST_MODE` (set to `True` for a single run, `False` for continuous operation)
|
||||
|
||||
### Configuration
|
||||
## Usage
|
||||
|
||||
All configuration is done in the `config.py` file. You will need to replace the placeholder values with your actual credentials and URLs.
|
||||
|
||||
- `DISCORD_WEBHOOK_URL`: Your Discord channel's webhook URL. This is used to send alerts.
|
||||
- `HOME_ASSISTANT_URL`: The URL of your Home Assistant instance (e.g., `http://192.168.1.50:8123`).
|
||||
- `HOME_ASSISTANT_TOKEN`: A Long-Lived Access Token for your Home Assistant instance. You can generate this in your Home Assistant profile settings.
|
||||
- `GOOGLE_HOME_SPEAKER_ID`: The `media_player` entity ID for your Google Home speaker in Home Assistant (e.g., `media_player.kitchen_speaker`).
|
||||
|
||||
## 3. Usage
|
||||
|
||||
Once the installation and setup are complete, you can run the monitoring agent with the following command:
|
||||
To run the monitoring agent:
|
||||
|
||||
```bash
|
||||
python monitor_agent.py
|
||||
```
|
||||
|
||||
The script will start a continuous monitoring loop. Every 5 minutes, it will:
|
||||
### Test Mode
|
||||
|
||||
1. Collect simulated system and network data.
|
||||
2. Send the data to the local LLM for analysis.
|
||||
3. If the LLM detects a **high-severity** anomaly, it will send an alert to your configured Discord channel and broadcast a message to your Google Home speaker via Home Assistant.
|
||||
4. At the time specified in `DAILY_RECAP_TIME`, a summary of all anomalies for the day will be sent to the Discord channel.
|
||||
Set `TEST_MODE = True` in `config.py` to run the agent once and exit. This is useful for testing configurations and initial setup.
|
||||
|
||||
The script will print its status and any detected anomalies to the console.
|
||||
## Extending and Customizing
|
||||
|
||||
### Nmap Scans
|
||||
- **Adding New Metrics**: You can add new data collection functions in `monitor_agent.py` and include their results in the `combined_data` dictionary.
|
||||
- **Customizing LLM Analysis**: Modify the `CONSTRAINTS.md` file to provide specific instructions or constraints to the LLM for anomaly detection.
|
||||
- **Known Issues**: Update `known_issues.json` with any known or expected system behaviors to prevent the LLM from flagging them as anomalies.
|
||||
- **Alerting Mechanisms**: Implement additional alerting functions (e.g., email, SMS) in `monitor_agent.py` and integrate them into the anomaly detection logic.
|
||||
|
||||
The agent uses `nmap` to scan the network for open ports. By default, it uses a TCP SYN scan (`-sS`), which requires root privileges. If the script is not run as root, it will fall back to a TCP connect scan (`-sT`), which does not require root privileges but is slower and more likely to be detected.
|
||||
## Project Structure
|
||||
|
||||
To run the agent with root privileges, use the `sudo` command:
|
||||
|
||||
```bash
|
||||
sudo python monitor_agent.py
|
||||
```
|
||||
|
||||
## 4. Features
|
||||
|
||||
### Priority System
|
||||
|
||||
The monitoring agent uses a priority system to classify anomalies. The LLM is instructed to return a severity level for each anomaly it detects. The possible severity levels are:
|
||||
|
||||
- **high**: Indicates a critical issue that requires immediate attention. An alert is sent to Discord and Google Home.
|
||||
- **medium**: Indicates a non-critical issue that should be investigated. No alert is sent.
|
||||
- **low**: Indicates a minor issue or a potential false positive. No alert is sent.
|
||||
- **none**: Indicates that no anomaly was detected.
|
||||
|
||||
### Known Issues Feed
|
||||
|
||||
The agent uses a `known_issues.json` file to provide the LLM with a list of known issues and their resolutions. This helps the LLM to avoid flagging resolved or expected issues as anomalies.
|
||||
|
||||
You can add new issues to the `known_issues.json` file by following the existing format. Each issue should have an "issue" and a "resolution" key. For example:
|
||||
|
||||
```json
|
||||
[
|
||||
{
|
||||
"issue": "CPU temperature spikes to 80C under heavy load",
|
||||
"resolution": "This is normal behavior for this CPU model and is not a cause for concern."
|
||||
}
|
||||
]
|
||||
```
|
||||
|
||||
**Note on Mock Data:** The current version of the script uses mock data for system logs and network metrics. To use this in a real-world scenario, you would need to replace the mock data with actual data from your systems.
|
||||
- `monitor_agent.py`: Main script for data collection, LLM interaction, and alerting.
|
||||
- `data_storage.py`: Handles loading, storing, and calculating baselines from historical data.
|
||||
- `config.py`: Stores configurable parameters for the agent.
|
||||
- `requirements.txt`: Lists Python dependencies.
|
||||
- `CONSTRAINTS.md`: Defines constraints and guidelines for the LLM's analysis.
|
||||
- `known_issues.json`: A JSON file containing a list of known issues to be considered by the LLM.
|
||||
- `monitoring_data.json`: (Generated) Stores historical monitoring data.
|
||||
- `log_position.txt`: (Generated) Stores the last read position for `/var/log/syslog`.
|
||||
- `auth_log_position.txt`: (Generated) Stores the last read position for `/var/log/auth.log`.
|
||||
Reference in New Issue
Block a user