Files
LLM-Powered-Monitoring-Agent/PROGRESS.md
2025-09-14 22:01:00 -05:00

5.3 KiB
Executable File

Project Progress

Phase 1: Initial Setup

  1. Create monitor_agent.py
  2. Create config.py
  3. Create requirements.txt
  4. Create README.md
  5. Create .gitignore
  6. Create SPEC.md
  7. Create PROMPT.md
  8. Create CONSTRAINTS.md

Phase 2: Data Storage

  1. Implement data storage functions in data_storage.py
  2. Update monitor_agent.py to use data storage
  3. Update SPEC.md to reflect data storage functionality

Phase 3: Expanded Monitoring

  1. Implement CPU temperature monitoring
  2. Implement GPU temperature monitoring
  3. Implement system login attempt monitoring
  4. Update monitor_agent.py to include new metrics
  5. Update SPEC.md to reflect new metrics
  6. Extend calculate_baselines to include system temps

Phase 4: Troubleshooting

  1. Investigated and resolved issue with jc library
  2. Removed jc library as a dependency
  3. Implemented manual parsing of sensors command output

Phase 5: Network Scanning (Nmap Integration)

  1. Add python-nmap to requirements.txt and install.
  2. Define NMAP_TARGETS and NMAP_SCAN_OPTIONS in config.py.
  3. Create a new function get_nmap_scan_results() in monitor_agent.py:
    • Use python-nmap to perform a scan on the defined targets with the specified options.
    • Return the parsed results.
  4. Integrate get_nmap_scan_results() into the main monitoring loop:
    • Call this function periodically (e.g., less frequently than other metrics).
    • Add the nmap results to the combined_data dictionary.
  5. Update data_storage.py to store nmap results.
  6. Extend calculate_baselines() in data_storage.py to include nmap baselines:
    • Compare current nmap results with historical data to identify changes.
  7. Modify analyze_data_with_llm() prompt to include nmap scan results for analysis.
  8. Consider how to handle nmap permissions.
  9. Improve Nmap data logging to include IP addresses, open ports, and service details.

Phase 6: Code Refactoring and Documentation

  1. Remove duplicate pingparsing import in monitor_agent.py.
  2. Refactor get_cpu_temperature and get_gpu_temperature to call sensors command only once.
  3. Refactor get_login_attempts to use a position file for efficient log reading.
  4. Simplify JSON parsing in analyze_data_with_llm.
  5. Move LLM prompt to a separate function build_llm_prompt.
  6. Refactor main loop into smaller functions (run_monitoring_cycle, main).
  7. Create helper function in data_storage.py for calculating average metrics.
  8. Update README.md with current project status and improvements.
  9. Create AGENTS.md to document human and autonomous agents.

Previous TODO

  • Improve "high" priority detection by explicitly instructing LLM to output severity in structured JSON format.
  • Implement dynamic contextual information (Known/Resolved Issues Feed) for LLM to improve severity detection.
  • Change baseline calculations to only use integers instead of floats.
  • Add a log file that only keeps records for the past 24 hours.
  • Log all LLM responses to the console.
  • Reduce alerts to only happen between 9am and 12am.
  • Get hostnames of devices in Nmap scan.
  • Filter out RTT fluctuations below 10 seconds.
  • Filter out temperature fluctuations with differences less than 5 degrees.
  • Create a list of known port numbers and their applications for the LLM to check against to see if an open port is a threat
  • When calculating averages, please round up to the nearest integer. We only want to deliver whole integers to the LLM to process, and nothing with decimal points. It gets confused with decimal points.
  • In the discord message, please include the exact specific details and the log of the problem that prompted the alert

Phase 7: Offloading Analysis from LLM

  1. Create a new function analyze_data_locally in monitor_agent.py. 39.1. [x] This function will take data, baselines, known_issues, and port_applications as input. 39.2. [x] It will contain the logic to compare current data with baselines and predefined thresholds. 39.3. [x] It will be responsible for identifying anomalies for various metrics (CPU/GPU temp, network RTT, failed logins, Nmap changes). 39.4. [x] It will return a list of dictionaries, where each dictionary represents an anomaly and contains 'severity' and 'reason' keys.
  2. Refactor analyze_data_with_llm into a new function called generate_llm_report. 40.1. [x] This function will take the list of anomalies from analyze_data_locally as input. 40.2. [x] It will construct a simple prompt to ask the LLM to generate a human-readable summary of the anomalies. 40.3. [x] The LLM will no longer be making analytical decisions.
  3. Update run_monitoring_cycle to orchestrate the new workflow. 41.1. [x] Call analyze_data_locally to get the list of anomalies. 41.2. [x] If anomalies are found, call generate_llm_report to create the report. 41.3. [x] Use the output of generate_llm_report for alerting.
  4. Remove the detailed analytical instructions from build_llm_prompt as they will be handled by analyze_data_locally.

TODO