Refactor: Integrate scripts into a single application (v1.2.0)

2025-12-29 16:45:40 -06:00
parent 671741772f
commit 5bd154fb4e
6 changed files with 213 additions and 251 deletions
--- a/README.md
+++ b/README.md
@@ -1,4 +1,4 @@
-# Territory Analysis Tool
+# Territory Analysis Tool v1.2.0

 ## Overview

@@ -6,88 +6,80 @@ This tool provides a complete pipeline for processing and analyzing territory da

 The workflow is managed by a command-line script that gives the user fine-grained control over the execution process.

+## Installation
+
+This tool requires Python 3 and has a few dependencies.
+
+1.  **Install dependencies:**
+    Navigate to this directory in your terminal and run the following command to install the required Python libraries:
+    ```bash
+    pip install -r requirements.txt
+    ```
+
 ## File Structure

 All necessary files are located in this directory.

-### Core Scripts
-
- `run_all.py`: The main command-line script to run the workflow. **This is the recommended entry point.**
- `process_territories.py`: (Step 1) Combines address and boundary data.
- `analysis.py`: (Step 2) Performs general territory analysis and generates `map.html`.
- `category_analysis.py`: (Step 2) Performs category-specific analysis and generates `category_map.html`.
-
-### Input Data Files
-
- The tool is designed to work with any address and boundary CSV files.
- The example files `Okinawa Territory Jan 2026 - Addresses.csv` and `Okinawa Territory Jan 2026 - Boundaries.csv` are provided.
-
-These two files can be found in NW Scheduler. Go to export -> Territories, and download them both from there.
+- `run_all.py`: The main command-line script to run the workflow.
+- `process_territories.py`: A module that combines address and boundary data.
+- `analysis.py`: A module that performs general territory analysis.
+- `category_analysis.py`: A module that performs category-specific analysis.
+- `requirements.txt`: A list of Python dependencies.

 ## Usage

-The entire workflow is managed through `run_all.py` using a command-line interface. You can see all available commands by running:
-
+The entire workflow is managed through `run_all.py`. You can see all commands by running:
 ```bash
 python run_all.py --help
 ```

 ### Full Pipeline Run
-
-To run the entire process from start to finish (process raw files and then analyze them), use the `full-run` command. This is the most common use case.
+To run the entire process from start to finish in memory, use the `full-run` command.

 **Command:**
-
 ```bash
 python run_all.py full-run --addresses <path_to_addresses.csv> --boundaries <path_to_boundaries.csv>
 ```

 **Example:**
-
 ```bash
 python run_all.py full-run --addresses "Okinawa Territory Jan 2026 - Addresses.csv" --boundaries "Okinawa Territory Jan 2026 - Boundaries.csv"
 ```

 ### Running Steps Individually

-You can also run each step of the pipeline separately.
-
 #### Step 1: Process Raw Files
-
-To combine the address and boundary files into a single "Final" CSV, use the `process` command.
+To combine the address and boundary files and save the result to a CSV, use the `process` command.

 **Command:**
-
 ```bash
 python run_all.py process --addresses <path_to_addresses.csv> --boundaries <path_to_boundaries.csv>
 ```

-This will generate a new file named `Okinawa Territory <Mon Year> - Final.csv`.
-
 #### Step 2: Analyze a Processed File
-
-To run the analysis and generate maps from a "Final" CSV file, use the `analyze` command.
+To run analysis from a "Final" CSV file, use the `analyze` command.

 **Command:**
-
 ```bash
 python run_all.py analyze --input <path_to_final_file.csv>
 ```

-**Example:**
+## Changelog

-```bash
-python run_all.py analyze --input "Okinawa Territory Dec 2025 - Final.csv"
-```
+### v1.2.0 (Current)
+- Refactored the tool from a collection of separate scripts into a single, integrated Python application.
+- Replaced `subprocess` calls with direct function imports for improved performance and reliability.
+- Integrated the `pandas` library for more efficient in-memory data processing.
+- The `full-run` command now processes data in memory without writing an intermediate CSV file.
+- Added a `requirements.txt` file for easier dependency management.

-## Workflow Details
+### v1.1.0
+- Introduced a command-line interface with `argparse` to replace the interactive menu.
+- Added `process`, `analyze`, and `full-run` commands.
+- Allowed for dynamic input file paths via command-line arguments.

-1.  **Data Processing:** The `process_territories.py` script reads the `Addresses.csv` to count addresses per `TerritoryID` and merges this count into the `Boundaries.csv` file. It outputs a new CSV file named in the format `Okinawa Territory Mon Year - Final.csv`.
-2.  **Data Analysis:** The `analysis.py` and `category_analysis.py` scripts take the `Final.csv` file as input to generate reports and interactive maps.
-
-## Output Files
-
- `Okinawa Territory <Mon Year> - Final.csv`: The consolidated data file.
- `analysis.md`: A markdown summary of the general territory analysis.
- `map.html`: An interactive map visualizing territories colored by address count.
- `category_map.html`: An interactive map visualizing territories colored by their category's total address count.
+### v1.0.0
+- Initial release with separate scripts for processing and analysis.
+- Workflow managed by an interactive `run_all.py` script.
+- Project structure consolidated into a single directory.
+- Git repository initialized.