16 Commits

Author SHA1 Message Date
9917d44f5d docs: add HuggingFace cache troubleshooting to README
- Document HF_HOME environment variable for writable cache
- Add systemd service permission guidance for /tmp paths
- Troubleshooting steps for read-only file system errors
2026-02-26 15:56:09 -06:00
85a334a57b docs: update README with comprehensive effects documentation and bump version to 1.2.0
README Updates:
- Updated features list with all new capabilities
- Comprehensive Audio Effects section covering all 7 effects:
  - Pitch, Speed, Echo, Robot, Chorus, Tremolo Depth, Tremolo Rate
- Detailed effect ranges, defaults, and descriptions
- Effect application order documentation
- Performance notes and warnings
- Enhanced Preview with Effects section with examples
- Example effect combinations for users to try

Version Bump:
- Bumped __version__ from 1.1.0 to 1.2.0

Major features in 1.2.0:
- 4 new voice effects (echo, robot, chorus, tremolo)
- Unlimited effects with performance warnings
- Complete effects pipeline implementation
- Enhanced preview system
2026-01-31 17:33:28 -06:00
7e76deed3d feat: wire up all effects to audio processing pipeline
- Updated queue system to pass effects as dict instead of individual params
- Updated process_queue to handle effects_dict for previews
- Updated speak_message to extract all 7 effects from user settings
- Updated _generate_wav_bytes to accept effects dict and pass all params
- Updated _handle_voice_preview to use new effects dict system
- Effects now actually process the audio:
  - pitch, speed, echo, robot, chorus, tremolo_depth, tremolo_rate
- Fixed preview effect description to use preview_effects dict
2026-01-31 17:25:52 -06:00
795d5087e9 feat: add 4 new voice effects (echo, robot, chorus, tremolo)
- Removed MAX_ACTIVE_EFFECTS limit (effects unlimited)
- Added echo effect (0-100%): spatial delay/reverb
- Added robot effect (0-100%): ring modulation voice
- Added chorus effect (0-100%): multiple voices effect
- Added tremolo depth (0.0-1.0) and rate (0.0-10.0 Hz): amplitude modulation
- Effects apply in order: pitch → speed → echo → chorus → tremolo → robot
- Updated /effects command with all 7 effect choices
- Updated /effects list to display all 7 effects with emojis
- Updated warning system: warns when > 2 active effects
- Added validation and formatting for all new effects
- Updated voice_manager.py to handle all 7 effect storage/loading

Note: Cancel button for processing >10s not yet implemented
Note: Queue system needs updating to handle all effect parameters
2026-01-31 17:10:19 -06:00
4cb0a78486 fix: squeeze audio to 1D before applying effects
The TTS model returns a 2D array [samples, 1], but librosa.effects
functions expect 1D arrays. This was causing the warning:
'n_fft=2048 is too large for input signal of length=1'

Fix: Squeeze to 1D before effects, reshape back after.

Also moved the effects application logic to handle the shape
conversion properly.
2026-01-31 16:50:43 -06:00
f082c62a16 fix: use copy_global_to before guild sync for immediate command availability
The issue: Commands registered as global commands weren't being synced
when calling tree.sync(guild=...) because they weren't associated with
the specific guild context.

The fix: Call tree.copy_global_to(guild=...) before sync() to copy global
commands to each guild's context. This makes commands appear immediately
instead of requiring global sync (which can take up to 1 hour).

Reference: discord.py FAQ recommends copy_global_to for development
when you want immediate command availability in specific guilds.
2026-01-31 16:43:10 -06:00
85f3e79d2a debug: add comprehensive logging for command registration and sync
- Added _log_registered_commands() to list all commands in tree
- Added logging in __init__ to track command registration
- Enhanced on_ready() sync logging with detailed information
- Shows registered commands before and during sync
- Shows specific guild sync status with command counts
- Added error handling for Forbidden errors (missing permissions)
- Clear warnings when no guilds are synced
2026-01-31 16:40:23 -06:00
9f14e8c745 feat: add audio effects (pitch and speed control)
- Added new audio_effects.py module with pitch shift and speed change
- Pitch range: -12 to +12 semitones (higher = chipmunk, lower = deeper)
- Speed range: 0.5 to 2.0x (higher = faster, lower = slower)
- Maximum 2 active effects per user (performance optimization)
- Added /effects command group:
  - /effects list - Shows current effects with descriptions
  - /effects set pitch|speed <value> - Apply effects
  - /effects reset - Confirmation UI to clear all effects
- Effects persist across restarts in preferences.json
- Updated /voice preview to support optional pitch/speed parameters
- Effects applied in _generate_wav_bytes using librosa
- Added performance warnings when processing takes >1 second
- Updated README with effects documentation
2026-01-31 15:43:29 -06:00
4a2d72517f feat: add /voice preview command
- Added 8 random preview sample lines for voice testing
- New /voice preview <name> command to hear voices before selecting
- Previews play in queue like regular messages (no queue jumping)
- Preview does NOT change user's active voice preference
- Updated queue system to support voice override for previews
- Added documentation for new command in README
2026-01-31 15:06:45 -06:00
2403b431e9 chore: bump version to 1.1.0
Major features added since 1.0.0:
- Test Mode support for safe development
- Auto-updates dependencies on startup
- Multi-voice support with per-user preferences
- Voice persistence across restarts
- Hot-reload voices without restart
2026-01-31 14:47:52 -06:00
c5e3fd33c4 Added Test Mode 2026-01-31 14:42:08 -06:00
d0de47bdd7 fix: replace emoji characters with ASCII-safe markers for Windows compatibility
- Replace Unicode emoji (✓, ⚠️) with [OK] and [WARN] in audio_preprocessor.py
  to prevent UnicodeEncodeError on Windows console (cp1252 codec)
- Add auto-update dependencies function to bot.py for easier maintenance
- Remove setup_linux.sh (no longer needed)
- Update .gitignore to exclude VS Code launch.json
2026-01-31 13:54:27 -06:00
a46ddc9b21 Added Disconnect 2026-01-18 18:27:01 -06:00
736a819493 feat: Rename pockettts service to vox and improve numba caching
Renamed the systemd service from "pockettts" to "vox" for better branding and clarity.
Updated the  script to reflect the new service name.

Addressed numba caching issues when running as a systemd service:
- Created  to explicitly set  to a project-local directory ().
- Modified  to import  early in the execution flow.
- Updated the systemd service file to grant write permissions to the  directory.
- Added  to  to prevent caching files from being committed.
2026-01-18 18:09:10 -06:00
92dfcb1d39 feat: Implement multi-voice support and management
Refactor the TTS handling to support multiple, user-selectable voices. This replaces the previous single-voice system.

Key changes:
- Introduce VoiceManager to handle loading and managing voices from a dedicated oices/ directory.
- Add slash commands (/voice list, /set, /current, /refresh) for users to manage their personal TTS voice.
- Implement on-demand voice loading to improve startup time and memory usage.
- Remove the old 	ts_handler.py and single voice .wav files in favor of the new system.
- Update configuration to specify a voices directory instead of a single file path.
2026-01-18 17:24:12 -06:00
ae1c2a65d3 Initial commit 2026-01-18 17:08:37 -06:00