c69028a970b4130c22aea91dd4e448ddf3bebe01
Improve documentation and add a setup script for easy deployment on Linux systems. - Update README.md with instructions for the new multi-voice slash commands and the oices/ directory structure. - Add a comprehensive 'Linux Server Deployment' section to the README, detailing both a quick setup via a new script and a manual systemd service setup. - Create setup_linux.sh to automate dependency checking, virtual environment creation, and service installation on Linux. - Revise comments in .env.example for clarity and to reflect the latest configuration options.
Pocket TTS Discord Bot
A Discord bot that reads messages aloud using Pocket TTS with voice cloning from a reference WAV file.
Features
- 🎤 Voice Cloning: Uses a reference WAV file to clone a voice
- 📝 Auto-read Messages: Automatically reads all messages from a configured text channel
- 🔊 Voice Channel Streaming: Streams generated audio to the voice channel where the message author is
- 📋 Message Queue: Messages are queued and spoken in order
- 🔄 Per-User Voice Selection: Each user can choose their own TTS voice via
/voicecommands - 💾 Voice Persistence: User voice preferences are saved and restored on restart
- 🔄 Hot-reload Voices: Add new voices without restarting the bot using
/voice refresh
Prerequisites
- Python 3.10+
- FFmpeg installed and available in PATH
- A Discord bot token
- A reference voice WAV file (3-10 seconds of clear speech recommended)
Installation
-
Clone the repository:
git clone <repository-url> cd PocketTTSBot -
Create a virtual environment:
python -m venv venv # Windows venv\Scripts\activate # Linux/macOS source venv/bin/activate -
Install dependencies:
pip install -r requirements.txt -
Install FFmpeg:
- Windows: Download from ffmpeg.org and add to PATH
- Linux:
sudo apt install ffmpeg - macOS:
brew install ffmpeg
Configuration
-
Create a Discord Bot:
- Go to Discord Developer Portal
- Create a new application
- Go to the "Bot" section and create a bot
- Copy the bot token
- Enable these Privileged Gateway Intents:
- Message Content Intent
- Server Members Intent (optional)
-
Invite the Bot to your server:
- Go to OAuth2 > URL Generator
- Select scopes:
bot - Select permissions:
Connect,Speak,Send Messages,Read Message History - Use the generated URL to invite the bot
-
Get Channel ID:
- Enable Developer Mode in Discord (Settings > Advanced > Developer Mode)
- Right-click the text channel you want to monitor and click "Copy ID"
-
Create
.envfile:cp .env.example .envEdit
.envwith your values:DISCORD_TOKEN=your_bot_token_here TEXT_CHANNEL_ID=123456789012345678 VOICES_DIR=./voices DEFAULT_VOICE=estinien -
Add voice reference files:
- Create a
voices/directory:mkdir voices - Place
.wavfiles in thevoices/directory - Each file should contain 3-10 seconds of clear speech
- File names become voice names (e.g.,
MasterChief.wav→/voice set masterchief) - Higher quality audio = better voice cloning results
- Create a
Usage
-
Start the bot:
python bot.py -
Using the bot:
- Join a voice channel in your Discord server
- Type a message in the configured text channel
- The bot will join your voice channel and read your message aloud
- Messages are queued if the bot is already speaking
-
Voice Commands (Slash Commands):
/voice list- Shows all available voices/voice set <name>- Change your personal TTS voice/voice current- Shows your current voice/voice refresh- Re-scan for new voice files (no restart needed)
How It Works
┌─────────────────┐ ┌──────────────────┐ ┌─────────────────┐
│ Text Channel │ --> │ Pocket TTS │ --> │ Voice Channel │
│ (configured) │ │ (generate) │ │ (user's VC) │
└─────────────────┘ └──────────────────┘ └─────────────────┘
▲
│
┌─────┴─────┐
│ voices/ │
│ per-user │
└───────────┘
- Bot monitors the configured text channel for new messages
- When a message is received, it's added to the queue
- The bot generates speech using Pocket TTS with the cloned voice
- Audio is streamed to the voice channel where the message author is
Troubleshooting
Bot doesn't respond to messages
- Ensure Message Content Intent is enabled in Discord Developer Portal
- Check that the TEXT_CHANNEL_ID is correct
- Verify the bot has permissions to read the channel
No audio in voice channel
- Ensure FFmpeg is installed and in PATH
- Check that the bot has Connect and Speak permissions
- Verify your voice.wav file is valid
Voice quality issues
- Use a higher quality reference WAV file
- Ensure the reference audio is clear with minimal background noise
- Try a longer reference clip (5-10 seconds)
Linux Server Deployment
To run the bot as a service on a Linux server:
Quick Setup (Recommended)
# Make the setup script executable
chmod +x setup_linux.sh
# Run the setup script
./setup_linux.sh
The script will:
- Check system dependencies (Python 3.10+, FFmpeg, pip)
- Create a virtual environment and install dependencies
- Create
.envtemplate if needed - Optionally install and configure the systemd service
Manual Setup
-
Install system dependencies:
# Ubuntu/Debian sudo apt update sudo apt install python3 python3-pip python3-venv ffmpeg # Fedora sudo dnf install python3 python3-pip ffmpeg # Arch sudo pacman -S python python-pip ffmpeg -
Set up the project:
cd /path/to/PocketTTSBot python3 -m venv venv source venv/bin/activate pip install -r requirements.txt -
Configure the service:
Edit
pockettts.serviceand replace:YOUR_USERNAMEwith your Linux username- Update paths if your bot is not in
/home/YOUR_USERNAME/PocketTTSBot
-
Install the service:
sudo cp pockettts.service /etc/systemd/system/ sudo systemctl daemon-reload sudo systemctl enable pockettts # Start on boot sudo systemctl start pockettts # Start now
Service Management
# Check status
sudo systemctl status pockettts
# View logs (live)
journalctl -u pockettts -f
# View recent logs
journalctl -u pockettts --since "1 hour ago"
# Restart after changes
sudo systemctl restart pockettts
# Stop the bot
sudo systemctl stop pockettts
# Disable auto-start
sudo systemctl disable pockettts
Updating the Bot
cd /path/to/PocketTTSBot
git pull # If using git
source venv/bin/activate
pip install -r requirements.txt
sudo systemctl restart pockettts
License
MIT License
Description
Languages
Python
99.9%
Shell
0.1%