4.4 KiB
4.4 KiB
Pocket TTS Discord Bot
A Discord bot that reads messages aloud using Pocket TTS with voice cloning from a reference WAV file.
Features
- 🎤 Voice Cloning: Uses a reference WAV file to clone a voice
- 📝 Auto-read Messages: Automatically reads all messages from a configured text channel
- 🔊 Voice Channel Streaming: Streams generated audio to the voice channel where the message author is
- 📋 Message Queue: Messages are queued and spoken in order
Prerequisites
- Python 3.10+
- FFmpeg installed and available in PATH
- A Discord bot token
- A reference voice WAV file (3-10 seconds of clear speech recommended)
Installation
-
Clone the repository:
git clone <repository-url> cd PocketTTSBot -
Create a virtual environment:
python -m venv venv # Windows venv\Scripts\activate # Linux/macOS source venv/bin/activate -
Install dependencies:
pip install -r requirements.txt -
Install FFmpeg:
- Windows: Download from ffmpeg.org and add to PATH
- Linux:
sudo apt install ffmpeg - macOS:
brew install ffmpeg
Configuration
-
Create a Discord Bot:
- Go to Discord Developer Portal
- Create a new application
- Go to the "Bot" section and create a bot
- Copy the bot token
- Enable these Privileged Gateway Intents:
- Message Content Intent
- Server Members Intent (optional)
-
Invite the Bot to your server:
- Go to OAuth2 > URL Generator
- Select scopes:
bot - Select permissions:
Connect,Speak,Send Messages,Read Message History - Use the generated URL to invite the bot
-
Get Channel ID:
- Enable Developer Mode in Discord (Settings > Advanced > Developer Mode)
- Right-click the text channel you want to monitor and click "Copy ID"
-
Create
.envfile:cp .env.example .envEdit
.envwith your values:DISCORD_TOKEN=your_bot_token_here TEXT_CHANNEL_ID=123456789012345678 VOICE_WAV_PATH=./voice.wav -
Add a voice reference file:
- Place a WAV file named
voice.wavin the project directory - The file should contain 3-10 seconds of clear speech
- Higher quality audio = better voice cloning results
- Place a WAV file named
Usage
-
Start the bot:
python bot.py -
Using the bot:
- Join a voice channel in your Discord server
- Type a message in the configured text channel
- The bot will join your voice channel and read your message aloud
- Messages are queued if the bot is already speaking
How It Works
┌─────────────────┐ ┌──────────────────┐ ┌─────────────────┐
│ Text Channel │ --> │ Pocket TTS │ --> │ Voice Channel │
│ (configured) │ │ (generate) │ │ (user's VC) │
└─────────────────┘ └──────────────────┘ └─────────────────┘
▲
│
┌─────┴─────┐
│ voice.wav │
│ (speaker) │
└───────────┘
- Bot monitors the configured text channel for new messages
- When a message is received, it's added to the queue
- The bot generates speech using Pocket TTS with the cloned voice
- Audio is streamed to the voice channel where the message author is
Troubleshooting
Bot doesn't respond to messages
- Ensure Message Content Intent is enabled in Discord Developer Portal
- Check that the TEXT_CHANNEL_ID is correct
- Verify the bot has permissions to read the channel
No audio in voice channel
- Ensure FFmpeg is installed and in PATH
- Check that the bot has Connect and Speak permissions
- Verify your voice.wav file is valid
Voice quality issues
- Use a higher quality reference WAV file
- Ensure the reference audio is clear with minimal background noise
- Try a longer reference clip (5-10 seconds)
License
MIT License