Initial commit
This commit is contained in:
138
README.md
Normal file
138
README.md
Normal file
@@ -0,0 +1,138 @@
|
||||
# Pocket TTS Discord Bot
|
||||
|
||||
A Discord bot that reads messages aloud using [Pocket TTS](https://github.com/kyutai-labs/pocket-tts) with voice cloning from a reference WAV file.
|
||||
|
||||
## Features
|
||||
|
||||
- 🎤 **Voice Cloning**: Uses a reference WAV file to clone a voice
|
||||
- 📝 **Auto-read Messages**: Automatically reads all messages from a configured text channel
|
||||
- 🔊 **Voice Channel Streaming**: Streams generated audio to the voice channel where the message author is
|
||||
- 📋 **Message Queue**: Messages are queued and spoken in order
|
||||
|
||||
## Prerequisites
|
||||
|
||||
- Python 3.10+
|
||||
- FFmpeg installed and available in PATH
|
||||
- A Discord bot token
|
||||
- A reference voice WAV file (3-10 seconds of clear speech recommended)
|
||||
|
||||
## Installation
|
||||
|
||||
1. **Clone the repository**:
|
||||
```bash
|
||||
git clone <repository-url>
|
||||
cd PocketTTSBot
|
||||
```
|
||||
|
||||
2. **Create a virtual environment**:
|
||||
```bash
|
||||
python -m venv venv
|
||||
|
||||
# Windows
|
||||
venv\Scripts\activate
|
||||
|
||||
# Linux/macOS
|
||||
source venv/bin/activate
|
||||
```
|
||||
|
||||
3. **Install dependencies**:
|
||||
```bash
|
||||
pip install -r requirements.txt
|
||||
```
|
||||
|
||||
4. **Install FFmpeg**:
|
||||
- **Windows**: Download from [ffmpeg.org](https://ffmpeg.org/download.html) and add to PATH
|
||||
- **Linux**: `sudo apt install ffmpeg`
|
||||
- **macOS**: `brew install ffmpeg`
|
||||
|
||||
## Configuration
|
||||
|
||||
1. **Create a Discord Bot**:
|
||||
- Go to [Discord Developer Portal](https://discord.com/developers/applications)
|
||||
- Create a new application
|
||||
- Go to the "Bot" section and create a bot
|
||||
- Copy the bot token
|
||||
- Enable these Privileged Gateway Intents:
|
||||
- Message Content Intent
|
||||
- Server Members Intent (optional)
|
||||
|
||||
2. **Invite the Bot to your server**:
|
||||
- Go to OAuth2 > URL Generator
|
||||
- Select scopes: `bot`
|
||||
- Select permissions: `Connect`, `Speak`, `Send Messages`, `Read Message History`
|
||||
- Use the generated URL to invite the bot
|
||||
|
||||
3. **Get Channel ID**:
|
||||
- Enable Developer Mode in Discord (Settings > Advanced > Developer Mode)
|
||||
- Right-click the text channel you want to monitor and click "Copy ID"
|
||||
|
||||
4. **Create `.env` file**:
|
||||
```bash
|
||||
cp .env.example .env
|
||||
```
|
||||
|
||||
Edit `.env` with your values:
|
||||
```env
|
||||
DISCORD_TOKEN=your_bot_token_here
|
||||
TEXT_CHANNEL_ID=123456789012345678
|
||||
VOICE_WAV_PATH=./voice.wav
|
||||
```
|
||||
|
||||
5. **Add a voice reference file**:
|
||||
- Place a WAV file named `voice.wav` in the project directory
|
||||
- The file should contain 3-10 seconds of clear speech
|
||||
- Higher quality audio = better voice cloning results
|
||||
|
||||
## Usage
|
||||
|
||||
1. **Start the bot**:
|
||||
```bash
|
||||
python bot.py
|
||||
```
|
||||
|
||||
2. **Using the bot**:
|
||||
- Join a voice channel in your Discord server
|
||||
- Type a message in the configured text channel
|
||||
- The bot will join your voice channel and read your message aloud
|
||||
- Messages are queued if the bot is already speaking
|
||||
|
||||
## How It Works
|
||||
|
||||
```
|
||||
┌─────────────────┐ ┌──────────────────┐ ┌─────────────────┐
|
||||
│ Text Channel │ --> │ Pocket TTS │ --> │ Voice Channel │
|
||||
│ (configured) │ │ (generate) │ │ (user's VC) │
|
||||
└─────────────────┘ └──────────────────┘ └─────────────────┘
|
||||
▲
|
||||
│
|
||||
┌─────┴─────┐
|
||||
│ voice.wav │
|
||||
│ (speaker) │
|
||||
└───────────┘
|
||||
```
|
||||
|
||||
1. Bot monitors the configured text channel for new messages
|
||||
2. When a message is received, it's added to the queue
|
||||
3. The bot generates speech using Pocket TTS with the cloned voice
|
||||
4. Audio is streamed to the voice channel where the message author is
|
||||
|
||||
## Troubleshooting
|
||||
|
||||
### Bot doesn't respond to messages
|
||||
- Ensure Message Content Intent is enabled in Discord Developer Portal
|
||||
- Check that the TEXT_CHANNEL_ID is correct
|
||||
- Verify the bot has permissions to read the channel
|
||||
|
||||
### No audio in voice channel
|
||||
- Ensure FFmpeg is installed and in PATH
|
||||
- Check that the bot has Connect and Speak permissions
|
||||
- Verify your voice.wav file is valid
|
||||
|
||||
### Voice quality issues
|
||||
- Use a higher quality reference WAV file
|
||||
- Ensure the reference audio is clear with minimal background noise
|
||||
- Try a longer reference clip (5-10 seconds)
|
||||
|
||||
## License
|
||||
|
||||
MIT License
|
||||
Reference in New Issue
Block a user