4.9 KiB
sidebar_position | title |
---|---|
11 | Integrating OpenedAI-Speech with Open WebUI using Docker Desktop |
Integrating openedai-speech
into Open WebUI using Docker Desktop
What is openedai-speech
?
openedai-speech is an OpenAI API compatible text-to-speech server that uses Coqui AI's xtts_v2
and/or Piper TTS
as the backend. It's a free, private, text-to-speech server that allows for custom voice cloning and is compatible with the OpenAI audio/speech API.
Prerequisites
- Docker Desktop installed on your system
- Open WebUI running in a Docker container
- A basic understanding of Docker and Docker Compose
Option 1: Using Docker Compose
Step 1: Create a new folder for the openedai-speech
service
Create a new folder, for example, openedai-speech-service
, to store the docker-compose.yml
and .env
files.
Step 2: Create a docker-compose.yml
file
In the openedai-speech-service
folder, create a new file named docker-compose.yml
with the following contents:
services:
server:
image: ghcr.io/matatonic/openedai-speech
container_name: openedai-speech
env_file: .env
ports:
- "8000:8000"
volumes:
- tts-voices:/app/voices
- tts-config:/app/config
# labels:
# - "com.centurylinklabs.watchtower.enable=true"
restart: unless-stopped
volumes:
tts-voices:
tts-config:
Step 3: Create an .env
file (optional)
In the same openedai-speech-service
folder, create a new file named .env
with the following contents:
TTS_HOME=voices
HF_HOME=voices
#PRELOAD_MODEL=xtts
#PRELOAD_MODEL=xtts_v2.0.2
#PRELOAD_MODEL=parler-tts/parler_tts_mini_v0.1
Step 4: Run docker-compose
to start the openedai-speech
service
Run the following command in the openedai-speech-service
folder to start the openedai-speech
service in detached mode:
docker compose up -d
This will start the openedai-speech
service in the background.
Option 2: Using Docker Run Commands
You can also use the following Docker run commands to start the openedai-speech
service in detached mode:
With GPU (Nvidia) support:
docker run -d --gpus=all -p 8000:8000 -v tts-voices:/app/voices -v tts-config:/app/config --name openedai-speech ghcr.io/matatonic/openedai-speech:latest
Alternative without GPU support:
docker run -d -p 8000:8000 -v tts-voices:/app/voices -v tts-config:/app/config --name openedai-speech ghcr.io/matatonic/openedai-speech-min:latest
Configuring Open WebUI
For more information on configuring Open WebUI to use openedai-speech
, including setting environment variables, see the Open WebUI documentation.
Step 5: Configure Open WebUI to use openedai-speech
Open the Open WebUI settings and navigate to the TTS Settings under Admin Panel > Settings > Audio. Add the following configuration:
- API Base URL:
http://host.docker.internal:8000/v1
- API Key:
sk-111111111
(note: this is a dummy API key, asopenedai-speech
doesn't require an API key; you can use whatever for this field)
Step 6: Choose a voice
Under Set Voice, you can choose from the following voices:
- alloy
- echo
- echo-alt
- fable
- onyx
- nova
- shimmer
Step 7: Enjoy naturally sounding voices
You should now be able to use the openedai-speech
integration with Open WebUI to generate naturally sounding voices.
Troubleshooting
If you encounter any issues, make sure that:
- The
openedai-speech
service is running and the port you set in the docker-compose.yml file is exposed. - The
host.docker.internal
hostname is resolvable from within the Open WebUI container.host.docker.internal
is required sinceopenedai-speech
is exposed vialocalhost
on your PC, butopen-webui
cannot normally access this from within its container. - The API key is set to a dummy value, as
openedai-speech
doesn't require an API key.
Additional Resources
For more information on openedai-speech
, please visit the GitHub repository.
Note: You can change the port number in the docker-compose.yml
file to any open and usable port, but make sure to update the API Base URL in Open WebUI Admin Audio settings accordingly.