diff --git a/docs/tutorial/openedai-speech-integration.md b/docs/tutorial/openedai-speech-integration.md new file mode 100644 index 0000000..c631114 --- /dev/null +++ b/docs/tutorial/openedai-speech-integration.md @@ -0,0 +1,101 @@ +--- +sidebar_position: 11 +title: "Integrating OpenedAI-Speech with Open WebUI using Docker Desktop" +--- + +Integrating `openedai-speech` into Open WebUI using Docker Desktop +================================================================ + +**Prerequisites** +--------------- + +* Docker Desktop installed on your system +* Open WebUI running in a Docker container +* A basic understanding of Docker and Docker Compose + +**Step 1: Create a new folder for the `openedai-speech` service** +--------------------------------------------------------- + +Create a new folder, for example, `openedai-speech-service`, to store the `docker-compose.yml` and `.env` files. + +**Step 2: Create a `docker-compose.yml` file** +------------------------------------------ + +In the `openedai-speech-service` folder, create a new file named `docker-compose.yml` with the following contents: +```yaml +services: + server: + image: ghcr.io/matatonic/openedai-speech + container_name: openedai-speech + env_file: .env + ports: + - "8000:8000" + volumes: + - tts-voices:/app/voices + - tts-config:/app/config + # labels: + # - "com.centurylinklabs.watchtower.enable=true" + restart: unless-stopped + +volumes: + tts-voices: + tts-config: +``` +**Step 3: Create an `.env` file (optional)** +----------------------------------------- + +In the same `openedai-speech-service` folder, create a new file named `.env` with the following contents: +``` +TTS_HOME=voices +HF_HOME=voices +#PRELOAD_MODEL=xtts +#PRELOAD_MODEL=xtts_v2.0.2 +#PRELOAD_MODEL=parler-tts/parler_tts_mini_v0.1 +``` +**Step 4: Run `docker-compose` to start the `openedai-speech` service** +--------------------------------------------------------- + +Run the following command in the `openedai-speech-service` folder to start the `openedai-speech` service in detached mode: +``` +docker compose up -d +``` +This will start the `openedai-speech` service in the background. + +**Step 5: Configure Open WebUI to use `openedai-speech`** +--------------------------------------------------------- + +Open the Open WebUI settings and navigate to the TTS Settings under Admin Panel > Settings > Audio. Add the following configuration: + +* **API Base URL**: `http://host.docker.internal:8000/v1` +* **API Key**: `sk-111111111` (note: this is a dummy API key, as `openedai-speech` doesn't require an API key; you can use whatever for this field) + +**Step 6: Choose a voice** +------------------------- + +Under Set Voice, you can choose from the following voices: + +* alloy +* echo +* echo-alt +* fable +* onyx +* nova +* shimmer + +**Step 7: Enjoy naturally sounding voices** +----------------------------------------- + +You should now be able to use the `openedai-speech` integration with Open WebUI to generate naturally sounding voices. + +**Troubleshooting** +------------------- + +If you encounter any issues, make sure that: + +* The `openedai-speech` service is running and exposed on port 8000. +* The `host.docker.internal` hostname is resolvable from within the Open WebUI container. +* `host.docker.internal` is required since `openedai-speech` is exposed via `localhost` on your PC, but `open-webui` cannot normally access this from within its container. +* The API key is set to a dummy value, as `openedai-speech` doesn't require an API key. + +Note: You can change the port number in the `docker-compose.yml` file to any open and usable port, but make sure to update the **API Base URL** in Open WebUI Admin Audio settings accordingly. +:::