mirror of
https://github.com/open-webui/docs
synced 2025-05-20 03:08:56 +00:00
Merge pull request #366 from travisvn/patch-6
Update openai-edge-tts-integration.md
This commit is contained in:
commit
5a05b5d7af
@ -9,7 +9,16 @@ This tutorial is a community contribution and is not supported by the OpenWebUI
|
|||||||
|
|
||||||
# Integrating `openai-edge-tts` 🗣️ with Open WebUI
|
# Integrating `openai-edge-tts` 🗣️ with Open WebUI
|
||||||
|
|
||||||
## What is `openai-edge-tts`, and how is it different from `openedai-speech`?
|
## What is `openai-edge-tts`?
|
||||||
|
|
||||||
|
[OpenAI Edge TTS](https://github.com/travisvn/openai-edge-tts) is a text-to-speech API that mimics the OpenAI API endpoint, allowing for a direct substitute in scenarios where you can define the endpoint URL, like with Open WebUI.
|
||||||
|
|
||||||
|
It uses the [edge-tts](https://github.com/rany2/edge-tts) package, which leverages the Edge browser's free "Read Aloud" feature to emulate a request to Microsoft / Azure in order to receive very high quality text-to-speech for free.
|
||||||
|
|
||||||
|
[Sample the voices here](https://tts.travisvn.com)
|
||||||
|
|
||||||
|
<details>
|
||||||
|
<summary>How is it different from 'openedai-speech'?</summary>
|
||||||
|
|
||||||
Similar to [openedai-speech](https://github.com/matatonic/openedai-speech), [openai-edge-tts](https://github.com/travisvn/openai-edge-tts) is a text-to-speech API endpoint that mimics the OpenAI API endpoint, allowing for a direct substitute in scenarios where the OpenAI Speech endpoint is callable and the server endpoint URL can be configured.
|
Similar to [openedai-speech](https://github.com/matatonic/openedai-speech), [openai-edge-tts](https://github.com/travisvn/openai-edge-tts) is a text-to-speech API endpoint that mimics the OpenAI API endpoint, allowing for a direct substitute in scenarios where the OpenAI Speech endpoint is callable and the server endpoint URL can be configured.
|
||||||
|
|
||||||
@ -17,13 +26,12 @@ Similar to [openedai-speech](https://github.com/matatonic/openedai-speech), [ope
|
|||||||
|
|
||||||
`openai-edge-tts` is a simpler option that uses a Python package called `edge-tts` to generate the audio.
|
`openai-edge-tts` is a simpler option that uses a Python package called `edge-tts` to generate the audio.
|
||||||
|
|
||||||
`edge-tts` ([repo](https://github.com/rany2/edge-tts)) leverages the Edge browser's free "Read Aloud" feature to emulate a request to Microsoft / Azure in order to receive very high quality text-to-speech for free.
|
</details>
|
||||||
|
|
||||||
## Requirements
|
## Requirements
|
||||||
|
|
||||||
- Docker installed on your system
|
- Docker installed on your system
|
||||||
- Open WebUI running
|
- Open WebUI running
|
||||||
- ffmpeg (Optional - Only required if opting to not use `mp3` format)
|
|
||||||
|
|
||||||
## ⚡️ Quick start
|
## ⚡️ Quick start
|
||||||
|
|
||||||
@ -37,7 +45,7 @@ This will run the service at port 5050 with all the default configs
|
|||||||
|
|
||||||
## Setting up Open WebUI to use `openai-edge-tts`
|
## Setting up Open WebUI to use `openai-edge-tts`
|
||||||
|
|
||||||
- Open the Admin Panel and go to Settings -> Audio
|
- Open the Admin Panel and go to `Settings` -> `Audio`
|
||||||
- Set your TTS Settings to match the screenshot below
|
- Set your TTS Settings to match the screenshot below
|
||||||
- _Note: you can specify the TTS Voice here_
|
- _Note: you can specify the TTS Voice here_
|
||||||
|
|
||||||
@ -49,15 +57,11 @@ The default API key is the string `your_api_key_here`. You do not have to change
|
|||||||
|
|
||||||
**And that's it! You can end here**
|
**And that's it! You can end here**
|
||||||
|
|
||||||
See the [Usage](#usage) section for request examples.
|
|
||||||
|
|
||||||
# Please ⭐️ star the repo on GitHub if you find [OpenAI Edge TTS](https://github.com/travisvn/openai-edge-tts) useful
|
# Please ⭐️ star the repo on GitHub if you find [OpenAI Edge TTS](https://github.com/travisvn/openai-edge-tts) useful
|
||||||
|
|
||||||
:::tip
|
|
||||||
You can define the environment variables directly in the `docker run` command. See [Quick Config for Docker](#-quick-config-for-docker) below.
|
|
||||||
:::
|
|
||||||
|
|
||||||
## Alternative Options
|
<details>
|
||||||
|
<summary>Running with Python</summary>
|
||||||
|
|
||||||
### 🐍 Running with Python
|
### 🐍 Running with Python
|
||||||
|
|
||||||
@ -100,9 +104,9 @@ Create a `.env` file in the root directory and set the following variables:
|
|||||||
API_KEY=your_api_key_here
|
API_KEY=your_api_key_here
|
||||||
PORT=5050
|
PORT=5050
|
||||||
|
|
||||||
DEFAULT_VOICE=en-US-AndrewNeural
|
DEFAULT_VOICE=en-US-AvaNeural
|
||||||
DEFAULT_RESPONSE_FORMAT=mp3
|
DEFAULT_RESPONSE_FORMAT=mp3
|
||||||
DEFAULT_SPEED=1.2
|
DEFAULT_SPEED=1.0
|
||||||
|
|
||||||
DEFAULT_LANGUAGE=en-US
|
DEFAULT_LANGUAGE=en-US
|
||||||
|
|
||||||
@ -125,7 +129,10 @@ The server will start running at `http://localhost:5050`.
|
|||||||
|
|
||||||
You can now interact with the API at `http://localhost:5050/v1/audio/speech` and other available endpoints. See the [Usage](#usage) section for request examples.
|
You can now interact with the API at `http://localhost:5050/v1/audio/speech` and other available endpoints. See the [Usage](#usage) section for request examples.
|
||||||
|
|
||||||
#### Usage
|
</details>
|
||||||
|
|
||||||
|
<details>
|
||||||
|
<summary>Usage details</summary>
|
||||||
|
|
||||||
##### Endpoint: `/v1/audio/speech` (aliased with `/audio/speech`)
|
##### Endpoint: `/v1/audio/speech` (aliased with `/audio/speech`)
|
||||||
|
|
||||||
@ -138,9 +145,9 @@ Generates audio from the input text. Available parameters:
|
|||||||
**Optional Parameters:**
|
**Optional Parameters:**
|
||||||
|
|
||||||
- **model** (string): Set to "tts-1" or "tts-1-hd" (default: `"tts-1"`).
|
- **model** (string): Set to "tts-1" or "tts-1-hd" (default: `"tts-1"`).
|
||||||
- **voice** (string): One of the OpenAI-compatible voices (alloy, echo, fable, onyx, nova, shimmer) or any valid `edge-tts` voice (default: `"en-US-AndrewNeural"`).
|
- **voice** (string): One of the OpenAI-compatible voices (alloy, echo, fable, onyx, nova, shimmer) or any valid `edge-tts` voice (default: `"en-US-AvaNeural"`).
|
||||||
- **response_format** (string): Audio format. Options: `mp3`, `opus`, `aac`, `flac`, `wav`, `pcm` (default: `mp3`).
|
- **response_format** (string): Audio format. Options: `mp3`, `opus`, `aac`, `flac`, `wav`, `pcm` (default: `mp3`).
|
||||||
- **speed** (number): Playback speed (0.25 to 4.0). Default is `1.2`.
|
- **speed** (number): Playback speed (0.25 to 4.0). Default is `1.0`.
|
||||||
|
|
||||||
:::tip
|
:::tip
|
||||||
You can browse available voices and listen to sample previews at [tts.travisvn.com](https://tts.travisvn.com)
|
You can browse available voices and listen to sample previews at [tts.travisvn.com](https://tts.travisvn.com)
|
||||||
@ -203,6 +210,8 @@ Additionally, there are endpoints for **Azure AI Speech** and **ElevenLabs** for
|
|||||||
These can be disabled by setting the environment variable `EXPAND_API=False`.
|
These can be disabled by setting the environment variable `EXPAND_API=False`.
|
||||||
:::
|
:::
|
||||||
|
|
||||||
|
</details>
|
||||||
|
|
||||||
## 🐳 Quick Config for Docker
|
## 🐳 Quick Config for Docker
|
||||||
|
|
||||||
You can configure the environment variables in the command used to run the project
|
You can configure the environment variables in the command used to run the project
|
||||||
@ -211,9 +220,9 @@ You can configure the environment variables in the command used to run the proje
|
|||||||
docker run -d -p 5050:5050 \
|
docker run -d -p 5050:5050 \
|
||||||
-e API_KEY=your_api_key_here \
|
-e API_KEY=your_api_key_here \
|
||||||
-e PORT=5050 \
|
-e PORT=5050 \
|
||||||
-e DEFAULT_VOICE=en-US-AndrewNeural \
|
-e DEFAULT_VOICE=en-US-AvaNeural \
|
||||||
-e DEFAULT_RESPONSE_FORMAT=mp3 \
|
-e DEFAULT_RESPONSE_FORMAT=mp3 \
|
||||||
-e DEFAULT_SPEED=1.2 \
|
-e DEFAULT_SPEED=1.0 \
|
||||||
-e DEFAULT_LANGUAGE=en-US \
|
-e DEFAULT_LANGUAGE=en-US \
|
||||||
-e REQUIRE_API_KEY=True \
|
-e REQUIRE_API_KEY=True \
|
||||||
-e REMOVE_FILTER=False \
|
-e REMOVE_FILTER=False \
|
||||||
|
Loading…
Reference in New Issue
Block a user