mirror of
https://github.com/open-webui/docs
synced 2025-06-16 11:28:36 +00:00
135 lines
3.4 KiB
Markdown
135 lines
3.4 KiB
Markdown
---
|
|
sidebar_position: 2
|
|
title: "🗨️ Kokoro-FastAPI Using Docker"
|
|
---
|
|
|
|
:::warning
|
|
This tutorial is a community contribution and is not supported by the Open WebUI team. It serves only as a demonstration on how to customize Open WebUI for your specific use case. Want to contribute? Check out the contributing tutorial.
|
|
:::
|
|
|
|
## What is `Kokoro-FastAPI`?
|
|
|
|
[Kokoro-FastAPI](https://github.com/remsky/Kokoro-FastAPI) is a dockerized FastAPI wrapper for the [Kokoro-82M](https://huggingface.co/hexgrad/Kokoro-82M) text-to-speech model that implements the OpenAI API endpoint specification.
|
|
|
|
## Key Features
|
|
|
|
- OpenAI-compatible Speech endpoint with inline voice combination
|
|
- NVIDIA GPU accelerated or CPU Onnx inference
|
|
- Streaming support with variable chunking
|
|
- Multiple audio format support (`.mp3`, `.wav`, `.opus`, `.flac`, `.aac`, `.pcm`)
|
|
- Integrated web interface on localhost:8880/web (or additional container in repo for gradio)
|
|
- Phoneme endpoints for conversion and generation
|
|
|
|
## Voices
|
|
|
|
- af
|
|
- af_bella
|
|
- af_irulan
|
|
- af_nicole
|
|
- af_sarah
|
|
- af_sky
|
|
- am_adam
|
|
- am_michael
|
|
- am_gurney
|
|
- bf_emma
|
|
- bf_isabella
|
|
- bm_george
|
|
- bm_lewis
|
|
|
|
## Languages
|
|
|
|
- en_us
|
|
- en_uk
|
|
|
|
## Requirements
|
|
|
|
- Docker installed on your system
|
|
- Open WebUI running
|
|
- For GPU support: NVIDIA GPU with CUDA 12.3
|
|
- For CPU-only: No special requirements
|
|
|
|
## ⚡️ Quick start
|
|
|
|
### You can choose between GPU or CPU versions
|
|
|
|
### GPU Version (Requires NVIDIA GPU with CUDA 12.1)
|
|
|
|
Using docker run:
|
|
|
|
```bash
|
|
docker run --gpus all -p 8880:8880 ghcr.io/remsky/kokoro-fastapi-gpu
|
|
```
|
|
|
|
Or docker compose, by creating a `docker-compose.yml` file and running `docker compose up`. For example:
|
|
|
|
```yaml
|
|
name: kokoro
|
|
services:
|
|
kokoro-fastapi-gpu:
|
|
ports:
|
|
- 8880:8880
|
|
image: ghcr.io/remsky/kokoro-fastapi-gpu:v0.2.1
|
|
restart: always
|
|
deploy:
|
|
resources:
|
|
reservations:
|
|
devices:
|
|
- driver: nvidia
|
|
count: all
|
|
capabilities:
|
|
- gpu
|
|
```
|
|
|
|
:::info
|
|
You may need to install and configure [the NVIDIA Container Toolkit](https://docs.nvidia.com/datacenter/cloud-native/container-toolkit/latest/install-guide.html)
|
|
:::
|
|
|
|
### CPU Version (ONNX optimized inference)
|
|
|
|
With docker run:
|
|
|
|
```bash
|
|
docker run -p 8880:8880 ghcr.io/remsky/kokoro-fastapi-cpu
|
|
```
|
|
|
|
With docker compose:
|
|
|
|
```yaml
|
|
name: kokoro
|
|
services:
|
|
kokoro-fastapi-cpu:
|
|
ports:
|
|
- 8880:8880
|
|
image: ghcr.io/remsky/kokoro-fastapi-cpu
|
|
restart: always
|
|
```
|
|
|
|
## Setting up Open WebUI to use `Kokoro-FastAPI`
|
|
|
|
To use Kokoro-FastAPI with Open WebUI, follow these steps:
|
|
|
|
- Open the Admin Panel and go to `Settings` -> `Audio`
|
|
- Set your TTS Settings to match the following:
|
|
- - Text-to-Speech Engine: OpenAI
|
|
- API Base URL: `http://localhost:8880/v1`
|
|
- API Key: `not-needed`
|
|
- TTS Model: `kokoro`
|
|
- TTS Voice: `af_bella` # also accepts mapping of existing OAI voices for compatibility
|
|
|
|
:::info
|
|
The default API key is the string `not-needed`. You do not have to change that value if you do not need the added security.
|
|
:::
|
|
|
|
## Building the Docker Container
|
|
|
|
```bash
|
|
git clone https://github.com/remsky/Kokoro-FastAPI.git
|
|
cd Kokoro-FastAPI
|
|
cd docker/cpu # or docker/gpu
|
|
docker compose up --build
|
|
```
|
|
|
|
**That's it!**
|
|
|
|
For more information on building the Docker container, including changing ports, please refer to the [Kokoro-FastAPI](https://github.com/remsky/Kokoro-FastAPI) repository
|