docs/kokoro-web-integration.md at cd17aba84e988e302ba6e7a77a4bbcb24c2b52c0

mirror of https://github.com/open-webui/docs synced 2025-05-20 03:08:56 +00:00

Luis Eduardo f162c2f818 feat: Add tutorial for Kokoro Web integration with Open WebUI for TTS capabilities

2025-03-16 04:05:12 +00:00

3.5 KiB

Raw Blame History

sidebar_position	title
2	🗨️ Kokoro Web - Effortless TTS for Open WebUI

:::warning This tutorial is a community contribution and is not supported by the Open WebUI team. It serves only as a demonstration on how to customize Open WebUI for your specific use case. Want to contribute? Check out the contributing tutorial. :::

What is `Kokoro Web`?

Kokoro Web provides a lightweight, OpenAI-compatible API for the powerful Kokoro-82M text-to-speech model, seamlessly integrating with Open WebUI to enhance your AI conversations with natural-sounding voices.

🚀 Two-Step Integration

1. Deploy Kokoro Web API (One Command)

services:
  kokoro-web:
    image: ghcr.io/eduardolat/kokoro-web:latest
    ports:
      - "3000:3000"
    environment:
      # Change this to any secret key to use as your OpenAI compatible API key
      - KW_SECRET_API_KEY=your-api-key
    volumes:
      - ./kokoro-cache:/kokoro/cache
    restart: unless-stopped

Run with: docker compose up -d

2. Connect OpenWebUI (30 Seconds)

In OpenWebUI, go to Admin Panel → Settings → Audio
Configure:
- Text-to-Speech Engine: OpenAI
- API Base URL: http://localhost:3000/api/v1
  (If using Docker: http://host.docker.internal:3000/api/v1)
- API Key: your-api-key (from step 1)
- TTS Model: model_q8f16 (best balance of size/quality)
- TTS Voice: af_heart (default warm, natural english voice). You can change this to any other voice or formula from the Kokoro Web Demo

That's it! Your OpenWebUI now has AI voice capabilities.

🌍 Supported Languages

Kokoro Web supports 8 languages with specific voices optimized for each:

English (US) - en-us
English (UK) - en-gb
Japanese - ja
Chinese - cmn
Spanish - es-419
Hindi - hi
Italian - it
Portuguese (Brazil) - pt-br

Each language has dedicated voices for optimal pronunciation and natural flow. See the GitHub repository for the complete list of language-specific voices or use the Kokoro Web Demo to preview and create your own custom voices instantly.

💾 Optimized Models for Any Hardware

Choose the model that fits your hardware needs:

Model ID	Optimization	Size	Ideal For
model_q8f16	Mixed precision	86 MB	Recommended - Best balance
model_quantized	8-bit	92.4 MB	Good CPU performance
model_uint8f16	Mixed precision	114 MB	Better quality on mid-range CPUs
model_q4f16	4-bit & fp16 weights	154 MB	Higher quality, still efficient
model_fp16	fp16	163 MB	Premium quality
model_uint8	8-bit & mixed	177 MB	Balanced option
model_q4	4-bit matmul	305 MB	High quality option
model	fp32	326 MB	Maximum quality (slower)

✨ Try Before You Install

Visit the Kokoro Web Demo to preview all voices instantly. This demo:

Runs 100% in your browser - No server required
Free forever - No usage limits or registration needed
Zero installation - Just visit the website and start creating
All features included - Test any voice or language immediately

Need More Help?

For additional options, voice customization guides, and advanced settings, visit the GitHub repository.

Enjoy natural AI voices in your OpenWebUI conversations!

3.5 KiB Raw Blame History

What is Kokoro Web?