STT documentation

2025-06-16 11:28:36 +00:00 · 2025-03-02 22:44:14 +02:00 · 2025-03-02 22:44:14 +02:00 · fd46e595aa
commit fd46e595aa
parent 5fac1e1af3
9 changed files with 85 additions and 1 deletions
--- a/docs/tutorials/speech-to-text/env-variables.md
+++ b/docs/tutorials/speech-to-text/env-variables.md
@ -0,0 +1,25 @@
+---
+sidebar_position: 2
+title: "Environment Variables"
+---
+
+
+# Environment Variables List
+
+
+:::info
+For a complete list of all Open WebUI environment variables, see the [Environment Variable Configuration](/docs/getting-started/env-configuration) page.
+:::
+
+The following is a summary of the environment variables for speech to text (STT).
+
+# Environment Variables For Speech To Text (STT)
+
+| Variable | Description |
+|----------|-------------|
+| `WHISPER_MODEL` | Sets the Whisper model to use for local Speech-to-Text |
+| `WHISPER_MODEL_DIR` | Specifies the directory to store Whisper model files |
+| `AUDIO_STT_ENGINE` | Specifies the Speech-to-Text engine to use (empty for local Whisper, or `openai`) |
+| `AUDIO_STT_MODEL` | Specifies the Speech-to-Text model for OpenAI-compatible endpoints |
+| `AUDIO_STT_OPENAI_API_BASE_URL` | Sets the OpenAI-compatible base URL for Speech-to-Text |
+| `AUDIO_STT_OPENAI_API_KEY` | Sets the OpenAI API key for Speech-to-Text |
--- a/docs/tutorials/speech-to-text/stt-config.md
+++ b/docs/tutorials/speech-to-text/stt-config.md
@ -1,4 +1,63 @@
 ---
 sidebar_position: 1
 title: "🗨️  Configuration"
---
+---
+
+Open Web UI supports both local, browser, and remote speech to text.
+
+![alt text](../../../static/images/tutorials/stt/image.png)
+
+![alt text](../../../static/images/tutorials/stt/stt-providers.png)
+
+## Cloud / Remote Speech To Text Proivders
+
+The following cloud speech to text providers are currently supported. API keys can be configured as environment variables (OpenAI) or in the admin settings page (both keys).
+
+ | Service  | API Key Required |
+ | ------------- | ------------- |
+ | OpenAI  | ✅ |
+ | DeepGram  | ✅ |
+
+ WebAPI provides STT via the built-in browser STT provider.
+
+## Configuring Your STT Provider
+
+To configure a speech to text provider:
+
+- Navigate to the admin settings  
+- Choose Audio
+- Provider an API key and choose a model from the dropdown  
+
+![alt text](../../../static/images/tutorials/stt/stt-config.png)
+
+## User-Level Settings
+
+In addition the instance settings provisioned in the admin panel, there are also a couple of user-level settings that can provide additional functionality.
+
+*   **STT Settings:** Contains settings related to Speech-to-Text functionality.
+*   **Speech-to-Text Engine:** Determines the engine used for speech recognition (Default or Web API).
+ 
+
+![alt text](../../../static/images/tutorials/stt/user-settings.png)
+
+## Using STT
+
+Speech to text provides a highly efficient way of "writing" prompts using your voice and it performs robustly from both desktop and mobile devices.
+
+To use STT, simply click on the microphone icon:
+
+![alt text](../../../static/images/tutorials/stt/stt-operation.png)
+
+A live audio waveform will indicate successful voice capture:
+
+![alt text](../../../static/images/tutorials/stt/stt-in-progress.png)
+
+## STT Mode Operation
+
+Once your recording has begun you can:
+
+- Click on the tick icon to save the recording (if auto send after completion is enabled it will send for completion; otherwise you can manually send)
+- If you wish to abort the recording (for example, you wish to start a fresh recording) you can click on the 'x' icon to scape the recording interface
+
+![alt text](../../../static/images/tutorials/stt/endstt.png)
+
--- a/static/images/tutorials/stt/endstt.png
+++ b/static/images/tutorials/stt/endstt.png
--- a/static/images/tutorials/stt/image.png
+++ b/static/images/tutorials/stt/image.png
--- a/static/images/tutorials/stt/stt-config.png
+++ b/static/images/tutorials/stt/stt-config.png
--- a/static/images/tutorials/stt/stt-in-progress.png
+++ b/static/images/tutorials/stt/stt-in-progress.png
--- a/static/images/tutorials/stt/stt-operation.png
+++ b/static/images/tutorials/stt/stt-operation.png
--- a/static/images/tutorials/stt/stt-providers.png
+++ b/static/images/tutorials/stt/stt-providers.png
--- a/static/images/tutorials/stt/user-settings.png
+++ b/static/images/tutorials/stt/user-settings.png