STT documentation
25
docs/tutorials/speech-to-text/env-variables.md
Normal file
@ -0,0 +1,25 @@
|
||||
---
|
||||
sidebar_position: 2
|
||||
title: "Environment Variables"
|
||||
---
|
||||
|
||||
|
||||
# Environment Variables List
|
||||
|
||||
|
||||
:::info
|
||||
For a complete list of all Open WebUI environment variables, see the [Environment Variable Configuration](/docs/getting-started/env-configuration) page.
|
||||
:::
|
||||
|
||||
The following is a summary of the environment variables for speech to text (STT).
|
||||
|
||||
# Environment Variables For Speech To Text (STT)
|
||||
|
||||
| Variable | Description |
|
||||
|----------|-------------|
|
||||
| `WHISPER_MODEL` | Sets the Whisper model to use for local Speech-to-Text |
|
||||
| `WHISPER_MODEL_DIR` | Specifies the directory to store Whisper model files |
|
||||
| `AUDIO_STT_ENGINE` | Specifies the Speech-to-Text engine to use (empty for local Whisper, or `openai`) |
|
||||
| `AUDIO_STT_MODEL` | Specifies the Speech-to-Text model for OpenAI-compatible endpoints |
|
||||
| `AUDIO_STT_OPENAI_API_BASE_URL` | Sets the OpenAI-compatible base URL for Speech-to-Text |
|
||||
| `AUDIO_STT_OPENAI_API_KEY` | Sets the OpenAI API key for Speech-to-Text |
|
@ -1,4 +1,63 @@
|
||||
---
|
||||
sidebar_position: 1
|
||||
title: "🗨️ Configuration"
|
||||
---
|
||||
---
|
||||
|
||||
Open Web UI supports both local, browser, and remote speech to text.
|
||||
|
||||

|
||||
|
||||

|
||||
|
||||
## Cloud / Remote Speech To Text Proivders
|
||||
|
||||
The following cloud speech to text providers are currently supported. API keys can be configured as environment variables (OpenAI) or in the admin settings page (both keys).
|
||||
|
||||
| Service | API Key Required |
|
||||
| ------------- | ------------- |
|
||||
| OpenAI | ✅ |
|
||||
| DeepGram | ✅ |
|
||||
|
||||
WebAPI provides STT via the built-in browser STT provider.
|
||||
|
||||
## Configuring Your STT Provider
|
||||
|
||||
To configure a speech to text provider:
|
||||
|
||||
- Navigate to the admin settings
|
||||
- Choose Audio
|
||||
- Provider an API key and choose a model from the dropdown
|
||||
|
||||

|
||||
|
||||
## User-Level Settings
|
||||
|
||||
In addition the instance settings provisioned in the admin panel, there are also a couple of user-level settings that can provide additional functionality.
|
||||
|
||||
* **STT Settings:** Contains settings related to Speech-to-Text functionality.
|
||||
* **Speech-to-Text Engine:** Determines the engine used for speech recognition (Default or Web API).
|
||||
|
||||
|
||||

|
||||
|
||||
## Using STT
|
||||
|
||||
Speech to text provides a highly efficient way of "writing" prompts using your voice and it performs robustly from both desktop and mobile devices.
|
||||
|
||||
To use STT, simply click on the microphone icon:
|
||||
|
||||

|
||||
|
||||
A live audio waveform will indicate successful voice capture:
|
||||
|
||||

|
||||
|
||||
## STT Mode Operation
|
||||
|
||||
Once your recording has begun you can:
|
||||
|
||||
- Click on the tick icon to save the recording (if auto send after completion is enabled it will send for completion; otherwise you can manually send)
|
||||
- If you wish to abort the recording (for example, you wish to start a fresh recording) you can click on the 'x' icon to scape the recording interface
|
||||
|
||||

|
||||
|
||||
|
BIN
static/images/tutorials/stt/endstt.png
Normal file
After Width: | Height: | Size: 10 KiB |
BIN
static/images/tutorials/stt/image.png
Normal file
After Width: | Height: | Size: 82 KiB |
BIN
static/images/tutorials/stt/stt-config.png
Normal file
After Width: | Height: | Size: 20 KiB |
BIN
static/images/tutorials/stt/stt-in-progress.png
Normal file
After Width: | Height: | Size: 39 KiB |
BIN
static/images/tutorials/stt/stt-operation.png
Normal file
After Width: | Height: | Size: 15 KiB |
BIN
static/images/tutorials/stt/stt-providers.png
Normal file
After Width: | Height: | Size: 10 KiB |
BIN
static/images/tutorials/stt/user-settings.png
Normal file
After Width: | Height: | Size: 67 KiB |