diff --git a/docs/tutorials/speech-to-text/env-variables.md b/docs/tutorials/speech-to-text/env-variables.md new file mode 100644 index 0000000..5e60336 --- /dev/null +++ b/docs/tutorials/speech-to-text/env-variables.md @@ -0,0 +1,25 @@ +--- +sidebar_position: 2 +title: "Environment Variables" +--- + + +# Environment Variables List + + +:::info +For a complete list of all Open WebUI environment variables, see the [Environment Variable Configuration](/docs/getting-started/env-configuration) page. +::: + +The following is a summary of the environment variables for speech to text (STT). + +# Environment Variables For Speech To Text (STT) + +| Variable | Description | +|----------|-------------| +| `WHISPER_MODEL` | Sets the Whisper model to use for local Speech-to-Text | +| `WHISPER_MODEL_DIR` | Specifies the directory to store Whisper model files | +| `AUDIO_STT_ENGINE` | Specifies the Speech-to-Text engine to use (empty for local Whisper, or `openai`) | +| `AUDIO_STT_MODEL` | Specifies the Speech-to-Text model for OpenAI-compatible endpoints | +| `AUDIO_STT_OPENAI_API_BASE_URL` | Sets the OpenAI-compatible base URL for Speech-to-Text | +| `AUDIO_STT_OPENAI_API_KEY` | Sets the OpenAI API key for Speech-to-Text | \ No newline at end of file diff --git a/docs/tutorials/speech-to-text/stt-config.md b/docs/tutorials/speech-to-text/stt-config.md index ae7f95b..ea612ce 100644 --- a/docs/tutorials/speech-to-text/stt-config.md +++ b/docs/tutorials/speech-to-text/stt-config.md @@ -1,4 +1,63 @@ --- sidebar_position: 1 title: "🗨️ Configuration" ---- \ No newline at end of file +--- + +Open Web UI supports both local, browser, and remote speech to text. + +![alt text](../../../static/images/tutorials/stt/image.png) + +![alt text](../../../static/images/tutorials/stt/stt-providers.png) + +## Cloud / Remote Speech To Text Proivders + +The following cloud speech to text providers are currently supported. API keys can be configured as environment variables (OpenAI) or in the admin settings page (both keys). + + | Service | API Key Required | + | ------------- | ------------- | + | OpenAI | ✅ | + | DeepGram | ✅ | + + WebAPI provides STT via the built-in browser STT provider. + +## Configuring Your STT Provider + +To configure a speech to text provider: + +- Navigate to the admin settings +- Choose Audio +- Provider an API key and choose a model from the dropdown + +![alt text](../../../static/images/tutorials/stt/stt-config.png) + +## User-Level Settings + +In addition the instance settings provisioned in the admin panel, there are also a couple of user-level settings that can provide additional functionality. + +* **STT Settings:** Contains settings related to Speech-to-Text functionality. +* **Speech-to-Text Engine:** Determines the engine used for speech recognition (Default or Web API). + + +![alt text](../../../static/images/tutorials/stt/user-settings.png) + +## Using STT + +Speech to text provides a highly efficient way of "writing" prompts using your voice and it performs robustly from both desktop and mobile devices. + +To use STT, simply click on the microphone icon: + +![alt text](../../../static/images/tutorials/stt/stt-operation.png) + +A live audio waveform will indicate successful voice capture: + +![alt text](../../../static/images/tutorials/stt/stt-in-progress.png) + +## STT Mode Operation + +Once your recording has begun you can: + +- Click on the tick icon to save the recording (if auto send after completion is enabled it will send for completion; otherwise you can manually send) +- If you wish to abort the recording (for example, you wish to start a fresh recording) you can click on the 'x' icon to scape the recording interface + +![alt text](../../../static/images/tutorials/stt/endstt.png) + diff --git a/static/images/tutorials/stt/endstt.png b/static/images/tutorials/stt/endstt.png new file mode 100644 index 0000000..6fd73da Binary files /dev/null and b/static/images/tutorials/stt/endstt.png differ diff --git a/static/images/tutorials/stt/image.png b/static/images/tutorials/stt/image.png new file mode 100644 index 0000000..6fee0e5 Binary files /dev/null and b/static/images/tutorials/stt/image.png differ diff --git a/static/images/tutorials/stt/stt-config.png b/static/images/tutorials/stt/stt-config.png new file mode 100644 index 0000000..b578f20 Binary files /dev/null and b/static/images/tutorials/stt/stt-config.png differ diff --git a/static/images/tutorials/stt/stt-in-progress.png b/static/images/tutorials/stt/stt-in-progress.png new file mode 100644 index 0000000..6ce6e01 Binary files /dev/null and b/static/images/tutorials/stt/stt-in-progress.png differ diff --git a/static/images/tutorials/stt/stt-operation.png b/static/images/tutorials/stt/stt-operation.png new file mode 100644 index 0000000..4b3d1f5 Binary files /dev/null and b/static/images/tutorials/stt/stt-operation.png differ diff --git a/static/images/tutorials/stt/stt-providers.png b/static/images/tutorials/stt/stt-providers.png new file mode 100644 index 0000000..ed8927c Binary files /dev/null and b/static/images/tutorials/stt/stt-providers.png differ diff --git a/static/images/tutorials/stt/user-settings.png b/static/images/tutorials/stt/user-settings.png new file mode 100644 index 0000000..224c04e Binary files /dev/null and b/static/images/tutorials/stt/user-settings.png differ