From c56b6de5e8a98bebaecaa81b2ac1528f8544bfe4 Mon Sep 17 00:00:00 2001 From: nathaniel Date: Thu, 1 May 2025 21:45:45 +0100 Subject: [PATCH 1/4] Added documentation for WHISPER_LANGUAGE environment variable --- docs/getting-started/env-configuration.md | 6 ++++++ docs/tutorials/speech-to-text/env-variables.md | 1 + 2 files changed, 7 insertions(+) diff --git a/docs/getting-started/env-configuration.md b/docs/getting-started/env-configuration.md index 98421e5..899dad1 100644 --- a/docs/getting-started/env-configuration.md +++ b/docs/getting-started/env-configuration.md @@ -1871,6 +1871,12 @@ Using a remote Playwright browser via `PLAYWRIGHT_WS_URL` can be beneficial for: - Default: `False` - Description: Toggles automatic update of the Whisper model. +#### `WHISPER_LANGUAGE` + +- Type: `str` +- Default: `None` +- Description: Specifies the language Whisper uses for TTS. Whisper predicts the language by default. To revert to default behaviour, unset this variable. + ### Speech-to-Text (OpenAI) #### `AUDIO_STT_ENGINE` diff --git a/docs/tutorials/speech-to-text/env-variables.md b/docs/tutorials/speech-to-text/env-variables.md index e20be96..ae01849 100644 --- a/docs/tutorials/speech-to-text/env-variables.md +++ b/docs/tutorials/speech-to-text/env-variables.md @@ -19,6 +19,7 @@ The following is a summary of the environment variables for speech to text (STT) |----------|-------------| | `WHISPER_MODEL` | Sets the Whisper model to use for local Speech-to-Text | | `WHISPER_MODEL_DIR` | Specifies the directory to store Whisper model files | +| `WHISPER_LANGUAGE` | Specifies the Speech-to-Text language to use (language is predicted unless set) | | `AUDIO_STT_ENGINE` | Specifies the Speech-to-Text engine to use (empty for local Whisper, or `openai`) | | `AUDIO_STT_MODEL` | Specifies the Speech-to-Text model for OpenAI-compatible endpoints | | `AUDIO_STT_OPENAI_API_BASE_URL` | Sets the OpenAI-compatible base URL for Speech-to-Text | From 060b49c30c6363f0b7b1092af918a1f9c2ab69f2 Mon Sep 17 00:00:00 2001 From: nathaniel Date: Thu, 1 May 2025 22:26:22 +0100 Subject: [PATCH 2/4] Fixed a typo (TTS should have been STT) --- docs/getting-started/env-configuration.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/docs/getting-started/env-configuration.md b/docs/getting-started/env-configuration.md index 899dad1..979569b 100644 --- a/docs/getting-started/env-configuration.md +++ b/docs/getting-started/env-configuration.md @@ -1875,7 +1875,7 @@ Using a remote Playwright browser via `PLAYWRIGHT_WS_URL` can be beneficial for: - Type: `str` - Default: `None` -- Description: Specifies the language Whisper uses for TTS. Whisper predicts the language by default. To revert to default behaviour, unset this variable. +- Description: Specifies the language Whisper uses for STT. Whisper predicts the language by default. To revert to default behaviour, unset this variable. ### Speech-to-Text (OpenAI) From 9b309e3c484d9a55b02ccacdd0166fbc67f92917 Mon Sep 17 00:00:00 2001 From: nathaniel Date: Mon, 5 May 2025 17:44:31 +0100 Subject: [PATCH 3/4] Adjustment to WHISPER_LANGUAGE docs to mention expected input format (ISO 639-2) --- docs/getting-started/env-configuration.md | 2 +- docs/tutorials/speech-to-text/env-variables.md | 2 +- 2 files changed, 2 insertions(+), 2 deletions(-) diff --git a/docs/getting-started/env-configuration.md b/docs/getting-started/env-configuration.md index 979569b..f1279c6 100644 --- a/docs/getting-started/env-configuration.md +++ b/docs/getting-started/env-configuration.md @@ -1875,7 +1875,7 @@ Using a remote Playwright browser via `PLAYWRIGHT_WS_URL` can be beneficial for: - Type: `str` - Default: `None` -- Description: Specifies the language Whisper uses for STT. Whisper predicts the language by default. To revert to default behaviour, unset this variable. +- Description: Specifies the ISO 639-2 language Whisper uses for STT. Whisper predicts the language by default. ### Speech-to-Text (OpenAI) diff --git a/docs/tutorials/speech-to-text/env-variables.md b/docs/tutorials/speech-to-text/env-variables.md index ae01849..01efa79 100644 --- a/docs/tutorials/speech-to-text/env-variables.md +++ b/docs/tutorials/speech-to-text/env-variables.md @@ -19,7 +19,7 @@ The following is a summary of the environment variables for speech to text (STT) |----------|-------------| | `WHISPER_MODEL` | Sets the Whisper model to use for local Speech-to-Text | | `WHISPER_MODEL_DIR` | Specifies the directory to store Whisper model files | -| `WHISPER_LANGUAGE` | Specifies the Speech-to-Text language to use (language is predicted unless set) | +| `WHISPER_LANGUAGE` | Specifies the ISO 639-2 Speech-to-Text language to use for Whisper (language is predicted unless set) | | `AUDIO_STT_ENGINE` | Specifies the Speech-to-Text engine to use (empty for local Whisper, or `openai`) | | `AUDIO_STT_MODEL` | Specifies the Speech-to-Text model for OpenAI-compatible endpoints | | `AUDIO_STT_OPENAI_API_BASE_URL` | Sets the OpenAI-compatible base URL for Speech-to-Text | From cb4b24bf4747c63271cffb662a1c9f7bcad62330 Mon Sep 17 00:00:00 2001 From: nathaniel Date: Mon, 5 May 2025 18:08:06 +0100 Subject: [PATCH 4/4] Adjustment to WHISPER_LANGUAGE docs - Corrected ISO 639-2 to ISO 639-1 for most cases. Added exception for some languages --- docs/getting-started/env-configuration.md | 2 +- docs/tutorials/speech-to-text/env-variables.md | 2 +- 2 files changed, 2 insertions(+), 2 deletions(-) diff --git a/docs/getting-started/env-configuration.md b/docs/getting-started/env-configuration.md index f1279c6..e137cbe 100644 --- a/docs/getting-started/env-configuration.md +++ b/docs/getting-started/env-configuration.md @@ -1875,7 +1875,7 @@ Using a remote Playwright browser via `PLAYWRIGHT_WS_URL` can be beneficial for: - Type: `str` - Default: `None` -- Description: Specifies the ISO 639-2 language Whisper uses for STT. Whisper predicts the language by default. +- Description: Specifies the ISO 639-1 language Whisper uses for STT (ISO 639-2 for Hawaiian and Cantonese). Whisper predicts the language by default. ### Speech-to-Text (OpenAI) diff --git a/docs/tutorials/speech-to-text/env-variables.md b/docs/tutorials/speech-to-text/env-variables.md index 01efa79..d0cf355 100644 --- a/docs/tutorials/speech-to-text/env-variables.md +++ b/docs/tutorials/speech-to-text/env-variables.md @@ -19,7 +19,7 @@ The following is a summary of the environment variables for speech to text (STT) |----------|-------------| | `WHISPER_MODEL` | Sets the Whisper model to use for local Speech-to-Text | | `WHISPER_MODEL_DIR` | Specifies the directory to store Whisper model files | -| `WHISPER_LANGUAGE` | Specifies the ISO 639-2 Speech-to-Text language to use for Whisper (language is predicted unless set) | +| `WHISPER_LANGUAGE` | Specifies the ISO 639-1 (ISO 639-2 for Hawaiian and Cantonese) Speech-to-Text language to use for Whisper (language is predicted unless set) | | `AUDIO_STT_ENGINE` | Specifies the Speech-to-Text engine to use (empty for local Whisper, or `openai`) | | `AUDIO_STT_MODEL` | Specifies the Speech-to-Text model for OpenAI-compatible endpoints | | `AUDIO_STT_OPENAI_API_BASE_URL` | Sets the OpenAI-compatible base URL for Speech-to-Text |