From 452c447edc1fd25d90b9f06e6d40108527b8c053 Mon Sep 17 00:00:00 2001 From: Rory <16675082+roryeckel@users.noreply.github.com> Date: Sat, 1 Feb 2025 00:01:30 -0600 Subject: [PATCH 1/3] Document "RAG_WEB_LOADER" env-configuration --- docs/getting-started/env-configuration.md | 14 ++++++++++++++ 1 file changed, 14 insertions(+) diff --git a/docs/getting-started/env-configuration.md b/docs/getting-started/env-configuration.md index 605699b..fedca9c 100644 --- a/docs/getting-started/env-configuration.md +++ b/docs/getting-started/env-configuration.md @@ -1170,6 +1170,20 @@ When enabling `GOOGLE_DRIVE_INTEGRATION`, ensure that you have configured `GOOGL - `bing` - Uses the [Bing](https://www.bing.com/) search engine. - Persistence: This environment variable is a `PersistentConfig` variable. +#### `RAG_WEB_LOADER` + +- Type: `str` +- Default: `safe_web` +- Description: Specifies the loader to use for retrieving and processing web content. Options include: + - `safe_web` - Uses the `requests` module with enhanced error handling. + - `playwright` - Uses Playwright (backed by Chromium) for more advanced web page rendering and interaction. + +:::info + +Choosing `playwright` is beneficial when dealing with JavaScript-heavy websites, while `safe_web` is suitable for static content. Dependencies will be automatically installed on launch of the Open WebUI instance. + +::: + #### `SEARXNG_QUERY_URL` - Type: `str` From b9bf34e0bd98c1b7ee015dee24faf0185a28ac08 Mon Sep 17 00:00:00 2001 From: Rory <16675082+roryeckel@users.noreply.github.com> Date: Sun, 2 Feb 2025 19:20:04 -0600 Subject: [PATCH 2/3] Document "PLAYWRIGHT_WS_URI" env-configuration --- docs/getting-started/env-configuration.md | 25 +++++++++++++++++++++-- 1 file changed, 23 insertions(+), 2 deletions(-) diff --git a/docs/getting-started/env-configuration.md b/docs/getting-started/env-configuration.md index fedca9c..96993a3 100644 --- a/docs/getting-started/env-configuration.md +++ b/docs/getting-started/env-configuration.md @@ -1170,17 +1170,38 @@ When enabling `GOOGLE_DRIVE_INTEGRATION`, ensure that you have configured `GOOGL - `bing` - Uses the [Bing](https://www.bing.com/) search engine. - Persistence: This environment variable is a `PersistentConfig` variable. +### Web Loader Configuration + #### `RAG_WEB_LOADER` - Type: `str` - Default: `safe_web` - Description: Specifies the loader to use for retrieving and processing web content. Options include: - `safe_web` - Uses the `requests` module with enhanced error handling. - - `playwright` - Uses Playwright (backed by Chromium) for more advanced web page rendering and interaction. + - `playwright` - Uses Playwright for more advanced web page rendering and interaction. +- Persistence: This environment variable is a `PersistentConfig` variable. :::info -Choosing `playwright` is beneficial when dealing with JavaScript-heavy websites, while `safe_web` is suitable for static content. Dependencies will be automatically installed on launch of the Open WebUI instance. +When using `playwright`, you have two options: +1. If `PLAYWRIGHT_WS_URI` is not set, Playwright with Chromium dependencies will be automatically installed in the Open WebUI container on launch. +2. If `PLAYWRIGHT_WS_URI` is set, Open WebUI will connect to a remote browser instance instead of installing dependencies locally. + +::: + +#### `PLAYWRIGHT_WS_URI` + +- Type: `str` +- Default: `None` +- Description: Specifies the WebSocket URI of a remote Playwright browser instance. When set, Open WebUI will use this remote browser instead of installing browser dependencies locally. This is particularly useful in containerized environments where you want to keep the Open WebUI container lightweight and separate browser concerns. Example: `ws://playwright:3000` +- Persistence: This environment variable is a `PersistentConfig` variable. + +:::tip + +Using a remote Playwright browser via `PLAYWRIGHT_WS_URI` can be beneficial for: +- Reducing the size of the Open WebUI container +- Using a different browser other than the default Chromium +- Connecting to a non-headless (GUI) browser ::: From 65bd521613af5eb359fab94f075878c28e965935 Mon Sep 17 00:00:00 2001 From: Rory <16675082+roryeckel@users.noreply.github.com> Date: Mon, 17 Feb 2025 21:37:32 -0600 Subject: [PATCH 3/3] Rename "RAG_WEB_LOADER" to "RAG_WEB_LOADER_ENGINE" in env-configuration documentation --- docs/getting-started/env-configuration.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/docs/getting-started/env-configuration.md b/docs/getting-started/env-configuration.md index 96993a3..0a0cca9 100644 --- a/docs/getting-started/env-configuration.md +++ b/docs/getting-started/env-configuration.md @@ -1172,7 +1172,7 @@ When enabling `GOOGLE_DRIVE_INTEGRATION`, ensure that you have configured `GOOGL ### Web Loader Configuration -#### `RAG_WEB_LOADER` +#### `RAG_WEB_LOADER_ENGINE` - Type: `str` - Default: `safe_web`