diff --git a/docs/tutorial/continue-dev.md b/docs/tutorial/continue-dev.md index 84302c3..b5d360f 100644 --- a/docs/tutorial/continue-dev.md +++ b/docs/tutorial/continue-dev.md @@ -1,5 +1,5 @@ --- -sidebar_position: 12 +sidebar_position: 13 title: "Continue.dev VSCode Extension with Open WebUI" --- diff --git a/docs/tutorial/ipex_llm.md b/docs/tutorial/ipex_llm.md index ea1196b..741b712 100644 --- a/docs/tutorial/ipex_llm.md +++ b/docs/tutorial/ipex_llm.md @@ -1,5 +1,5 @@ --- -sidebar_position: 10 +sidebar_position: 11 title: "Local LLM Setup with IPEX-LLM on Intel GPU" --- diff --git a/docs/tutorial/openedai-speech-integration.md b/docs/tutorial/openedai-speech-integration.md index 6b3bcc3..907e81f 100644 --- a/docs/tutorial/openedai-speech-integration.md +++ b/docs/tutorial/openedai-speech-integration.md @@ -1,5 +1,5 @@ --- -sidebar_position: 11 +sidebar_position: 12 title: "TTS - OpenedAI-Speech using Docker" --- diff --git a/docs/tutorial/slim_down.md b/docs/tutorial/slim_down.md new file mode 100644 index 0000000..1c39c2f --- /dev/null +++ b/docs/tutorial/slim_down.md @@ -0,0 +1,26 @@ +--- +sidebar_position: 10 +title: "Reduce RAM usage" +--- + +# Reduce RAM usage + +If you are deploying this image in a RAM-constrained environment, there are a few things you can do to slim down the image. + +On a Raspberry Pi 4 (arm64) with version v0.3.10, this was able to reduce idle memory consumption from >1GB to ~200MB (as observed with `docker container stats`). + +## TLDR + +Set the following environment variables (or the respective UI settings for an existing deployment): `RAG_EMBEDDING_ENGINE: ollama`, `AUDIO_STT_ENGINE: openai`. + +## Longer explanation + +Much of the memory consumption is due to loaded ML models. Even if you are using an external language model (OpenAI or unbundled ollama), many models may be loaded for additional purposes. + +As of v0.3.10 this includes: +* Speech-to-text (whisper by default) +* RAG embedding engine (defaults to local SentenceTransformers model) +* Image generation engine (disabled by default) + +The first 2 are enabled and set to local models by default. You can change the models in the admin panel (RAG: Documents category, set it to Ollama or OpenAI, Speech-to-text: Audio section, work with OpenAI or WebAPI). +If you are deploying a fresh Docker image, you can also set them with the following environment variables: `RAG_EMBEDDING_ENGINE: ollama`, `AUDIO_STT_ENGINE: openai`. Note that these environment variables have no effect if a `config.json` already exists.