Merge pull request #159 from 9SMTM6/main

Add wiki tutorial on how to slim down the ram usage
2025-06-16 11:28:36 +00:00 · 2024-07-29 12:26:33 -04:00 · 2024-07-29 12:26:33 -04:00 · 366e5aa023
commit 366e5aa023
parent 5a5ef2a742 5c60399de5
4 changed files with 29 additions and 3 deletions
--- a/docs/tutorial/continue-dev.md
+++ b/docs/tutorial/continue-dev.md
@ -1,5 +1,5 @@
 ---
-sidebar_position: 12
+sidebar_position: 13
 title: "Continue.dev VSCode Extension with Open WebUI"
 ---

--- a/docs/tutorial/ipex_llm.md
+++ b/docs/tutorial/ipex_llm.md
@ -1,5 +1,5 @@
 ---
-sidebar_position: 10
+sidebar_position: 11
 title: "Local LLM Setup with IPEX-LLM on Intel GPU"
 ---

--- a/docs/tutorial/openedai-speech-integration.md
+++ b/docs/tutorial/openedai-speech-integration.md
@ -1,5 +1,5 @@
 ---
-sidebar_position: 11
+sidebar_position: 12
 title: "TTS - OpenedAI-Speech using Docker"
 ---

--- a/docs/tutorial/slim_down.md
+++ b/docs/tutorial/slim_down.md
@ -0,0 +1,26 @@
+---
+sidebar_position: 10
+title: "Reduce RAM usage"
+---
+
+# Reduce RAM usage
+
+If you are deploying this image in a RAM-constrained environment, there are a few things you can do to slim down the image.
+
+On a Raspberry Pi 4 (arm64) with version v0.3.10, this was able to reduce idle memory consumption from >1GB to ~200MB (as observed with `docker container stats`).
+
+## TLDR
+
+Set the following environment variables (or the respective UI settings for an existing deployment): `RAG_EMBEDDING_ENGINE: ollama`, `AUDIO_STT_ENGINE: openai`.
+
+## Longer explanation
+
+Much of the memory consumption is due to loaded ML models. Even if you are using an external language model (OpenAI or unbundled ollama), many models may be loaded for additional purposes.
+
+As of v0.3.10 this includes:
+* Speech-to-text (whisper by default)
+* RAG embedding engine (defaults to local SentenceTransformers model)
+* Image generation engine (disabled by default)
+
+The first 2 are enabled and set to local models by default. You can change the models in the admin panel (RAG: Documents category, set it to Ollama or OpenAI, Speech-to-text: Audio section, work with OpenAI or WebAPI).
+If you are deploying a fresh Docker image, you can also set them with the following environment variables: `RAG_EMBEDDING_ENGINE: ollama`, `AUDIO_STT_ENGINE: openai`. Note that these environment variables have no effect if a `config.json` already exists.