From f74125c1be3c4ed3fe35e6ee7443cefb6920b1a3 Mon Sep 17 00:00:00 2001 From: Athanasios Oikonomou Date: Sat, 3 May 2025 07:51:30 +0300 Subject: [PATCH] docs: add documentation for DOCLING_OCR_ENGINE and DOCLING_OCR_LANG environment variables - Documented DOCLING_OCR_ENGINE with default value and supported engines - Documented DOCLING_OCR_LANG with default languages and engine-specific note --- docs/getting-started/env-configuration.md | 16 ++++++++++++++++ 1 file changed, 16 insertions(+) diff --git a/docs/getting-started/env-configuration.md b/docs/getting-started/env-configuration.md index 98421e5..fad8599 100644 --- a/docs/getting-started/env-configuration.md +++ b/docs/getting-started/env-configuration.md @@ -1207,6 +1207,22 @@ When using Pinecone as the vector store, the following environment variables are - Description: Specifies the URL for the Docling server. - Persistence: This environment variable is a `PersistentConfig` variable. +#### `DOCLING_OCR_ENGINE` + +- Type: `str` +- Default: `tesseract` +- Description: Specifies the OCR engine used by Docling. + Supported values include: `tesseract` (default), `easyocr`, `ocrmac`, `rapidocr`, and `tesserocr`. +- Persistence: This environment variable is a `PersistentConfig` variable. + +#### `DOCLING_OCR_LANG` + +- Type: `str` +- Default: `eng,fra,deu,spa` (when using the default `tesseract` engine) +- Description: Specifies the OCR language(s) to be used with the configured `DOCLING_OCR_ENGINE`. + The format and available language codes depend on the selected OCR engine. +- Persistence: This environment variable is a `PersistentConfig` variable. + ## Retrieval Augmented Generation (RAG) #### `RAG_EMBEDDING_ENGINE`