docs: add documentation for DOCLING_OCR_ENGINE and DOCLING_OCR_LANG environment variables

- Documented DOCLING_OCR_ENGINE with default value and supported engines
- Documented DOCLING_OCR_LANG with default languages and engine-specific note
This commit is contained in:
Athanasios Oikonomou 2025-05-03 07:51:30 +03:00 committed by GitHub
parent a164415416
commit f74125c1be
No known key found for this signature in database
GPG Key ID: B5690EEEBB952194

View File

@ -1207,6 +1207,22 @@ When using Pinecone as the vector store, the following environment variables are
- Description: Specifies the URL for the Docling server. - Description: Specifies the URL for the Docling server.
- Persistence: This environment variable is a `PersistentConfig` variable. - Persistence: This environment variable is a `PersistentConfig` variable.
#### `DOCLING_OCR_ENGINE`
- Type: `str`
- Default: `tesseract`
- Description: Specifies the OCR engine used by Docling.
Supported values include: `tesseract` (default), `easyocr`, `ocrmac`, `rapidocr`, and `tesserocr`.
- Persistence: This environment variable is a `PersistentConfig` variable.
#### `DOCLING_OCR_LANG`
- Type: `str`
- Default: `eng,fra,deu,spa` (when using the default `tesseract` engine)
- Description: Specifies the OCR language(s) to be used with the configured `DOCLING_OCR_ENGINE`.
The format and available language codes depend on the selected OCR engine.
- Persistence: This environment variable is a `PersistentConfig` variable.
## Retrieval Augmented Generation (RAG) ## Retrieval Augmented Generation (RAG)
#### `RAG_EMBEDDING_ENGINE` #### `RAG_EMBEDDING_ENGINE`