Commit Graph

72 Commits

Author SHA1 Message Date
Athanasios Oikonomou
657162e96d feat(ocr): add support for Docling OCR engine and language configuration
This commit adds support for configuring the OCR engine and language(s) for Docling.
Configuration can be set via the environment variables `DOCLING_OCR_ENGINE` and `DOCLING_OCR_LANG`, or through the UI.

Fixes #13133
2025-05-03 00:32:06 +03:00
Timothy Jaeryang Baek
48a23ce3fe refac: web/rag config 2025-04-12 16:33:36 -07:00
hurxxxx
7c828015d3 fix: ReindexKnowledgeFilesConfirmDialog 2025-04-08 00:53:11 +09:00
hurxxxx
4e545d432b feat: add new admin func - reindex knowledge files 2025-04-08 00:44:10 +09:00
Patrick Wachter
1ac6879268
Add Mistral OCR integration and configuration support 2025-04-01 14:24:33 +02:00
Timothy Jaeryang Baek
737f41dd2e refac 2025-03-28 13:18:44 -07:00
Timothy Jaeryang Baek
402d32ccfd refac 2025-03-28 13:17:43 -07:00
Timothy Jaeryang Baek
0413c747a9 refac: hide hybrid option with full context mode 2025-03-28 13:16:56 -07:00
Timothy Jaeryang Baek
4a79320253 chore: format 2025-03-27 01:40:28 -07:00
Timothy Jaeryang Baek
9d834a8e90
Merge branch 'dev' into k_reranker 2025-03-26 20:50:31 -07:00
Timothy Jaeryang Baek
3186aeac08 chore: format 2025-03-18 06:39:37 -07:00
Fabio Polito
0aa42615f9 Merge remote-tracking branch 'upstream/dev' into docling_context_extraction_engine
merge upstream
2025-03-08 18:52:51 +00:00
orenzhang
72ea6dd9f1
refactor(lint): code lint 2025-03-07 19:59:09 +08:00
orenzhang
92fb1109b6
i18n(common): add i18n translation 2025-03-06 20:16:34 +08:00
Marko Henning
41a4cf7106 Added new k_reranker parameter 2025-03-06 10:47:57 +01:00
Fabio Polito
2982893d0d fix: format fixes 2025-03-06 00:39:00 +00:00
Fabio Polito
9aa407dbd2 feat: merge with main 2025-03-05 22:04:34 +00:00
Timothy Jaeryang Baek
57010901e6 enh: bypass embedding and retrieval 2025-02-26 15:42:19 -08:00
Timothy Jaeryang Baek
1c2e36f1b7 refac 2025-02-26 13:59:08 -08:00
Timothy Jaeryang Baek
fa91d83ac3 refac: documents settings ui 2025-02-26 13:48:56 -08:00
Timothy Jaeryang Baek
9f27d7710b chore: format 2025-02-25 01:46:08 -08:00
hurxxxx
4cc3102758 feat: onedrive file picker integration 2025-02-25 01:47:07 +09:00
Timothy Jaeryang Baek
ab1b910d80
Merge pull request #10486 from Micca/feature/document_intelligence_support
Feat: Adding Support for Azure AI Document Intelligence for Content Extraction (Revised)
2025-02-21 10:56:18 -08:00
Timothy Jaeryang Baek
81715f6553 enh: RAG full context mode 2025-02-18 21:14:58 -08:00
Timothy Jaeryang Baek
e3fa48b6ce chore: tailwind v4 migration 2025-02-15 19:27:25 -08:00
Fabio Polito
2419ef06a0 feat: docling support for document preprocessing 2025-02-14 12:08:03 +00:00
Mazurek Michal
35f3824932 feat: Implement Document Intelligence as Content Extraction Engine 2025-02-07 13:44:47 +01:00
Timothy Jaeryang Baek
a863f98c53 refac: toast error 2025-01-20 22:41:32 -08:00
Timothy Jaeryang Baek
f8269de947 fix 2024-12-24 20:10:52 -07:00
Timothy Jaeryang Baek
50f36a5262 refac: styling 2024-12-19 20:56:16 -08:00
Timothy Jaeryang Baek
0f6d302760 refac 2024-12-18 18:04:56 -08:00
Taylor Wilsdon
1120f4d09a npm run format 2024-12-18 13:32:46 -05:00
Taylor Wilsdon
0dc75363aa Add configurable Google Drive toggle in the Documents admin section along with necessary config scaffolding 2024-12-18 13:25:57 -05:00
Taylor Wilsdon (aider)
5c149c3aa2 style: Align Google Drive switch to the right side of text 2024-12-18 13:24:13 -05:00
Taylor Wilsdon
d43ca803ca feat: Add Google Drive integration toggle to document settings 2024-12-18 13:24:11 -05:00
Timothy Jaeryang Baek
20321e5271 refac: ollama setting for rag 2024-11-18 14:19:56 -08:00
Timothy Jaeryang Baek
227cca35e8 enh: knowledge access control 2024-11-16 16:51:55 -08:00
Timothy Jaeryang Baek
f9412f72f1 refac: styling 2024-11-16 01:54:40 -08:00
Timothy Jaeryang Baek
4eb8b1450c refac 2024-11-15 22:09:06 -08:00
Timothy J. Baek
47e377967e refac: styling 2024-10-21 00:05:27 -07:00
Timothy J. Baek
4b357a7b62 refac: styling 2024-10-20 18:49:30 -07:00
Timothy J. Baek
e8c629a2e2 refac: styling 2024-10-19 23:17:47 -07:00
Timothy J. Baek
eef9045dcc chore: format 2024-10-15 09:22:03 -07:00
Timothy J. Baek
586e005f0f enh: token text splitter support 2024-10-13 04:24:13 -07:00
Timothy J. Baek
5ffd216fca refac 2024-10-13 03:02:02 -07:00
Peter De-Ath
885b9f1ece refactor: Update GenerateEmbeddingsForm to support batch processing
refactor: Update embedding batch size handling in RAG configuration

refactor: add query_doc query caching

refactor: update logging statements in generate_chat_completion function

change embedding_batch_size to Optional
2024-10-08 00:04:35 +01:00
Timothy J. Baek
79c005a041 refac: deprecate docs_dir 2024-10-04 18:22:55 -07:00
Timothy J. Baek
a6c797d4c2 refac: process docs dir 2024-10-04 17:22:00 -07:00
Timothy J. Baek
08969ecf89 refac: rename projects -> knowledge 2024-10-01 22:45:04 -07:00
Timothy J. Baek
c5eb0a9732 refac: documents -> projects 2024-10-01 17:35:35 -07:00