Athanasios Oikonomou
657162e96d
feat(ocr): add support for Docling OCR engine and language configuration
...
This commit adds support for configuring the OCR engine and language(s) for Docling.
Configuration can be set via the environment variables `DOCLING_OCR_ENGINE` and `DOCLING_OCR_LANG`, or through the UI.
Fixes #13133
2025-05-03 00:32:06 +03:00
Tim Jaeryang Baek
e87f2669fa
Merge pull request #13191 from tth37/feat_firecrawl_search_engine
...
feat: Add Firecrawl search engine
2025-04-29 08:38:28 -07:00
Tim Jaeryang Baek
7b863465a9
Merge pull request #13311 from stephen304/yacy-support
...
feat: Yacy search support
2025-04-29 08:35:10 -07:00
Stephen Smith
240d91d38d
Add yacy config for user/pass, automatically add yacy json api path
2025-04-26 22:28:30 -04:00
Stephen Smith
0f73b96616
first pass at yacy support copied from searxng
2025-04-26 14:07:13 -04:00
tth37
92dbeb1939
feat: Add Firecrawl search engine
2025-04-24 14:57:28 +08:00
Timothy Jaeryang Baek
732d7aee70
enh: sentence transformers env vars
...
Co-Authored-By: DrZoidberg09 <96449693+drzoidberg09@users.noreply.github.com>
2025-04-24 01:55:18 +09:00
Timothy Jaeryang Baek
09874ab83d
fix: FireCrawlLoader
2025-04-24 01:40:34 +09:00
Timothy Jaeryang Baek
43efff0fe6
refac
2025-04-22 23:22:50 +09:00
Tim Jaeryang Baek
87844a8042
Merge pull request #12822 from tth37/feat_external_search_loader
...
feat: Support for Self-Hosted/External Web Search/Loader Engines
2025-04-18 23:51:27 -07:00
Youggls
9669cd3454
fix: use run_in_threadpool for search_web to prevent blocking
...
Used fastapi's run_in_threadpool function to execute the search_web function,
preventing the synchronous function from blocking the entire web search process.
2025-04-17 17:23:20 +08:00
tth37
85f8e91288
feat: Allow admin editing external search/loader settings
2025-04-14 18:19:26 +08:00
Timothy Jaeryang Baek
70718dda90
refac
2025-04-13 22:31:43 -07:00
tth37
839ba22c90
feat: Backend for Self-Hosted/External Web Search/Loader Engines
2025-04-14 01:49:05 +08:00
Timothy Jaeryang Baek
888b468576
fix
2025-04-12 23:00:34 -07:00
Timothy Jaeryang Baek
4dafbbccfc
fix: rag template display issue
2025-04-12 22:55:24 -07:00
tth37
8d53f1e770
fix: small bugs on updated web/rag settings
2025-04-13 12:55:50 +08:00
Timothy Jaeryang Baek
48a23ce3fe
refac: web/rag config
2025-04-12 16:33:36 -07:00
tth37
5eac5960ef
feat: Add frontend configuration for web loader
2025-04-12 17:13:30 +08:00
Youggls
3e2a6df1fb
feat: Add sougou web search API for backend, add config panel in for frontend.
2025-04-10 14:51:44 +08:00
Timothy Jaeryang Baek
914eb49767
chore: include accelerate
dependency
2025-04-06 17:44:05 -07:00
Timothy Jaeryang Baek
cbe2056587
fix: audio file upload response issue
2025-04-06 17:31:50 -07:00
Timothy Jaeryang Baek
f243e523a6
refac
2025-04-06 15:52:38 -07:00
Timothy Jaeryang Baek
155dbd5a66
refac
2025-04-06 15:45:48 -07:00
Timothy Jaeryang Baek
9825d03602
Merge pull request #12507 from Ithanil/fix_web_result_collection_source_ids
...
fix: fix web results all getting the same source id when using embedding and retrieval
2025-04-06 15:43:21 -07:00
Jan Kessler
a506a1a61e
only keep URLs as sources for which the content could actually be retrieved
2025-04-06 20:31:12 +02:00
Jan Kessler
4476060044
fix web results all getting the same source id when using embedding and retrieval
2025-04-06 15:51:05 +02:00
Marko Henning
3b2b6e183d
Added missing parameter for query_doc_with_hybrid_search.
2025-04-04 15:30:57 +02:00
Timothy Jaeryang Baek
94bf49440d
enh: unload hybrid model if set to False
2025-04-02 18:15:14 -07:00
Patrick Wachter
1ac6879268
Add Mistral OCR integration and configuration support
2025-04-01 14:24:33 +02:00
Timothy Jaeryang Baek
cafc5413f5
refac
2025-03-31 14:13:27 -07:00
Timothy Jaeryang Baek
d542881ee4
refac
2025-03-30 21:55:20 -07:00
Timothy Jaeryang Baek
433b5bddc1
Merge pull request #8594 from jayteaftw/main
...
feat: Support for instruct/prefixing embeddings
2025-03-30 21:54:44 -07:00
Timothy Jaeryang Baek
4a79320253
chore: format
2025-03-27 01:40:28 -07:00
Timothy Jaeryang Baek
9d834a8e90
Merge branch 'dev' into k_reranker
2025-03-26 20:50:31 -07:00
Marko Henning
41a4cf7106
Added new k_reranker parameter
2025-03-06 10:47:57 +01:00
Fabio Polito
9aa407dbd2
feat: merge with main
2025-03-05 22:04:34 +00:00
Timothy Jaeryang Baek
efe8c4ca69
chore: format
2025-03-01 07:28:00 -08:00
Timothy Jaeryang Baek
d0ddb0637e
enh: web embed bypass embedding and retrieval support
2025-02-27 16:34:05 -08:00
Timothy Jaeryang Baek
1b56a8f3cb
Merge pull request #10864 from kurtdami/perplexity_integration
...
feat: add perplexity integration to web search
2025-02-27 13:51:03 -08:00
kurtdami
b061775932
feat: add perplexity integration to web search
2025-02-27 00:30:48 -08:00
Timothy Jaeryang Baek
57010901e6
enh: bypass embedding and retrieval
2025-02-26 15:42:19 -08:00
Timothy Jaeryang Baek
78a8ef8e66
refac: audio file handling
2025-02-26 13:09:52 -08:00
Timothy Jaeryang Baek
3be5e3129b
Merge pull request #10752 from NovoNordisk-OpenSource/yvedeng/standardize-logging
...
refactor: replace print statements with logging
2025-02-25 10:53:02 -08:00
Yifang Deng
0e5d5ecb81
refactor: replace print statements with logging for better error tracking
2025-02-25 15:53:55 +01:00
hurxxxx
4cc3102758
feat: onedrive file picker integration
2025-02-25 01:47:07 +09:00
Timothy Jaeryang Baek
b14e75dd6c
feat: added Trust Proxy Environment switch in Web Search admin settings tab.
...
Co-Authored-By: harry zhou <67385896+harryzhou2000@users.noreply.github.com>
2025-02-21 13:40:11 -08:00
Timothy Jaeryang Baek
ab1b910d80
Merge pull request #10486 from Micca/feature/document_intelligence_support
...
Feat: Adding Support for Azure AI Document Intelligence for Content Extraction (Revised)
2025-02-21 10:56:18 -08:00
Timothy Jaeryang Baek
81715f6553
enh: RAG full context mode
2025-02-18 21:14:58 -08:00
Rory
10e0c81de9
Merge remote-tracking branch 'upstream/dev' into playwright
...
# Conflicts:
# backend/open_webui/retrieval/web/utils.py
# backend/open_webui/routers/retrieval.py
2025-02-17 21:53:39 -06:00