Commit Graph

107 Commits

Author SHA1 Message Date
Derek Wischusen
42be1f956a Add Azure OpenAI embedding support 2025-05-19 22:58:04 -04:00
Timothy Jaeryang Baek
2bd7db12a2 enh: ALLOWED_FILE_EXTENSIONS ui 2025-05-16 21:05:52 +04:00
Timothy Jaeryang Baek
8732b64b6b feat: external document loader support
Some checks are pending
Deploy to HuggingFace Spaces / check-secret (push) Waiting to run
Deploy to HuggingFace Spaces / deploy (push) Blocked by required conditions
Create and publish Docker images with specific build args / build-main-image (linux/amd64) (push) Waiting to run
Create and publish Docker images with specific build args / build-main-image (linux/arm64) (push) Waiting to run
Create and publish Docker images with specific build args / build-cuda-image (linux/amd64) (push) Waiting to run
Create and publish Docker images with specific build args / build-cuda-image (linux/arm64) (push) Waiting to run
Create and publish Docker images with specific build args / build-ollama-image (linux/amd64) (push) Waiting to run
Create and publish Docker images with specific build args / build-ollama-image (linux/arm64) (push) Waiting to run
Create and publish Docker images with specific build args / merge-main-images (push) Blocked by required conditions
Create and publish Docker images with specific build args / merge-cuda-images (push) Blocked by required conditions
Create and publish Docker images with specific build args / merge-ollama-images (push) Blocked by required conditions
Python CI / Format Backend (3.11.x) (push) Waiting to run
Python CI / Format Backend (3.12.x) (push) Waiting to run
Frontend Build / Format & Build Frontend (push) Waiting to run
Frontend Build / Frontend Unit Tests (push) Waiting to run
2025-05-14 22:28:40 +04:00
Timothy Jaeryang Baek
de70d0cb64 feat: docling do picture description support 2025-05-14 21:26:49 +04:00
hwzhuhao
6f869ded43 feat:Add vector type and vector factory class for vector database integration 2025-05-14 21:30:50 +08:00
Timothy Jaeryang Baek
6f635d8b7d refac 2025-05-10 19:16:09 +04:00
Timothy Jaeryang Baek
be912f1529 refac 2025-05-10 18:29:04 +04:00
Timothy Jaeryang Baek
d5fd3b3600 feat: external reranker
Co-Authored-By: Brendan Campbell <20541191+bcambs09@users.noreply.github.com>
2025-05-10 18:25:20 +04:00
Timothy Jaeryang Baek
34ec10a78c refac: web search performance
Co-Authored-By: Mabeck <64421281+mmabeck@users.noreply.github.com>
2025-05-10 17:54:41 +04:00
tth37
c95a65a4bd fix: Duplicate web search urls 2025-05-09 20:06:35 +08:00
Timothy Jaeryang Baek
b50dcb1862 refac: remove duplicate urls 2025-05-07 22:25:18 +04:00
Athanasios Oikonomou
657162e96d feat(ocr): add support for Docling OCR engine and language configuration
This commit adds support for configuring the OCR engine and language(s) for Docling.
Configuration can be set via the environment variables `DOCLING_OCR_ENGINE` and `DOCLING_OCR_LANG`, or through the UI.

Fixes #13133
2025-05-03 00:32:06 +03:00
Tim Jaeryang Baek
e87f2669fa
Merge pull request #13191 from tth37/feat_firecrawl_search_engine
feat: Add Firecrawl search engine
2025-04-29 08:38:28 -07:00
Tim Jaeryang Baek
7b863465a9
Merge pull request #13311 from stephen304/yacy-support
feat: Yacy search support
2025-04-29 08:35:10 -07:00
Stephen Smith
240d91d38d Add yacy config for user/pass, automatically add yacy json api path 2025-04-26 22:28:30 -04:00
Stephen Smith
0f73b96616 first pass at yacy support copied from searxng 2025-04-26 14:07:13 -04:00
tth37
92dbeb1939 feat: Add Firecrawl search engine 2025-04-24 14:57:28 +08:00
Timothy Jaeryang Baek
732d7aee70 enh: sentence transformers env vars
Co-Authored-By: DrZoidberg09 <96449693+drzoidberg09@users.noreply.github.com>
2025-04-24 01:55:18 +09:00
Timothy Jaeryang Baek
09874ab83d fix: FireCrawlLoader 2025-04-24 01:40:34 +09:00
Timothy Jaeryang Baek
43efff0fe6 refac 2025-04-22 23:22:50 +09:00
Tim Jaeryang Baek
87844a8042
Merge pull request #12822 from tth37/feat_external_search_loader
feat: Support for Self-Hosted/External Web Search/Loader Engines
2025-04-18 23:51:27 -07:00
Youggls
9669cd3454 fix: use run_in_threadpool for search_web to prevent blocking
Used fastapi's run_in_threadpool function to execute the search_web function,
preventing the synchronous function from blocking the entire web search process.
2025-04-17 17:23:20 +08:00
tth37
85f8e91288 feat: Allow admin editing external search/loader settings 2025-04-14 18:19:26 +08:00
Timothy Jaeryang Baek
70718dda90 refac 2025-04-13 22:31:43 -07:00
tth37
839ba22c90 feat: Backend for Self-Hosted/External Web Search/Loader Engines 2025-04-14 01:49:05 +08:00
Timothy Jaeryang Baek
888b468576 fix 2025-04-12 23:00:34 -07:00
Timothy Jaeryang Baek
4dafbbccfc fix: rag template display issue 2025-04-12 22:55:24 -07:00
tth37
8d53f1e770 fix: small bugs on updated web/rag settings 2025-04-13 12:55:50 +08:00
Timothy Jaeryang Baek
48a23ce3fe refac: web/rag config 2025-04-12 16:33:36 -07:00
tth37
5eac5960ef feat: Add frontend configuration for web loader 2025-04-12 17:13:30 +08:00
Youggls
3e2a6df1fb feat: Add sougou web search API for backend, add config panel in for frontend. 2025-04-10 14:51:44 +08:00
Timothy Jaeryang Baek
914eb49767 chore: include accelerate dependency 2025-04-06 17:44:05 -07:00
Timothy Jaeryang Baek
cbe2056587 fix: audio file upload response issue 2025-04-06 17:31:50 -07:00
Timothy Jaeryang Baek
f243e523a6 refac 2025-04-06 15:52:38 -07:00
Timothy Jaeryang Baek
155dbd5a66 refac 2025-04-06 15:45:48 -07:00
Timothy Jaeryang Baek
9825d03602
Merge pull request #12507 from Ithanil/fix_web_result_collection_source_ids
fix: fix web results all getting the same source id when using embedding and retrieval
2025-04-06 15:43:21 -07:00
Jan Kessler
a506a1a61e
only keep URLs as sources for which the content could actually be retrieved 2025-04-06 20:31:12 +02:00
Jan Kessler
4476060044
fix web results all getting the same source id when using embedding and retrieval 2025-04-06 15:51:05 +02:00
Marko Henning
3b2b6e183d Added missing parameter for query_doc_with_hybrid_search. 2025-04-04 15:30:57 +02:00
Timothy Jaeryang Baek
94bf49440d enh: unload hybrid model if set to False 2025-04-02 18:15:14 -07:00
Patrick Wachter
1ac6879268
Add Mistral OCR integration and configuration support 2025-04-01 14:24:33 +02:00
Timothy Jaeryang Baek
cafc5413f5 refac 2025-03-31 14:13:27 -07:00
Timothy Jaeryang Baek
d542881ee4 refac 2025-03-30 21:55:20 -07:00
Timothy Jaeryang Baek
433b5bddc1
Merge pull request #8594 from jayteaftw/main
feat: Support for instruct/prefixing embeddings
2025-03-30 21:54:44 -07:00
Timothy Jaeryang Baek
4a79320253 chore: format 2025-03-27 01:40:28 -07:00
Timothy Jaeryang Baek
9d834a8e90
Merge branch 'dev' into k_reranker 2025-03-26 20:50:31 -07:00
Marko Henning
41a4cf7106 Added new k_reranker parameter 2025-03-06 10:47:57 +01:00
Fabio Polito
9aa407dbd2 feat: merge with main 2025-03-05 22:04:34 +00:00
Timothy Jaeryang Baek
efe8c4ca69 chore: format 2025-03-01 07:28:00 -08:00
Timothy Jaeryang Baek
d0ddb0637e enh: web embed bypass embedding and retrieval support 2025-02-27 16:34:05 -08:00