Tim Jaeryang Baek
042c37ea34
Merge pull request #14311 from Hisma/marker-api-content-extraction
...
feat: Marker api content extraction support
2025-05-29 02:21:13 +04:00
Timothy Jaeryang Baek
4461122a0e
fix: /api/v1/retrieval/query/collection endpoint
Deploy to HuggingFace Spaces / check-secret (push) Waiting to run
Deploy to HuggingFace Spaces / deploy (push) Blocked by required conditions
Create and publish Docker images with specific build args / build-main-image (linux/amd64) (push) Waiting to run
Create and publish Docker images with specific build args / build-main-image (linux/arm64) (push) Waiting to run
Create and publish Docker images with specific build args / build-cuda-image (linux/amd64) (push) Waiting to run
Create and publish Docker images with specific build args / build-cuda-image (linux/arm64) (push) Waiting to run
Create and publish Docker images with specific build args / build-ollama-image (linux/amd64) (push) Waiting to run
Create and publish Docker images with specific build args / build-ollama-image (linux/arm64) (push) Waiting to run
Create and publish Docker images with specific build args / merge-main-images (push) Blocked by required conditions
Create and publish Docker images with specific build args / merge-cuda-images (push) Blocked by required conditions
Create and publish Docker images with specific build args / merge-ollama-images (push) Blocked by required conditions
Python CI / Format Backend (3.11.x) (push) Waiting to run
Python CI / Format Backend (3.12.x) (push) Waiting to run
Frontend Build / Format & Build Frontend (push) Waiting to run
Frontend Build / Frontend Unit Tests (push) Waiting to run
2025-05-28 18:45:47 +04:00
Hisma
a9405cc101
feat: Marker api content extraction support
2025-05-27 00:44:07 -04:00
Tim Jaeryang Baek
e663b90a9f
Merge pull request #14069 from Ithanil/bm25_weight
...
feat: Configurable weight for BM25Retriever during hybrid search
2025-05-24 01:13:03 +04:00
Jan Kessler
e70dd33233
rename BM25_WEIGHT -> HYBRID_BM25_WEIGHT
2025-05-23 22:06:44 +02:00
Timothy Jaeryang Baek
2eca6f6414
feat: bypass web loader in web search
...
Co-Authored-By: Perry Li <peiyaoli@mail.nankai.edu.cn>
Co-Authored-By: WilliamGates <3852641+williamgateszhao@users.noreply.github.com>
2025-05-23 02:30:35 +04:00
Jan Kessler
308d8ac04a
make bm25_weight a regular parameter of query_doc.. / get_sources_from_files functions
2025-05-20 11:46:32 +02:00
Jan Kessler
b5ddaf6417
make weight for bm25 retriever in hybrid search ui-configurable
2025-05-20 10:39:31 +02:00
Timothy Jaeryang Baek
2bd7db12a2
enh: ALLOWED_FILE_EXTENSIONS ui
2025-05-16 21:05:52 +04:00
Timothy Jaeryang Baek
8732b64b6b
feat: external document loader support
Deploy to HuggingFace Spaces / check-secret (push) Waiting to run
Deploy to HuggingFace Spaces / deploy (push) Blocked by required conditions
Create and publish Docker images with specific build args / build-main-image (linux/amd64) (push) Waiting to run
Create and publish Docker images with specific build args / build-main-image (linux/arm64) (push) Waiting to run
Create and publish Docker images with specific build args / build-cuda-image (linux/amd64) (push) Waiting to run
Create and publish Docker images with specific build args / build-cuda-image (linux/arm64) (push) Waiting to run
Create and publish Docker images with specific build args / build-ollama-image (linux/amd64) (push) Waiting to run
Create and publish Docker images with specific build args / build-ollama-image (linux/arm64) (push) Waiting to run
Create and publish Docker images with specific build args / merge-main-images (push) Blocked by required conditions
Create and publish Docker images with specific build args / merge-cuda-images (push) Blocked by required conditions
Create and publish Docker images with specific build args / merge-ollama-images (push) Blocked by required conditions
Python CI / Format Backend (3.11.x) (push) Waiting to run
Python CI / Format Backend (3.12.x) (push) Waiting to run
Frontend Build / Format & Build Frontend (push) Waiting to run
Frontend Build / Frontend Unit Tests (push) Waiting to run
2025-05-14 22:28:40 +04:00
Timothy Jaeryang Baek
de70d0cb64
feat: docling do picture description support
2025-05-14 21:26:49 +04:00
hwzhuhao
6f869ded43
feat:Add vector type and vector factory class for vector database integration
2025-05-14 21:30:50 +08:00
Timothy Jaeryang Baek
6f635d8b7d
refac
2025-05-10 19:16:09 +04:00
Timothy Jaeryang Baek
be912f1529
refac
2025-05-10 18:29:04 +04:00
Timothy Jaeryang Baek
d5fd3b3600
feat: external reranker
...
Co-Authored-By: Brendan Campbell <20541191+bcambs09@users.noreply.github.com>
2025-05-10 18:25:20 +04:00
Timothy Jaeryang Baek
34ec10a78c
refac: web search performance
...
Co-Authored-By: Mabeck <64421281+mmabeck@users.noreply.github.com>
2025-05-10 17:54:41 +04:00
tth37
c95a65a4bd
fix: Duplicate web search urls
2025-05-09 20:06:35 +08:00
Timothy Jaeryang Baek
b50dcb1862
refac: remove duplicate urls
2025-05-07 22:25:18 +04:00
Athanasios Oikonomou
657162e96d
feat(ocr): add support for Docling OCR engine and language configuration
...
This commit adds support for configuring the OCR engine and language(s) for Docling.
Configuration can be set via the environment variables `DOCLING_OCR_ENGINE` and `DOCLING_OCR_LANG`, or through the UI.
Fixes #13133
2025-05-03 00:32:06 +03:00
Tim Jaeryang Baek
e87f2669fa
Merge pull request #13191 from tth37/feat_firecrawl_search_engine
...
feat: Add Firecrawl search engine
2025-04-29 08:38:28 -07:00
Tim Jaeryang Baek
7b863465a9
Merge pull request #13311 from stephen304/yacy-support
...
feat: Yacy search support
2025-04-29 08:35:10 -07:00
Stephen Smith
240d91d38d
Add yacy config for user/pass, automatically add yacy json api path
2025-04-26 22:28:30 -04:00
Stephen Smith
0f73b96616
first pass at yacy support copied from searxng
2025-04-26 14:07:13 -04:00
tth37
92dbeb1939
feat: Add Firecrawl search engine
2025-04-24 14:57:28 +08:00
Timothy Jaeryang Baek
732d7aee70
enh: sentence transformers env vars
...
Co-Authored-By: DrZoidberg09 <96449693+drzoidberg09@users.noreply.github.com>
2025-04-24 01:55:18 +09:00
Timothy Jaeryang Baek
09874ab83d
fix: FireCrawlLoader
2025-04-24 01:40:34 +09:00
Timothy Jaeryang Baek
43efff0fe6
refac
2025-04-22 23:22:50 +09:00
Tim Jaeryang Baek
87844a8042
Merge pull request #12822 from tth37/feat_external_search_loader
...
feat: Support for Self-Hosted/External Web Search/Loader Engines
2025-04-18 23:51:27 -07:00
Youggls
9669cd3454
fix: use run_in_threadpool for search_web to prevent blocking
...
Used fastapi's run_in_threadpool function to execute the search_web function,
preventing the synchronous function from blocking the entire web search process.
2025-04-17 17:23:20 +08:00
tth37
85f8e91288
feat: Allow admin editing external search/loader settings
2025-04-14 18:19:26 +08:00
Timothy Jaeryang Baek
70718dda90
refac
2025-04-13 22:31:43 -07:00
tth37
839ba22c90
feat: Backend for Self-Hosted/External Web Search/Loader Engines
2025-04-14 01:49:05 +08:00
Timothy Jaeryang Baek
888b468576
fix
2025-04-12 23:00:34 -07:00
Timothy Jaeryang Baek
4dafbbccfc
fix: rag template display issue
2025-04-12 22:55:24 -07:00
tth37
8d53f1e770
fix: small bugs on updated web/rag settings
2025-04-13 12:55:50 +08:00
Timothy Jaeryang Baek
48a23ce3fe
refac: web/rag config
2025-04-12 16:33:36 -07:00
tth37
5eac5960ef
feat: Add frontend configuration for web loader
2025-04-12 17:13:30 +08:00
Youggls
3e2a6df1fb
feat: Add sougou web search API for backend, add config panel in for frontend.
2025-04-10 14:51:44 +08:00
Timothy Jaeryang Baek
914eb49767
chore: include accelerate
dependency
2025-04-06 17:44:05 -07:00
Timothy Jaeryang Baek
cbe2056587
fix: audio file upload response issue
2025-04-06 17:31:50 -07:00
Timothy Jaeryang Baek
f243e523a6
refac
2025-04-06 15:52:38 -07:00
Timothy Jaeryang Baek
155dbd5a66
refac
2025-04-06 15:45:48 -07:00
Timothy Jaeryang Baek
9825d03602
Merge pull request #12507 from Ithanil/fix_web_result_collection_source_ids
...
fix: fix web results all getting the same source id when using embedding and retrieval
2025-04-06 15:43:21 -07:00
Jan Kessler
a506a1a61e
only keep URLs as sources for which the content could actually be retrieved
2025-04-06 20:31:12 +02:00
Jan Kessler
4476060044
fix web results all getting the same source id when using embedding and retrieval
2025-04-06 15:51:05 +02:00
Marko Henning
3b2b6e183d
Added missing parameter for query_doc_with_hybrid_search.
2025-04-04 15:30:57 +02:00
Timothy Jaeryang Baek
94bf49440d
enh: unload hybrid model if set to False
2025-04-02 18:15:14 -07:00
Patrick Wachter
1ac6879268
Add Mistral OCR integration and configuration support
2025-04-01 14:24:33 +02:00
Timothy Jaeryang Baek
cafc5413f5
refac
2025-03-31 14:13:27 -07:00
Timothy Jaeryang Baek
d542881ee4
refac
2025-03-30 21:55:20 -07:00