Timothy Jaeryang Baek
|
91a455a284
|
chore: format
|
2025-04-12 16:35:11 -07:00 |
|
Timothy Jaeryang Baek
|
48a23ce3fe
|
refac: web/rag config
|
2025-04-12 16:33:36 -07:00 |
|
Tim Jaeryang Baek
|
62ef0bad6f
|
Merge pull request #12680 from lucyknada/patch-1
fix #12678
|
2025-04-10 08:46:41 -07:00 |
|
Timothy Jaeryang Baek
|
63e5200e2f
|
refac
|
2025-04-10 08:46:12 -07:00 |
|
Youggls
|
3e2a6df1fb
|
feat: Add sougou web search API for backend, add config panel in for frontend.
|
2025-04-10 14:51:44 +08:00 |
|
lucy
|
bc295546cd
|
fix #12678
|
2025-04-10 07:23:34 +02:00 |
|
Tim Jaeryang Baek
|
2575dac4ed
|
Merge pull request #12604 from maurerle/ddg_improve_stacktrace
Deploy to HuggingFace Spaces / check-secret (push) Waiting to run
Deploy to HuggingFace Spaces / deploy (push) Blocked by required conditions
Create and publish Docker images with specific build args / build-main-image (linux/amd64) (push) Waiting to run
Create and publish Docker images with specific build args / build-main-image (linux/arm64) (push) Waiting to run
Create and publish Docker images with specific build args / build-cuda-image (linux/amd64) (push) Waiting to run
Create and publish Docker images with specific build args / build-cuda-image (linux/arm64) (push) Waiting to run
Create and publish Docker images with specific build args / build-ollama-image (linux/amd64) (push) Waiting to run
Create and publish Docker images with specific build args / build-ollama-image (linux/arm64) (push) Waiting to run
Create and publish Docker images with specific build args / merge-main-images (push) Blocked by required conditions
Create and publish Docker images with specific build args / merge-cuda-images (push) Blocked by required conditions
Create and publish Docker images with specific build args / merge-ollama-images (push) Blocked by required conditions
Python CI / Format Backend (3.11.x) (push) Waiting to run
Python CI / Format Backend (3.12.x) (push) Waiting to run
Frontend Build / Format & Build Frontend (push) Waiting to run
Frontend Build / Frontend Unit Tests (push) Waiting to run
**fix** improve stack trace of duckduckgo exception
|
2025-04-08 13:03:57 -07:00 |
|
Robert Norberg
|
2337b36609
|
add debug logging to RAG utils
|
2025-04-08 12:08:32 -04:00 |
|
Florian Maurer
|
760ea3f4af
|
duckduckgo: backend api has been deprecated since december
also increase duckduckgo-search version
see 3ee8e08b1c
|
2025-04-08 14:02:06 +02:00 |
|
Florian Maurer
|
337c7caafa
|
improve stack trace of duckduckgo exception
* fix search_results out of scope
* ddgs.text does already always return a list
|
2025-04-08 13:52:23 +02:00 |
|
Timothy Jaeryang Baek
|
65ed76abe1
|
refac: embedding prefix
|
2025-04-06 17:17:24 -07:00 |
|
Timothy Jaeryang Baek
|
ef787e4a79
|
Merge pull request #12486 from FabioPolito24/text-file-handling-docling
fix: text file handling with docling
|
2025-04-05 09:55:51 -07:00 |
|
Fabio Polito
|
cd0a1b4852
|
fix: fix for text file handling with docling
|
2025-04-05 16:44:08 +00:00 |
|
Juan Calderon-Perez
|
324550423c
|
Fix formatting issues
|
2025-04-05 10:03:24 -04:00 |
|
Phlogi
|
8cf8121812
|
Update utils.py
Avoid running any tasks for collections that failed to fetch data (have assigned None)
|
2025-04-05 10:41:21 +02:00 |
|
Patrick Wachter
|
0ac00b9256
|
refactor: update import path for MistralLoader
|
2025-04-02 13:56:10 +02:00 |
|
Patrick Wachter
|
c5a8d2f857
|
refactor: update MistralLoader documentation and adjust parameters for signed URL retrieval
|
2025-04-01 20:14:34 +02:00 |
|
Patrick Wachter
|
93d7702e8c
|
refactor: move MistralLoader to a separate module and just use the requests package instead of mistralai
|
2025-04-01 20:14:34 +02:00 |
|
Patrick Wachter
|
1ac6879268
|
Add Mistral OCR integration and configuration support
|
2025-04-01 14:24:33 +02:00 |
|
Timothy Jaeryang Baek
|
391dd33da3
|
chore: format
|
2025-03-31 17:59:21 -07:00 |
|
Timothy Jaeryang Baek
|
3ba12e7a43
|
Merge pull request #12239 from Phlogi/dev-threads-on-hybrid
perf: parallelize hybrid search
|
2025-03-31 17:06:32 -07:00 |
|
Timothy Jaeryang Baek
|
cafc5413f5
|
refac
|
2025-03-31 14:13:27 -07:00 |
|
Phlogi
|
9c64310db5
|
Run hybrid_search in parallel
|
2025-03-31 16:43:37 +02:00 |
|
Timothy Jaeryang Baek
|
4b75966401
|
refac: embedding prefix var naming
|
2025-03-30 21:55:15 -07:00 |
|
Timothy Jaeryang Baek
|
433b5bddc1
|
Merge pull request #8594 from jayteaftw/main
feat: Support for instruct/prefixing embeddings
|
2025-03-30 21:54:44 -07:00 |
|
Timothy Jaeryang Baek
|
50b8dec3ac
|
fix/refac: hybrid search
|
2025-03-30 20:48:22 -07:00 |
|
Timothy Jaeryang Baek
|
ce0d82b55f
|
Merge pull request #12132 from Phlogi/dev-fetch-documents-once
Avoid multiple data fetching
|
2025-03-30 20:44:32 -07:00 |
|
Junaid Pinjari
|
e782e7d3a7
|
Fix: CSV loader encoding issue using autodetect_encoding=True
|
2025-03-29 13:14:53 +05:30 |
|
Phlogi
|
04bf9ddab2
|
Avoid multiple data fetching
|
2025-03-27 19:05:20 +01:00 |
|
Timothy Jaeryang Baek
|
4a79320253
|
chore: format
|
2025-03-27 01:40:28 -07:00 |
|
Timothy Jaeryang Baek
|
7490bc9100
|
Merge branch 'dev' into fix-db-order
|
2025-03-26 20:55:42 -07:00 |
|
Timothy Jaeryang Baek
|
9d834a8e90
|
Merge branch 'dev' into k_reranker
|
2025-03-26 20:50:31 -07:00 |
|
Marko Henning
|
7531b7dcaa
|
Satisfy github format check
|
2025-03-25 19:09:17 +01:00 |
|
Iván Baldo
|
115e46a6a2
|
Fix: Tika 3.1.0.0 sends a lot of blank lines which degrades the RAG results, strip them.
|
2025-03-25 14:53:14 -03:00 |
|
Marko Henning
|
94d9d3d590
|
Fix: Normalze all database distances to score in [0, 1]
|
2025-03-25 16:46:14 +01:00 |
|
Timothy Jaeryang Baek
|
38d524f6a0
|
chore: format
|
2025-03-24 11:35:32 -07:00 |
|
Jonathan Flower
|
bdd236fa3a
|
improved error handling for deleting collections that do not exist in chromadb
|
2025-03-22 09:59:06 -04:00 |
|
Timothy Jaeryang Baek
|
8aa6dade41
|
Merge pull request #11876 from mahenning/fix--rag-sorting
Fix: wrong citation order for chromadb, wrong order for hybrid search
|
2025-03-20 17:54:22 -07:00 |
|
Timothy Jaeryang Baek
|
9b20ef4922
|
refac
|
2025-03-20 14:01:47 -07:00 |
|
genjuro
|
07098c6352
|
perf: set shorter timeout for playwright and make it configurable
|
2025-03-20 15:28:09 +08:00 |
|
Marko Henning
|
5f48af5b91
|
Revert the ordering change with chromadb, not necessary with reranker results
|
2025-03-19 17:04:45 +01:00 |
|
Marko Henning
|
ec8fc727b8
|
Fix wrong order for chromadb
|
2025-03-19 16:06:10 +01:00 |
|
leilibj
|
3e8546135d
|
fix: correct incorrect usage of log.exception method
|
2025-03-19 13:04:34 +08:00 |
|
Marko Henning
|
5ab789e83e
|
Add documentation on chroma special case
|
2025-03-18 16:44:58 +01:00 |
|
Marko Henning
|
ba676b7ed6
|
Use k_reranker also for result merge, and add special sorting use case for ChromaDB
|
2025-03-18 16:25:24 +01:00 |
|
Marko Henning
|
f13948d805
|
Fixed typo
|
2025-03-18 12:14:59 +01:00 |
|
Marko Henning
|
c877b59cbc
|
Address edge case with k < k_reranker, sort results for cutting off
|
2025-03-18 11:31:17 +01:00 |
|
orenzhang
|
c761e4fd08
|
feat(trace): opentelemetry instrument
|
2025-03-10 22:27:31 +08:00 |
|
Fabio Polito
|
9d6743824e
|
fix: fix params DoclingLoader
|
2025-03-09 16:12:14 +00:00 |
|
Fabio Polito
|
0aa42615f9
|
Merge remote-tracking branch 'upstream/dev' into docling_context_extraction_engine
merge upstream
|
2025-03-08 18:52:51 +00:00 |
|
Timothy Jaeryang Baek
|
22b88f9593
|
Merge pull request #11324 from kela4/main
fix: opensearch vector db query structures, result mapping, filters, bulk query actions, knn_vector usage
|
2025-03-08 12:19:38 -04:00 |
|
Luke
|
7917128ed3
|
enh: enable configuration for tavily extract depth
|
2025-03-08 00:43:02 -05:00 |
|
Fabio Polito
|
e3eef58310
|
feat: merge with dev
|
2025-03-07 00:22:47 +00:00 |
|
Luke
|
987954c817
|
feat: Add Tavily extract web loader integration
|
2025-03-06 18:15:18 -05:00 |
|
Katharina
|
6cb0c0339a
|
fix: opensearch vector db query structures, result mapping, filters, bulk query actions, knn_vector usage
|
2025-03-06 23:49:54 +01:00 |
|
Fabio Polito
|
98857184ff
|
Merge remote-tracking branch 'upstream/dev' into docling_context_extraction_engine
merge with dev branch
|
2025-03-06 12:12:50 +00:00 |
|
Marko Henning
|
41a4cf7106
|
Added new k_reranker parameter
|
2025-03-06 10:47:57 +01:00 |
|
Timothy Jaeryang Baek
|
d4fca9dabf
|
chore: format
|
2025-03-05 19:17:41 -08:00 |
|
Fabio Polito
|
0716f96da8
|
style: change style in DoclingLoader
|
2025-03-05 23:15:55 +00:00 |
|
Fabio Polito
|
9aa407dbd2
|
feat: merge with main
|
2025-03-05 22:04:34 +00:00 |
|
ofek
|
a8f205213c
|
fixed es bugs
|
2025-03-05 23:19:56 +02:00 |
|
Fabio Polito
|
a44b35e99e
|
fix: fix DoclingLoader input params
|
2025-03-05 17:53:45 +00:00 |
|
Timothy Jaeryang Baek
|
7b442e4be0
|
Merge pull request #11141 from Youggls/dev
fix: correct parameter name for MilvusClient instantiation
|
2025-03-04 00:54:49 -08:00 |
|
Timothy Jaeryang Baek
|
39ea59edc8
|
chore: format
|
2025-03-04 00:32:27 -08:00 |
|
Perry Li
|
67ed61d022
|
fixbug: correct parameter name for MilvusClient instantiation
Replace incorrect parameter 'database=MILVUS_DB' with valid 'db_name=MILVUS_DB'
|
2025-03-04 16:02:19 +08:00 |
|
ofek
|
737dfd2763
|
added elasticsearch support
|
2025-03-03 23:39:42 +02:00 |
|
Timothy Jaeryang Baek
|
6471f12668
|
Merge pull request #11033 from dtaivpp/main
fix: Changed to use collection_name and fixed bulk indexing missing index.
|
2025-03-01 16:00:13 -08:00 |
|
David Tippett
|
f3c4c2b8e3
|
Changed to use colleciton name and fixed bulk indexing missing index.
|
2025-03-01 13:26:19 -05:00 |
|
Timothy Jaeryang Baek
|
d0ddb0637e
|
enh: web embed bypass embedding and retrieval support
|
2025-02-27 16:34:05 -08:00 |
|
Timothy Jaeryang Baek
|
1b56a8f3cb
|
Merge pull request #10864 from kurtdami/perplexity_integration
feat: add perplexity integration to web search
|
2025-02-27 13:51:03 -08:00 |
|
kurtdami
|
b061775932
|
feat: add perplexity integration to web search
|
2025-02-27 00:30:48 -08:00 |
|
Timothy Jaeryang Baek
|
ce7cf62a55
|
refac: dedup
|
2025-02-26 23:51:39 -08:00 |
|
Timothy Jaeryang Baek
|
ddb30589e3
|
chore: format
HIDE MODELS
|
2025-02-26 22:18:18 -08:00 |
|
Timothy Jaeryang Baek
|
57010901e6
|
enh: bypass embedding and retrieval
|
2025-02-26 15:42:19 -08:00 |
|
Timothy Jaeryang Baek
|
34aeaaf020
|
refac
|
2025-02-26 13:54:26 -08:00 |
|
Timothy Jaeryang Baek
|
46ac6f2b29
|
fix
|
2025-02-26 12:53:07 -08:00 |
|
Timothy Jaeryang Baek
|
33d3558ca9
|
Merge pull request #10817 from NovoNordisk-OpenSource/ivaroli/adding-json-as-supported-file-type
fix: Using the TextLoader instead of Tika for JSON files
|
2025-02-26 12:49:29 -08:00 |
|
Ívar Óli Sigurðsson
|
c5a09cdd21
|
adding a comma
|
2025-02-26 15:27:03 +01:00 |
|
Ívar Óli Sigurðsson
|
661711164a
|
Adding json as a known source for Tika
|
2025-02-26 15:11:21 +01:00 |
|
Timothy Jaeryang Baek
|
3be5e3129b
|
Merge pull request #10752 from NovoNordisk-OpenSource/yvedeng/standardize-logging
refactor: replace print statements with logging
|
2025-02-25 10:53:02 -08:00 |
|
Yifang Deng
|
0e5d5ecb81
|
refactor: replace print statements with logging for better error tracking
|
2025-02-25 15:53:55 +01:00 |
|
Timothy Jaeryang Baek
|
ab1b910d80
|
Merge pull request #10486 from Micca/feature/document_intelligence_support
Feat: Adding Support for Azure AI Document Intelligence for Content Extraction (Revised)
|
2025-02-21 10:56:18 -08:00 |
|
Timothy Jaeryang Baek
|
93d486d50e
|
revert: faulty dedup code
|
2025-02-20 11:02:45 -08:00 |
|
Timothy Jaeryang Baek
|
eeb00a5ca2
|
chore: format
|
2025-02-20 01:01:29 -08:00 |
|
Youggls
|
0fb3c08181
|
feat: Add Firecrawl web loader integration
|
2025-02-19 16:54:44 +08:00 |
|
Timothy Jaeryang Baek
|
c073b8b4ee
|
refac
|
2025-02-18 23:49:27 -08:00 |
|
Timothy Jaeryang Baek
|
5465cabd40
|
refac
|
2025-02-18 21:17:09 -08:00 |
|
Timothy Jaeryang Baek
|
81715f6553
|
enh: RAG full context mode
|
2025-02-18 21:14:58 -08:00 |
|
Timothy Jaeryang Baek
|
1bbecd46c8
|
Merge pull request #10052 from roryeckel/playwright
Support Playwright RAG Web Loader: Revised
|
2025-02-18 19:57:48 -08:00 |
|
Timothy Jaeryang Baek
|
4ef7aff663
|
refac
|
2025-02-18 19:35:22 -08:00 |
|
mikhail-khludnev
|
925bfe840b
|
dedupe results from multiple queries
|
2025-02-18 20:10:57 +03:00 |
|
Rory
|
10e0c81de9
|
Merge remote-tracking branch 'upstream/dev' into playwright
# Conflicts:
# backend/open_webui/retrieval/web/utils.py
# backend/open_webui/routers/retrieval.py
|
2025-02-17 21:53:39 -06:00 |
|
Rory
|
bc82f48ebf
|
refac: RAG_WEB_LOADER -> RAG_WEB_LOADER_ENGINE
|
2025-02-17 21:43:32 -06:00 |
|
Timothy Jaeryang Baek
|
ba6cde8a87
|
fix: include_domain does NOT exist
|
2025-02-17 19:20:49 -08:00 |
|
Timothy Jaeryang Baek
|
dbe5d1ca08
|
refac
|
2025-02-17 18:16:23 -08:00 |
|
Timothy Jaeryang Baek
|
ca0b7217d2
|
enh: full context web search
|
2025-02-17 18:14:26 -08:00 |
|
Rory
|
66c2acc08d
|
Merge branch 'dev' into playwright
|
2025-02-15 22:14:16 -06:00 |
|
Timothy Jaeryang Baek
|
b0ad5cd863
|
Merge pull request #10076 from crizCraig/local_date
fix: return local date from `getFormattedDate`
|
2025-02-15 20:10:56 -08:00 |
|
Timothy Jaeryang Baek
|
3d0c06ccee
|
refac: duckduckgo
|
2025-02-15 16:45:56 -08:00 |
|
Craig Quiter
|
e67eb89e05
|
style: black format
|
2025-02-15 10:53:16 -08:00 |
|