Timothy Jaeryang Baek
|
3ba12e7a43
|
Merge pull request #12239 from Phlogi/dev-threads-on-hybrid
perf: parallelize hybrid search
|
2025-03-31 17:06:32 -07:00 |
|
Timothy Jaeryang Baek
|
cafc5413f5
|
refac
|
2025-03-31 14:13:27 -07:00 |
|
Phlogi
|
9c64310db5
|
Run hybrid_search in parallel
|
2025-03-31 16:43:37 +02:00 |
|
Timothy Jaeryang Baek
|
4b75966401
|
refac: embedding prefix var naming
|
2025-03-30 21:55:15 -07:00 |
|
Timothy Jaeryang Baek
|
433b5bddc1
|
Merge pull request #8594 from jayteaftw/main
feat: Support for instruct/prefixing embeddings
|
2025-03-30 21:54:44 -07:00 |
|
Timothy Jaeryang Baek
|
50b8dec3ac
|
fix/refac: hybrid search
|
2025-03-30 20:48:22 -07:00 |
|
Timothy Jaeryang Baek
|
ce0d82b55f
|
Merge pull request #12132 from Phlogi/dev-fetch-documents-once
Avoid multiple data fetching
|
2025-03-30 20:44:32 -07:00 |
|
Junaid Pinjari
|
e782e7d3a7
|
Fix: CSV loader encoding issue using autodetect_encoding=True
|
2025-03-29 13:14:53 +05:30 |
|
Phlogi
|
04bf9ddab2
|
Avoid multiple data fetching
|
2025-03-27 19:05:20 +01:00 |
|
Timothy Jaeryang Baek
|
4a79320253
|
chore: format
|
2025-03-27 01:40:28 -07:00 |
|
Timothy Jaeryang Baek
|
7490bc9100
|
Merge branch 'dev' into fix-db-order
|
2025-03-26 20:55:42 -07:00 |
|
Timothy Jaeryang Baek
|
9d834a8e90
|
Merge branch 'dev' into k_reranker
|
2025-03-26 20:50:31 -07:00 |
|
Marko Henning
|
7531b7dcaa
|
Satisfy github format check
|
2025-03-25 19:09:17 +01:00 |
|
Iván Baldo
|
115e46a6a2
|
Fix: Tika 3.1.0.0 sends a lot of blank lines which degrades the RAG results, strip them.
|
2025-03-25 14:53:14 -03:00 |
|
Marko Henning
|
94d9d3d590
|
Fix: Normalze all database distances to score in [0, 1]
|
2025-03-25 16:46:14 +01:00 |
|
Timothy Jaeryang Baek
|
38d524f6a0
|
chore: format
|
2025-03-24 11:35:32 -07:00 |
|
Jonathan Flower
|
bdd236fa3a
|
improved error handling for deleting collections that do not exist in chromadb
|
2025-03-22 09:59:06 -04:00 |
|
Timothy Jaeryang Baek
|
8aa6dade41
|
Merge pull request #11876 from mahenning/fix--rag-sorting
Fix: wrong citation order for chromadb, wrong order for hybrid search
|
2025-03-20 17:54:22 -07:00 |
|
Timothy Jaeryang Baek
|
9b20ef4922
|
refac
|
2025-03-20 14:01:47 -07:00 |
|
genjuro
|
07098c6352
|
perf: set shorter timeout for playwright and make it configurable
|
2025-03-20 15:28:09 +08:00 |
|
Marko Henning
|
5f48af5b91
|
Revert the ordering change with chromadb, not necessary with reranker results
|
2025-03-19 17:04:45 +01:00 |
|
Marko Henning
|
ec8fc727b8
|
Fix wrong order for chromadb
|
2025-03-19 16:06:10 +01:00 |
|
leilibj
|
3e8546135d
|
fix: correct incorrect usage of log.exception method
|
2025-03-19 13:04:34 +08:00 |
|
Marko Henning
|
5ab789e83e
|
Add documentation on chroma special case
|
2025-03-18 16:44:58 +01:00 |
|
Marko Henning
|
ba676b7ed6
|
Use k_reranker also for result merge, and add special sorting use case for ChromaDB
|
2025-03-18 16:25:24 +01:00 |
|
Marko Henning
|
f13948d805
|
Fixed typo
|
2025-03-18 12:14:59 +01:00 |
|
Marko Henning
|
c877b59cbc
|
Address edge case with k < k_reranker, sort results for cutting off
|
2025-03-18 11:31:17 +01:00 |
|
orenzhang
|
c761e4fd08
|
feat(trace): opentelemetry instrument
|
2025-03-10 22:27:31 +08:00 |
|
Fabio Polito
|
9d6743824e
|
fix: fix params DoclingLoader
|
2025-03-09 16:12:14 +00:00 |
|
Fabio Polito
|
0aa42615f9
|
Merge remote-tracking branch 'upstream/dev' into docling_context_extraction_engine
merge upstream
|
2025-03-08 18:52:51 +00:00 |
|
Timothy Jaeryang Baek
|
22b88f9593
|
Merge pull request #11324 from kela4/main
fix: opensearch vector db query structures, result mapping, filters, bulk query actions, knn_vector usage
|
2025-03-08 12:19:38 -04:00 |
|
Luke
|
7917128ed3
|
enh: enable configuration for tavily extract depth
|
2025-03-08 00:43:02 -05:00 |
|
Fabio Polito
|
e3eef58310
|
feat: merge with dev
|
2025-03-07 00:22:47 +00:00 |
|
Luke
|
987954c817
|
feat: Add Tavily extract web loader integration
|
2025-03-06 18:15:18 -05:00 |
|
Katharina
|
6cb0c0339a
|
fix: opensearch vector db query structures, result mapping, filters, bulk query actions, knn_vector usage
|
2025-03-06 23:49:54 +01:00 |
|
Fabio Polito
|
98857184ff
|
Merge remote-tracking branch 'upstream/dev' into docling_context_extraction_engine
merge with dev branch
|
2025-03-06 12:12:50 +00:00 |
|
Marko Henning
|
41a4cf7106
|
Added new k_reranker parameter
|
2025-03-06 10:47:57 +01:00 |
|
Timothy Jaeryang Baek
|
d4fca9dabf
|
chore: format
|
2025-03-05 19:17:41 -08:00 |
|
Fabio Polito
|
0716f96da8
|
style: change style in DoclingLoader
|
2025-03-05 23:15:55 +00:00 |
|
Fabio Polito
|
9aa407dbd2
|
feat: merge with main
|
2025-03-05 22:04:34 +00:00 |
|
ofek
|
a8f205213c
|
fixed es bugs
|
2025-03-05 23:19:56 +02:00 |
|
Fabio Polito
|
a44b35e99e
|
fix: fix DoclingLoader input params
|
2025-03-05 17:53:45 +00:00 |
|
Timothy Jaeryang Baek
|
7b442e4be0
|
Merge pull request #11141 from Youggls/dev
fix: correct parameter name for MilvusClient instantiation
|
2025-03-04 00:54:49 -08:00 |
|
Timothy Jaeryang Baek
|
39ea59edc8
|
chore: format
|
2025-03-04 00:32:27 -08:00 |
|
Perry Li
|
67ed61d022
|
fixbug: correct parameter name for MilvusClient instantiation
Replace incorrect parameter 'database=MILVUS_DB' with valid 'db_name=MILVUS_DB'
|
2025-03-04 16:02:19 +08:00 |
|
ofek
|
737dfd2763
|
added elasticsearch support
|
2025-03-03 23:39:42 +02:00 |
|
Timothy Jaeryang Baek
|
6471f12668
|
Merge pull request #11033 from dtaivpp/main
fix: Changed to use collection_name and fixed bulk indexing missing index.
|
2025-03-01 16:00:13 -08:00 |
|
David Tippett
|
f3c4c2b8e3
|
Changed to use colleciton name and fixed bulk indexing missing index.
|
2025-03-01 13:26:19 -05:00 |
|
Timothy Jaeryang Baek
|
d0ddb0637e
|
enh: web embed bypass embedding and retrieval support
|
2025-02-27 16:34:05 -08:00 |
|
Timothy Jaeryang Baek
|
1b56a8f3cb
|
Merge pull request #10864 from kurtdami/perplexity_integration
feat: add perplexity integration to web search
|
2025-02-27 13:51:03 -08:00 |
|
kurtdami
|
b061775932
|
feat: add perplexity integration to web search
|
2025-02-27 00:30:48 -08:00 |
|
Timothy Jaeryang Baek
|
ce7cf62a55
|
refac: dedup
|
2025-02-26 23:51:39 -08:00 |
|
Timothy Jaeryang Baek
|
ddb30589e3
|
chore: format
HIDE MODELS
|
2025-02-26 22:18:18 -08:00 |
|
Timothy Jaeryang Baek
|
57010901e6
|
enh: bypass embedding and retrieval
|
2025-02-26 15:42:19 -08:00 |
|
Timothy Jaeryang Baek
|
34aeaaf020
|
refac
|
2025-02-26 13:54:26 -08:00 |
|
Timothy Jaeryang Baek
|
46ac6f2b29
|
fix
|
2025-02-26 12:53:07 -08:00 |
|
Timothy Jaeryang Baek
|
33d3558ca9
|
Merge pull request #10817 from NovoNordisk-OpenSource/ivaroli/adding-json-as-supported-file-type
fix: Using the TextLoader instead of Tika for JSON files
|
2025-02-26 12:49:29 -08:00 |
|
Ívar Óli Sigurðsson
|
c5a09cdd21
|
adding a comma
|
2025-02-26 15:27:03 +01:00 |
|
Ívar Óli Sigurðsson
|
661711164a
|
Adding json as a known source for Tika
|
2025-02-26 15:11:21 +01:00 |
|
Timothy Jaeryang Baek
|
3be5e3129b
|
Merge pull request #10752 from NovoNordisk-OpenSource/yvedeng/standardize-logging
refactor: replace print statements with logging
|
2025-02-25 10:53:02 -08:00 |
|
Yifang Deng
|
0e5d5ecb81
|
refactor: replace print statements with logging for better error tracking
|
2025-02-25 15:53:55 +01:00 |
|
Timothy Jaeryang Baek
|
ab1b910d80
|
Merge pull request #10486 from Micca/feature/document_intelligence_support
Feat: Adding Support for Azure AI Document Intelligence for Content Extraction (Revised)
|
2025-02-21 10:56:18 -08:00 |
|
Timothy Jaeryang Baek
|
93d486d50e
|
revert: faulty dedup code
|
2025-02-20 11:02:45 -08:00 |
|
Timothy Jaeryang Baek
|
eeb00a5ca2
|
chore: format
|
2025-02-20 01:01:29 -08:00 |
|
Youggls
|
0fb3c08181
|
feat: Add Firecrawl web loader integration
|
2025-02-19 16:54:44 +08:00 |
|
Timothy Jaeryang Baek
|
c073b8b4ee
|
refac
|
2025-02-18 23:49:27 -08:00 |
|
Timothy Jaeryang Baek
|
5465cabd40
|
refac
|
2025-02-18 21:17:09 -08:00 |
|
Timothy Jaeryang Baek
|
81715f6553
|
enh: RAG full context mode
|
2025-02-18 21:14:58 -08:00 |
|
Timothy Jaeryang Baek
|
1bbecd46c8
|
Merge pull request #10052 from roryeckel/playwright
Support Playwright RAG Web Loader: Revised
|
2025-02-18 19:57:48 -08:00 |
|
Timothy Jaeryang Baek
|
4ef7aff663
|
refac
|
2025-02-18 19:35:22 -08:00 |
|
mikhail-khludnev
|
925bfe840b
|
dedupe results from multiple queries
|
2025-02-18 20:10:57 +03:00 |
|
Rory
|
10e0c81de9
|
Merge remote-tracking branch 'upstream/dev' into playwright
# Conflicts:
# backend/open_webui/retrieval/web/utils.py
# backend/open_webui/routers/retrieval.py
|
2025-02-17 21:53:39 -06:00 |
|
Rory
|
bc82f48ebf
|
refac: RAG_WEB_LOADER -> RAG_WEB_LOADER_ENGINE
|
2025-02-17 21:43:32 -06:00 |
|
Timothy Jaeryang Baek
|
ba6cde8a87
|
fix: include_domain does NOT exist
|
2025-02-17 19:20:49 -08:00 |
|
Timothy Jaeryang Baek
|
dbe5d1ca08
|
refac
|
2025-02-17 18:16:23 -08:00 |
|
Timothy Jaeryang Baek
|
ca0b7217d2
|
enh: full context web search
|
2025-02-17 18:14:26 -08:00 |
|
Rory
|
66c2acc08d
|
Merge branch 'dev' into playwright
|
2025-02-15 22:14:16 -06:00 |
|
Timothy Jaeryang Baek
|
b0ad5cd863
|
Merge pull request #10076 from crizCraig/local_date
fix: return local date from `getFormattedDate`
|
2025-02-15 20:10:56 -08:00 |
|
Timothy Jaeryang Baek
|
3d0c06ccee
|
refac: duckduckgo
|
2025-02-15 16:45:56 -08:00 |
|
Craig Quiter
|
e67eb89e05
|
style: black format
|
2025-02-15 10:53:16 -08:00 |
|
Rory
|
8e9b00a017
|
Fix docstring
|
2025-02-14 22:48:15 -06:00 |
|
Rory
|
aa2b764d74
|
Finalize incomplete merge to update playwright branch
Introduced feature parity for trust_env
|
2025-02-14 22:32:45 -06:00 |
|
Rory
|
4da220c513
|
Merge remote-tracking branch 'upstream/dev' into playwright
# Conflicts:
# backend/open_webui/config.py
# backend/open_webui/main.py
# backend/open_webui/retrieval/web/utils.py
# backend/open_webui/routers/retrieval.py
# backend/open_webui/utils/middleware.py
# pyproject.toml
|
2025-02-14 20:48:22 -06:00 |
|
Guofeng Yi
|
b38acc8559
|
Merge branch 'dev' into feate-webloader-support-proxy
|
2025-02-15 09:50:02 +08:00 |
|
Timothy Jaeryang Baek
|
3e543691a4
|
Merge pull request #9988 from Yimi81/feat-support-async-load
feat: websearch support async docs load
|
2025-02-14 14:10:46 -08:00 |
|
LiuC0j
|
5ca39eb9fd
|
Update tavily.py
|
2025-02-14 14:56:01 +01:00 |
|
Fabio Polito
|
2419ef06a0
|
feat: docling support for document preprocessing
|
2025-02-14 12:08:03 +00:00 |
|
Yimi81
|
d3f71930f0
|
web loader support proxy
|
2025-02-14 07:15:09 +00:00 |
|
Yimi81
|
ceef600223
|
support async load for websearch
|
2025-02-14 07:05:10 +00:00 |
|
xring
|
27d395ba06
|
feat: add web search via SerpApi
|
2025-02-14 12:24:58 +08:00 |
|
Timothy Jaeryang Baek
|
5626426c31
|
chore: format
|
2025-02-12 23:28:57 -08:00 |
|
Rory
|
40d4db97e6
|
Merge remote-tracking branch 'upstream/dev' into playwright
|
2025-02-12 22:32:44 -06:00 |
|
Timothy Jaeryang Baek
|
a5bba20915
|
Merge pull request #9837 from silverriver/patch-1
feat Make Google PSE search return more than 10 google search results
|
2025-02-11 21:36:53 -08:00 |
|
Silver
|
7e08373ae5
|
Update google_pse.py to return results more than 10
|
2025-02-12 13:01:09 +08:00 |
|
Timothy Jaeryang Baek
|
8906a2e260
|
Merge pull request #9803 from BochaAI/main
add Bocha
|
2025-02-11 21:01:04 -08:00 |
|
luckyman-yan
|
31360fe991
|
add Bocha
|
2025-02-10 16:44:47 +08:00 |
|
Timothy Jaeryang Baek
|
60095598ec
|
chore: format
|
2025-02-09 22:20:47 -08:00 |
|
Rory
|
2c711d8365
|
Merge remote-tracking branch 'upstream/dev' into playwright
# Conflicts:
# backend/requirements.txt
|
2025-02-09 23:52:21 -06:00 |
|
Timothy Jaeryang Baek
|
d5a815b19c
|
Merge pull request #9693 from vinsdragonis/main
fix: Fixed error occurring when using OpenSearch as a vector db
|
2025-02-09 13:06:19 -08:00 |
|
Mazurek Michal
|
35f3824932
|
feat: Implement Document Intelligence as Content Extraction Engine
|
2025-02-07 13:44:47 +01:00 |
|