Classic298
00b3583dc2
fix: fix reindex not working due to unnecessary dupe check ( #20857 )
...
* Update retrieval.py
* Update knowledge.py
* Update retrieval.py
* Update knowledge.py
2026-01-21 18:36:08 -05:00
Timothy Jaeryang Baek
ecbdef732b
enh: PDF_LOADER_MODE
2026-01-21 23:51:36 +04:00
Classic298
182d5e8591
fix(db): release connection before embedding in process_files_batch ( #20576 )
...
Remove Depends(get_session) from POST /process/files/batch endpoint to prevent database connections from being held during batch embedding API calls (5-60+ seconds for large batches).
The save_docs_to_vector_db() function makes external embedding API calls. Post-embedding file updates (Files.update_file_by_id) manage their own short-lived sessions internally, releasing connections promptly.
2026-01-11 23:32:56 +04:00
G30
4b4743b497
feat: enforce permissions in backend ( #20471 )
...
* feat: enforce image generation permissions in backend
* feat: enforce web search permissions in backend
* feat: enforce audio (tts/stt) permissions in backend
2026-01-08 02:48:35 +04:00
Timothy Jaeryang Baek
1d08376860
refac
2026-01-05 18:55:44 +04:00
Timothy Jaeryang Baek
d3ab9f4b96
fix: failed hash in files
2026-01-05 18:21:00 +04:00
Classic298
614cb56420
feat: Add configurable DDGS backend selection with UI support ( #20366 )
...
* init
* Update WebSearch.svelte
* reorder
2026-01-05 03:05:56 +04:00
Timothy Jaeryang Baek
dc2c2f2295
refac
2026-01-03 19:48:37 +04:00
Timothy Jaeryang Baek
c324359580
feat: chunk min size target for md header splitter
...
Co-Authored-By: Classic298 <27028174+Classic298@users.noreply.github.com >
2026-01-03 19:47:29 +04:00
Timothy Jaeryang Baek
f7f8a263b9
feat: JINA_API_BASE_URL
2026-01-01 02:17:47 +04:00
Timothy Jaeryang Baek
89ad1c68d1
enh: FIRECRAWL_TIMEOUT
2026-01-01 02:07:22 +04:00
Classic298
431632d530
fix: normalize local CrossEncoder reranking scores for relevance threshold ( #20228 )
...
* Update utils.py
* Update retrieval.py
* Update utils.py
* Update retrieval.py
* add env var
* rename to SENTENCE_TRANSFORMERS_CROSS_ENCODER_SIGMOID_ACTIVATION_FUNCTION
2025-12-31 15:48:31 -05:00
Classic298
201c38a08a
fix: prevent delete_entries_from_collection crash when file is None ( #20274 )
...
Add null check after Files.get_file_by_id() before accessing file.hash. Raises HTTP 404 instead of crashing with AttributeError when file doesn't exist.
2025-12-31 02:31:26 -05:00
Classic298
46f867cda6
fix: prevent save_docs_to_vector_db crash on empty result.ids ( #20275 )
...
Add check that result.ids exists and has length > 0 before accessing result.ids[0]. Prevents IndexError when query returns empty results.
2025-12-31 02:31:05 -05:00
Timothy Jaeryang Baek
08bf4670ec
refac
2025-12-30 19:38:45 +04:00
Timothy Jaeryang Baek
18a33a079b
refac
2025-12-30 19:33:30 +04:00
Timothy Jaeryang Baek
d3a682759f
enh: ENABLE_MARKDOWN_HEADER_TEXT_SPLITTER
2025-12-30 19:31:59 +04:00
Timothy Jaeryang Baek
b1d0f00d8c
refac/enh: db session sharing
2025-12-29 00:21:18 +04:00
Timothy Jaeryang Baek
c96549eaa7
refac
2025-12-21 18:08:36 +04:00
Classic298
4fd790f7dd
feat: Apply WEB_SEARCH_CONCURRENT_REQUESTS to all search engines using semaphore ( #20070 )
...
* sequential
* zero default
* fix
2025-12-21 07:18:00 -05:00
Classic298
48ccb1e170
fix: consolidate psql cleanup logic and fix web add with cleanup ( #20072 )
...
* sequential
* consolidate logic and fix for web add
* Update WebSearch.svelte
* Update retrieval.py
* Update retrieval.py
* Update WebSearch.svelte
2025-12-21 07:14:29 -05:00
okamototk
37085ed42b
chore: update langchain 1.2.0 ( #19991 )
...
* chore: update langchain 1.2.0
* chore: format
2025-12-20 08:50:44 -05:00
Classic298
2e7c7d635d
fix: prevent ExternalReranker from blocking event loop during RAG queries ( #20049 )
...
* fix: prevent ExternalReranker from blocking event loop during RAG queries (#120 )
Co-authored-by: Tim Baek <tim@openwebui.com >
Co-authored-by: Claude <noreply@anthropic.com >
Fixes #19900
* Merge pull request open-webui#19030 from open-webui/dev (#122 )
Co-authored-by: Tim Baek <tim@openwebui.com >
Co-authored-by: Claude <noreply@anthropic.com >
Fixes #19900
---------
Co-authored-by: Tim Baek <tim@openwebui.com >
Co-authored-by: Claude <noreply@anthropic.com >
2025-12-20 08:43:40 -05:00
Timothy Jaeryang Baek
afaa404fe4
enh: mineru api timeout
2025-12-20 17:39:33 +04:00
Classic298
823b9a6dd9
chore/perf: Remove old SRC level log env vars with no impact ( #20045 )
...
* Update openai.py
* Update env.py
* Merge pull request open-webui#19030 from open-webui/dev (#119 )
Co-authored-by: Tim Baek <tim@openwebui.com >
Co-authored-by: Claude <noreply@anthropic.com >
---------
Co-authored-by: Tim Baek <tim@openwebui.com >
Co-authored-by: Claude <noreply@anthropic.com >
2025-12-20 08:16:14 -05:00
Boris Bocquet
bc681f8258
feat : new environment variable SEARXNG_LANGUAGE , in the persistent config, that you can also edit in Admin > Web Search pannel in case you choose Searxng. This is used in the request to searxng as the "search language" (arguement "language"). Before this feature, it was set to en-US only. Now default is "all". ( #19909 )
2025-12-14 12:38:47 -05:00
Timothy Jaeryang Baek
b02397e460
feat: WEB_LOADER_TIMEOUT
2025-12-08 11:49:27 -05:00
Classic298
1779090bdb
fix: add missing env var parameter pass through for enable async embedding ( #19748 )
...
* Add enable_async parameter to embedding function
* Add enable_async parameter to RAG configuration
2025-12-04 14:59:09 -05:00
Henne
a7e614ca4c
feat: Adds document intelligence model configuration ( #19692 )
...
* Adds document intelligence model configuration
Enables the configuration of the Document Intelligence model to be used by the RAG pipeline.
This allows users to specify the model they want to use for document processing, providing flexibility and control over the extraction process.
* Added Titel to Document Intelligence Model Config
Added Titel to Document Intelligence Model Config
2025-12-02 14:41:09 -05:00
Timothy Jaeryang Baek
6ce9afd95d
refac
2025-12-02 09:21:03 -05:00
Timothy Jaeryang Baek
4370dee79e
fix: async save docs to vector db
2025-11-25 17:19:33 -05:00
Timothy Jaeryang Baek
8b2015a97b
refac
2025-11-25 16:28:06 -05:00
Timothy Jaeryang Baek
6235243b62
refac
2025-11-25 05:07:53 -05:00
Timothy Jaeryang Baek
488631db98
refac
2025-11-25 02:05:27 -05:00
Timothy Jaeryang Baek
2328dc284e
feat/enh: async embedding processing setting
...
Co-Authored-By: Classic298 <27028174+Classic298@users.noreply.github.com >
2025-11-25 01:55:43 -05:00
Timothy Jaeryang Baek
9c19d0abd4
refac/breaking: docling params
2025-11-24 16:01:13 -05:00
Timothy Jaeryang Baek
48d1e67e79
chore: format
2025-11-23 20:15:52 -05:00
Classic298
902c6cfbea
perf: 50x performance improvement for external embeddings ( #19296 )
...
* Update utils.py (#77 )
Co-authored-by: Claude <noreply@anthropic.com >
* refactor: address code review feedback for embedding performance improvements (#92 )
Co-authored-by: Claude <noreply@anthropic.com >
* fix: prevent sentence transformers from blocking async event loop (#95 )
Co-authored-by: Claude <noreply@anthropic.com >
---------
Co-authored-by: Claude <noreply@anthropic.com >
2025-11-22 20:54:59 -05:00
Jacob Leksan
07ef295a77
feat: Adding file metadata to hybrid search ( #19095 )
...
* Added metadata to hybrid search
* And config and env plus refac
* consistency
---------
Co-authored-by: Tim Baek <tim@openwebui.com >
2025-11-18 15:29:07 -05:00
Timothy Jaeryang Baek
42071cb8e8
refac
2025-11-18 15:27:26 -05:00
Sang Lê
64747f7f79
Add Azure Search ( #19104 )
...
Co-authored-by: Tim Baek <tim@openwebui.com >
2025-11-13 19:12:34 -05:00
Classic298
ad17d35ac4
feat: Add custom API endpoint and user info headers for Perplexity Search ( #31 ) ( #19147 )
...
Co-authored-by: Claude <noreply@anthropic.com >
2025-11-12 22:53:54 -05:00
Timothy Jaeryang Baek
413fa27b18
refac
2025-11-09 21:09:59 -05:00
Timothy Jaeryang Baek
a65cc196a5
refac: batch file processing
...
Co-Authored-By: Sihyeon Jang <24850223+sihyeonn@users.noreply.github.com >
2025-11-09 21:06:21 -05:00
Timothy Jaeryang Baek
e69c2cf3f6
refac
2025-11-09 16:12:38 -05:00
Timothy Jaeryang Baek
25c7f101f2
enh: optionally add user headers external websearch
...
Co-Authored-By: Classic298 <27028174+Classic298@users.noreply.github.com >
2025-11-09 16:09:29 -05:00
Timothy Jaeryang Baek
e2b9942648
feat: Optionally forward user headers to external document loader
...
Co-Authored-By: Classic298 <27028174+Classic298@users.noreply.github.com >
2025-11-06 00:05:46 -05:00
Timothy Jaeryang Baek
415b93c7c3
enh: configurable mistral ocr base url
2025-11-05 23:25:51 -05:00
Timothy Jaeryang Baek
a4fd26b478
enh/fix: google pse referer header
2025-11-04 13:50:07 -05:00
palazski
288b323df8
feat: use MINERU_PARAMS json field for mineru settings
2025-10-15 22:59:59 +03:00