Commit Graph

228 Commits

Author SHA1 Message Date
Timothy J. Baek
1163745a03 revert 2024-06-12 11:08:05 -07:00
Que Nguyen
305ec59d76
Set searxng language as 'auto' and enable safesearch (moderate).
Configure searxng with language param set to auto and add "safesearch": 1 (moderate) for safer web results.
2024-06-12 21:33:33 +07:00
Timothy J. Baek
c0ca447041 chore: format 2024-06-12 01:37:53 -07:00
Timothy Jaeryang Baek
5d3db15eca
Merge pull request #3049 from que-nguyen/dev
Refactor URL validation function
2024-06-12 01:36:34 -07:00
Timothy J. Baek
e8fc522eba chore: format 2024-06-12 00:18:22 -07:00
Que Nguyen
eb7bba81fe
Refactor URL validation function
- The check for private IP addresses often did not yield the expected results, especially with errors like: `[Errno -2] Name or service not known`.
- Removed the check for private IP addresses in the URL validation process.
- Simplified the `validate_url` function to focus on validating the URL format and checking the existence of the URL using a HEAD request.
2024-06-12 08:15:04 +07:00
Timothy Jaeryang Baek
d709038b5b
Merge pull request #3029 from Yash-1511/main
feat: add DuckDuckGo search functionality using duckduckgo_search library
2024-06-11 09:53:26 -07:00
Que Nguyen
3bec60b80c
Fixed the issue where a single URL error disrupts the data loading process in Web Search mode
To address the unresolved issue in the LangChain library where a single URL error disrupts the data loading process, the lazy_load method in the WebBaseLoader class has been modified. The enhanced method now handles exceptions appropriately, logging errors and continuing with the remaining URLs.
2024-06-11 22:06:14 +07:00
Yash-1511
83f9475584 feat: add DuckDuckGo search functionality using duckduckgo_search library 2024-06-11 19:49:08 +05:30
Timothy J. Baek
bd5a8567ef refac: tools & rag 2024-06-11 01:10:24 -07:00
Timothy J. Baek
644f0fe6c3 chore: version bump 2024-06-10 13:52:35 -07:00
teampen
14d33f0fcc Merge branch 'add-serply' into dev 2024-06-09 21:40:50 -04:00
teampen
efb4a710c8 adding Serply as an alternative web search 2024-06-09 20:44:34 -04:00
Timothy J. Baek
f2b9a5f5bf refac: rag 2024-06-09 03:01:25 -07:00
mindspawn
6f9148ac4c
Update main.py 2024-06-07 21:41:30 -07:00
mindspawn
4ecc1c06d3
Update main.py 2024-06-07 21:18:04 -07:00
Timothy J. Baek
0495f01acb feat: reset upload dir 2024-06-03 21:45:36 -07:00
Timothy J. Baek
61867c1545 Update searxng.py 2024-06-03 17:02:50 -07:00
Timothy J. Baek
4068a421bf fix 2024-06-03 17:00:35 -07:00
Timothy Jaeryang Baek
768941bded
Merge pull request #2785 from cheahjs/feat/openai-embeddings-batch
feat: add RAG_EMBEDDING_OPENAI_BATCH_SIZE to batch multiple embeddings
2024-06-03 13:50:14 -07:00
Jun Siang Cheah
7fefbb316d fix: add backwards compat with older searxng urls 2024-06-03 21:13:10 +01:00
Timothy Jaeryang Baek
92d9b38110
Merge branch 'dev' into feat/openai-embeddings-batch 2024-06-03 12:39:09 -07:00
Lyuboslav Petrov
08443b3c55
Revert log level to debug 2024-06-03 12:48:40 +03:00
Lyuboslav Petrov
7e761a69a7
FIX searxng URL construction using params for arg passing
Accept additional parameters such as language, time_range, and categories to tailor the search results.
Raise an exception if a request error occurs during the search process.
Use params argument to construct the query string
Sort by relevance
Expand docstring
2024-06-03 12:44:46 +03:00
Jun Siang Cheah
0cb8163321 feat: add RAG_EMBEDDING_OPENAI_BATCH_SIZE to batch multiple embeddings 2024-06-02 15:34:31 +01:00
Timothy J. Baek
a53796270f refac: web search config 2024-06-01 20:08:08 -07:00
Timothy J. Baek
fbdfb7e4fa refac: web search 2024-06-01 19:57:00 -07:00
Timothy J. Baek
999d2bc21b refac: web search 2024-06-01 19:52:12 -07:00
Timothy J. Baek
912a704fdc refac: web search settings 2024-06-01 19:40:48 -07:00
Timothy J. Baek
ea6b8984ab refac: web search 2024-06-01 19:03:56 -07:00
Timothy J. Baek
74a8deb19f refac 2024-05-27 14:25:36 -07:00
Timothy J. Baek
4685f523b6 refac 2024-05-27 12:48:08 -07:00
Jun Siang Cheah
276b7b90b8 Merge remote-tracking branch 'upstream/dev' into feat/backend-web-search 2024-05-26 11:31:23 +01:00
Timothy J. Baek
84bfebd05e fix 2024-05-26 01:17:57 -07:00
Jun Siang Cheah
224a578e6b Merge remote-tracking branch 'upstream/dev' into feat/backend-web-search 2024-05-20 19:53:23 +01:00
Jun Siang Cheah
eb509c460a Merge remote-tracking branch 'origin/dev' into feat/backend-web-search 2024-05-20 18:01:29 +01:00
Timothy J. Baek
322db31dc9 fix: rag 2024-05-20 07:22:43 -07:00
Timothy J. Baek
5376525777 refac 2024-05-19 06:51:32 -07:00
Timothy J. Baek
400bfa5a02 fix: rag config.json 2024-05-17 19:53:38 -07:00
Jun Siang Cheah
5e1c408937 Merge branch 'dev' into feat/backend-web-search 2024-05-14 14:03:23 +08:00
Jun Siang Cheah
9ed1a31575 fix: continue with failures when bulk loading urls with WebBaseLoader 2024-05-12 15:19:07 +08:00
Jun Siang Cheah
77928ae141 Merge branch 'dev' of https://github.com/open-webui/open-webui into feat/web-search-toggle 2024-05-11 23:51:37 +08:00
Jun Siang Cheah
2660a6e5b8 feat: prototype frontend web search integration 2024-05-11 23:44:34 +08:00
Jun Siang Cheah
fb8069123e feat: add WEB_SEARCH_RESULT_COUNT to control max number of results 2024-05-11 23:18:59 +08:00
Jun Siang Cheah
298e6848b3 feat: switch to config proxy, remove config_get/set 2024-05-10 15:03:24 +08:00
Jun Siang Cheah
058eb76568 feat: save UI config changes to config.json 2024-05-10 13:51:50 +08:00
Timothy J. Baek
06cbe337de feat: youtube loader language env var 2024-05-08 10:51:29 -07:00
Timothy J. Baek
d3822f782c feat: non-english youtube support 2024-05-08 10:47:05 -07:00
Aarni Koskela
61bb1f1dc8 fix: do not use hardware ID in document ID generation 2024-05-07 11:42:05 +03:00
Timothy Jaeryang Baek
635951b55c
Merge branch 'dev' into feat/backend-web-search 2024-05-06 16:26:44 -07:00
Timothy J. Baek
64ed0d1089 refac: include source name to citation 2024-05-06 16:16:26 -07:00
Timothy J. Baek
4c490132ba refac: styling 2024-05-06 16:16:26 -07:00
Jun Siang Cheah
0872bea790 feat: show RAG query results as citations 2024-05-06 16:14:10 -07:00
Timothy J. Baek
cecb87b8c2 feat: web_loader_ssl_verification setting 2024-05-06 14:50:55 -07:00
Timothy J. Baek
95f579cabe feat: rag ssl verification env var
Co-Authored-By: Tobias Steidle <tobias.steidle@softwaredev.de>
2024-05-06 13:12:08 -07:00
Jun Siang Cheah
8b3e370a6e fix: run formatter 2024-05-06 17:11:04 +08:00
Jun Siang Cheah
83f086ccdd fix: do not return raw search exception due to API keys in URLs 2024-05-06 17:09:04 +08:00
Jun Siang Cheah
99e4edd364 feat: add websearch endpoint to RAG API
fix: google PSE endpoint uses GET

fix: google PSE returns link, not url

fix: serper wrong field
2024-05-06 17:09:04 +08:00
Jun Siang Cheah
501ff7a98b feat: backend implementation of various search APIs 2024-05-06 12:28:41 +08:00
tabacoWang
fffd283b0c fix:
fix: Change the type from int to float
2024-05-02 13:45:19 +08:00
Timothy J. Baek
0595c04909 feat: youtube rag 2024-05-01 17:17:00 -07:00
Yanyutin753
c0bb32d768 📌 fixed a bug where RAG would not reply after not reading the file correctly 2024-04-30 13:51:30 +08:00
Timothy Jaeryang Baek
1afc49c1e4
Merge pull request #1862 from cheahjs/feat/filter-local-rag-fetch
feat: add ENABLE_LOCAL_WEB_FETCH to protect against SSRF attacks
2024-04-29 15:51:17 -07:00
Jun Siang Cheah
1c4e63f71e feat: add ENABLE_LOCAL_WEB_FETCH to protect against SSRF attacks 2024-04-29 20:55:17 +01:00
Steven Kreitzer
5b8fd14470 fix: various api rag results 2024-04-29 12:17:36 -05:00
Yanyutin753
b0245a7eff feat added environment variables and sync.yml 2024-04-28 06:54:26 +08:00
Timothy J. Baek
ce9a5d12e0 refac: rag pipeline 2024-04-27 15:38:50 -04:00
Timothy J. Baek
8f1563a7a5 fix: typo 2024-04-27 15:03:49 -04:00
Timothy J. Baek
9be56d68e0 refac: naming convention 2024-04-27 15:02:57 -04:00
Timothy J. Baek
cebf733b9d refac: naming convention 2024-04-26 14:41:39 -04:00
Steven Kreitzer
69822e4c25 fix: sort ranking hybrid 2024-04-26 07:56:41 -05:00
Steven Kreitzer
9755cd5baa feat: toggle hybrid search 2024-04-25 17:51:38 -05:00
Timothy J. Baek
984dbf13ab revert: original rag pipeline 2024-04-25 17:03:00 -04:00
Steven Kreitzer
1c1d2c254d fix: query collection api call 2024-04-25 13:38:18 -05:00
Steven Kreitzer
72090fab88 chore: update log line 2024-04-25 13:28:31 -05:00
Steven Kreitzer
c9c9660459 fix: address comment in pr #1687 2024-04-25 07:50:42 -05:00
Steven Kreitzer
c0259aad67 feat: hybrid search and reranking support 2024-04-24 07:55:10 -05:00
Steven Kreitzer
4e0b32b505 feat: hybrid search 2024-04-22 18:33:43 -05:00
Steven Kreitzer
f3e5700d49 feat: move to native sentence_transformer 2024-04-22 14:20:41 -05:00
Timothy J. Baek
713934edb6 refac 2024-04-20 15:21:52 -05:00
Timothy J. Baek
710850e442 refac: audio 2024-04-20 15:15:59 -05:00
Timothy J. Baek
741ed5dc4c fix 2024-04-14 19:56:33 -04:00
Timothy J. Baek
b1b72441bb feat: openai embeddings integration 2024-04-14 19:48:15 -04:00
Timothy J. Baek
b48e73fa43 feat: openai embeddings support 2024-04-14 19:15:39 -04:00
Timothy J. Baek
36ce157907 fix: integration 2024-04-14 18:47:45 -04:00
Timothy J. Baek
9cdb5bf9fe feat: frontend integration 2024-04-14 18:31:40 -04:00
Timothy J. Baek
2952e61167 feat: external embeddings support 2024-04-14 17:55:00 -04:00
Timothy Jaeryang Baek
b9cadff16b
Merge pull request #1419 from lainedfles/embedding-model-fix-and-manual-update
feat: improve embedding model update & resolve network dependency
2024-04-10 01:10:07 -07:00
Timothy J. Baek
582d11f191 refac: RAG_EMBEDDING_MODEL_PATH removed 2024-04-10 00:59:05 -07:00
Timothy J. Baek
cb2158a794 fix 2024-04-10 00:51:16 -07:00
Timothy J. Baek
abfcceecef refac 2024-04-10 00:46:09 -07:00
Timothy J. Baek
f4b87ecb23 refac 2024-04-10 00:33:45 -07:00
Steven Kreitzer
0bae789d39
fix: support batching chromadb 2024-04-09 10:13:29 -05:00
lainedfles
506a061387
Merge branch 'dev' into embedding-model-fix-and-manual-update 2024-04-08 14:57:54 -06:00
Jannik S
3b3d0cce1e
Merge branch 'dev' into dockerfile-optimisation 2024-04-08 09:15:00 +02:00
Timothy J. Baek
e61e1b079f fix: file upload issue 2024-04-04 17:38:59 -07:00
Self Denial
9f82f5abba Formatting... 2024-04-04 12:09:48 -06:00
Self Denial
075fbedb02 More format fixes 2024-04-04 12:07:42 -06:00
Self Denial
bcf79c8366 Format fixes 2024-04-04 12:02:48 -06:00
Self Denial
3b66aa55c0 Improve embedding model update & resolve network dependency
* Add config variable RAG_EMBEDDING_MODEL_AUTO_UPDATE to control update behavior
* Add RAG utils embedding_model_get_path() function to output the filesystem path in addition to update of the model using huggingface_hub
* Update and utilize existing RAG functions in main: get_embedding_model() & update_embedding_model()
* Add GUI setting to execute manual update process
2024-04-04 11:01:23 -06:00