Commit Graph

227 Commits

Author SHA1 Message Date
Que Nguyen
305ec59d76
Set searxng language as 'auto' and enable safesearch (moderate).
Configure searxng with language param set to auto and add "safesearch": 1 (moderate) for safer web results.
2024-06-12 21:33:33 +07:00
Timothy J. Baek
c0ca447041 chore: format 2024-06-12 01:37:53 -07:00
Timothy Jaeryang Baek
5d3db15eca
Merge pull request #3049 from que-nguyen/dev
Refactor URL validation function
2024-06-12 01:36:34 -07:00
Timothy J. Baek
e8fc522eba chore: format 2024-06-12 00:18:22 -07:00
Que Nguyen
eb7bba81fe
Refactor URL validation function
- The check for private IP addresses often did not yield the expected results, especially with errors like: `[Errno -2] Name or service not known`.
- Removed the check for private IP addresses in the URL validation process.
- Simplified the `validate_url` function to focus on validating the URL format and checking the existence of the URL using a HEAD request.
2024-06-12 08:15:04 +07:00
Timothy Jaeryang Baek
d709038b5b
Merge pull request #3029 from Yash-1511/main
feat: add DuckDuckGo search functionality using duckduckgo_search library
2024-06-11 09:53:26 -07:00
Que Nguyen
3bec60b80c
Fixed the issue where a single URL error disrupts the data loading process in Web Search mode
To address the unresolved issue in the LangChain library where a single URL error disrupts the data loading process, the lazy_load method in the WebBaseLoader class has been modified. The enhanced method now handles exceptions appropriately, logging errors and continuing with the remaining URLs.
2024-06-11 22:06:14 +07:00
Yash-1511
83f9475584 feat: add DuckDuckGo search functionality using duckduckgo_search library 2024-06-11 19:49:08 +05:30
Timothy J. Baek
bd5a8567ef refac: tools & rag 2024-06-11 01:10:24 -07:00
Timothy J. Baek
644f0fe6c3 chore: version bump 2024-06-10 13:52:35 -07:00
teampen
14d33f0fcc Merge branch 'add-serply' into dev 2024-06-09 21:40:50 -04:00
teampen
efb4a710c8 adding Serply as an alternative web search 2024-06-09 20:44:34 -04:00
Timothy J. Baek
f2b9a5f5bf refac: rag 2024-06-09 03:01:25 -07:00
mindspawn
6f9148ac4c
Update main.py 2024-06-07 21:41:30 -07:00
mindspawn
4ecc1c06d3
Update main.py 2024-06-07 21:18:04 -07:00
Timothy J. Baek
0495f01acb feat: reset upload dir 2024-06-03 21:45:36 -07:00
Timothy J. Baek
61867c1545 Update searxng.py 2024-06-03 17:02:50 -07:00
Timothy J. Baek
4068a421bf fix 2024-06-03 17:00:35 -07:00
Timothy Jaeryang Baek
768941bded
Merge pull request #2785 from cheahjs/feat/openai-embeddings-batch
feat: add RAG_EMBEDDING_OPENAI_BATCH_SIZE to batch multiple embeddings
2024-06-03 13:50:14 -07:00
Jun Siang Cheah
7fefbb316d fix: add backwards compat with older searxng urls 2024-06-03 21:13:10 +01:00
Timothy Jaeryang Baek
92d9b38110
Merge branch 'dev' into feat/openai-embeddings-batch 2024-06-03 12:39:09 -07:00
Lyuboslav Petrov
08443b3c55
Revert log level to debug 2024-06-03 12:48:40 +03:00
Lyuboslav Petrov
7e761a69a7
FIX searxng URL construction using params for arg passing
Accept additional parameters such as language, time_range, and categories to tailor the search results.
Raise an exception if a request error occurs during the search process.
Use params argument to construct the query string
Sort by relevance
Expand docstring
2024-06-03 12:44:46 +03:00
Jun Siang Cheah
0cb8163321 feat: add RAG_EMBEDDING_OPENAI_BATCH_SIZE to batch multiple embeddings 2024-06-02 15:34:31 +01:00
Timothy J. Baek
a53796270f refac: web search config 2024-06-01 20:08:08 -07:00
Timothy J. Baek
fbdfb7e4fa refac: web search 2024-06-01 19:57:00 -07:00
Timothy J. Baek
999d2bc21b refac: web search 2024-06-01 19:52:12 -07:00
Timothy J. Baek
912a704fdc refac: web search settings 2024-06-01 19:40:48 -07:00
Timothy J. Baek
ea6b8984ab refac: web search 2024-06-01 19:03:56 -07:00
Timothy J. Baek
74a8deb19f refac 2024-05-27 14:25:36 -07:00
Timothy J. Baek
4685f523b6 refac 2024-05-27 12:48:08 -07:00
Jun Siang Cheah
276b7b90b8 Merge remote-tracking branch 'upstream/dev' into feat/backend-web-search 2024-05-26 11:31:23 +01:00
Timothy J. Baek
84bfebd05e fix 2024-05-26 01:17:57 -07:00
Jun Siang Cheah
224a578e6b Merge remote-tracking branch 'upstream/dev' into feat/backend-web-search 2024-05-20 19:53:23 +01:00
Jun Siang Cheah
eb509c460a Merge remote-tracking branch 'origin/dev' into feat/backend-web-search 2024-05-20 18:01:29 +01:00
Timothy J. Baek
322db31dc9 fix: rag 2024-05-20 07:22:43 -07:00
Timothy J. Baek
5376525777 refac 2024-05-19 06:51:32 -07:00
Timothy J. Baek
400bfa5a02 fix: rag config.json 2024-05-17 19:53:38 -07:00
Jun Siang Cheah
5e1c408937 Merge branch 'dev' into feat/backend-web-search 2024-05-14 14:03:23 +08:00
Jun Siang Cheah
9ed1a31575 fix: continue with failures when bulk loading urls with WebBaseLoader 2024-05-12 15:19:07 +08:00
Jun Siang Cheah
77928ae141 Merge branch 'dev' of https://github.com/open-webui/open-webui into feat/web-search-toggle 2024-05-11 23:51:37 +08:00
Jun Siang Cheah
2660a6e5b8 feat: prototype frontend web search integration 2024-05-11 23:44:34 +08:00
Jun Siang Cheah
fb8069123e feat: add WEB_SEARCH_RESULT_COUNT to control max number of results 2024-05-11 23:18:59 +08:00
Jun Siang Cheah
298e6848b3 feat: switch to config proxy, remove config_get/set 2024-05-10 15:03:24 +08:00
Jun Siang Cheah
058eb76568 feat: save UI config changes to config.json 2024-05-10 13:51:50 +08:00
Timothy J. Baek
06cbe337de feat: youtube loader language env var 2024-05-08 10:51:29 -07:00
Timothy J. Baek
d3822f782c feat: non-english youtube support 2024-05-08 10:47:05 -07:00
Aarni Koskela
61bb1f1dc8 fix: do not use hardware ID in document ID generation 2024-05-07 11:42:05 +03:00
Timothy Jaeryang Baek
635951b55c
Merge branch 'dev' into feat/backend-web-search 2024-05-06 16:26:44 -07:00
Timothy J. Baek
64ed0d1089 refac: include source name to citation 2024-05-06 16:16:26 -07:00
Timothy J. Baek
4c490132ba refac: styling 2024-05-06 16:16:26 -07:00
Jun Siang Cheah
0872bea790 feat: show RAG query results as citations 2024-05-06 16:14:10 -07:00
Timothy J. Baek
cecb87b8c2 feat: web_loader_ssl_verification setting 2024-05-06 14:50:55 -07:00
Timothy J. Baek
95f579cabe feat: rag ssl verification env var
Co-Authored-By: Tobias Steidle <tobias.steidle@softwaredev.de>
2024-05-06 13:12:08 -07:00
Jun Siang Cheah
8b3e370a6e fix: run formatter 2024-05-06 17:11:04 +08:00
Jun Siang Cheah
83f086ccdd fix: do not return raw search exception due to API keys in URLs 2024-05-06 17:09:04 +08:00
Jun Siang Cheah
99e4edd364 feat: add websearch endpoint to RAG API
fix: google PSE endpoint uses GET

fix: google PSE returns link, not url

fix: serper wrong field
2024-05-06 17:09:04 +08:00
Jun Siang Cheah
501ff7a98b feat: backend implementation of various search APIs 2024-05-06 12:28:41 +08:00
tabacoWang
fffd283b0c fix:
fix: Change the type from int to float
2024-05-02 13:45:19 +08:00
Timothy J. Baek
0595c04909 feat: youtube rag 2024-05-01 17:17:00 -07:00
Yanyutin753
c0bb32d768 📌 fixed a bug where RAG would not reply after not reading the file correctly 2024-04-30 13:51:30 +08:00
Timothy Jaeryang Baek
1afc49c1e4
Merge pull request #1862 from cheahjs/feat/filter-local-rag-fetch
feat: add ENABLE_LOCAL_WEB_FETCH to protect against SSRF attacks
2024-04-29 15:51:17 -07:00
Jun Siang Cheah
1c4e63f71e feat: add ENABLE_LOCAL_WEB_FETCH to protect against SSRF attacks 2024-04-29 20:55:17 +01:00
Steven Kreitzer
5b8fd14470 fix: various api rag results 2024-04-29 12:17:36 -05:00
Yanyutin753
b0245a7eff feat added environment variables and sync.yml 2024-04-28 06:54:26 +08:00
Timothy J. Baek
ce9a5d12e0 refac: rag pipeline 2024-04-27 15:38:50 -04:00
Timothy J. Baek
8f1563a7a5 fix: typo 2024-04-27 15:03:49 -04:00
Timothy J. Baek
9be56d68e0 refac: naming convention 2024-04-27 15:02:57 -04:00
Timothy J. Baek
cebf733b9d refac: naming convention 2024-04-26 14:41:39 -04:00
Steven Kreitzer
69822e4c25 fix: sort ranking hybrid 2024-04-26 07:56:41 -05:00
Steven Kreitzer
9755cd5baa feat: toggle hybrid search 2024-04-25 17:51:38 -05:00
Timothy J. Baek
984dbf13ab revert: original rag pipeline 2024-04-25 17:03:00 -04:00
Steven Kreitzer
1c1d2c254d fix: query collection api call 2024-04-25 13:38:18 -05:00
Steven Kreitzer
72090fab88 chore: update log line 2024-04-25 13:28:31 -05:00
Steven Kreitzer
c9c9660459 fix: address comment in pr #1687 2024-04-25 07:50:42 -05:00
Steven Kreitzer
c0259aad67 feat: hybrid search and reranking support 2024-04-24 07:55:10 -05:00
Steven Kreitzer
4e0b32b505 feat: hybrid search 2024-04-22 18:33:43 -05:00
Steven Kreitzer
f3e5700d49 feat: move to native sentence_transformer 2024-04-22 14:20:41 -05:00
Timothy J. Baek
713934edb6 refac 2024-04-20 15:21:52 -05:00
Timothy J. Baek
710850e442 refac: audio 2024-04-20 15:15:59 -05:00
Timothy J. Baek
741ed5dc4c fix 2024-04-14 19:56:33 -04:00
Timothy J. Baek
b1b72441bb feat: openai embeddings integration 2024-04-14 19:48:15 -04:00
Timothy J. Baek
b48e73fa43 feat: openai embeddings support 2024-04-14 19:15:39 -04:00
Timothy J. Baek
36ce157907 fix: integration 2024-04-14 18:47:45 -04:00
Timothy J. Baek
9cdb5bf9fe feat: frontend integration 2024-04-14 18:31:40 -04:00
Timothy J. Baek
2952e61167 feat: external embeddings support 2024-04-14 17:55:00 -04:00
Timothy Jaeryang Baek
b9cadff16b
Merge pull request #1419 from lainedfles/embedding-model-fix-and-manual-update
feat: improve embedding model update & resolve network dependency
2024-04-10 01:10:07 -07:00
Timothy J. Baek
582d11f191 refac: RAG_EMBEDDING_MODEL_PATH removed 2024-04-10 00:59:05 -07:00
Timothy J. Baek
cb2158a794 fix 2024-04-10 00:51:16 -07:00
Timothy J. Baek
abfcceecef refac 2024-04-10 00:46:09 -07:00
Timothy J. Baek
f4b87ecb23 refac 2024-04-10 00:33:45 -07:00
Steven Kreitzer
0bae789d39
fix: support batching chromadb 2024-04-09 10:13:29 -05:00
lainedfles
506a061387
Merge branch 'dev' into embedding-model-fix-and-manual-update 2024-04-08 14:57:54 -06:00
Jannik S
3b3d0cce1e
Merge branch 'dev' into dockerfile-optimisation 2024-04-08 09:15:00 +02:00
Timothy J. Baek
e61e1b079f fix: file upload issue 2024-04-04 17:38:59 -07:00
Self Denial
9f82f5abba Formatting... 2024-04-04 12:09:48 -06:00
Self Denial
075fbedb02 More format fixes 2024-04-04 12:07:42 -06:00
Self Denial
bcf79c8366 Format fixes 2024-04-04 12:02:48 -06:00
Self Denial
3b66aa55c0 Improve embedding model update & resolve network dependency
* Add config variable RAG_EMBEDDING_MODEL_AUTO_UPDATE to control update behavior
* Add RAG utils embedding_model_get_path() function to output the filesystem path in addition to update of the model using huggingface_hub
* Update and utilize existing RAG functions in main: get_embedding_model() & update_embedding_model()
* Add GUI setting to execute manual update process
2024-04-04 11:01:23 -06:00
Mmx233
947c392f72
fix: manually check the docs' filename 2024-04-03 23:37:13 +08:00