Commit Graph

183 Commits

Author SHA1 Message Date
Pascal Lim
c386d0b1a5 sort and fix backend imports 2024-08-30 22:26:22 +02:00
Rajendra Kadam
7e1923fcfe Add searchapi as an alternative web search
Add config changes for SearchApi api key and engine

Add searchapi results json in testdata
2024-08-28 15:26:33 +05:30
Timothy Jaeryang Baek
9dade91ef5
Merge pull request #4917 from Yanyutin753/upload_files_limit
🤖 Limit the size and number of uploaded files
2024-08-27 17:09:52 +02:00
Timothy J. Baek
6a21a77ee9 refac 2024-08-27 17:05:24 +02:00
Timothy J. Baek
69c4687a53 refac 2024-08-27 15:53:29 +02:00
Timothy J. Baek
09cba5b87a refac: rm sub standard code 2024-08-27 15:51:40 +02:00
Timothy J. Baek
600409682e refac: do not change default behaviour 2024-08-27 15:30:57 +02:00
Timothy J. Baek
062649e483 refac: endpoints regarding db operations 2024-08-27 14:01:00 +02:00
Clivia
775478534a 👀 Fix Common users cannot upload files
💄Fix format

💄Fix format i18

 Feat paste upload files and make restrictions

 Feat paste upload files and make restrictions
2024-08-26 23:36:13 +08:00
Clivia
b6da4baa97 💄 Limit the size and number of uploaded files
💄 Limit the size and number of uploaded files
2024-08-26 23:36:13 +08:00
Craig Quiter
d2f10d50bf Allow seting CORS origin 2024-08-18 14:19:29 -07:00
Michael Poluektov
29f904db45 remove List imports 2024-08-14 13:46:31 +01:00
Michael Poluektov
038fc48ac0 replace == None with is None 2024-08-14 13:39:53 +01:00
Alexandre GODARD
7a8f8960c5
Update main.py
Fix typo in update_reranking_model
2024-08-13 17:51:25 +02:00
José Luis Di Biase
23c9122458 chore RAG: adding languages known extension for erlang, elixir, haskell and jsx/tsx
Signed-off-by: José Luis Di Biase <josx@interorganic.com.ar>
2024-07-18 17:48:39 -03:00
Timothy J. Baek
dbc352f01b refac: documents file handling 2024-07-15 13:05:38 +02:00
Timothy Jaeryang Baek
7e6c5193d6
Merge pull request #3688 from leobenkel/no-trace-when-success
fix: Remove the tracestack when the collection already exists
2024-07-07 09:00:23 -07:00
Leo Benkel
a73a9c7310 Remove the tracestack when the collection already exists 2024-07-06 23:20:41 +02:00
Timothy J. Baek
a392865615 refac 2024-07-01 17:11:09 -07:00
Timothy Jaeryang Baek
3c1ea24374
Merge pull request #3582 from nickovs/tika-document-text
feat: Support Tika for document text extraction
2024-07-01 17:07:40 -07:00
Nicko van Someren
7aa35a3757 Added HTML and Typescript UI components to support configration of text extraction engine.
Updated RAG /config and /config/update endpoints to support UI updates.

Fixed .dockerignore to prevent Python venv from being copied into Docker image.
2024-07-01 12:10:59 -06:00
Jun Siang Cheah
a48ac6a209 refac: lazily load sentence_transformers to reduce start up memory usage 2024-07-01 08:13:56 +08:00
Nicko van Someren
9cf622d981 Added support for using Apache Tika as a document loader.
Added persistent configuration options to configure use and location of Tika service.

Updated backend.apps.rag.main:get_loader() to make use of Tika document loader.
2024-06-30 15:49:15 -06:00
Timothy J. Baek
3f5f410453 refac 2024-06-27 11:29:59 -07:00
Timothy Jaeryang Baek
fd96c9c68d
Merge pull request #3380 from Yash-1511/main
feat: add jina_search as new websearch provider
2024-06-22 15:19:38 -07:00
Yash-1511
7c9fb9199e feat: add jina_search as new websearch provider 2024-06-22 20:06:15 +05:30
Timothy J. Baek
83986620ee refac 2024-06-18 14:15:08 -07:00
Timothy J. Baek
9e7b7a895e refac: file upload 2024-06-18 13:50:18 -07:00
Timothy J. Baek
b1d83fc42c chore: format 2024-06-17 14:32:23 -07:00
Que Nguyen
a3ac9ee774
Refactor main.py
Rename RAG_WEB_SEARCH_WHITE_LIST_DOMAINS to RAG_WEB_SEARCH_DOMAIN_FILTER_LIST
2024-06-17 14:31:44 +07:00
Que Nguyen
a02ba52de8
Merge branch 'dev' into searxng 2024-06-15 23:44:31 +07:00
Yash-1511
b9da72560a feat: add tavily web search in web search provider 2024-06-14 20:44:11 +05:30
Que Nguyen
7b5f434a07
Implement domain whitelisting for web search results 2024-06-13 07:14:48 +07:00
Timothy J. Baek
1163745a03 revert 2024-06-12 11:08:05 -07:00
Timothy J. Baek
c0ca447041 chore: format 2024-06-12 01:37:53 -07:00
Timothy Jaeryang Baek
5d3db15eca
Merge pull request #3049 from que-nguyen/dev
Refactor URL validation function
2024-06-12 01:36:34 -07:00
Timothy J. Baek
e8fc522eba chore: format 2024-06-12 00:18:22 -07:00
Que Nguyen
eb7bba81fe
Refactor URL validation function
- The check for private IP addresses often did not yield the expected results, especially with errors like: `[Errno -2] Name or service not known`.
- Removed the check for private IP addresses in the URL validation process.
- Simplified the `validate_url` function to focus on validating the URL format and checking the existence of the URL using a HEAD request.
2024-06-12 08:15:04 +07:00
Timothy Jaeryang Baek
d709038b5b
Merge pull request #3029 from Yash-1511/main
feat: add DuckDuckGo search functionality using duckduckgo_search library
2024-06-11 09:53:26 -07:00
Que Nguyen
3bec60b80c
Fixed the issue where a single URL error disrupts the data loading process in Web Search mode
To address the unresolved issue in the LangChain library where a single URL error disrupts the data loading process, the lazy_load method in the WebBaseLoader class has been modified. The enhanced method now handles exceptions appropriately, logging errors and continuing with the remaining URLs.
2024-06-11 22:06:14 +07:00
Yash-1511
83f9475584 feat: add DuckDuckGo search functionality using duckduckgo_search library 2024-06-11 19:49:08 +05:30
teampen
14d33f0fcc Merge branch 'add-serply' into dev 2024-06-09 21:40:50 -04:00
teampen
efb4a710c8 adding Serply as an alternative web search 2024-06-09 20:44:34 -04:00
mindspawn
6f9148ac4c
Update main.py 2024-06-07 21:41:30 -07:00
mindspawn
4ecc1c06d3
Update main.py 2024-06-07 21:18:04 -07:00
Timothy J. Baek
0495f01acb feat: reset upload dir 2024-06-03 21:45:36 -07:00
Jun Siang Cheah
0cb8163321 feat: add RAG_EMBEDDING_OPENAI_BATCH_SIZE to batch multiple embeddings 2024-06-02 15:34:31 +01:00
Timothy J. Baek
a53796270f refac: web search config 2024-06-01 20:08:08 -07:00
Timothy J. Baek
fbdfb7e4fa refac: web search 2024-06-01 19:57:00 -07:00
Timothy J. Baek
999d2bc21b refac: web search 2024-06-01 19:52:12 -07:00