Michael Poluektov
038fc48ac0
replace == None with is None
2024-08-14 13:39:53 +01:00
Alexandre GODARD
7a8f8960c5
Update main.py
...
Fix typo in update_reranking_model
2024-08-13 17:51:25 +02:00
José Luis Di Biase
23c9122458
chore RAG: adding languages known extension for erlang, elixir, haskell and jsx/tsx
...
Signed-off-by: José Luis Di Biase <josx@interorganic.com.ar>
2024-07-18 17:48:39 -03:00
Timothy J. Baek
dbc352f01b
refac: documents file handling
2024-07-15 13:05:38 +02:00
Timothy Jaeryang Baek
7e6c5193d6
Merge pull request #3688 from leobenkel/no-trace-when-success
...
fix: Remove the tracestack when the collection already exists
2024-07-07 09:00:23 -07:00
Leo Benkel
a73a9c7310
Remove the tracestack when the collection already exists
2024-07-06 23:20:41 +02:00
Timothy J. Baek
a392865615
refac
2024-07-01 17:11:09 -07:00
Timothy Jaeryang Baek
3c1ea24374
Merge pull request #3582 from nickovs/tika-document-text
...
feat: Support Tika for document text extraction
2024-07-01 17:07:40 -07:00
Nicko van Someren
7aa35a3757
Added HTML and Typescript UI components to support configration of text extraction engine.
...
Updated RAG /config and /config/update endpoints to support UI updates.
Fixed .dockerignore to prevent Python venv from being copied into Docker image.
2024-07-01 12:10:59 -06:00
Jun Siang Cheah
a48ac6a209
refac: lazily load sentence_transformers to reduce start up memory usage
2024-07-01 08:13:56 +08:00
Nicko van Someren
9cf622d981
Added support for using Apache Tika as a document loader.
...
Added persistent configuration options to configure use and location of Tika service.
Updated backend.apps.rag.main:get_loader() to make use of Tika document loader.
2024-06-30 15:49:15 -06:00
Timothy J. Baek
3f5f410453
refac
2024-06-27 11:29:59 -07:00
Timothy Jaeryang Baek
fd96c9c68d
Merge pull request #3380 from Yash-1511/main
...
feat: add jina_search as new websearch provider
2024-06-22 15:19:38 -07:00
Yash-1511
7c9fb9199e
feat: add jina_search as new websearch provider
2024-06-22 20:06:15 +05:30
Timothy J. Baek
83986620ee
refac
2024-06-18 14:15:08 -07:00
Timothy J. Baek
9e7b7a895e
refac: file upload
2024-06-18 13:50:18 -07:00
Timothy J. Baek
b1d83fc42c
chore: format
2024-06-17 14:32:23 -07:00
Que Nguyen
a3ac9ee774
Refactor main.py
...
Rename RAG_WEB_SEARCH_WHITE_LIST_DOMAINS to RAG_WEB_SEARCH_DOMAIN_FILTER_LIST
2024-06-17 14:31:44 +07:00
Que Nguyen
a02ba52de8
Merge branch 'dev' into searxng
2024-06-15 23:44:31 +07:00
Yash-1511
b9da72560a
feat: add tavily web search in web search provider
2024-06-14 20:44:11 +05:30
Que Nguyen
7b5f434a07
Implement domain whitelisting for web search results
2024-06-13 07:14:48 +07:00
Timothy J. Baek
1163745a03
revert
2024-06-12 11:08:05 -07:00
Timothy J. Baek
c0ca447041
chore: format
2024-06-12 01:37:53 -07:00
Timothy Jaeryang Baek
5d3db15eca
Merge pull request #3049 from que-nguyen/dev
...
Refactor URL validation function
2024-06-12 01:36:34 -07:00
Timothy J. Baek
e8fc522eba
chore: format
2024-06-12 00:18:22 -07:00
Que Nguyen
eb7bba81fe
Refactor URL validation function
...
- The check for private IP addresses often did not yield the expected results, especially with errors like: `[Errno -2] Name or service not known`.
- Removed the check for private IP addresses in the URL validation process.
- Simplified the `validate_url` function to focus on validating the URL format and checking the existence of the URL using a HEAD request.
2024-06-12 08:15:04 +07:00
Timothy Jaeryang Baek
d709038b5b
Merge pull request #3029 from Yash-1511/main
...
feat: add DuckDuckGo search functionality using duckduckgo_search library
2024-06-11 09:53:26 -07:00
Que Nguyen
3bec60b80c
Fixed the issue where a single URL error disrupts the data loading process in Web Search mode
...
To address the unresolved issue in the LangChain library where a single URL error disrupts the data loading process, the lazy_load method in the WebBaseLoader class has been modified. The enhanced method now handles exceptions appropriately, logging errors and continuing with the remaining URLs.
2024-06-11 22:06:14 +07:00
Yash-1511
83f9475584
feat: add DuckDuckGo search functionality using duckduckgo_search library
2024-06-11 19:49:08 +05:30
teampen
14d33f0fcc
Merge branch 'add-serply' into dev
2024-06-09 21:40:50 -04:00
teampen
efb4a710c8
adding Serply as an alternative web search
2024-06-09 20:44:34 -04:00
mindspawn
6f9148ac4c
Update main.py
2024-06-07 21:41:30 -07:00
mindspawn
4ecc1c06d3
Update main.py
2024-06-07 21:18:04 -07:00
Timothy J. Baek
0495f01acb
feat: reset upload dir
2024-06-03 21:45:36 -07:00
Jun Siang Cheah
0cb8163321
feat: add RAG_EMBEDDING_OPENAI_BATCH_SIZE to batch multiple embeddings
2024-06-02 15:34:31 +01:00
Timothy J. Baek
a53796270f
refac: web search config
2024-06-01 20:08:08 -07:00
Timothy J. Baek
fbdfb7e4fa
refac: web search
2024-06-01 19:57:00 -07:00
Timothy J. Baek
999d2bc21b
refac: web search
2024-06-01 19:52:12 -07:00
Timothy J. Baek
912a704fdc
refac: web search settings
2024-06-01 19:40:48 -07:00
Timothy J. Baek
ea6b8984ab
refac: web search
2024-06-01 19:03:56 -07:00
Timothy J. Baek
74a8deb19f
refac
2024-05-27 14:25:36 -07:00
Timothy J. Baek
4685f523b6
refac
2024-05-27 12:48:08 -07:00
Jun Siang Cheah
276b7b90b8
Merge remote-tracking branch 'upstream/dev' into feat/backend-web-search
2024-05-26 11:31:23 +01:00
Timothy J. Baek
84bfebd05e
fix
2024-05-26 01:17:57 -07:00
Jun Siang Cheah
224a578e6b
Merge remote-tracking branch 'upstream/dev' into feat/backend-web-search
2024-05-20 19:53:23 +01:00
Jun Siang Cheah
eb509c460a
Merge remote-tracking branch 'origin/dev' into feat/backend-web-search
2024-05-20 18:01:29 +01:00
Timothy J. Baek
322db31dc9
fix: rag
2024-05-20 07:22:43 -07:00
Timothy J. Baek
5376525777
refac
2024-05-19 06:51:32 -07:00
Timothy J. Baek
400bfa5a02
fix: rag config.json
2024-05-17 19:53:38 -07:00
Jun Siang Cheah
5e1c408937
Merge branch 'dev' into feat/backend-web-search
2024-05-14 14:03:23 +08:00