Aarni Koskela
61bb1f1dc8
fix: do not use hardware ID in document ID generation
2024-05-07 11:42:05 +03:00
Timothy Jaeryang Baek
635951b55c
Merge branch 'dev' into feat/backend-web-search
2024-05-06 16:26:44 -07:00
Timothy J. Baek
64ed0d1089
refac: include source name to citation
2024-05-06 16:16:26 -07:00
Timothy J. Baek
4c490132ba
refac: styling
2024-05-06 16:16:26 -07:00
Jun Siang Cheah
0872bea790
feat: show RAG query results as citations
2024-05-06 16:14:10 -07:00
Timothy J. Baek
cecb87b8c2
feat: web_loader_ssl_verification setting
2024-05-06 14:50:55 -07:00
Timothy J. Baek
95f579cabe
feat: rag ssl verification env var
...
Co-Authored-By: Tobias Steidle <tobias.steidle@softwaredev.de>
2024-05-06 13:12:08 -07:00
Jun Siang Cheah
8b3e370a6e
fix: run formatter
2024-05-06 17:11:04 +08:00
Jun Siang Cheah
83f086ccdd
fix: do not return raw search exception due to API keys in URLs
2024-05-06 17:09:04 +08:00
Jun Siang Cheah
99e4edd364
feat: add websearch endpoint to RAG API
...
fix: google PSE endpoint uses GET
fix: google PSE returns link, not url
fix: serper wrong field
2024-05-06 17:09:04 +08:00
Jun Siang Cheah
501ff7a98b
feat: backend implementation of various search APIs
2024-05-06 12:28:41 +08:00
tabacoWang
fffd283b0c
fix:
...
fix: Change the type from int to float
2024-05-02 13:45:19 +08:00
Timothy J. Baek
0595c04909
feat: youtube rag
2024-05-01 17:17:00 -07:00
Yanyutin753
c0bb32d768
📌 fixed a bug where RAG would not reply after not reading the file correctly
2024-04-30 13:51:30 +08:00
Timothy Jaeryang Baek
1afc49c1e4
Merge pull request #1862 from cheahjs/feat/filter-local-rag-fetch
...
feat: add ENABLE_LOCAL_WEB_FETCH to protect against SSRF attacks
2024-04-29 15:51:17 -07:00
Jun Siang Cheah
1c4e63f71e
feat: add ENABLE_LOCAL_WEB_FETCH to protect against SSRF attacks
2024-04-29 20:55:17 +01:00
Steven Kreitzer
5b8fd14470
fix: various api rag results
2024-04-29 12:17:36 -05:00
Yanyutin753
b0245a7eff
✨ feat added environment variables and sync.yml
2024-04-28 06:54:26 +08:00
Timothy J. Baek
ce9a5d12e0
refac: rag pipeline
2024-04-27 15:38:50 -04:00
Timothy J. Baek
8f1563a7a5
fix: typo
2024-04-27 15:03:49 -04:00
Timothy J. Baek
9be56d68e0
refac: naming convention
2024-04-27 15:02:57 -04:00
Timothy J. Baek
cebf733b9d
refac: naming convention
2024-04-26 14:41:39 -04:00
Steven Kreitzer
69822e4c25
fix: sort ranking hybrid
2024-04-26 07:56:41 -05:00
Steven Kreitzer
9755cd5baa
feat: toggle hybrid search
2024-04-25 17:51:38 -05:00
Timothy J. Baek
984dbf13ab
revert: original rag pipeline
2024-04-25 17:03:00 -04:00
Steven Kreitzer
1c1d2c254d
fix: query collection api call
2024-04-25 13:38:18 -05:00
Steven Kreitzer
72090fab88
chore: update log line
2024-04-25 13:28:31 -05:00
Steven Kreitzer
c9c9660459
fix: address comment in pr #1687
2024-04-25 07:50:42 -05:00
Steven Kreitzer
c0259aad67
feat: hybrid search and reranking support
2024-04-24 07:55:10 -05:00
Steven Kreitzer
4e0b32b505
feat: hybrid search
2024-04-22 18:33:43 -05:00
Steven Kreitzer
f3e5700d49
feat: move to native sentence_transformer
2024-04-22 14:20:41 -05:00
Timothy J. Baek
713934edb6
refac
2024-04-20 15:21:52 -05:00
Timothy J. Baek
710850e442
refac: audio
2024-04-20 15:15:59 -05:00
Timothy J. Baek
741ed5dc4c
fix
2024-04-14 19:56:33 -04:00
Timothy J. Baek
b1b72441bb
feat: openai embeddings integration
2024-04-14 19:48:15 -04:00
Timothy J. Baek
b48e73fa43
feat: openai embeddings support
2024-04-14 19:15:39 -04:00
Timothy J. Baek
36ce157907
fix: integration
2024-04-14 18:47:45 -04:00
Timothy J. Baek
9cdb5bf9fe
feat: frontend integration
2024-04-14 18:31:40 -04:00
Timothy J. Baek
2952e61167
feat: external embeddings support
2024-04-14 17:55:00 -04:00
Timothy Jaeryang Baek
b9cadff16b
Merge pull request #1419 from lainedfles/embedding-model-fix-and-manual-update
...
feat: improve embedding model update & resolve network dependency
2024-04-10 01:10:07 -07:00
Timothy J. Baek
582d11f191
refac: RAG_EMBEDDING_MODEL_PATH removed
2024-04-10 00:59:05 -07:00
Timothy J. Baek
cb2158a794
fix
2024-04-10 00:51:16 -07:00
Timothy J. Baek
abfcceecef
refac
2024-04-10 00:46:09 -07:00
Timothy J. Baek
f4b87ecb23
refac
2024-04-10 00:33:45 -07:00
Steven Kreitzer
0bae789d39
fix: support batching chromadb
2024-04-09 10:13:29 -05:00
lainedfles
506a061387
Merge branch 'dev' into embedding-model-fix-and-manual-update
2024-04-08 14:57:54 -06:00
Jannik S
3b3d0cce1e
Merge branch 'dev' into dockerfile-optimisation
2024-04-08 09:15:00 +02:00
Timothy J. Baek
e61e1b079f
fix: file upload issue
2024-04-04 17:38:59 -07:00
Self Denial
9f82f5abba
Formatting...
2024-04-04 12:09:48 -06:00
Self Denial
075fbedb02
More format fixes
2024-04-04 12:07:42 -06:00
Self Denial
bcf79c8366
Format fixes
2024-04-04 12:02:48 -06:00
Self Denial
3b66aa55c0
Improve embedding model update & resolve network dependency
...
* Add config variable RAG_EMBEDDING_MODEL_AUTO_UPDATE to control update behavior
* Add RAG utils embedding_model_get_path() function to output the filesystem path in addition to update of the model using huggingface_hub
* Update and utilize existing RAG functions in main: get_embedding_model() & update_embedding_model()
* Add GUI setting to execute manual update process
2024-04-04 11:01:23 -06:00
Mmx233
947c392f72
fix: manually check the docs' filename
2024-04-03 23:37:13 +08:00
Jannik Streidl
9bcb37ea10
fixes and updates
2024-04-02 14:47:52 +02:00
Jannik S
099b1d066b
Revert "Merge Updates & Dockerfile improvements" ( #3 )
...
This reverts commit 9763d885be
.
2024-04-02 11:28:04 +02:00
lainedfles
9763d885be
Merge Updates & Dockerfile improvements
2024-04-02 11:25:20 +02:00
Timothy J. Baek
5558514ff1
fix
2024-04-01 15:23:12 -07:00
KoreLogic Disclosures
6c96361402
Suggested mitigation for KL-CAN-2024-002.
2024-04-01 15:55:14 -05:00
Timothy J. Baek
a6c154d839
feat: rag context logging
2024-03-31 14:02:31 -07:00
Self Denial
144c9059a3
Improve logging. Move print()
statements to appropiate log()
.
...
Add COMFYUI and WEBHOOK logging and associated environment variable
control. Add WEBHOOK payload & request debug logs.
2024-03-31 13:17:29 -06:00
Timothy J. Baek
3688955c77
fix: encoding issue
2024-03-25 23:50:52 -07:00
Timothy J. Baek
6307adfba1
feat: better error handling
2024-03-25 23:47:08 -07:00
Doug Danat
c91a5d8b1f
switch to using BeautifulSoup HTML loader so title is also captured
2024-03-25 11:26:18 +01:00
Doug Danat
784a6ec85e
include html langchain loader for RAG
2024-03-25 09:50:53 +01:00
Timothy Jaeryang Baek
371dfc1143
Merge branch 'dev' into debug_print
2024-03-24 18:04:03 -05:00
Timothy J. Baek
ff8a55a861
refac: rag api
2024-03-24 00:41:41 -07:00
Timothy J. Baek
7e0ea8f77d
feat: RAG text ingestion(store) api
2024-03-24 00:40:27 -07:00
Jannik Streidl
fdef2abdfb
cuda fix
2024-03-22 12:48:48 +01:00
Self Denial
e6dd0bfbe0
Migrate to python logging module with env var control.
2024-03-20 17:11:36 -06:00
Jannik Streidl
1f6739337b
docker improvements & changed universal device type env for different models used
2024-03-20 08:44:09 +01:00
Timothy J. Baek
91efd6cb63
fix: file upload encoding issue
2024-03-15 23:52:37 -07:00
Timothy J. Baek
072b499a50
fix: backslash rag content issue
2024-03-15 13:34:52 -07:00
Timothy J. Baek
8df6b137cb
fix: rag
2024-03-10 18:40:50 -07:00
Timothy J. Baek
98948814fd
feat: toggle pdf ocr
2024-03-10 13:32:34 -07:00
Timothy J. Baek
c49491e516
refac: rag to backend
2024-03-08 22:34:47 -08:00
Timothy J. Baek
7e5e2c42c9
refac: rag routes
2024-03-08 19:26:39 -08:00
Timothy J. Baek
b88c64f80e
fix: ocr issue
2024-03-06 17:54:42 -08:00
Timothy J. Baek
bb98c10abb
revert: ocr feature
2024-03-06 17:04:40 -08:00
Timothy Jaeryang Baek
8fb5f54751
Merge pull request #1050 from jannikstdl/rag-pdf-ocr
...
feat: added ocr functionality to the pdf loader
2024-03-06 00:45:33 -05:00
Jannik Streidl
089a63e0c6
feat: added ocr functionality to the pdf loader
2024-03-05 22:25:25 +01:00
Firat Birlik
6782e95c75
recreate rag collection is now optional and only used for web requests
2024-03-04 10:00:06 -06:00
Firat Birlik
5d4ff85228
recreate rag collection instead of falling back to stale version
2024-03-03 21:25:00 -06:00
Timothy J. Baek
47a05a47b4
feat: add rag top k value setting
2024-03-02 18:56:57 -08:00
Ased Mammad
b473ad574f
fix: RAG scan unsupported mimetype
...
This fixes an issue with RAG that stops loading documents as soon
as it reaches a file with unsupported mimetype.
2024-02-23 14:27:31 +03:30
Timothy J. Baek
7c127c35fc
feat: dynamic embedding model load
2024-02-19 11:05:45 -08:00
Jannik Streidl
acf999013b
storing vectordb in project cache folder + device types
2024-02-19 07:51:17 +01:00
Timothy J. Baek
0cb0358485
refac: more descriptive var names
2024-02-18 11:16:10 -08:00
Jannik S
4b88e7e44f
Merge branch 'main' into choose-embedding-model
2024-02-18 09:20:54 +01:00
Jannik Streidl
bc3dd34d8b
collection query fix
2024-02-18 09:17:43 +01:00
Timothy J. Baek
07b451995e
feat: reset rag template
2024-02-17 22:49:18 -08:00
Timothy J. Baek
5270efa9e5
feat: editable rag template
2024-02-17 22:41:03 -08:00
Timothy J. Baek
ccf08fb91e
feat: editable chunk params
2024-02-17 22:29:52 -08:00
Timothy J. Baek
a94e4161f7
fix: file content type issue
2024-02-17 21:31:46 -08:00
Timothy J. Baek
e07001e5f6
feat: rag folder scan support
2024-02-17 21:06:08 -08:00
Jannik Streidl
1846c1e80d
choose embedding model when using docker
2024-02-17 19:38:29 +01:00
Tim Farrell
08e8e922fd
Endpoint role-checking was redundantly applied but FastAPI provides a nice abstraction mechanic...so I applied it. There should be no logical changes in this code; only simpler, cleaner ways for doing the same thing.
2024-02-08 18:05:01 -06:00
Timothy J. Baek
683650ec00
feat: collection rag integration
2024-02-03 15:57:06 -08:00
Timothy J. Baek
00803c92f2
feat: doc tagging
2024-02-03 14:44:49 -08:00
Timothy J. Baek
50f7b20ac2
refac
2024-02-01 13:35:41 -08:00
Timothy J. Baek
28226a6f97
feat: web rag support
2024-01-26 22:17:28 -08:00
Timothy J. Baek
4e468dc58c
refac
2024-01-25 00:24:49 -08:00
Timothy Jaeryang Baek
fa5918ad13
Merge branch 'main' into main
2024-01-25 00:13:12 -08:00
Marclass
8bfda730d9
add excel document support
2024-01-23 14:03:22 -07:00
Timothy Jaeryang Baek
ca943d0795
Merge pull request #549 from Marclass/main
...
Bugfix: Fix toast error popup when front end can't figure out file type.
2024-01-22 23:13:53 -08:00
Timothy Jaeryang Baek
7054f02891
Merge pull request #466 from baumandm/feat/epub-support
...
feat: Add epub support
2024-01-22 23:12:46 -08:00
Marclass
7eea3ef313
copy list of file ext from backend to front end
2024-01-23 00:00:07 -07:00
Marclass
35ace57784
add rst document for RAG
2024-01-19 10:48:04 -07:00
Dave Bauman
f559068186
feat: Add epub support
2024-01-19 12:23:59 -05:00
Marclass
aa1d386042
Allow any file to be used for RAG.
...
Changed RAG parser to prefer file extensions over MIME content types. If the type of file is not recognized assume it's a text file.
2024-01-18 20:41:14 -07:00
Marclass
6070e6bcd1
add svelte type to RAG
2024-01-17 20:10:34 -07:00
Marclass
cf6b3fa48a
remove html type and add js/css
2024-01-17 00:34:22 -07:00
Marclass
43d8466677
feat: Add RAG support for various programming languages
...
Enables RAG for golang, python, java, sh, bat, powershell, cmd, js, css, c/c++/c#, sql, logs, ini, perl, r, dart, docker, env, php, haskell, lua, conf, plsql, ruby, db2, scalla, bash, swift, vue, html, xml, and other arbitrary text files.
2024-01-17 00:09:47 -07:00
Timothy J. Baek
c1ec604f21
feat: rag md support
2024-01-09 15:24:53 -08:00
Timothy J. Baek
54c4e0761a
feat: documents file upload
2024-01-08 01:26:15 -08:00
Timothy J. Baek
57c050326c
feat: docx support
2024-01-07 13:56:01 -08:00
Timothy J. Baek
9a63376e55
feat: file upload error handling
2024-01-07 09:33:34 -08:00
Timothy J. Baek
b37b157638
feat: reset vectordb storage support
2024-01-07 09:15:45 -08:00
Timothy J. Baek
d4b2578f6e
feat: rag csv support
2024-01-07 09:05:52 -08:00
Timothy J. Baek
d6a1bf1406
refac: file upload
2024-01-07 09:00:30 -08:00
Timothy J. Baek
ffd0a5a2a0
Update main.py
2024-01-07 08:34:05 -08:00
Timothy J. Baek
c68bb3b950
docker: slim
2024-01-07 08:28:35 -08:00
Timothy J. Baek
464d0fb016
fix: update langchain.document_loaders
2024-01-07 02:49:13 -08:00
Timothy J. Baek
70d2571be1
feat: rag backend auth
2024-01-07 02:46:12 -08:00
Timothy J. Baek
142269374f
feat: vectordb query error handling
2024-01-07 01:59:00 -08:00
Timothy J. Baek
ad3d69be30
refac
2024-01-07 01:54:58 -08:00
Timothy J. Baek
9634e2da3e
feat: full integration
2024-01-07 01:40:36 -08:00
Timothy J. Baek
fef4725d56
feat: frontend file upload support
2024-01-07 00:57:10 -08:00
Timothy J. Baek
cd86c36953
feat: pdf data load
2024-01-06 23:40:51 -08:00
Timothy J. Baek
784b369cc9
feat: chromadb vector store api
2024-01-06 22:59:22 -08:00
Timothy J. Baek
b2c9f6dff8
feat: rag api endpoint
2024-01-06 22:07:20 -08:00