Commit Graph

19 Commits

Author SHA1 Message Date
Timothy Jaeryang Baek
ef787e4a79
Merge pull request #12486 from FabioPolito24/text-file-handling-docling
fix: text file handling with docling
2025-04-05 09:55:51 -07:00
Fabio Polito
cd0a1b4852 fix: fix for text file handling with docling 2025-04-05 16:44:08 +00:00
Patrick Wachter
0ac00b9256
refactor: update import path for MistralLoader 2025-04-02 13:56:10 +02:00
Patrick Wachter
93d7702e8c
refactor: move MistralLoader to a separate module and just use the requests package instead of mistralai 2025-04-01 20:14:34 +02:00
Patrick Wachter
1ac6879268
Add Mistral OCR integration and configuration support 2025-04-01 14:24:33 +02:00
Junaid Pinjari
e782e7d3a7 Fix: CSV loader encoding issue using autodetect_encoding=True 2025-03-29 13:14:53 +05:30
Iván Baldo
115e46a6a2 Fix: Tika 3.1.0.0 sends a lot of blank lines which degrades the RAG results, strip them. 2025-03-25 14:53:14 -03:00
Fabio Polito
9d6743824e fix: fix params DoclingLoader 2025-03-09 16:12:14 +00:00
Fabio Polito
0716f96da8 style: change style in DoclingLoader 2025-03-05 23:15:55 +00:00
Fabio Polito
9aa407dbd2 feat: merge with main 2025-03-05 22:04:34 +00:00
Fabio Polito
a44b35e99e fix: fix DoclingLoader input params 2025-03-05 17:53:45 +00:00
Timothy Jaeryang Baek
33d3558ca9
Merge pull request #10817 from NovoNordisk-OpenSource/ivaroli/adding-json-as-supported-file-type
fix: Using the TextLoader instead of Tika for JSON files
2025-02-26 12:49:29 -08:00
Ívar Óli Sigurðsson
c5a09cdd21 adding a comma 2025-02-26 15:27:03 +01:00
Ívar Óli Sigurðsson
661711164a Adding json as a known source for Tika 2025-02-26 15:11:21 +01:00
Fabio Polito
2419ef06a0 feat: docling support for document preprocessing 2025-02-14 12:08:03 +00:00
Mazurek Michal
35f3824932 feat: Implement Document Intelligence as Content Extraction Engine 2025-02-07 13:44:47 +01:00
Timothy Jaeryang Baek
f341971eae fix 2024-12-15 23:41:17 -08:00
MooreDerek
4905c180a5
Only log file contents in debug 2024-12-16 15:58:26 +13:00
Timothy Jaeryang Baek
d3d161f723 wip 2024-12-10 00:54:13 -08:00