sree
f408b08965
minor bug fix for external document loader not working
2025-05-20 11:10:23 +05:30
Timothy Jaeryang Baek
8732b64b6b
feat: external document loader support
Deploy to HuggingFace Spaces / check-secret (push) Waiting to run
Deploy to HuggingFace Spaces / deploy (push) Blocked by required conditions
Create and publish Docker images with specific build args / build-main-image (linux/amd64) (push) Waiting to run
Create and publish Docker images with specific build args / build-main-image (linux/arm64) (push) Waiting to run
Create and publish Docker images with specific build args / build-cuda-image (linux/amd64) (push) Waiting to run
Create and publish Docker images with specific build args / build-cuda-image (linux/arm64) (push) Waiting to run
Create and publish Docker images with specific build args / build-ollama-image (linux/amd64) (push) Waiting to run
Create and publish Docker images with specific build args / build-ollama-image (linux/arm64) (push) Waiting to run
Create and publish Docker images with specific build args / merge-main-images (push) Blocked by required conditions
Create and publish Docker images with specific build args / merge-cuda-images (push) Blocked by required conditions
Create and publish Docker images with specific build args / merge-ollama-images (push) Blocked by required conditions
Python CI / Format Backend (3.11.x) (push) Waiting to run
Python CI / Format Backend (3.12.x) (push) Waiting to run
Frontend Build / Format & Build Frontend (push) Waiting to run
Frontend Build / Frontend Unit Tests (push) Waiting to run
2025-05-14 22:28:40 +04:00
Timothy Jaeryang Baek
de70d0cb64
feat: docling do picture description support
2025-05-14 21:26:49 +04:00
Timothy Jaeryang Baek
6359cb55fe
chore: format
2025-05-07 02:01:03 +04:00
Tim Jaeryang Baek
ea07e242f5
Merge pull request #13528 from Classic298/dev
...
feat: Enhance YouTube Transcription Loader for multi-language support
2025-05-07 00:44:45 +04:00
Classic298
1dcbec71ec
Update youtube.py
2025-05-06 17:14:00 +02:00
Classic298
87dcbd198c
Update youtube.py
2025-05-06 17:11:03 +02:00
Classic298
d7927506f1
Update youtube.py
2025-05-06 17:06:21 +02:00
Classic298
f65dc715f9
Update youtube.py
2025-05-06 16:30:18 +02:00
Classic298
c69278c13c
Update youtube.py
2025-05-06 16:24:27 +02:00
Classic298
a129e0954e
Update youtube.py
2025-05-06 16:22:40 +02:00
Classic298
5e1cb76b93
Update youtube.py
2025-05-06 16:16:58 +02:00
Timothy Jaeryang Baek
e63b8b3879
refac
Deploy to HuggingFace Spaces / check-secret (push) Waiting to run
Deploy to HuggingFace Spaces / deploy (push) Blocked by required conditions
Create and publish Docker images with specific build args / build-main-image (linux/amd64) (push) Waiting to run
Create and publish Docker images with specific build args / build-main-image (linux/arm64) (push) Waiting to run
Create and publish Docker images with specific build args / build-cuda-image (linux/amd64) (push) Waiting to run
Create and publish Docker images with specific build args / build-cuda-image (linux/arm64) (push) Waiting to run
Create and publish Docker images with specific build args / build-ollama-image (linux/amd64) (push) Waiting to run
Create and publish Docker images with specific build args / build-ollama-image (linux/arm64) (push) Waiting to run
Create and publish Docker images with specific build args / merge-main-images (push) Blocked by required conditions
Create and publish Docker images with specific build args / merge-cuda-images (push) Blocked by required conditions
Create and publish Docker images with specific build args / merge-ollama-images (push) Blocked by required conditions
Python CI / Format Backend (3.11.x) (push) Waiting to run
Python CI / Format Backend (3.12.x) (push) Waiting to run
Frontend Build / Format & Build Frontend (push) Waiting to run
Frontend Build / Frontend Unit Tests (push) Waiting to run
2025-05-06 00:46:32 +04:00
Timothy Jaeryang Baek
27da31dc83
fix: tikaloader extract images
2025-05-05 23:40:34 +04:00
Classic298
67a612fe24
Update youtube.py
2025-05-05 20:40:48 +02:00
Classic298
791dd24ace
Update youtube.py
2025-05-05 20:08:25 +02:00
Classic298
9cf3381381
Update youtube.py
2025-05-05 20:07:52 +02:00
Classic298
b0d74a59f1
Update youtube.py
2025-05-05 20:07:37 +02:00
Classic298
1a30b3746e
Update youtube.py
2025-05-05 20:03:00 +02:00
Classic298
0a3817ed86
Update youtube.py
2025-05-05 20:00:10 +02:00
Classic298
0a845db8ec
Update youtube.py
2025-05-05 19:57:21 +02:00
Classic298
7680ac2517
Update youtube.py
2025-05-05 19:57:06 +02:00
Athanasios Oikonomou
657162e96d
feat(ocr): add support for Docling OCR engine and language configuration
...
This commit adds support for configuring the OCR engine and language(s) for Docling.
Configuration can be set via the environment variables `DOCLING_OCR_ENGINE` and `DOCLING_OCR_LANG`, or through the UI.
Fixes #13133
2025-05-03 00:32:06 +03:00
Tim Jaeryang Baek
7d184c3a14
Merge pull request #13085 from ayan4m1/fix/tika-image-ocr
...
Deploy to HuggingFace Spaces / check-secret (push) Has been cancelled
Create and publish Docker images with specific build args / build-main-image (linux/amd64) (push) Has been cancelled
Create and publish Docker images with specific build args / build-main-image (linux/arm64) (push) Has been cancelled
Create and publish Docker images with specific build args / build-cuda-image (linux/amd64) (push) Has been cancelled
Create and publish Docker images with specific build args / build-cuda-image (linux/arm64) (push) Has been cancelled
Create and publish Docker images with specific build args / build-ollama-image (linux/amd64) (push) Has been cancelled
Create and publish Docker images with specific build args / build-ollama-image (linux/arm64) (push) Has been cancelled
Python CI / Format Backend (3.11.x) (push) Has been cancelled
Python CI / Format Backend (3.12.x) (push) Has been cancelled
Frontend Build / Format & Build Frontend (push) Has been cancelled
Frontend Build / Frontend Unit Tests (push) Has been cancelled
Deploy to HuggingFace Spaces / deploy (push) Has been cancelled
Create and publish Docker images with specific build args / merge-main-images (push) Has been cancelled
Create and publish Docker images with specific build args / merge-cuda-images (push) Has been cancelled
Create and publish Docker images with specific build args / merge-ollama-images (push) Has been cancelled
fix: pass extractInlineImages header to Tika if PDF_EXTRACT_IMAGES is true
2025-05-02 03:47:51 -07:00
ayan4m1
039dec6820
fix: pass header to Tika if PDF_EXTRACT_IMAGES is true
2025-04-20 17:36:40 +02:00
tth37
008fec80c1
fix: Update external search/loader method to POST
2025-04-14 18:17:27 +08:00
tth37
22f0365cef
format
2025-04-14 02:05:58 +08:00
tth37
839ba22c90
feat: Backend for Self-Hosted/External Web Search/Loader Engines
2025-04-14 01:49:05 +08:00
lucy
bc295546cd
fix #12678
2025-04-10 07:23:34 +02:00
Timothy Jaeryang Baek
ef787e4a79
Merge pull request #12486 from FabioPolito24/text-file-handling-docling
...
fix: text file handling with docling
2025-04-05 09:55:51 -07:00
Fabio Polito
cd0a1b4852
fix: fix for text file handling with docling
2025-04-05 16:44:08 +00:00
Patrick Wachter
0ac00b9256
refactor: update import path for MistralLoader
2025-04-02 13:56:10 +02:00
Patrick Wachter
c5a8d2f857
refactor: update MistralLoader documentation and adjust parameters for signed URL retrieval
2025-04-01 20:14:34 +02:00
Patrick Wachter
93d7702e8c
refactor: move MistralLoader to a separate module and just use the requests package instead of mistralai
2025-04-01 20:14:34 +02:00
Patrick Wachter
1ac6879268
Add Mistral OCR integration and configuration support
2025-04-01 14:24:33 +02:00
Junaid Pinjari
e782e7d3a7
Fix: CSV loader encoding issue using autodetect_encoding=True
2025-03-29 13:14:53 +05:30
Iván Baldo
115e46a6a2
Fix: Tika 3.1.0.0 sends a lot of blank lines which degrades the RAG results, strip them.
2025-03-25 14:53:14 -03:00
orenzhang
c761e4fd08
feat(trace): opentelemetry instrument
2025-03-10 22:27:31 +08:00
Fabio Polito
9d6743824e
fix: fix params DoclingLoader
2025-03-09 16:12:14 +00:00
Fabio Polito
0aa42615f9
Merge remote-tracking branch 'upstream/dev' into docling_context_extraction_engine
...
merge upstream
2025-03-08 18:52:51 +00:00
Luke
987954c817
feat: Add Tavily extract web loader integration
2025-03-06 18:15:18 -05:00
Fabio Polito
0716f96da8
style: change style in DoclingLoader
2025-03-05 23:15:55 +00:00
Fabio Polito
9aa407dbd2
feat: merge with main
2025-03-05 22:04:34 +00:00
Fabio Polito
a44b35e99e
fix: fix DoclingLoader input params
2025-03-05 17:53:45 +00:00
Timothy Jaeryang Baek
33d3558ca9
Merge pull request #10817 from NovoNordisk-OpenSource/ivaroli/adding-json-as-supported-file-type
...
fix: Using the TextLoader instead of Tika for JSON files
2025-02-26 12:49:29 -08:00
Ívar Óli Sigurðsson
c5a09cdd21
adding a comma
2025-02-26 15:27:03 +01:00
Ívar Óli Sigurðsson
661711164a
Adding json as a known source for Tika
2025-02-26 15:11:21 +01:00
Fabio Polito
2419ef06a0
feat: docling support for document preprocessing
2025-02-14 12:08:03 +00:00
Mazurek Michal
35f3824932
feat: Implement Document Intelligence as Content Extraction Engine
2025-02-07 13:44:47 +01:00
Timothy Jaeryang Baek
f341971eae
fix
2024-12-15 23:41:17 -08:00