Commit Graph

60 Commits

Author SHA1 Message Date
IlyaMescheryakov1402
8ecb51f1db add models endpoint 2025-03-12 01:09:50 +03:00
IlyaMescheryakov1402
25e2940596 fix jsonresponse 2025-03-11 22:44:32 +03:00
IlyaMescheryakov1402
fedfcdadeb add getattr for process methods 2025-03-11 22:42:59 +03:00
IlyaMescheryakov1402
9bb0dbb182 fix imports 2025-03-11 11:45:52 +03:00
IlyaMescheryakov1402
9441ae8473 move engine init in separate class 2025-03-10 23:52:14 +03:00
IlyaMescheryakov1402
1c591f2d15 fix openai testing 2025-03-10 00:21:24 +03:00
IlyaMescheryakov1402
77e1f95dbd fix response import 2025-03-09 22:53:44 +03:00
IlyaMescheryakov1402
cadd48f672 add openai_serving and openai_serving_models 2025-03-09 15:12:05 +03:00
IlyaMescheryakov1402
428be76642 major vllm engine update 2025-03-09 01:46:05 +03:00
IlyaMescheryakov1402
32d72bcd1c add vllm example 2025-02-28 22:36:14 +03:00
IlyaMescheryakov1402
f51bf2e081 revert some old changes 2025-02-27 23:13:47 +03:00
IlyaMescheryakov1402
5b73bdf085 fix suffix and add router 2025-02-27 22:56:39 +03:00
IlyaMescheryakov1402
2685d2a0e5 Merge branch 'main' into feature/multimodel 2025-02-27 13:41:54 +03:00
clearml
9f51a9334f Fix torch import 2024-12-16 18:51:58 +02:00
clearml
aff27c62b8 Fix gRPC errors print stack traces and full verbose details. Add support for controlling error printouts using CLEARML_SERVING_AIO_RPC_IGNORE_ERRORS and CLEARML_SERVING_AIO_RPC_VERBOSE_ERRORS (pass a whitespace-separated list of error codes or error names) 2024-12-12 23:57:21 +02:00
IlyaMescheryakov1402
724c99c605
Add clearml_serving_inference restart on CUDA OOM (#75)
* initial commit

* add OOM handler for MIG profiles

---------

Co-authored-by: Meshcheryakov Ilya <i.meshcheryakov@mts.ai>
2024-07-07 15:54:08 +03:00
Meshcheryakov Ilya
4796d77ad7 fix shash processing 2024-05-30 15:52:06 +03:00
Meshcheryakov Ilya
b8f5d81636 initial commit 2024-05-30 00:30:30 +03:00
Meshcheryakov Ilya
64daef23ba initial commit 2024-05-29 21:18:39 +03:00
Meshcheryakov Ilya
6859920848 initial commit 2024-04-16 00:54:35 +03:00
allegroai
71c104c9df Add exception prints to serving session Task and inference Task, for better debugging capabilities
Add report instance ID when reporting back to the main serving session task
2024-03-01 13:13:48 +02:00
allegroai
8df521b949 Fix python < 3.10 support
Fix custom_async engine
Suppress warning
2024-02-27 09:45:32 +02:00
allegroai
3611a040f7 Fix requirements 2024-02-27 09:43:18 +02:00
allegroai
dc6fd46a46 Update requirements 2024-02-26 11:34:21 +02:00
allegroai
0f4122247d Fix ping serving session task to make sure everyone knows we are alive 2024-02-26 11:31:58 +02:00
allegroai
f2ba37c8d4 Fix version enabled endpoints on Triton engine were not called 2024-02-26 11:27:12 +02:00
allegroai
4ac13d5287 Fix requirements issue 2024-01-11 15:21:13 +02:00
allegroai
368a03dc70 Fix internal ValueError exception should return 422 (not 404 as before) 2024-01-06 17:55:49 +02:00
Jake Henning
c20bbd66b9
Fix Pillow vulnerability "libwebp: OOB write in BuildHuffmanTable" 2023-10-04 13:22:46 +03:00
allegroai
05cbfade2a Update requirements 2023-09-23 18:03:24 +03:00
allegroai
82ade1e24a Fix check triton config.pbtxt for missing values or colliding specifications (#62) 2023-09-23 17:42:57 +03:00
allegroai
e4c07c756a Add traceback for failing to load preprocess class (#57) 2023-09-23 17:35:21 +03:00
Amir Mousavi
115770547c
Adds missing await (#55)
Co-authored-by: Amir Mousavi <amirh@collisure.com>
2023-05-08 12:46:52 +03:00
allegroai
aca8b4aa03 Upgrade to python 3.11 2023-04-12 23:38:56 +03:00
allegroai
78a03cc166 Register models on serving session 2023-04-12 23:34:49 +03:00
allegroai
31a4ebb965 Add CLEARML_GRPC_* environement variable support to configure grpc channel options (notice CLEARML_GRPC_var is converted into grpc.var when setting grpc channel, casing does not change) #49 2023-04-12 23:30:59 +03:00
Victor Sonck
a04d1bda03
Remove never-used but now deprecated np.int (#42) 2023-02-07 08:05:01 +02:00
allegroai
0c5d9820df Optimize containers 2022-10-08 02:22:32 +03:00
allegroai
395a547c04 Optimize async processing for increased speed 2022-10-08 02:12:04 +03:00
Aleksandar Ivanovski
d89d1370d8 [DEV] feature/bytes-payload | Add typing 2022-10-06 16:01:31 +02:00
Aleksandar Ivanovski
2aa91a3d43 [DEV] feature/bytes-payload | Handle keys when req is bytes 2022-10-06 15:13:36 +02:00
Aleksandar Ivanovski
09ed480bc2 [DEV] feature/bytes-payload | Add bytes as payload 2022-10-06 13:31:54 +02:00
allegroai
f4eed33f10 Add Triton support for variable length requests, adds support for HuggingFace Transformers
Add triton_grpc_compression=False (default) for grpc connection compression control
2022-09-02 23:41:54 +03:00
allegroai
c6c40c9a36 Add support for Preprocess class inside a module (i.e. __init__.py with subfolders) 2022-09-02 21:50:41 +03:00
allegroai
5beb077f51 Add support for update pre/post processing code to a live endpoint 2022-06-07 00:52:08 +03:00
allegroai
48f720ac91 Optimize request serving statistics reporting 2022-06-07 00:20:33 +03:00
allegroai
4a55c10366 Change default log level to warning UVICORN_LOG_LEVEL 2022-06-07 00:19:51 +03:00
allegroai
f7b21b38b1 Add CLEARML_EXTRA_PYTHON_PACKAGES for additional runtime package installaiton
Upgrade kafka to 3.1.1
2022-06-05 16:15:34 +03:00
allegroai
0e240101db Add pandas to the default serving container, update triton client package 2022-06-05 16:12:22 +03:00
allegroai
782cd5dfc8 pep8 2022-06-05 16:11:55 +03:00