Commit Graph

30 Commits

Author SHA1 Message Date
IlyaMescheryakov1402
8ecb51f1db add models endpoint 2025-03-12 01:09:50 +03:00
IlyaMescheryakov1402
fedfcdadeb add getattr for process methods 2025-03-11 22:42:59 +03:00
IlyaMescheryakov1402
cadd48f672 add openai_serving and openai_serving_models 2025-03-09 15:12:05 +03:00
IlyaMescheryakov1402
32d72bcd1c add vllm example 2025-02-28 22:36:14 +03:00
IlyaMescheryakov1402
f51bf2e081 revert some old changes 2025-02-27 23:13:47 +03:00
IlyaMescheryakov1402
5b73bdf085 fix suffix and add router 2025-02-27 22:56:39 +03:00
IlyaMescheryakov1402
2685d2a0e5 Merge branch 'main' into feature/multimodel 2025-02-27 13:41:54 +03:00
clearml
9f51a9334f Fix torch import 2024-12-16 18:51:58 +02:00
IlyaMescheryakov1402
724c99c605
Add clearml_serving_inference restart on CUDA OOM (#75)
* initial commit

* add OOM handler for MIG profiles

---------

Co-authored-by: Meshcheryakov Ilya <i.meshcheryakov@mts.ai>
2024-07-07 15:54:08 +03:00
Meshcheryakov Ilya
b8f5d81636 initial commit 2024-05-30 00:30:30 +03:00
Meshcheryakov Ilya
64daef23ba initial commit 2024-05-29 21:18:39 +03:00
Meshcheryakov Ilya
6859920848 initial commit 2024-04-16 00:54:35 +03:00
allegroai
71c104c9df Add exception prints to serving session Task and inference Task, for better debugging capabilities
Add report instance ID when reporting back to the main serving session task
2024-03-01 13:13:48 +02:00
allegroai
0f4122247d Fix ping serving session task to make sure everyone knows we are alive 2024-02-26 11:31:58 +02:00
allegroai
82ade1e24a Fix check triton config.pbtxt for missing values or colliding specifications (#62) 2023-09-23 17:42:57 +03:00
allegroai
78a03cc166 Register models on serving session 2023-04-12 23:34:49 +03:00
allegroai
395a547c04 Optimize async processing for increased speed 2022-10-08 02:12:04 +03:00
Aleksandar Ivanovski
d89d1370d8 [DEV] feature/bytes-payload | Add typing 2022-10-06 16:01:31 +02:00
Aleksandar Ivanovski
2aa91a3d43 [DEV] feature/bytes-payload | Handle keys when req is bytes 2022-10-06 15:13:36 +02:00
allegroai
f4eed33f10 Add Triton support for variable length requests, adds support for HuggingFace Transformers
Add triton_grpc_compression=False (default) for grpc connection compression control
2022-09-02 23:41:54 +03:00
allegroai
5beb077f51 Add support for update pre/post processing code to a live endpoint 2022-06-07 00:52:08 +03:00
allegroai
48f720ac91 Optimize request serving statistics reporting 2022-06-07 00:20:33 +03:00
allegroai
f2e207e2f2 Add per endpoint-variable add/remove statistics logging 2022-06-05 16:11:17 +03:00
allegroai
8778f723e6 Add pre/post processing callnack state dict, for safe per request state storage 2022-06-05 16:10:20 +03:00
Victor Sonck
e3a8ed95b5 Add task reload call that made statistics service not update correctly 2022-06-01 09:54:02 +02:00
allegroai
c3f3008868 pep8 2022-04-29 03:10:35 +03:00
allegroai
409fc156fd Add Preprocess.model_endpoint 2022-04-18 23:24:30 +03:00
allegroai
4355c1b1f4 Add model metric logging 2022-03-21 01:00:19 +02:00
allegroai
d684169367 Add model ensemble and model pipelines support 2022-03-09 04:02:03 +02:00
allegroai
b4cb27b27d ClearML-Serving v2 initial working commit 2022-03-06 01:25:56 +02:00