IlyaMescheryakov1402
|
8ecb51f1db
|
add models endpoint
|
2025-03-12 01:09:50 +03:00 |
|
IlyaMescheryakov1402
|
fedfcdadeb
|
add getattr for process methods
|
2025-03-11 22:42:59 +03:00 |
|
IlyaMescheryakov1402
|
cadd48f672
|
add openai_serving and openai_serving_models
|
2025-03-09 15:12:05 +03:00 |
|
IlyaMescheryakov1402
|
32d72bcd1c
|
add vllm example
|
2025-02-28 22:36:14 +03:00 |
|
IlyaMescheryakov1402
|
f51bf2e081
|
revert some old changes
|
2025-02-27 23:13:47 +03:00 |
|
IlyaMescheryakov1402
|
5b73bdf085
|
fix suffix and add router
|
2025-02-27 22:56:39 +03:00 |
|
IlyaMescheryakov1402
|
2685d2a0e5
|
Merge branch 'main' into feature/multimodel
|
2025-02-27 13:41:54 +03:00 |
|
clearml
|
9f51a9334f
|
Fix torch import
|
2024-12-16 18:51:58 +02:00 |
|
IlyaMescheryakov1402
|
724c99c605
|
Add clearml_serving_inference restart on CUDA OOM (#75)
* initial commit
* add OOM handler for MIG profiles
---------
Co-authored-by: Meshcheryakov Ilya <i.meshcheryakov@mts.ai>
|
2024-07-07 15:54:08 +03:00 |
|
Meshcheryakov Ilya
|
b8f5d81636
|
initial commit
|
2024-05-30 00:30:30 +03:00 |
|
Meshcheryakov Ilya
|
64daef23ba
|
initial commit
|
2024-05-29 21:18:39 +03:00 |
|
Meshcheryakov Ilya
|
6859920848
|
initial commit
|
2024-04-16 00:54:35 +03:00 |
|
allegroai
|
71c104c9df
|
Add exception prints to serving session Task and inference Task, for better debugging capabilities
Add report instance ID when reporting back to the main serving session task
|
2024-03-01 13:13:48 +02:00 |
|
allegroai
|
0f4122247d
|
Fix ping serving session task to make sure everyone knows we are alive
|
2024-02-26 11:31:58 +02:00 |
|
allegroai
|
82ade1e24a
|
Fix check triton config.pbtxt for missing values or colliding specifications (#62)
|
2023-09-23 17:42:57 +03:00 |
|
allegroai
|
78a03cc166
|
Register models on serving session
|
2023-04-12 23:34:49 +03:00 |
|
allegroai
|
395a547c04
|
Optimize async processing for increased speed
|
2022-10-08 02:12:04 +03:00 |
|
Aleksandar Ivanovski
|
d89d1370d8
|
[DEV] feature/bytes-payload | Add typing
|
2022-10-06 16:01:31 +02:00 |
|
Aleksandar Ivanovski
|
2aa91a3d43
|
[DEV] feature/bytes-payload | Handle keys when req is bytes
|
2022-10-06 15:13:36 +02:00 |
|
allegroai
|
f4eed33f10
|
Add Triton support for variable length requests, adds support for HuggingFace Transformers
Add triton_grpc_compression=False (default) for grpc connection compression control
|
2022-09-02 23:41:54 +03:00 |
|
allegroai
|
5beb077f51
|
Add support for update pre/post processing code to a live endpoint
|
2022-06-07 00:52:08 +03:00 |
|
allegroai
|
48f720ac91
|
Optimize request serving statistics reporting
|
2022-06-07 00:20:33 +03:00 |
|
allegroai
|
f2e207e2f2
|
Add per endpoint-variable add/remove statistics logging
|
2022-06-05 16:11:17 +03:00 |
|
allegroai
|
8778f723e6
|
Add pre/post processing callnack state dict, for safe per request state storage
|
2022-06-05 16:10:20 +03:00 |
|
Victor Sonck
|
e3a8ed95b5
|
Add task reload call that made statistics service not update correctly
|
2022-06-01 09:54:02 +02:00 |
|
allegroai
|
c3f3008868
|
pep8
|
2022-04-29 03:10:35 +03:00 |
|
allegroai
|
409fc156fd
|
Add Preprocess.model_endpoint
|
2022-04-18 23:24:30 +03:00 |
|
allegroai
|
4355c1b1f4
|
Add model metric logging
|
2022-03-21 01:00:19 +02:00 |
|
allegroai
|
d684169367
|
Add model ensemble and model pipelines support
|
2022-03-09 04:02:03 +02:00 |
|
allegroai
|
b4cb27b27d
|
ClearML-Serving v2 initial working commit
|
2022-03-06 01:25:56 +02:00 |
|