IlyaMescheryakov1402
|
8ecb51f1db
|
add models endpoint
|
2025-03-12 01:09:50 +03:00 |
|
IlyaMescheryakov1402
|
25e2940596
|
fix jsonresponse
|
2025-03-11 22:44:32 +03:00 |
|
IlyaMescheryakov1402
|
fedfcdadeb
|
add getattr for process methods
|
2025-03-11 22:42:59 +03:00 |
|
IlyaMescheryakov1402
|
9bb0dbb182
|
fix imports
|
2025-03-11 11:45:52 +03:00 |
|
IlyaMescheryakov1402
|
9441ae8473
|
move engine init in separate class
|
2025-03-10 23:52:14 +03:00 |
|
IlyaMescheryakov1402
|
1c591f2d15
|
fix openai testing
|
2025-03-10 00:21:24 +03:00 |
|
IlyaMescheryakov1402
|
77e1f95dbd
|
fix response import
|
2025-03-09 22:53:44 +03:00 |
|
IlyaMescheryakov1402
|
cadd48f672
|
add openai_serving and openai_serving_models
|
2025-03-09 15:12:05 +03:00 |
|
IlyaMescheryakov1402
|
428be76642
|
major vllm engine update
|
2025-03-09 01:46:05 +03:00 |
|
IlyaMescheryakov1402
|
32d72bcd1c
|
add vllm example
|
2025-02-28 22:36:14 +03:00 |
|
IlyaMescheryakov1402
|
f51bf2e081
|
revert some old changes
|
2025-02-27 23:13:47 +03:00 |
|
IlyaMescheryakov1402
|
5b73bdf085
|
fix suffix and add router
|
2025-02-27 22:56:39 +03:00 |
|
IlyaMescheryakov1402
|
2685d2a0e5
|
Merge branch 'main' into feature/multimodel
|
2025-02-27 13:41:54 +03:00 |
|
clearml
|
9f51a9334f
|
Fix torch import
|
2024-12-16 18:51:58 +02:00 |
|
clearml
|
aff27c62b8
|
Fix gRPC errors print stack traces and full verbose details. Add support for controlling error printouts using CLEARML_SERVING_AIO_RPC_IGNORE_ERRORS and CLEARML_SERVING_AIO_RPC_VERBOSE_ERRORS (pass a whitespace-separated list of error codes or error names)
|
2024-12-12 23:57:21 +02:00 |
|
IlyaMescheryakov1402
|
724c99c605
|
Add clearml_serving_inference restart on CUDA OOM (#75)
* initial commit
* add OOM handler for MIG profiles
---------
Co-authored-by: Meshcheryakov Ilya <i.meshcheryakov@mts.ai>
|
2024-07-07 15:54:08 +03:00 |
|
Meshcheryakov Ilya
|
4796d77ad7
|
fix shash processing
|
2024-05-30 15:52:06 +03:00 |
|
Meshcheryakov Ilya
|
b8f5d81636
|
initial commit
|
2024-05-30 00:30:30 +03:00 |
|
Meshcheryakov Ilya
|
64daef23ba
|
initial commit
|
2024-05-29 21:18:39 +03:00 |
|
Meshcheryakov Ilya
|
6859920848
|
initial commit
|
2024-04-16 00:54:35 +03:00 |
|
allegroai
|
71c104c9df
|
Add exception prints to serving session Task and inference Task, for better debugging capabilities
Add report instance ID when reporting back to the main serving session task
|
2024-03-01 13:13:48 +02:00 |
|
allegroai
|
8df521b949
|
Fix python < 3.10 support
Fix custom_async engine
Suppress warning
|
2024-02-27 09:45:32 +02:00 |
|
allegroai
|
3611a040f7
|
Fix requirements
|
2024-02-27 09:43:18 +02:00 |
|
allegroai
|
dc6fd46a46
|
Update requirements
|
2024-02-26 11:34:21 +02:00 |
|
allegroai
|
0f4122247d
|
Fix ping serving session task to make sure everyone knows we are alive
|
2024-02-26 11:31:58 +02:00 |
|
allegroai
|
f2ba37c8d4
|
Fix version enabled endpoints on Triton engine were not called
|
2024-02-26 11:27:12 +02:00 |
|
allegroai
|
4ac13d5287
|
Fix requirements issue
|
2024-01-11 15:21:13 +02:00 |
|
allegroai
|
368a03dc70
|
Fix internal ValueError exception should return 422 (not 404 as before)
|
2024-01-06 17:55:49 +02:00 |
|
Jake Henning
|
c20bbd66b9
|
Fix Pillow vulnerability "libwebp: OOB write in BuildHuffmanTable"
|
2023-10-04 13:22:46 +03:00 |
|
allegroai
|
05cbfade2a
|
Update requirements
|
2023-09-23 18:03:24 +03:00 |
|
allegroai
|
82ade1e24a
|
Fix check triton config.pbtxt for missing values or colliding specifications (#62)
|
2023-09-23 17:42:57 +03:00 |
|
allegroai
|
e4c07c756a
|
Add traceback for failing to load preprocess class (#57)
|
2023-09-23 17:35:21 +03:00 |
|
Amir Mousavi
|
115770547c
|
Adds missing await (#55)
Co-authored-by: Amir Mousavi <amirh@collisure.com>
|
2023-05-08 12:46:52 +03:00 |
|
allegroai
|
aca8b4aa03
|
Upgrade to python 3.11
|
2023-04-12 23:38:56 +03:00 |
|
allegroai
|
78a03cc166
|
Register models on serving session
|
2023-04-12 23:34:49 +03:00 |
|
allegroai
|
31a4ebb965
|
Add CLEARML_GRPC_* environement variable support to configure grpc channel options (notice CLEARML_GRPC_var is converted into grpc.var when setting grpc channel, casing does not change) #49
|
2023-04-12 23:30:59 +03:00 |
|
Victor Sonck
|
a04d1bda03
|
Remove never-used but now deprecated np.int (#42)
|
2023-02-07 08:05:01 +02:00 |
|
allegroai
|
0c5d9820df
|
Optimize containers
|
2022-10-08 02:22:32 +03:00 |
|
allegroai
|
395a547c04
|
Optimize async processing for increased speed
|
2022-10-08 02:12:04 +03:00 |
|
Aleksandar Ivanovski
|
d89d1370d8
|
[DEV] feature/bytes-payload | Add typing
|
2022-10-06 16:01:31 +02:00 |
|
Aleksandar Ivanovski
|
2aa91a3d43
|
[DEV] feature/bytes-payload | Handle keys when req is bytes
|
2022-10-06 15:13:36 +02:00 |
|
Aleksandar Ivanovski
|
09ed480bc2
|
[DEV] feature/bytes-payload | Add bytes as payload
|
2022-10-06 13:31:54 +02:00 |
|
allegroai
|
f4eed33f10
|
Add Triton support for variable length requests, adds support for HuggingFace Transformers
Add triton_grpc_compression=False (default) for grpc connection compression control
|
2022-09-02 23:41:54 +03:00 |
|
allegroai
|
c6c40c9a36
|
Add support for Preprocess class inside a module (i.e. __init__.py with subfolders)
|
2022-09-02 21:50:41 +03:00 |
|
allegroai
|
5beb077f51
|
Add support for update pre/post processing code to a live endpoint
|
2022-06-07 00:52:08 +03:00 |
|
allegroai
|
48f720ac91
|
Optimize request serving statistics reporting
|
2022-06-07 00:20:33 +03:00 |
|
allegroai
|
4a55c10366
|
Change default log level to warning UVICORN_LOG_LEVEL
|
2022-06-07 00:19:51 +03:00 |
|
allegroai
|
f7b21b38b1
|
Add CLEARML_EXTRA_PYTHON_PACKAGES for additional runtime package installaiton
Upgrade kafka to 3.1.1
|
2022-06-05 16:15:34 +03:00 |
|
allegroai
|
0e240101db
|
Add pandas to the default serving container, update triton client package
|
2022-06-05 16:12:22 +03:00 |
|
allegroai
|
782cd5dfc8
|
pep8
|
2022-06-05 16:11:55 +03:00 |
|