clearml-serving/clearml_serving/serving
IlyaMescheryakov1402 724c99c605
Add clearml_serving_inference restart on CUDA OOM (#75)
* initial commit

* add OOM handler for MIG profiles

---------

Co-authored-by: Meshcheryakov Ilya <i.meshcheryakov@mts.ai>
2024-07-07 15:54:08 +03:00
..
__init__.py ClearML-Serving v2 initial working commit 2022-03-06 01:25:56 +02:00
Dockerfile Upgrade to python 3.11 2023-04-12 23:38:56 +03:00
endpoints.py Add Triton support for variable length requests, adds support for HuggingFace Transformers 2022-09-02 23:41:54 +03:00
entrypoint.sh Add clearml_serving_inference restart on CUDA OOM (#75) 2024-07-07 15:54:08 +03:00
init.py Add clearml_serving_inference restart on CUDA OOM (#75) 2024-07-07 15:54:08 +03:00
main.py Add clearml_serving_inference restart on CUDA OOM (#75) 2024-07-07 15:54:08 +03:00
model_request_processor.py Add clearml_serving_inference restart on CUDA OOM (#75) 2024-07-07 15:54:08 +03:00
preprocess_service.py Fix python < 3.10 support 2024-02-27 09:45:32 +02:00
requirements.txt Fix requirements 2024-02-27 09:43:18 +02:00
uvicorn_mp_entrypoint.py Optimize async processing for increased speed 2022-10-08 02:12:04 +03:00