3.5 KiB
title |
---|
Setup |
The following page goes over how to set up and upgrade clearml-serving
.
Prerequisites
- ClearML-Server : Model repository, Service Health, Control plane
- Kubernetes / Single-instance Machine : Deploying containers
- CLI : Configuration and model deployment interface
Initial Setup
-
Set up your ClearML Server or use the free hosted service
-
Connect
clearml
SDK to the server, see instructions here -
Install the
clearml-serving
CLI:pip3 install clearml-serving
-
Create the Serving Service Controller:
clearml-serving create --name "serving example"
This command prints the Serving Service UID:
New Serving Service created: id=aa11bb22aa11bb22
Copy the Serving Service UID (e.g.,
aa11bb22aa11bb22
), as you will need it in the next steps. -
Clone the
clearml-serving
repository:git clone https://github.com/clearml/clearml-serving.git
-
Edit the environment variables file (
docker/example.env
) with yourclearml-server
API credentials and Serving Service UID. For example:cat docker/example.env
CLEARML_WEB_HOST="https://app.clear.ml" CLEARML_API_HOST="https://api.clear.ml" CLEARML_FILES_HOST="https://files.clear.ml" CLEARML_API_ACCESS_KEY="<access_key_here>" CLEARML_API_SECRET_KEY="<secret_key_here>" CLEARML_SERVING_TASK_ID="<serving_service_id_here>"
-
Spin up the
clearml-serving
containers withdocker-compose
(or if running on Kubernetes, use the helm chart):cd docker && docker-compose --env-file example.env -f docker-compose.yml up
If you need Triton support (Keras/PyTorch/ONNX etc.), use the triton
docker-compose
file:cd docker && docker-compose --env-file example.env -f docker-compose-triton.yml up
If running on a GPU instance with Triton support (Keras/PyTorch/ONNX etc.), use the triton gpu docker-compose file:
cd docker && docker-compose --env-file example.env -f docker-compose-triton-gpu.yml up
:::note Any model that registers with Triton engine will run the pre/post-processing code in the Inference service container, and the model inference itself will be executed on the Triton Engine container. :::
Advanced Setup - S3/GS/Azure Access (Optional)
To enable inference containers to download models from S3, Google Cloud Storage (GS), or Azure,
add access credentials in the respective environment variables to your env files (example.env
):
AWS_ACCESS_KEY_ID
AWS_SECRET_ACCESS_KEY
AWS_DEFAULT_REGION
GOOGLE_APPLICATION_CREDENTIALS
AZURE_STORAGE_ACCOUNT
AZURE_STORAGE_KEY
For further details, see Configuring Storage.
Upgrading ClearML Serving
Upgrading to v1.1
-
Shut down the serving containers (
docker-compose
or k8s) -
Update the
clearml-serving
CLI:pip3 install -U clearml-serving
-
Re-add a single existing endpoint with
clearml-serving model add ...
(press yes when asked). It will upgrade theclearml-serving
session definitions. -
Pull the latest serving containers (
docker-compose pull ...
or k8s) -
Re-spin serving containers (
docker-compose
or k8s)
Tutorial
For further details, see the ClearML Serving Tutorial.