clearml-docs/clearml_serving_setup.md at 77721a925b261e889137dcb011244b1d372b5679

mirror of https://github.com/clearml/clearml-docs synced 2025-03-19 03:33:30 +00:00

2025-03-16 09:40:44 +02:00

3.5 KiB

Raw Blame History

title
Setup

The following page goes over how to set up and upgrade clearml-serving.

Prerequisites

ClearML-Server : Model repository, Service Health, Control plane
Kubernetes / Single-instance Machine : Deploying containers
CLI : Configuration and model deployment interface

Initial Setup

Set up your ClearML Server or use the free hosted service
Connect clearml SDK to the server, see instructions here
Install the clearml-serving CLI:
```
pip3 install clearml-serving
```
Create the Serving Service Controller:
```
clearml-serving create --name "serving example"
```
This command prints the Serving Service UID:
```
New Serving Service created: id=aa11bb22aa11bb22
```
Copy the Serving Service UID (e.g., aa11bb22aa11bb22), as you will need it in the next steps.

Clone the clearml-serving repository:

git clone https://github.com/clearml/clearml-serving.git

Edit the environment variables file (docker/example.env) with your clearml-server API credentials and Serving Service UID. For example:

cat docker/example.env

 CLEARML_WEB_HOST="https://app.clear.ml"
 CLEARML_API_HOST="https://api.clear.ml"
 CLEARML_FILES_HOST="https://files.clear.ml"
 CLEARML_API_ACCESS_KEY="<access_key_here>"
 CLEARML_API_SECRET_KEY="<secret_key_here>"
 CLEARML_SERVING_TASK_ID="<serving_service_id_here>"

Spin up the clearml-serving containers with docker-compose (or if running on Kubernetes, use the helm chart):
```
cd docker && docker-compose --env-file example.env -f docker-compose.yml up
```
If you need Triton support (Keras/PyTorch/ONNX etc.), use the triton docker-compose file:
```
cd docker && docker-compose --env-file example.env -f docker-compose-triton.yml up 
```
If running on a GPU instance with Triton support (Keras/PyTorch/ONNX etc.), use the triton gpu docker-compose file:
```
cd docker && docker-compose --env-file example.env -f docker-compose-triton-gpu.yml up
```

:::note Any model that registers with Triton engine will run the pre/post-processing code in the Inference service container, and the model inference itself will be executed on the Triton Engine container. :::

Advanced Setup - S3/GS/Azure Access (Optional)

To enable inference containers to download models from S3, Google Cloud Storage (GS), or Azure, add access credentials in the respective environment variables to your env files (example.env):

AWS_ACCESS_KEY_ID
AWS_SECRET_ACCESS_KEY
AWS_DEFAULT_REGION

GOOGLE_APPLICATION_CREDENTIALS

AZURE_STORAGE_ACCOUNT
AZURE_STORAGE_KEY

For further details, see Configuring Storage.

Upgrading ClearML Serving

Upgrading to v1.1

Shut down the serving containers (docker-compose or k8s)
Update the clearml-serving CLI:
```
pip3 install -U clearml-serving
```
Re-add a single existing endpoint with clearml-serving model add ... (press yes when asked). It will upgrade the clearml-serving session definitions.
Pull the latest serving containers (docker-compose pull ... or k8s)
Re-spin serving containers (docker-compose or k8s)

Tutorial

For further details, see the ClearML Serving Tutorial.

3.5 KiB Raw Blame History