clearml-docs/clearml_serving_cli.md at 10d9a17704bd6246a1e48018d926158166035b42

mirror of https://github.com/clearml/clearml-docs synced 2025-01-31 14:37:18 +00:00

2022-05-01 10:06:09 +03:00

17 KiB

Raw Blame History

title
CLI

The clearml-serving utility is a CLI tool for model deployment and orchestration.

The following page provides a reference for clearml-serving's CLI commands:

list - List running Serving Services
create - Create a new Serving Service
metrics - Configure inference metrics Service
config - Configure a new Serving Service
model - Configure model endpoints for a running Service

Global Parameters

clearml-serving [-h] [--debug] [--id ID] {list,create,metrics,config,model}

Name	Description	Optional
`--id`	Serving Service (Control plane) Task ID to configure (if not provided automatically detect the running control plane Task)
`--debug`	Print debug messages

:::info Service ID The Serving Service's ID (--id) is required to execute the metrics, config, and model commands. :::

list

List running Serving Services.

clearml-serving list [-h]

create

Create a new Serving Service.

clearml-serving create [-h] [--name NAME] [--tags TAGS [TAGS ...]] [--project PROJECT]

Parameters

Name	Description	Optional
`--name`	Serving service's name. Default: `Serving-Service`
`--project`	Serving service's project. Default: `DevOps`
`--tags`	Serving service's user tags. The serving service can be labeled, which can be useful for organizing

metrics

Configure inference metrics Service.

clearml-serving metrics [-h] {add,remove,list}

add

Add/modify metric for a specific endpoint.

clearml-serving metrics add [-h] --endpoint ENDPOINT [--log-freq LOG_FREQ]
                            [--variable-scalar VARIABLE_SCALAR [VARIABLE_SCALAR ...]]
                            [--variable-enum VARIABLE_ENUM [VARIABLE_ENUM ...]]
                            [--variable-value VARIABLE_VALUE [VARIABLE_VALUE ...]]

Parameters

Name	Description	Optional
`--endpoint`	Metric endpoint name including version (e.g. `"model/1"` or a prefix `"model/*"`). Notice: it will override any previous endpoint logged metrics
`--log-freq`	Logging request frequency, between 0.0 to 1.0. Example: 1.0 means all requests are logged, 0.5 means half of the requests are logged if not specified. To use global logging frequency, see `config --metric-log-freq`
`--variable-scalar`	Add float (scalar) argument to the metric logger, `<name>=<histogram>`. Example: with specific buckets: `"x1=0,0.2,0.4,0.6,0.8,1"` or with min/max/num_buckets `"x1=0.0/1.0/5"`
`--variable-enum`	Add enum (string) argument to the metric logger, `<name>=<optional_values>`. Example: `"detect=cat,dog,sheep"`
`--variable-value`	Add non-samples scalar argument to the metric logger, `<name>`. Example: `"latency"`

remove

Remove metric from a specific endpoint.

clearml-serving metrics remove [-h] [--endpoint ENDPOINT]
                               [--variable VARIABLE [VARIABLE ...]]

Parameters

Name	Description	Optional
`--endpoint`	Metric endpoint name including version (e.g. `"model/1"` or a prefix `"model/*"`)
`--variable`	Remove (scalar/enum) argument from the metric logger, `<name>` example: `"x1"`

list

List metrics logged on all endpoints.

clearml-serving metrics list [-h]

config

Configure a new Serving Service.

clearml-serving config [-h] [--base-serving-url BASE_SERVING_URL]
                       [--triton-grpc-server TRITON_GRPC_SERVER]
                       [--kafka-metric-server KAFKA_METRIC_SERVER]
                       [--metric-log-freq METRIC_LOG_FREQ]

Parameters

Name	Description	Optional
`--base-serving-url`	External base serving service url. Example: `http://127.0.0.1:8080/serve`
`--triton-grpc-server`	External ClearML-Triton serving container gRPC address. Example: `127.0.0.1:9001`
`--kafka-metric-server`	External Kafka service url. Example: `127.0.0.1:9092`
`--metric-log-freq`	Set default metric logging frequency between 0.0 to 1.0. 1.0 means that 100% of all requests are logged

model

Configure model endpoints for an already running Service.

clearml-serving model [-h] {list,remove,upload,canary,auto-update,add}

list

List current models.

clearml-serving model list [-h]

remove

Remove model by its endpoint name.

clearml-serving model remove [-h] [--endpoint ENDPOINT]

Parameter

Name	Description	Optional
`--endpoint`	Model endpoint name

upload

Upload and register model files/folder.

clearml-serving model upload [-h] --name NAME [--tags TAGS [TAGS ...]] --project PROJECT
                             [--framework {scikit-learn,xgboost,lightgbm,tensorflow,pytorch}]
                             [--publish] [--path PATH] [--url URL]
                             [--destination DESTINATION]

Parameters

Name	Description	Optional
`--name`	Specifying the model name to be registered in
`--tags`	Add tags to the newly created model
`--project`	Specify the project for the model to be registered in
`--framework`	Specify the model framework. Options are: "scikit-learn", "xgboost", "lightgbm", "tensorflow", "pytorch"
`--publish`	Publish the newly created model (change model state to "published" (i.e. locked and ready to deploy)
`--path`	Specify a model file/folder to be uploaded and registered
`--url`	Specify an already uploaded model url (e.g. `s3://bucket/model.bin`, `gs://bucket/model.bin`)
`--destination`	Specify the target destination for the model to be uploaded (e.g. `s3://bucket/folder/`, `gs://bucket/folder/`)

canary

Add model Canary/A/B endpoint.

clearml-serving model canary [-h] [--endpoint ENDPOINT] [--weights WEIGHTS [WEIGHTS ...]]
                             [--input-endpoints INPUT_ENDPOINTS [INPUT_ENDPOINTS ...]]
                             [--input-endpoint-prefix INPUT_ENDPOINT_PREFIX]

Parameters

Name	Description	Optional
`--endpoint`	Model canary serving endpoint name (e.g. `my_model/latest`)
`--weights`	Model canary weights (order matching model ep), (e.g. 0.2 0.8)
`--input-endpoints`	Model endpoint prefixes, can also include version (e.g. `my_model`, `my_model/v1`)
`--input-endpoint-prefix`	Model endpoint prefix, lexicographic order or by version `<int>` (e.g. `my_model/1`, `my_model/v1`), where the first weight matches the last version.

auto-update

Add/Modify model auto-update service.

clearml-serving model auto-update [-h] [--endpoint ENDPOINT] --engine ENGINE
                                  [--max-versions MAX_VERSIONS] [--name NAME]
                                  [--tags TAGS [TAGS ...]] [--project PROJECT]
                                  [--published] [--preprocess PREPROCESS]
                                  [--input-size INPUT_SIZE [INPUT_SIZE ...]]
                                  [--input-type INPUT_TYPE] [--input-name INPUT_NAME]
                                  [--output-size OUTPUT_SIZE [OUTPUT_SIZE ...]]
                                  [--output_type OUTPUT_TYPE] [--output-name OUTPUT_NAME]
                                  [--aux-config AUX_CONFIG [AUX_CONFIG ...]]

Parameters

Name	Description	Optional
`--endpoint`	Base model endpoint (must be unique)
`--engine`	Model endpoint serving engine (triton, sklearn, xgboost, lightgbm)
`--max-versions`	Max versions to store (and create endpoints) for the model. Highest number is the latest version
`--name`	Specify model name to be selected and auto-updated (notice regexp selection use `"$name^"` for exact match)
`--tags`	Specify tags to be selected and auto-updated
`--project`	Specify model project to be selected and auto-updated
`--published`	Only select published model for auto-update
`--preprocess`	Specify Pre/Post processing code to be used with the model (point to local file / folder) - this should hold for all the models
`--input-size`	Specify the model matrix input size [Rows x Columns X Channels etc ...]
`--input-type`	Specify the model matrix input type. Examples: uint8, float32, int16, float16 etc.
`--input-name`	Specify the model layer pushing input into. Example: layer_0
`--output-size`	Specify the model matrix output size [Rows x Columns X Channels etc ...]
`--output_type`	Specify the model matrix output type. Examples: uint8, float32, int16, float16 etc.
`--output-name`	Specify the model layer pulling results from. Examples: layer_99
`--aux-config`	Specify additional engine specific auxiliary configuration in the form of key=value. Example: `platform=onnxruntime_onnx response_cache.enable=true max_batch_size=8`. Notice: you can also pass a full configuration file (e.g. Triton "config.pbtxt")

add

Add/Update model.

clearml-serving model add [-h] --engine ENGINE --endpoint ENDPOINT [--version VERSION]
                          [--model-id MODEL_ID] [--preprocess PREPROCESS]
                          [--input-size INPUT_SIZE [INPUT_SIZE ...]]
                          [--input-type INPUT_TYPE] [--input-name INPUT_NAME]
                          [--output-size OUTPUT_SIZE [OUTPUT_SIZE ...]]
                          [--output-type OUTPUT_TYPE] [--output-name OUTPUT_NAME]
                          [--aux-config AUX_CONFIG [AUX_CONFIG ...]] [--name NAME]
                          [--tags TAGS [TAGS ...]] [--project PROJECT] [--published]

Parameters

Name	Description	Optional
`--engine`	Model endpoint serving engine (triton, sklearn, xgboost, lightgbm)
`--endpoint`	Base model endpoint (must be unique)
`--version`	Model endpoint version (default: None)
`model-id`	Specify a model ID to be served
`--preprocess`	Specify Pre/Post processing code to be used with the model (point to local file / folder) - this should hold for all the models
`--input-size`	Specify the model matrix input size [Rows x Columns X Channels etc ...]
`--input-type`	Specify the model matrix input type. Examples: uint8, float32, int16, float16 etc.
`--input-name`	Specify the model layer pushing input into. Example: layer_0
`--output-size`	Specify the model matrix output size [Rows x Columns X Channels etc ...]
`--output_type`	Specify the model matrix output type. Examples: uint8, float32, int16, float16 etc.
`--output-name`	Specify the model layer pulling results from. Examples: layer_99
`--aux-config`	Specify additional engine specific auxiliary configuration in the form of key=value. Example: `platform=onnxruntime_onnx response_cache.enable=true max_batch_size=8`. Notice: you can also pass a full configuration file (e.g. Triton "config.pbtxt")
`--name`	Instead of specifying `model-id` select based on model name
`--tags`	Specify tags to be selected and auto-updated
`--project`	Instead of specifying `model-id` select based on model project
`--published`	Instead of specifying `model-id` select based on model published

17 KiB Raw Blame History

Global Parameters

list

create

metrics

add

remove

list

config

model

list

remove

upload

canary

auto-update

add

17 KiB

Raw Blame History