clearml-docs/docs/clearml_serving/clearml_serving.md

67 lines
2.8 KiB
Markdown
Raw Normal View History

2022-04-05 11:30:30 +00:00
---
title: Introduction
---
`clearml-serving` is a command line utility for model deployment and orchestration.
It enables model deployment including serving and preprocessing code to a Kubernetes cluster or custom container based
solution.
## Features
* Easy to deploy & configure
* Support Machine Learning Models (Scikit Learn, XGBoost, LightGBM)
2023-01-23 13:04:24 +00:00
* Support Deep Learning Models (TensorFlow, PyTorch, ONNX)
2022-04-05 11:30:30 +00:00
* Customizable RestAPI for serving (i.e. allow per model pre/post-processing for easy integration)
* Flexible
* On-line model deployment
* On-line endpoint model/version deployment (i.e. no need to take the service down)
* Per model standalone preprocessing and postprocessing python code
* Scalable
* Multi model per container
* Multi models per serving service
2022-11-08 11:49:34 +00:00
* Multi-service support (fully separated multiple serving service running independently)
2022-04-05 11:30:30 +00:00
* Multi cluster support
2023-02-05 11:43:44 +00:00
* Out-of-the-box node autoscaling based on load/usage
2022-04-05 11:30:30 +00:00
* Efficient
* Multi-container resource utilization
* Support for CPU & GPU nodes
* Auto-batching for DL models
* [Automatic deployment](clearml_serving_tutorial.md#automatic-model-deployment)
* Automatic model upgrades w/ canary support
* Programmable API for model deployment
* [Canary A/B deployment](clearml_serving_tutorial.md#canary-endpoint-setup) - online Canary updates
* [Model Monitoring](clearml_serving_tutorial.md#model-monitoring-and-performance-metrics)
* Usage Metric reporting
* Metric Dashboard
* Model performance metric
* Model performance Dashboard
## Components
![ClearML Serving](https://github.com/allegroai/clearml-serving/raw/main/docs/design_diagram.png?raw=true)
* **CLI** - Secure configuration interface for on-line model upgrade/deployment on running Serving Services
* **Serving Service Task** - Control plane object storing configuration on all the endpoints. Support multiple separated
instance, deployed on multiple clusters.
2022-12-27 14:01:47 +00:00
* **Inference Services** - Inference containers, performing model serving pre/post-processing. Also supports CPU model
2022-04-05 11:30:30 +00:00
inferencing.
* **Serving Engine Services** - Inference engine containers (e.g. Nvidia Triton, TorchServe etc.) used by the Inference
Services for heavier model inference.
* **Statistics Service** - Single instance per Serving Service collecting and broadcasting model serving & performance
statistics
* **Time-series DB** - Statistics collection service used by the Statistics Service, e.g. Prometheus
* **Dashboards** - Customizable dashboard solution on top of the collected statistics, e.g. Grafana
2023-01-17 15:04:42 +00:00
![Grafana dashboard](../img/gif/clearml_serving_grafana_gif.gif)
2023-01-15 07:28:34 +00:00
## Next Steps
See ClearML Serving setup instructions [here](clearml_serving_setup.md). For further details, see the ClearML Serving
[Tutorial](clearml_serving_tutorial.md).