clearml-serving/examples/pytorch/readme.md

44 lines
2.6 KiB
Markdown
Raw Normal View History

2022-04-25 09:05:35 +00:00
# Train and Deploy Pytorch model with Nvidia Triton Engine
2022-03-06 00:05:52 +00:00
## training mnist digit classifier model
Run the mock python training code
```bash
2022-03-06 00:05:52 +00:00
pip install -r examples/pytorch/requirements.txt
python examples/pytorch/train_pytorch_mnist.py
```
The output will be a model created on the project "serving examples", by the name "train pytorch model"
*Notice* Only TorchScript models are supported by Triton server
## setting up the serving service
2022-03-20 23:00:19 +00:00
Prerequisites, PyTorch models require Triton engine support, please use `docker-compose-triton.yml` / `docker-compose-triton-gpu.yml` or if running on Kubernetes, the matching helm chart.
1. Create serving Service: `clearml-serving create --name "serving example"` (write down the service ID)
2. Create model endpoint:
2022-03-06 00:12:02 +00:00
2022-03-06 00:05:52 +00:00
`clearml-serving --id <service_id> model add --engine triton --endpoint "test_model_pytorch" --preprocess "examples/pytorch/preprocess.py" --name "train pytorch model" --project "serving examples"
2022-03-06 01:14:06 +00:00
--input-size 1 28 28 --input-name "INPUT__0" --input-type float32
--output-size -1 10 --output-name "OUTPUT__0" --output-type float32
`
2022-03-06 00:12:02 +00:00
Or auto update
2022-03-06 00:12:02 +00:00
2022-03-06 00:05:52 +00:00
`clearml-serving --id <service_id> model auto-update --engine triton --endpoint "test_model_pytorch_auto" --preprocess "examples/pytorch/preprocess.py" --name "train pytorch model" --project "serving examples" --max-versions 2
2022-03-06 01:14:06 +00:00
--input-size 1 28 28 --input-name "INPUT__0" --input-type float32
2022-03-06 00:12:02 +00:00
--output-size -1 10 --output-name "OUTPUT__0" --output-type float32`
Or add Canary endpoint
2022-03-06 00:12:02 +00:00
`clearml-serving --id <service_id> model canary --endpoint "test_model_pytorch_auto" --weights 0.1 0.9 --input-endpoint-prefix test_model_pytorch_auto`
2022-03-21 15:54:57 +00:00
3. Make sure you have the `clearml-serving` `docker-compose-triton.yml` (or `docker-compose-triton-gpu.yml`) running, it might take it a minute or two to sync with the new endpoint.
4. Test new endpoint (do notice the first call will trigger the model pulling, so it might take longer, from here on, it's all in memory): `curl -X POST "http://127.0.0.1:8080/serve/test_model_pytorch" -H "accept: application/json" -H "Content-Type: application/json" -d '{"url": "https://camo.githubusercontent.com/8385ca52c9cba1f6e629eb938ab725ec8c9449f12db81f9a34e18208cd328ce9/687474703a2f2f706574722d6d6172656b2e636f6d2f77702d636f6e74656e742f75706c6f6164732f323031372f30372f6465636f6d707265737365642e6a7067"}'`
> **_Notice:_** You can also change the serving service while it is already running!
This includes adding/removing endpoints, adding canary model routing etc.
2022-03-20 23:00:19 +00:00
by default new endpoints/models will be automatically updated after 1 minute