diff --git a/docs/webapp/applications/apps_model_deployment.md b/docs/webapp/applications/apps_model_deployment.md index c9816b56..22c2e90f 100644 --- a/docs/webapp/applications/apps_model_deployment.md +++ b/docs/webapp/applications/apps_model_deployment.md @@ -1,25 +1,25 @@ --- -title: Model Deployment +title: vLLM Model Deployment --- :::important Enterprise Feature -The Model Deployment App is available under the ClearML Enterprise plan. +The vLLM Model Deployment App is available under the ClearML Enterprise plan. ::: -The Model Deployment application enables users to quickly deploy LLM models as networking services over a secure +The vLLM Model Deployment application enables users to quickly deploy LLM models as networking services over a secure endpoint. This application supports various model configurations and customizations to optimize performance and resource -usage. The Model Deployment application serves your model on a machine of your choice. Once an app instance is running, +usage. The vLLM Model Deployment application serves your model on a machine of your choice. Once an app instance is running, it serves your model through a secure, publicly accessible network endpoint. The app monitors endpoint activity and shuts down if the model remains inactive for a specified maximum idle time. :::info AI Application Gateway -The Model Deployment app makes use of the ClearML Traffic Router which implements a secure, authenticated +The vLLM Model Deployment app makes use of the ClearML Traffic Router which implements a secure, authenticated network endpoint for the model. If the ClearML AI application Gateway is not available, the model endpoint might not be accessible. ::: -Once you start a Model Deployment instance, you can view the following information in its dashboard: +Once you start a vLLM Model Deployment instance, you can view the following information in its dashboard: * Status indicator * Active instance - App instance is running and is actively in use * Loading instance - App instance is setting up @@ -45,12 +45,13 @@ Once you start a Model Deployment instance, you can view the following informati * Console log - The console log shows the app instance's console output: setup progress, status changes, error messages, etc. -![Model Deployment App](../../img/apps_model_deployment.png#light-mode-only) -![Model Deployment App](../../img/apps_model_deployment_dark.png#dark-mode-only) +![vLLM Model Deployment App](../../img/apps_model_deployment.png#light-mode-only) +![vLLM Model Deployment App](../../img/apps_model_deployment_dark.png#dark-mode-only) -## Model Deployment Instance Configuration -When configuring a new Model Deployment instance, you can fill in the required parameters or reuse the +## vLLM Model Deployment Instance Configuration + +When configuring a new vLLM Model Deployment instance, you can fill in the required parameters or reuse the configuration of a previously launched instance. Launch an app instance with the configuration of a previously launched instance using one of the following options: @@ -68,8 +69,8 @@ to open the app's configuration form. * **Import Configuration** - Import an app instance configuration file. This will fill the instance launch form with the values from the file, which can be modified before launching the app instance * **Project name** - ClearML Project Name -* **Task name** - Name of ClearML Task for your Model Deployment app instance -* **Queue** - The [ClearML Queue](../../fundamentals/agents_and_queues.md#what-is-a-queue) to which the Model Deployment app +* **Task name** - Name of ClearML Task for your vLLM Model Deployment app instance +* **Queue** - The [ClearML Queue](../../fundamentals/agents_and_queues.md#what-is-a-queue) to which the vLLM Model Deployment app instance task will be enqueued (make sure an agent is assigned to that queue) * **Model** - A ClearML Model ID or a HuggingFace model name (e.g. `openai-community/gpt2`) * **Model Configuration** @@ -145,5 +146,6 @@ instance task will be enqueued (make sure an agent is assigned to that queue) * **Export Configuration** - Export the app instance configuration as a JSON file, which you can later import to create a new instance with the same configuration -![Model Deployment app form](../../img/apps_model_deployment_form.png#light-mode-only) -![Model Deployment app form](../../img/apps_model_deployment_form_dark.png#dark-mode-only) \ No newline at end of file +![vLLM Model Deployment app form](../../img/apps_model_deployment_form.png#light-mode-only) +![vLLM Model Deployment app form](../../img/apps_model_deployment_form_dark.png#dark-mode-only) + diff --git a/docs/webapp/applications/apps_overview.md b/docs/webapp/applications/apps_overview.md index 16eb6d64..2ca31fca 100644 --- a/docs/webapp/applications/apps_overview.md +++ b/docs/webapp/applications/apps_overview.md @@ -38,7 +38,7 @@ Applications for deploying user interfaces for models: ### Deploy Applications for deploying machine learning models as scalable, secure services: * [**Embedding Model Deployment**](apps_embed_model_deployment.md) - Deploy embedding models as networking services over a secure endpoint (available under ClearML Enterprise Plan) -* [**Model Deployment**](apps_model_deployment.md) - Deploy LLM models as networking services over a secure endpoint (available under ClearML Enterprise Plan) +* [**vLLM Model Deployment**](apps_model_deployment.md) - Deploy LLM models as networking services over a secure endpoint (available under ClearML Enterprise Plan) * [**llama.cpp**](apps_llama_deployment.md) - Deploy LLM models in GGUF format using [`llama.cpp`](https://github.com/ggerganov/llama.cpp) as networking services over a secure endpoint (available under ClearML Enterprise Plan) :::info Autoscalers