mirror of
https://github.com/clearml/clearml-docs
synced 2025-02-22 20:19:37 +00:00
94 lines
4.5 KiB
Markdown
94 lines
4.5 KiB
Markdown
---
|
|
title: Model Endpoints
|
|
---
|
|
|
|
The Model Endpoint table lists all currently live (active, and being brought up) model endpoints, allowing you to view
|
|
endpoint details and monitor status over time. Whenever you deploy a model through the [ClearML Deploy UI applications](applications/apps_overview.md#deploy),
|
|
it will be listed in the table.
|
|
|
|
View the model endpoints in table view <img src="/docs/latest/icons/ico-table-view.svg" alt="Table view" className="icon size-md space-sm" />
|
|
or in details view <img src="/docs/latest/icons/ico-split-view.svg" alt="Details view" className="icon size-md space-sm" />
|
|
using the buttons on the top left of the page. Use the table view for a comparative view of your endpoints according to
|
|
columns of interest. Use the details view to access a selected endpoint's details. Details view can also be accessed by
|
|
double-clicking a specific endpoint in the table view to open its details view.
|
|
|
|
Download the model endpoint table as a CSV file by clicking **Download** <img src="/docs/latest/icons/ico-download.svg" alt="Download" className="icon size-md space-sm" />.
|
|
|
|
## Endpoint Tables
|
|
|
|
Active Endpoints are displayed in the **Active** tab. The table provides the following information:
|
|
* Endpoint - Endpoint name
|
|
* Model - Model Name
|
|
* URL - Endpoint URL
|
|
* \# Instances - Number of model instances
|
|
* Uptime - Longest duration that any of the model instances has been running
|
|
* \# Requests - Total number of requests to the endpoint
|
|
* Requests / MIN (avg) - Average request rate in the last minute
|
|
* Latency (avg) - Average endpoint response latency
|
|
|
|
data:image/s3,"s3://crabby-images/c6472/c6472dba51183c5fc22817c6edad448b115185cd" alt="Active endpoints"
|
|
data:image/s3,"s3://crabby-images/2fcb8/2fcb877b88fe86fedfdebbd0e47681fff19d3507" alt="Active endpoints"
|
|
|
|
The **Loading** tab shows endpoints that are being set up, but are not yet active. The table provides the following
|
|
information:
|
|
|
|
* Instance ID - The model deployment application instance ID
|
|
* Model - Model Name
|
|
* Uptime - Time since this endpoint has started setting up
|
|
* Preprocess artifact - Preprocessing code used for the endpoint
|
|
* Input type - Model matrix input type (e.g. uint8, float32, int16, float16)
|
|
* Input size - Model matrix input size
|
|
|
|
data:image/s3,"s3://crabby-images/a973f/a973f1f37e7989a487f2aabd1c501075859066db" alt="Loading endpoints"
|
|
data:image/s3,"s3://crabby-images/261bb/261bb7b1270aaaab16cb6318f03c71423d7d25dc" alt="Loading endpoints"
|
|
|
|
You can apply column filters by clicking <img src="/docs/latest/icons/ico-filter-off.svg" alt="Filter" className="icon size-md" />,
|
|
and sort endpoints by clicking <img src="/docs/latest/icons/ico-sort-off.svg" alt="Sort order" className="icon size-md space-sm" /> on the relevant column.
|
|
|
|
## Active Endpoint Details
|
|
Clicking on a model endpoint opens it in detailed view.
|
|
|
|
### Details
|
|
|
|
The **Details** tab displays the model endpoint information:
|
|
* Endpoint Name
|
|
* Endpoint URL
|
|
* Model name (click to go to ClearML or HuggingFace model page)
|
|
* Uptime - Duration of longest running endpoint instance
|
|
* Preprocess artifact - Preprocessing code used for the endpoint
|
|
* Input type - Model matrix input type (e.g. uint8, float32, int16, float16)
|
|
* Input size - Model matrix input size
|
|
* Model instances list - Model instances servicing the endpoint. The following information is displayed for each instance:
|
|
* Instance ID - Click instance ID to go to the model endpoint's ClearML Application instance dashboard .
|
|
* Uptime
|
|
* Number of requests - Total count
|
|
* Request rate - Average requests per minute
|
|
* CPU Count
|
|
* GPU Count
|
|
* Latency - Average request latency in the last minute
|
|
|
|
data:image/s3,"s3://crabby-images/6f27e/6f27e2d01e30e8fe76c3fa7508d7ed8338f5f002" alt="Endpoints details"
|
|
data:image/s3,"s3://crabby-images/1b08d/1b08dd3dc1f3d0f0c4ca840e4debe69bbb4e10a3" alt="Endpoints details"
|
|
|
|
### Monitor
|
|
The Monitor tab displays the endpoint's operational metrics and resource usage over time graphs:
|
|
* Total number of requests
|
|
* Average requests/min
|
|
* Average latency
|
|
* Machine utilization metrics:
|
|
* CPU and GPU Usage
|
|
* Memory Usage
|
|
* Video Memory Usage
|
|
* Network Usage
|
|
|
|
data:image/s3,"s3://crabby-images/12f3d/12f3daf93d68b1e1403055c36093068479c9778c" alt="Endpoints monitor"
|
|
data:image/s3,"s3://crabby-images/ab5da/ab5da0c4494c2415357580a05a8b42ba87473d54" alt="Endpoints monitor"
|
|
|
|
The graphs' time span can be controlled through the menu at its top right corner.
|
|
Click <img src="/docs/latest/icons/ico-show.svg" alt="Eye Show" className="icon size-md space-sm" /> to control which
|
|
plots to display.
|
|
|
|
For example, to display specific plots, click **HIDE ALL**, and then click <img src="/docs/latest/icons/ico-show.svg" alt="Eye Show" className="icon size-md space-sm" />
|
|
on each plot you want to view.
|
|
|