Update application docs (#582)

This commit is contained in:
pollfly 2023-06-05 10:40:29 +03:00 committed by GitHub
parent abaa807403
commit 1161985782
No known key found for this signature in database
GPG Key ID: 4AEE18F83AFDEB23
7 changed files with 74 additions and 16 deletions

Binary file not shown.

Before

Width:  |  Height:  |  Size: 35 KiB

After

Width:  |  Height:  |  Size: 40 KiB

Binary file not shown.

Before

Width:  |  Height:  |  Size: 55 KiB

After

Width:  |  Height:  |  Size: 44 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 54 KiB

View File

@ -100,7 +100,22 @@ The autoscaler dashboard shows:
* Number of current running instances
* Console: the application log containing everything printed to stdout and stderr appears in the console log. The log
shows polling results of the autoscalers associated queues, including the number of tasks enqueued, and updates EC2
instances being spun up/down.
instances being spun up/down.
:::tip Console Debugging
To make the autoscaler console log show additional debug information, change an active app instances log level to DEBUG:
1. Go to the app instance tasks page > **CONFIGURATION** tab > **USER PROPERTIES** section
1. Hover over the section > Click `Edit` > Click `+ADD PARAMETER`
1. Input `log_level` as the key and `DEBUG` as the value of the new parameter.
![Autoscaler debugging](../../img/webapp_autoscaler_debug_log.png)
The consoles log level will update in the autoscaler's next iteration.
:::
* Instance log files - Click to access the app instance's logs. This takes you to the app instance task's ARTIFACTS tab,
which lists the app instances logs. In a logs `File Path` field, click <img src="/docs/latest/icons/ico-download-json.svg" alt="Download" className="icon size-sm space-sm" />
to download the complete log.
:::tip EMBEDDING CLEARML VISUALIZATION

View File

@ -6,23 +6,35 @@ title: Project Dashboard
The ClearML Project Dashboard App is available under the ClearML Pro plan
:::
The Project Dashboard Application provides an overview of a project's progress. It presents an aggregated view of a
chosen metric over the project's iterations, as well as project GPU and worker usage. It also supports alerts/warnings
on failed Tasks via Slack integration.
The Project Dashboard Application provides an overview of a project or workspaces progress. It presents an aggregated
view of task status and a chosen metric over time, as well as project GPU and worker usage. It also supports alerts/warnings
on completed/failed Tasks via Slack integration.
## Project Dashboard Instance Configuration
* **Import Configuration** - Import an app instance configuration file. This will fill the configuration wizard with the
values from the file, which can be modified before launching the app instance
* **Monitored Project** - Name of the ClearML project to monitor
* **Monitored Metric**
* Monitored Metric - Title - Metric title to track
* Monitored Metric - Series - Metric series (variant) to track
* Monitored Metric - Trend - Choose whether to track the monitored metric's highest or lowest values
* **Dashboard Title** - Name of the project dashboard instance, which will appear in the instance list
* **Failed Task Slack Monitor** (optional)
* API Token - Slack workspace access token
* Channel Name - Slack channel to which task failure alerts will be posted
* Fail Iteration Threshold - Minimum number of iterations to trigger Slack alerts about task failure (failed tasks that do not meet the threshold will be ignored)
* **Monitoring** - Select what the app instance should monitor. The options are:
* Project - Monitor a specific project. You can select an option to also monitor the specified projects subprojects
* Entire workspace - Monitor all projects in your workspace
:::caution
If your workspace or specified project contains a large number of experiments, the dashboard could take a while to update
:::
* **Monitored Metric** - Specify a metric for the app instance to monitor. The dashboard will present an aggregated view
of the chosen metric over time.
* Monitored Metric - Title - Metric title to track
* Monitored Metric - Series - Metric series (variant) to track
* Monitored Metric - Trend - Choose whether to track the monitored metric's highest or lowest values
* Slack Notification (optional) - Set up Slack integration for notifications of task failure. Select the
`Alert on completed experiments` under `Additional options` to set up alerts for task completions.
* API Token - Slack workspace access token
* Channel Name - Slack channel to which task failure alerts will be posted
* Alert Iteration Threshold - Minimum number of task iterations to trigger Slack alerts (tasks that fail prior to the threshold will be ignored)
* **Additional options**
* Track manual (non agent-run) experiments as well - Select to include in the dashboard experiments that were not executed by an agent
* Alert on completed experiments - Select to include completed tasks in alerts: in the dashboards Task Alerts section and in Slack Alerts.
* **Export Configuration** - Export the app instance configuration as a JSON file, which you can later import to create
a new instance with the same configuration.
@ -38,7 +50,10 @@ Once a project dashboard instance is launched, its dashboard displays the follow
* Metric Monitoring - An aggregated view of the values of a metric over time
* Projects Active Workers - Number of workers currently executing experiments in the monitored project
* Workers Table - List of active workers
* Failed Experiments - Failed experiments and their time of failure summary
* Task Alerts
* Failed tasks - Failed experiments and their time of failure summary
* Completed tasks - Completed experiments and their time of completion summary
:::tip EMBEDDING CLEARML VISUALIZATION
You can embed plots from the app instance dashboard into [ClearML Reports](../webapp_reports.md). These visualizations

View File

@ -26,6 +26,7 @@ For more information about how autoscalers work, see [Autoscalers Overview](../.
* **GCP Configuration**
* GCP Project ID - Project used for spinning up VM instances
* GCP Zone - The GCP zone where the VM instances will be spun up. See [Regions and zones](https://cloud.google.com/compute/docs/regions-zones)
* GCP Subnetwork - The GCP subnetwork where the instances will be spun up. GCP setting will be `projects/{project-id}/regions/{region}/subnetworks/{subnetwork}`
* GCP Credentials - Credentials with which the autoscaler can access your GCP account for spinning VM instances
up/down. See [Generating GCP Credentials](#generating-gcp-credentials).
* **Git Configuration** - Git credentials with which the ClearML Agents running on your VM instances will access your
@ -87,7 +88,22 @@ The autoscaler dashboard shows:
* Number of current running instances
* Console: the application log containing everything printed to stdout and stderr appears in the console log. The log
shows polling results of the autoscalers associated queues, including the number of tasks enqueued, and updates VM
instances being spun up/down.
instances being spun up/down
:::tip Console Debugging
To make the autoscaler console log show additional debug information, change an active app instances log level to DEBUG:
1. Go to the app instance tasks page > **CONFIGURATION** tab > **USER PROPERTIES** section
1. Hover over the section > Click `Edit` > Click `+ADD PARAMETER`
1. Input `log_level` as the key and `DEBUG` as the value of the new parameter.
![Autoscaler debugging](../../img/webapp_autoscaler_debug_log.png)
The consoles log level will update in the autoscaler's next iteration.
:::
* Instance log files - Click to access the app instance's logs. This takes you to the app instance task's ARTIFACTS tab,
which lists the app instances logs. In a logs `File Path` field, click <img src="/docs/latest/icons/ico-download-json.svg" alt="Download" className="icon size-sm space-sm" />
to download the complete log.
:::tip EMBEDDING CLEARML VISUALIZATION
You can embed plots from the app instance dashboard into [ClearML Reports](../webapp_reports.md). These visualizations

View File

@ -30,13 +30,14 @@ For more information about how autoscalers work, see [Autoscalers Overview](../.
machines of this specification
* Cloud Machine Limit - Maximum number of concurrent machines to launch
* **Idle Time Limit** (optional) - Maximum time in minutes that a cloud machine can be idle before it is spun down
* **Default Docker Image** (optional) - Default Docker image in which the ClearML Agent will run. Provide a Docker stored
* **Default Docker Image** - Default Docker image in which the ClearML Agent will run. Provide a Docker stored
in a Docker artifactory so instances can automatically fetch it
* **Git Configuration** - Git credentials with which the ClearML Agents running on your cloud instances will access your repositories to retrieve the code for their jobs
* Git User
* Git Password / Personal Access Token
* **Cloud Storage Access** (optional) - Access credentials to cloud storage service. Provides ClearML Tasks running on cloud
machines access to your storage
* Additional ClearML Configuration (optional) - A ClearML configuration file to use by the ClearML Agent when executing your experiments
![GPU Compute wizard](../../img/apps_gpu_compute_wizard.png)
@ -63,6 +64,17 @@ The GPU Compute dashboard shows:
* Instance History - Number of running cloud instances over time
* Console - The log shows updates of cloud instances being spun up/down.
:::tip Console Debugging
To make the autoscaler console log show additional debug information, change an active app instances log level to DEBUG:
1. Go to the app instance tasks page > **CONFIGURATION** tab > **USER PROPERTIES** section
1. Hover over the section > Click `Edit` > Click `+ADD PARAMETER`
1. Input `log_level` as the key and `DEBUG` as the value of the new parameter.
![Autoscaler debugging](../../img/webapp_autoscaler_debug_log.png)
The consoles log level will update in the autoscaler's next iteration.
:::
:::tip EMBEDDING CLEARML VISUALIZATION
You can embed plots from the app instance dashboard into [ClearML Reports](../webapp_reports.md). These visualizations
are updated live as the app instance(s) updates. The Enterprise Plan and Hosted Service support embedding resources in