mirror of
https://github.com/clearml/clearml-docs
synced 2025-02-07 05:20:07 +00:00
Update application docs (#582)
This commit is contained in:
parent
abaa807403
commit
1161985782
Binary file not shown.
Before Width: | Height: | Size: 35 KiB After Width: | Height: | Size: 40 KiB |
Binary file not shown.
Before Width: | Height: | Size: 55 KiB After Width: | Height: | Size: 44 KiB |
BIN
docs/img/webapp_autoscaler_debug_log.png
Normal file
BIN
docs/img/webapp_autoscaler_debug_log.png
Normal file
Binary file not shown.
After Width: | Height: | Size: 54 KiB |
@ -100,7 +100,22 @@ The autoscaler dashboard shows:
|
||||
* Number of current running instances
|
||||
* Console: the application log containing everything printed to stdout and stderr appears in the console log. The log
|
||||
shows polling results of the autoscaler’s associated queues, including the number of tasks enqueued, and updates EC2
|
||||
instances being spun up/down.
|
||||
instances being spun up/down.
|
||||
|
||||
:::tip Console Debugging
|
||||
To make the autoscaler console log show additional debug information, change an active app instance’s log level to DEBUG:
|
||||
1. Go to the app instance task’s page > **CONFIGURATION** tab > **USER PROPERTIES** section
|
||||
1. Hover over the section > Click `Edit` > Click `+ADD PARAMETER`
|
||||
1. Input `log_level` as the key and `DEBUG` as the value of the new parameter.
|
||||
|
||||
![Autoscaler debugging](../../img/webapp_autoscaler_debug_log.png)
|
||||
|
||||
The console’s log level will update in the autoscaler's next iteration.
|
||||
:::
|
||||
|
||||
* Instance log files - Click to access the app instance's logs. This takes you to the app instance task's ARTIFACTS tab,
|
||||
which lists the app instance’s logs. In a log’s `File Path` field, click <img src="/docs/latest/icons/ico-download-json.svg" alt="Download" className="icon size-sm space-sm" />
|
||||
to download the complete log.
|
||||
|
||||
|
||||
:::tip EMBEDDING CLEARML VISUALIZATION
|
||||
|
@ -6,23 +6,35 @@ title: Project Dashboard
|
||||
The ClearML Project Dashboard App is available under the ClearML Pro plan
|
||||
:::
|
||||
|
||||
The Project Dashboard Application provides an overview of a project's progress. It presents an aggregated view of a
|
||||
chosen metric over the project's iterations, as well as project GPU and worker usage. It also supports alerts/warnings
|
||||
on failed Tasks via Slack integration.
|
||||
The Project Dashboard Application provides an overview of a project or workspace’s progress. It presents an aggregated
|
||||
view of task status and a chosen metric over time, as well as project GPU and worker usage. It also supports alerts/warnings
|
||||
on completed/failed Tasks via Slack integration.
|
||||
|
||||
## Project Dashboard Instance Configuration
|
||||
* **Import Configuration** - Import an app instance configuration file. This will fill the configuration wizard with the
|
||||
values from the file, which can be modified before launching the app instance
|
||||
* **Monitored Project** - Name of the ClearML project to monitor
|
||||
* **Monitored Metric**
|
||||
* Monitored Metric - Title - Metric title to track
|
||||
* Monitored Metric - Series - Metric series (variant) to track
|
||||
* Monitored Metric - Trend - Choose whether to track the monitored metric's highest or lowest values
|
||||
* **Dashboard Title** - Name of the project dashboard instance, which will appear in the instance list
|
||||
* **Failed Task Slack Monitor** (optional)
|
||||
* API Token - Slack workspace access token
|
||||
* Channel Name - Slack channel to which task failure alerts will be posted
|
||||
* Fail Iteration Threshold - Minimum number of iterations to trigger Slack alerts about task failure (failed tasks that do not meet the threshold will be ignored)
|
||||
* **Monitoring** - Select what the app instance should monitor. The options are:
|
||||
* Project - Monitor a specific project. You can select an option to also monitor the specified project’s subprojects
|
||||
* Entire workspace - Monitor all projects in your workspace
|
||||
|
||||
:::caution
|
||||
If your workspace or specified project contains a large number of experiments, the dashboard could take a while to update
|
||||
:::
|
||||
|
||||
* **Monitored Metric** - Specify a metric for the app instance to monitor. The dashboard will present an aggregated view
|
||||
of the chosen metric over time.
|
||||
* Monitored Metric - Title - Metric title to track
|
||||
* Monitored Metric - Series - Metric series (variant) to track
|
||||
* Monitored Metric - Trend - Choose whether to track the monitored metric's highest or lowest values
|
||||
* Slack Notification (optional) - Set up Slack integration for notifications of task failure. Select the
|
||||
`Alert on completed experiments` under `Additional options` to set up alerts for task completions.
|
||||
* API Token - Slack workspace access token
|
||||
* Channel Name - Slack channel to which task failure alerts will be posted
|
||||
* Alert Iteration Threshold - Minimum number of task iterations to trigger Slack alerts (tasks that fail prior to the threshold will be ignored)
|
||||
* **Additional options**
|
||||
* Track manual (non agent-run) experiments as well - Select to include in the dashboard experiments that were not executed by an agent
|
||||
* Alert on completed experiments - Select to include completed tasks in alerts: in the dashboard’s Task Alerts section and in Slack Alerts.
|
||||
* **Export Configuration** - Export the app instance configuration as a JSON file, which you can later import to create
|
||||
a new instance with the same configuration.
|
||||
|
||||
@ -38,7 +50,10 @@ Once a project dashboard instance is launched, its dashboard displays the follow
|
||||
* Metric Monitoring - An aggregated view of the values of a metric over time
|
||||
* Project’s Active Workers - Number of workers currently executing experiments in the monitored project
|
||||
* Workers Table - List of active workers
|
||||
* Failed Experiments - Failed experiments and their time of failure summary
|
||||
* Task Alerts
|
||||
* Failed tasks - Failed experiments and their time of failure summary
|
||||
* Completed tasks - Completed experiments and their time of completion summary
|
||||
|
||||
|
||||
:::tip EMBEDDING CLEARML VISUALIZATION
|
||||
You can embed plots from the app instance dashboard into [ClearML Reports](../webapp_reports.md). These visualizations
|
||||
|
@ -26,6 +26,7 @@ For more information about how autoscalers work, see [Autoscalers Overview](../.
|
||||
* **GCP Configuration**
|
||||
* GCP Project ID - Project used for spinning up VM instances
|
||||
* GCP Zone - The GCP zone where the VM instances will be spun up. See [Regions and zones](https://cloud.google.com/compute/docs/regions-zones)
|
||||
* GCP Subnetwork - The GCP subnetwork where the instances will be spun up. GCP setting will be `projects/{project-id}/regions/{region}/subnetworks/{subnetwork}`
|
||||
* GCP Credentials - Credentials with which the autoscaler can access your GCP account for spinning VM instances
|
||||
up/down. See [Generating GCP Credentials](#generating-gcp-credentials).
|
||||
* **Git Configuration** - Git credentials with which the ClearML Agents running on your VM instances will access your
|
||||
@ -87,7 +88,22 @@ The autoscaler dashboard shows:
|
||||
* Number of current running instances
|
||||
* Console: the application log containing everything printed to stdout and stderr appears in the console log. The log
|
||||
shows polling results of the autoscaler’s associated queues, including the number of tasks enqueued, and updates VM
|
||||
instances being spun up/down.
|
||||
instances being spun up/down
|
||||
|
||||
:::tip Console Debugging
|
||||
To make the autoscaler console log show additional debug information, change an active app instance’s log level to DEBUG:
|
||||
1. Go to the app instance task’s page > **CONFIGURATION** tab > **USER PROPERTIES** section
|
||||
1. Hover over the section > Click `Edit` > Click `+ADD PARAMETER`
|
||||
1. Input `log_level` as the key and `DEBUG` as the value of the new parameter.
|
||||
|
||||
![Autoscaler debugging](../../img/webapp_autoscaler_debug_log.png)
|
||||
|
||||
The console’s log level will update in the autoscaler's next iteration.
|
||||
:::
|
||||
|
||||
* Instance log files - Click to access the app instance's logs. This takes you to the app instance task's ARTIFACTS tab,
|
||||
which lists the app instance’s logs. In a log’s `File Path` field, click <img src="/docs/latest/icons/ico-download-json.svg" alt="Download" className="icon size-sm space-sm" />
|
||||
to download the complete log.
|
||||
|
||||
:::tip EMBEDDING CLEARML VISUALIZATION
|
||||
You can embed plots from the app instance dashboard into [ClearML Reports](../webapp_reports.md). These visualizations
|
||||
|
@ -30,13 +30,14 @@ For more information about how autoscalers work, see [Autoscalers Overview](../.
|
||||
machines of this specification
|
||||
* Cloud Machine Limit - Maximum number of concurrent machines to launch
|
||||
* **Idle Time Limit** (optional) - Maximum time in minutes that a cloud machine can be idle before it is spun down
|
||||
* **Default Docker Image** (optional) - Default Docker image in which the ClearML Agent will run. Provide a Docker stored
|
||||
* **Default Docker Image** - Default Docker image in which the ClearML Agent will run. Provide a Docker stored
|
||||
in a Docker artifactory so instances can automatically fetch it
|
||||
* **Git Configuration** - Git credentials with which the ClearML Agents running on your cloud instances will access your repositories to retrieve the code for their jobs
|
||||
* Git User
|
||||
* Git Password / Personal Access Token
|
||||
* **Cloud Storage Access** (optional) - Access credentials to cloud storage service. Provides ClearML Tasks running on cloud
|
||||
machines access to your storage
|
||||
* Additional ClearML Configuration (optional) - A ClearML configuration file to use by the ClearML Agent when executing your experiments
|
||||
|
||||
![GPU Compute wizard](../../img/apps_gpu_compute_wizard.png)
|
||||
|
||||
@ -63,6 +64,17 @@ The GPU Compute dashboard shows:
|
||||
* Instance History - Number of running cloud instances over time
|
||||
* Console - The log shows updates of cloud instances being spun up/down.
|
||||
|
||||
:::tip Console Debugging
|
||||
To make the autoscaler console log show additional debug information, change an active app instance’s log level to DEBUG:
|
||||
1. Go to the app instance task’s page > **CONFIGURATION** tab > **USER PROPERTIES** section
|
||||
1. Hover over the section > Click `Edit` > Click `+ADD PARAMETER`
|
||||
1. Input `log_level` as the key and `DEBUG` as the value of the new parameter.
|
||||
|
||||
![Autoscaler debugging](../../img/webapp_autoscaler_debug_log.png)
|
||||
|
||||
The console’s log level will update in the autoscaler's next iteration.
|
||||
:::
|
||||
|
||||
:::tip EMBEDDING CLEARML VISUALIZATION
|
||||
You can embed plots from the app instance dashboard into [ClearML Reports](../webapp_reports.md). These visualizations
|
||||
are updated live as the app instance(s) updates. The Enterprise Plan and Hosted Service support embedding resources in
|
||||
|
Loading…
Reference in New Issue
Block a user