Update docs (#854)
@ -71,7 +71,7 @@ for column customization options.
|
||||
|
||||

|
||||
|
||||
The dataset version's frames can be filtered by multiple criteria. The resulting frames can be exported as a JSON file.
|
||||
The dataset version's frames can be filtered by multiple criteria. The resulting frames can be [exported as a JSON file](#exporting-frames).
|
||||
|
||||
To view the details of a specific frame, click on its preview, which will open the [Frame Viewer](webapp_datasets_frames.md#frame-viewer).
|
||||
|
||||
@ -174,6 +174,20 @@ Lucene queries can also be used in ROI label filters and frame rules.
|
||||
|
||||
</Collapsible>
|
||||
|
||||
### Sorting Frames
|
||||
|
||||
Sort the dataset version’s frames by any of the following attributes:
|
||||
* ID
|
||||
* Last update time
|
||||
* Dimensions (height)
|
||||
* Timestamp
|
||||
* Context ID
|
||||
* Metadata key - Click `+ Metadata Key` and select the desired key for sorting
|
||||
|
||||
Click <img src="/docs/latest/icons/ico-sort.svg" alt="Sort order" className="icon size-md space-sm" /> to toggle between ascending and descending sort orders.
|
||||
|
||||

|
||||
|
||||
### Exporting Frames
|
||||
|
||||
To export (download) the filtered frames as a JSON file, click <img src="/docs/latest/icons/ico-bars-menu.svg" alt="Menu" className="icon size-md space-sm" /> > **EXPORT FRAMES**.
|
||||
@ -185,12 +199,51 @@ frame browser configuration settings.
|
||||

|
||||
|
||||
#### Grouping Previews
|
||||
FrameGroups or SingleFrames can share the same `context_id` (URL). For example, users can set the same `context_id`
|
||||
to multiple FrameGroups that represent frames in a single video.
|
||||
|
||||
Use the **Grouping** menu to select one of the following options:
|
||||
* Split Preview - Show separate previews for each individual FrameGroup, regardless of shared context.
|
||||
* Group by URL - Show a single preview for all FrameGroups with the same context
|
||||
Use the **Grouping** menu to set how to display frames that share a common property:
|
||||
* **Split Preview** - Show a separate preview for each individual FrameGroup
|
||||
* **Group by URL** - Show a single preview for all FrameGroups with the same context ID. For example, users can set the
|
||||
same `context_id` to multiple FrameGroups that represent frames in a single video.
|
||||
* **Sample by Property** - Specify a frame or ROI property whose value to group frames by and set the number of frames
|
||||
to preview for each group. For example, in the image below, frames are grouped by ROI labels. Each group displays six
|
||||
samples of frames that contain an ROI with the same label.
|
||||
|
||||

|
||||
|
||||
**To sample by property:**
|
||||
1. In the **Grouping** menu, click **Sample by Property**
|
||||
1. In the **Sample by Property** modal, input the following:
|
||||
* Select the Property type:
|
||||
* ROI - Properties associated with the frame ROIs (e.g. ROI label names, IDs, confidence, etc.)
|
||||
* Frame - Properties associated with the frames (e.g. update time, metadata, timestamp, etc.)
|
||||
* Property name - Property whose value to group the frames by
|
||||
* Sample size - Number of frames to preview for each group
|
||||
* ROI match query (*For grouping by ROI property only*) - A Lucene query to filter which of a frame's ROIs
|
||||
to use in grouping by their properties. For example, in a Hyper-Dataset where ROIs have object labels and type labels,
|
||||
view a sample of frames with different types of the same object by grouping frames according to `label.keyword`
|
||||
with a match query for the object of interest.
|
||||
|
||||

|
||||
|
||||
The image below shows a sample of 3 frames which have ROIs of each type (`pedestrian`, `rider`, `sitting`) of `person`.
|
||||
|
||||

|
||||
:::note Property N/A group
|
||||
If there are frames which have no value for the grouped by property, a sample of them will be provided as a final
|
||||
group. If you sample according to an ROI property, this group will NOT include frames that have no ROIS at all.
|
||||
:::
|
||||
1. Click **Save**
|
||||
|
||||
Once saved, whenever you select the **Sample by Property** option in the **Grouping** menu, the frame will be grouped
|
||||
according to the previously configured setting.
|
||||
|
||||
**To modify the grouping property:**
|
||||
1. Hover over **Sample by Property**
|
||||
1. Click <img src="/docs/latest/icons/ico-edit.svg" alt="Edit pencil" className="icon size-md space-sm" />
|
||||
1. Modify the **Sample by Property** configuration
|
||||
1. Click **Save**
|
||||
|
||||
|
||||
|
||||
#### Preview Source
|
||||
When using multi-source FrameGroups, users can choose which of the FrameGroups' sources will be displayed as the preview.
|
||||
@ -204,11 +257,34 @@ If a FrameGroup doesn't have the selected preview source, the preview displays t
|
||||
|
||||
## Statistics
|
||||
|
||||
The **Statistics** tab displays a dataset version's label usage stats.
|
||||
* Dataset total count - number of annotations, annotated frames, and total frames
|
||||
* Each label is listed along with the number of times it was used in the version
|
||||
* The pie chart visualizes these stats. Hover over a chart slice and its associated label and usage
|
||||
percentage will appear at the center of the chart.
|
||||
The **Statistics** tab allows exploring frame and ROI property distribution across a Hyper-Dataset version:
|
||||
1. Query the frames to include in the statistics calculations under **Filter by label**. Use [simple](#simple-frame-filtering)
|
||||
or [advanced](#advanced-frame-filtering) frame filters. If no filter is applied, all frames in the dataset version will
|
||||
be included in the calculation.
|
||||
1. Select the property whose distribution should be calculated
|
||||
* Select the property **Type**:
|
||||
* **ROI** - Frame ROI properties (e.g. ROI label, ID, confidence, etc.). This will calculate the distribution of
|
||||
the specified property across all ROIs in the version's frames.
|
||||
* **Frame** - Frames properties (e.g. update time, metadata keys, timestamp, etc.)
|
||||
* Input the **Property** key (e.g. `meta.location`)
|
||||
* If **ROI** property was selected, you can also limit the scope of ROIs included in the calculation with the
|
||||
**Count ROIs matching** filter: Input a Lucene query to specify which ROIs to count
|
||||
1. Click **Apply** to calculate the statistics
|
||||
|
||||
For example, calculating the distribution for the `label` ROI property, specifying `rois.confidence: 1` for ROI matching
|
||||
will show the label distribution across only ROIs with a confidence level of 1.
|
||||
|
||||

|
||||
|
||||
By default, the ROI label distribution across the entire Hyper-Dataset version is shown.
|
||||
The tab displays the following information
|
||||
* Object counts:
|
||||
* Number of annotations matching specification
|
||||
* Number of annotated frames in the current frame filter selection
|
||||
* Total number of frames in the current frame filter selection
|
||||
* Each property is listed along with its number of occurrences in the current frame filter selection
|
||||
* The pie chart visualizes this distribution. Hover over a chart segment and its associated property and count will
|
||||
appear in a tooltip and its usage percentage will appear at the center of the chart.
|
||||
|
||||

|
||||
|
||||
|
BIN
docs/img/hyperdatasets/dataset_frame_sorting.png
Normal file
After Width: | Height: | Size: 1.4 MiB |
BIN
docs/img/hyperdatasets/dataset_sample_by_roi_property.png
Normal file
After Width: | Height: | Size: 1.1 MiB |
Before Width: | Height: | Size: 260 KiB After Width: | Height: | Size: 88 KiB |
BIN
docs/img/hyperdatasets/dataset_version_statistics_roi.png
Normal file
After Width: | Height: | Size: 7.1 KiB |
BIN
docs/img/hyperdatasets/roi_match_query.png
Normal file
After Width: | Height: | Size: 384 KiB |
BIN
docs/img/hyperdatasets/sample_by_property_modal.png
Normal file
After Width: | Height: | Size: 25 KiB |
BIN
docs/img/resource_configuration.png
Normal file
After Width: | Height: | Size: 463 KiB |
BIN
docs/img/resource_configuration_pool_card.png
Normal file
After Width: | Height: | Size: 11 KiB |
BIN
docs/img/resource_configuration_profile_card.png
Normal file
After Width: | Height: | Size: 10 KiB |
BIN
docs/img/resource_example_policy.png
Normal file
After Width: | Height: | Size: 39 KiB |
BIN
docs/img/resource_example_policy_priority.png
Normal file
After Width: | Height: | Size: 18 KiB |
BIN
docs/img/resource_example_pool_card.png
Normal file
After Width: | Height: | Size: 25 KiB |
BIN
docs/img/resource_example_pool_priority.png
Normal file
After Width: | Height: | Size: 12 KiB |
BIN
docs/img/resource_example_pools.png
Normal file
After Width: | Height: | Size: 53 KiB |
BIN
docs/img/resource_example_profile.png
Normal file
After Width: | Height: | Size: 28 KiB |
BIN
docs/img/resource_example_profile_pool_links.png
Normal file
After Width: | Height: | Size: 216 KiB |
BIN
docs/img/resource_example_profile_priority.png
Normal file
After Width: | Height: | Size: 217 KiB |
BIN
docs/img/resource_policies_policy_card.png
Normal file
After Width: | Height: | Size: 8.6 KiB |
BIN
docs/img/resource_policies_profile_card_admin.png
Normal file
After Width: | Height: | Size: 8.7 KiB |
BIN
docs/img/resource_policies_profile_card_non_admin.png
Normal file
After Width: | Height: | Size: 9.3 KiB |
BIN
docs/img/resource_policies_remove_profile.png
Normal file
After Width: | Height: | Size: 9.1 KiB |
128
docs/webapp/resource_policies.md
Normal file
@ -0,0 +1,128 @@
|
||||
---
|
||||
title: Resource Policies
|
||||
---
|
||||
|
||||
:::note ENTERPRISE FEATURE
|
||||
This feature is available under the ClearML Enterprise plan
|
||||
:::
|
||||
|
||||
|
||||
Resource policies let administrators define user group resource quotas and reservations to enable workload prioritization
|
||||
across available resources.
|
||||
|
||||
Administrators make the allocated resources available to users through designated execution queues, each matching a
|
||||
specific resource consumption profile (i.e. the amount of resources allocated to jobs run through the queue).
|
||||
|
||||
Workspace administrators can use the resource policy manager to create, modify or delete resource policies:
|
||||
Set resource reservation and limits for user groups
|
||||
|
||||
* Connect resource profiles to a policy, making them available to its user group via ClearML queues
|
||||
* Non-administrator users can see the resource policies currently applied to them.
|
||||
|
||||
## Create a Policy
|
||||
|
||||
**To create a policy:**
|
||||
1. Click `+`
|
||||
1. In the **Create Resource Policy** modal, fill in the following:
|
||||
* Name - Resource policy name. This name will appear on the Policies list
|
||||
* Reservation - The number of resources guaranteed to be available for the policy’s users
|
||||
* Limit - The maximum amount of resources that jobs run through this policy’s queues can concurrently use.
|
||||
* User Group - The [User groups](webapp_profile.md#user-groups) to which the policy applies
|
||||
* Description - Optional free form text for additional descriptive information
|
||||
1. Click **Add**
|
||||
|
||||
Once the policy is defined, you can connect profiles to it (Resource profiles are defined in the [Resource Configuration](webapp_profile.md#resource-configuration)
|
||||
settings page, available to administrators). Resource profiles serve as an interface for resource policies to provide
|
||||
users with access to the available resource pools based on their job resource requirements (i.e. a job running through a
|
||||
profile is allocated the profile’s defined amount of resources).
|
||||
|
||||
**To connect a resource profile to a policy:**
|
||||
1. In the policy’s details panel, click **Edit**
|
||||
1. Click **Connect Profile**
|
||||
1. In the **Connect Profile** modal, input the following information:
|
||||
* Queue name - The name for the ClearML queue the policy’s users will use to enqueue jobs using this resource
|
||||
profile. Jobs enqueued to this queue will be allocated the number of resources defined for its profile
|
||||
* Profile - select the resource profile.
|
||||
1. Click **Connect**
|
||||
|
||||
:::note Available Profiles
|
||||
Only profiles that are part of the currently provisioned [resource configuration](webapp_profile.md#resource-configuration)
|
||||
are available for selection (Profiles that are part of a configuration that has been saved but not yet provisioned
|
||||
will not appear in the list).
|
||||
|
||||
Profiles whose resource requirement exceeds the policy's resource limit will appear in the list but are not available
|
||||
for selection.
|
||||
:::
|
||||
|
||||
## Policy Details
|
||||
The policy details panel displays:
|
||||
* Policy quota and reservation
|
||||
* Resource profiles associated with the policy
|
||||
* Queues the policy makes available
|
||||
* Number of current jobs in each profile (pending or running)
|
||||
|
||||
The top card displays the policy information:
|
||||
* Policy name
|
||||
* Current usage - The number of resources currently in use (i.e. by currently running jobs)
|
||||
* Reserved resources
|
||||
* Resource limit
|
||||
* User group that the policy applies to - click to show list of users in the group
|
||||
|
||||

|
||||
|
||||
The cards below the policy card display the profiles that are connected to the policy:
|
||||
* Resource profile name
|
||||
* <img src="/docs/latest/icons/ico-resource-number.svg" alt="Number of resources" className="icon size-md space-sm" /> - Number
|
||||
of resources consumed by each job enqueued through this profile's queue
|
||||
* <img src="/docs/latest/icons/ico-queued-jobs.svg" alt="Queued jobs" className="icon size-md space-sm" /> - Currently queued jobs
|
||||
* <img src="/docs/latest/icons/ico-running-jobs.svg" alt="Running jobs" className="icon size-md space-sm" /> - Currently running jobs
|
||||
|
||||

|
||||
|
||||
Administrators can also see each resource profile’s resource pool links listed in order of routing priority.
|
||||
|
||||

|
||||
|
||||
The arrow connecting the policy card with a profile card is labeled with the name of the queue the policy’s users should
|
||||
use to run tasks through that resource profile.
|
||||
|
||||
## Modify Policy
|
||||
|
||||
To modify a resource policy, click **Edit** to open the details panel in editor mode.
|
||||
|
||||
### To Modify Policy Parameters
|
||||
|
||||
1. On the resource policy card, click <img src="/docs/latest/icons/ico-bars-menu.svg" alt="Menu" className="icon size-md space-sm" /> **> Edit**
|
||||
1. In the Edit Resource Policy modal, you can modify the policy’s name, number of reserved resources, resource limit,
|
||||
and description
|
||||
1. Click **Save**
|
||||
|
||||
### To Add a Resource Profile to a Policy
|
||||
1. Click **Connect Profile**
|
||||
1. In the **Connect Profile** modal, input the following information:
|
||||
* Queue name - The name for the ClearML queue the policy’s users will use to enqueue jobs using this resource
|
||||
profile. Jobs enqueued to this queue will be allocated the number of resources defined for its profile
|
||||
* Profile - select the resource profile. Note that you will only be able to connect profiles that have not already
|
||||
been connected to the policy
|
||||
1. Click **Connect**
|
||||
|
||||
### To Remove a Resource Profile
|
||||
|
||||
**To remove a resource profile:** On the relevant resource profile box, click `X`.
|
||||
|
||||

|
||||
|
||||
Removing a profile from a policy will also delete the queue which made this profile available to the policy’s users.
|
||||
Any tasks enqueued on this queue will be set to `draft` status.
|
||||
|
||||
Click **Exit** to close editor mode
|
||||
|
||||
## Delete Policy
|
||||
|
||||
**To delete a resource policy**
|
||||
1. Click **Edit** to open the details panel in editor mode
|
||||
1. On the resource policy box, click <img src="/docs/latest/icons/ico-bars-menu.svg" alt="Menu" className="icon size-md space-sm" />
|
||||
2. Click **Delete**
|
||||
|
||||
Deleting a policy also deletes its queues (i.e. the queues to access the resource profiles). Additionally, any pending
|
||||
tasks will be dequeued.
|
@ -29,6 +29,60 @@ The downloaded data consists of the currently displayed table columns.
|
||||
|
||||

|
||||
|
||||
## Creating Experiments
|
||||
|
||||
You can create experiments by:
|
||||
* Running code instrumented with ClearML (see [Task Creation](../clearml_sdk/task_sdk.md#task-creation))
|
||||
* [Cloning an existing experiment](webapp_exp_reproducing.md)
|
||||
* Through the UI interface: Input the experiment's details, including its source code and python requirements, and then
|
||||
run it through a [ClearML Queue](../fundamentals/agents_and_queues.md#what-is-a-queue) or save it as a *draft*.
|
||||
|
||||
To create an experiment through the UI interface:
|
||||
1. Click `+ New Experiment`
|
||||
1. In the `Create Experiment` modal, input the following information:
|
||||
* **Code**
|
||||
* Experiment name
|
||||
* Git
|
||||
* Repository URL
|
||||
* Version specification - one of the following:
|
||||
* Tag
|
||||
* Branch
|
||||
* Commit ID
|
||||
* Execution Entry Point
|
||||
* Working Directory
|
||||
* One of the following
|
||||
* Script name
|
||||
* Module (see [python module specification](https://docs.python.org/3/using/cmdline.html#cmdoption-m))
|
||||
* Add `Task.init` call - If selected, [`Task.init()`](../references/sdk/task.md#taskinit) call is added to the
|
||||
entry point. Select if it is not already called within your code
|
||||
* **Arguments** (*optional*) - Add [hyperparameter](../fundamentals/hyperparameters.md) values.
|
||||
* **Environment** (*optional*) - Set up the experiment’s python execution environment using either of the following
|
||||
options:
|
||||
* Use Poetry specification - Requires specifying a docker image for the experiment to be executed in.
|
||||
* Manually specify the python environment configuration:
|
||||
* Python binary - The python executable to use
|
||||
* Preinstalled venv - A specific existing virtual environment to use. Requires specifying a docker image for the
|
||||
experiment to be executed in.
|
||||
* Python package specification:
|
||||
* Skip - Assume system packages are available. Requires specifying a docker image for the experiment to be
|
||||
executed in.
|
||||
* Use an existing `requirements.txt` file
|
||||
* Explicitly specify the required packages
|
||||
* **Docker** (*optional*) - Specify Docker container configuration for executing the experiment
|
||||
* Image - Docker image to use for running the experiment
|
||||
* Arguments - Add Docker arguments as a single string
|
||||
* Startup Script - Add a bash script to be executed inside the Docker before setting up the experiment's environment
|
||||
* **Run**
|
||||
* Queue - [ClearML Queue](../fundamentals/agents_and_queues.md#what-is-a-queue) where the experiment should be
|
||||
enqueued for execution
|
||||
* Output Destination - A URI where experiment outputs should be stored (ClearML file server by default).
|
||||
1. Once you have input all the information, click one of the following options
|
||||
* Save as Draft - Save the experiment as a new draft task.
|
||||
* Run - Enqueue the experiment for execution in the queue specified in the **Run** tab
|
||||
|
||||
Once you have completed the experiment creation wizard, the experiment will be saved in your current project (where
|
||||
you clicked `+ New Experiment`). See what you can do with your experiment in [Experiment Actions](#experiment-actions).
|
||||
|
||||
## Experiments Table Columns
|
||||
|
||||
The experiments table default and customizable columns are described in the following table.
|
||||
|
@ -22,6 +22,8 @@ The Settings page consists of the following sections:
|
||||
* [Users & Groups](#users--groups) - Manage the users that have access to a workspace
|
||||
* [Access Rules](#access-rules) (ClearML Enterprise Server) - Manage per-resource access privileges
|
||||
* [Identity Providers](#identity-providers) (ClearML Enterprise Server) - Manage server identity providers
|
||||
* [Resource Configuration](#resource-configuration) (ClearML Enterprise Server) - Define the available resources and the way in which they
|
||||
will be allocated to different workloads
|
||||
* [Usage & Billing](#usage--billing) (ClearML Hosted Service) - View current usage information and billing details
|
||||
|
||||
## Profile
|
||||
@ -39,13 +41,16 @@ The profile tab presents user information.
|
||||
Under **USER PREFERENCES**, users can set a few web UI options:
|
||||
* **Show Hidden Projects** - Show ClearML infrastructure projects alongside your own projects. Disabled by default. When
|
||||
enabled, these projects are labeled with <img src="/docs/latest/icons/ico-ghost.svg" alt="Hidden project" className="icon size-md space-sm" />.
|
||||
* **Don't show ClearML Examples** - Hide the preloaded ClearML example content (project, pipeline, dataset, etc.)
|
||||
* **HiDPI browser scale override** - Adjust scaling on High-DPI monitors to improve the web UI experience.
|
||||
Enabled by default.
|
||||
* **Don't show ClearML examples** - Hide the preloaded ClearML example content (project, pipeline, dataset, etc.).
|
||||
* **Disable HiDPI browser scale override** - ClearML dynamically sets the browser scaling factor for an optimal page layout.
|
||||
Disable for default desktop scale.
|
||||
* **Don't show pro tips periodically** - Stop showing ClearML usage tips on login. Disabled by default.
|
||||
* **Block running user's scripts in the browser** - Block any user and 3rd party scripts from running anywhere in the
|
||||
WebApp. Note that if enabled, the WebApp will not display debug samples, [Hyper-Dataset frame previews](../hyperdatasets/previews.md),
|
||||
and embedded resources in [reports](webapp_reports.md).
|
||||
* **Hide specific container arguments** - Specify which Docker environment variable values should be hidden in logs.
|
||||
When printed, the variable values are replaced with `********`. By default, `CLEARML_API_SECRET_KEY`, `CLEARML_AGENT_GIT_PASS`,
|
||||
`AWS_SECRET_ACCESS_KEY`, and `AZURE_STORAGE_KEY` values are redacted.
|
||||
`AWS_SECRET_ACCESS_KEY`, and `AZURE_STORAGE_KEY` values are redacted. To modify the hidden container argument list, click **Edit**.
|
||||
|
||||
:::info Self-hosted ClearML Server
|
||||
The self-hosted ClearML Server has an additional option to enable sharing anonymous telemetry data with the ClearML
|
||||
@ -574,6 +579,206 @@ Hover over a connection in the table to **Edit** or **Delete** it.
|
||||
|
||||

|
||||
|
||||
## Resource Configuration
|
||||
|
||||
Administrators can define [Resource Policies](../webapp/resource_policies.md) to implement resource quotas and
|
||||
reservations for different user groups to prioritize workload usage across available resources.
|
||||
|
||||
Under the **Resource Configuration** section, administrators define the available resources and the way in which they
|
||||
will be allocated to different workloads.
|
||||
|
||||

|
||||
|
||||
The Resource Configuration settings page shows the [currently provisioned](#applying-resource-configuration) configuration:
|
||||
the defined resource pools, resource profiles, and the resource allocation architecture.
|
||||
|
||||
### Resource Pools
|
||||
A resource pool is an aggregation of resources available for use, such as a Kubernetes cluster or a GPU superpod.
|
||||
Administrators specify the total number of resources available in each pool. The resource policy manager ensures
|
||||
workload assignment up to the available number of resources.
|
||||
|
||||
Administrators control the execution priority within a pool across the resource profiles making use of it (e.g. if jobs
|
||||
of profile A and jobs of profile B currently need to run in a pool, allocate resources for profile A jobs first or vice
|
||||
versa).
|
||||
|
||||
The resource pool cards are displayed on the top of the Resource Configuration settings page. Each card displays the
|
||||
following information:
|
||||
|
||||

|
||||
|
||||
* Pool name
|
||||
* Number of resources currently in use out of the total available resources
|
||||
* Execution Priority - List of [linked profiles](#connecting-profiles-to-pools) in order of execution priority.
|
||||
|
||||
### Resource Profiles
|
||||
Resource profiles represent the resource consumption requirements of jobs, such as the number of GPUs needed. They are
|
||||
the interface that administrators use to provide users with access to the available resource pools based on their job
|
||||
resource requirements via [Resource Policies](../webapp/resource_policies.md).
|
||||
|
||||
Administrators can control the resource pool allocation precedence within a profile (e.g. only run jobs on `pool B` if
|
||||
`pool A` cannot currently satisfy the profile's resource requirements).
|
||||
|
||||
Administrators can control the queuing priority within a profile across resource policies making use of it (e.g. if the
|
||||
R&D team and DevOps team both have pending jobs - run the R&D team's jobs first or vice versa).
|
||||
|
||||
The resource profile cards are displayed on the bottom of the Resource Configuration settings page. Each card displays
|
||||
the following information:
|
||||
|
||||

|
||||
|
||||
* Profile name
|
||||
* <img src="/docs/latest/icons/ico-resource-number.svg" alt="Number of resources" className="icon size-md space-sm" /> - Number
|
||||
of resources allocated to jobs in this profile
|
||||
* List of [pool links](#connecting-profiles-to-pools)
|
||||
* <img src="/docs/latest/icons/ico-queued-jobs.svg" alt="Queued jobs" className="icon size-md space-sm" /> - Number of currently pending jobs
|
||||
* <img src="/docs/latest/icons/ico-running-jobs.svg" alt="Running jobs" className="icon size-md space-sm" /> - Number of currently running jobs
|
||||
* Number of resource policies. Click to open resource policy list and to order queuing priority.
|
||||
|
||||
### Example Workflow
|
||||
|
||||
You have GPUs spread across a local H100 and additional bare metal servers, as well as on AWS (managed
|
||||
by an autoscaler). Assume that currently most of your resources are already assigned to jobs, and only 16 resources are available: 8 in the
|
||||
H100 resource pool and 8 in the Bare Metal pool:
|
||||
|
||||

|
||||
|
||||
Teams' jobs have varying resource requirements of 0.5, 2, 4, and 8 GPUs. Resource profiles are defined to reflect these:
|
||||
|
||||

|
||||
|
||||
The different jobs will be routed to different resource pools by connecting the profiles to the resource pools. Jobs
|
||||
enqueued through the profiles will be run in the pools where there are available resources in order of their priority.
|
||||
For example, the H100 pool will run jobs with the following precedence: 2 GPU jobs first, then 4GPU ones, then 8 GPU,
|
||||
and lastly 0.5 GPU.
|
||||
|
||||

|
||||
|
||||
Resource policies are implemented for two teams:
|
||||
* Dev team
|
||||
* Research Team
|
||||
|
||||
Each team has a resource policy configured with 8 reserved resources and a 16 resource limit. Both teams make use of the
|
||||
4xGPU profile (i.e. each job running through this profile requires 4 resources).
|
||||
|
||||

|
||||
|
||||
The Dev team is prioritized over the Research team by placing it higher in the Resource Profile's Policies Priority list:
|
||||
|
||||

|
||||
|
||||
Both the Dev team and the Research team enqueue four 4-resource jobs each: Dev team jobs will be allocated resources
|
||||
first. The `4xGPU` resource profile is connected to two resource pools: `Bare Metal Low END GPUs` (with the
|
||||
`4 GPU Low End` link) and `H100 Half a Superpod` (with the `4 GPU H100 link`).
|
||||
|
||||

|
||||
|
||||
Resources are assigned from the `Bare Metal` pool first (precedence set on the resource profile card):
|
||||
|
||||

|
||||
|
||||
If the first pool cannot currently satisfy the profile’s resource requirements, resources are assigned from the next
|
||||
listed pool. Let's look at the first pool in the image below. Notice that the pool has 8 available resources, therefore
|
||||
it can run two 4-resource jobs.
|
||||
|
||||
<div class="max-w-50">
|
||||
|
||||

|
||||
|
||||
</div>
|
||||
|
||||
Since the Bare Metal pool does not have any more available resources, additional jobs will be assigned resources from
|
||||
the next pool that the Resource Profile is connected to. The H100 pool has 8 available resources. There are still 2 jobs
|
||||
pending from the Dev team requiring 8 resources in total, and 4 jobs from the Research team requiring 16 resources in
|
||||
total. In order to honor the Research team’s resource reservation, its first two jobs will be assigned the required 8
|
||||
resources from the H100 pool.
|
||||
|
||||
All available resources having been assigned - 2 jobs of each team will remain pending until some of the currently
|
||||
running jobs finish and resources become available.
|
||||
|
||||
### Applying Resource Configuration
|
||||
Administrators can globally activate/deactivate resource policy management. To enable the currently provisioned
|
||||
configuration, click on the `Enable resource management` toggle. Enabling resource management will service the policy
|
||||
queues according to the provisioned resource profile and pool assignments. Disabling the resource management will stop
|
||||
serving the policy queues. Tasks on these queues will remain pending until resource policy management is reenabled.
|
||||
|
||||
Administrators can add, edit, delete, and connect resource pools and profiles in the Resource Configuration settings
|
||||
page.
|
||||
|
||||
To make any change (create, delete, or modify a component) to the resource configuration, follow the following steps:
|
||||
1. Click **Open Editor** to go into Editing mode
|
||||
1. After making the desired changes you have the following options:
|
||||
* **Save** - Save the changes you made. These changes will not be applied until you click on Provision
|
||||
* **Provision** - Apply the resource policy’s saved changes
|
||||
* **Reset Configuration** - Set the editor to the currently provisioned values. This will delete any unprovisioned
|
||||
changes (both saved and unsaved)
|
||||
1. Click **Exit** to leave Editor mode. The page will show the provisioned configuration. Unprovisioned saved changes will
|
||||
still be available in Editor mode.
|
||||
|
||||
#### Resource Pool
|
||||
|
||||
**To create a resource pool:**
|
||||
1. Click **+ Add Pool**
|
||||
1. In the **Create Pool** modal, input:
|
||||
* Name - The resource pool’s name. This will appear in the Pool’s information card in the Resource Configuration settings page
|
||||
* Number of Resources - Number of resources available in this pool
|
||||
* Description - Optional free form text for additional descriptive information
|
||||
1. Click **Create**
|
||||
|
||||
**To modify a resource pool**
|
||||
1. Click <img src="/docs/latest/icons/ico-bars-menu.svg" alt="Menu" className="icon size-md space-sm" /> on the relevant
|
||||
resource pool card **>** click **Edit**
|
||||
1. In the **Edit Pool** modal, change the pool’s name, number of resources, or description
|
||||
1. Click **Save**
|
||||
|
||||
You can also change the Execution Priority of the [linked resource profiles](#connecting-profiles-to-pools). Click and
|
||||
drag the profile connection anchor <img src="/docs/latest/icons/ico-resource-anchor.svg" alt="Resourch anchor" className="icon size-md space-sm" />
|
||||
to change its position in the order of priority.
|
||||
|
||||
#### Resource Profile
|
||||
**To create a resource profile:**
|
||||
1. Click **+ Add Profile**
|
||||
1. In the **Create Profile** modal, input:
|
||||
* Name - The resource profile’s name. This will appear in the profile’s information card in the Resource Configuration settings page
|
||||
* Resource Allotment - Number of resources allocated to each job running in this profile
|
||||
3. Click **Create**
|
||||
|
||||
**To modify a resource profile:**
|
||||
1. Click <img src="/docs/latest/icons/ico-bars-menu.svg" alt="Menu" className="icon size-md space-sm" /> on the relevant
|
||||
resource profile card > click **Edit**
|
||||
1. In the **Edit Profile** modal, change the pool's name, number of resources, or description
|
||||
1. Click **Save**
|
||||
|
||||
To control which pool's resources will be assigned first, click and drag the pool connection anchor <img src="/docs/latest/icons/ico-resource-anchor.svg" alt="connection anchor" className="icon size-md space-sm" />
|
||||
to change its position in order of priority.
|
||||
|
||||
You can also change the Execution Priority of the resource policies making use of this profile. Open the policy list,
|
||||
then click the policy anchor <img src="/docs/latest/icons/ico-drag-vertical.svg" alt="policy anchor" className="icon size-md space-sm" />
|
||||
and drag the policy to change its position in order of priority.
|
||||
|
||||
**To delete a resource profile:**
|
||||
1. Click <img src="/docs/latest/icons/ico-bars-menu.svg" alt="Menu" className="icon size-md space-sm" /> on the relevant resource pool card
|
||||
1. Click Delete
|
||||
|
||||
#### Connecting Profiles to Pools
|
||||
Connect a resource profile to a resource pool to allow jobs assigned to the profile to make use of the pool’s resources.
|
||||
|
||||
**To connect a profile to a pool:**
|
||||
1. Click **Open Editor**
|
||||
1. Drag the <img src="/docs/latest/icons/ico-profile-link.svg" alt="Profile-pool link" className="icon size-md space-sm" />
|
||||
of the relevant profile to the resource pool you want to connect the profile to. This opens the **Connect Profile** modal
|
||||
1. In the **Connect Profile** modal, input a name for this connection. This connection name will appear on the profile
|
||||
card
|
||||
|
||||
The settings page will show a line linking the profile and the pool cards. The linked profile appears on the pool card,
|
||||
showing its place in the order of execution. To change the profile's priority placement, drag its connection anchor <img src="/docs/latest/icons/ico-resource-anchor.svg" alt="connection anchor" className="icon size-md space-sm" />
|
||||
to a new position.
|
||||
|
||||
**To disconnect a profile from a pool:**
|
||||
1. Click **Open Editor**
|
||||
1. On the relevant profile card, hover over connection name and click `X`
|
||||
|
||||
Jobs assigned to this resource profile will no longer be able to utilize the pool’s resources.
|
||||
|
||||
## Usage & Billing
|
||||
|
||||
The **USAGE & BILLING** section displays your ClearML workspace usage information including:
|
||||
|
@ -15,6 +15,8 @@ consumption as needed–-with no code (available under the ClearML Pro plan)
|
||||
* Monitor queue utilization
|
||||
* Reorder, move, and remove experiments from queues
|
||||
* Monitor all of your available and in-use compute resources (available in the ClearML Enterprise plan. See [Orchestration Dashboard](webapp_orchestration_dash.md))
|
||||
* Set user group resource quotas and reservations to enable workload prioritization across available resources (available
|
||||
in the ClearML Enterprise plan. See [Resource Policies](resource_policies.md))
|
||||
|
||||
## Autoscalers
|
||||
|
||||
|
@ -130,8 +130,13 @@ module.exports = {
|
||||
]
|
||||
},
|
||||
'webapp/webapp_reports',
|
||||
{
|
||||
'Orchestration': [
|
||||
'webapp/webapp_workers_queues',
|
||||
'webapp/webapp_orchestration_dash',
|
||||
'webapp/resource_policies'
|
||||
]
|
||||
},
|
||||
{
|
||||
'ClearML Applications': [
|
||||
'webapp/applications/apps_overview',
|
||||
|
3
static/icons/ico-drag-vertical.svg
Normal file
@ -0,0 +1,3 @@
|
||||
<svg xmlns="http://www.w3.org/2000/svg" viewBox="0 0 24 24">
|
||||
<path d="M3,13h2v2h-2v-2ZM3,11h2v-2h-2v2ZM11,15h2v-2h-2v2ZM9,13h-2v2h2v-2ZM13,9h-2v2h2v-2ZM9,9h-2v2h2v-2ZM15,11h2v-2h-2v2ZM19,9v2h2v-2h-2ZM19,15h2v-2h-2v2ZM15,15h2v-2h-2v2ZM11.97,4.82l2.24,2.24,1.41-1.41-3.66-3.65-3.6,3.6,1.41,1.41,2.19-2.19ZM12.03,19.18l-2.24-2.24-1.41,1.41,3.66,3.65,3.6-3.6-1.41-1.41-2.19,2.19Z" fill="#8492c2"/>
|
||||
</svg>
|
After Width: | Height: | Size: 406 B |
4
static/icons/ico-profile-link.svg
Normal file
@ -0,0 +1,4 @@
|
||||
<svg xmlns="http://www.w3.org/2000/svg" viewBox="0 0 24 24">
|
||||
<circle cx="12" cy="12" r="12" fill="#6eeaf7"/>
|
||||
<path d="M8,8.3h3.2v1.6h-3.2c-1.33,0-2.4,1.07-2.4,2.4s1.07,2.4,2.4,2.4h3.2v1.6h-3.2c-2.21,0-4-1.79-4-4s1.79-4,4-4M16,8.3c2.21,0,4,1.79,4,4h-1.6c0-1.33-1.07-2.4-2.4-2.4h-3.2v-1.6h3.2M8.8,11.5h6.4v1.6h-6.4v-1.6M16,12.3h1.6v2.4h2.4v1.6h-2.4v2.4h-1.6v-2.4h-2.4v-1.6h2.4v-2.4Z" fill="#000" />
|
||||
</svg>
|
After Width: | Height: | Size: 407 B |
3
static/icons/ico-queued-jobs.svg
Normal file
@ -0,0 +1,3 @@
|
||||
<svg xmlns="http://www.w3.org/2000/svg" viewBox="0 0 24 24">
|
||||
<path d="M16,8.75l-2-2V6h4v.75Zm5.41-.16L18,12l3.41,3.41A2,2,0,0,1,22,16.83V20a2,2,0,0,1-2,2H12a2,2,0,0,1-2-2V16.83a2,2,0,0,1,.59-1.42L14,12,10.59,8.59A2,2,0,0,1,10,7.17V4a2,2,0,0,1,2-2h8a2,2,0,0,1,2,2V7.17A2,2,0,0,1,21.41,8.59ZM16,12.5l-4,4V20h2V18h4v2h2V16.5ZM20,4H12V7.5l4,4,4-4ZM6,12,3,15H2V9H3Zm5.12,0L8,15H7V9H8.12Z" fill="#8492c2"/>
|
||||
</svg>
|
After Width: | Height: | Size: 410 B |
3
static/icons/ico-resource-anchor.svg
Normal file
@ -0,0 +1,3 @@
|
||||
<svg xmlns="http://www.w3.org/2000/svg" viewBox="0 0 24 24">
|
||||
<path d="M9,7h2v2h-2v-2ZM9,5h2v-2h-2v2ZM13,21h2v-2h-2v2ZM9,17h2v-2h-2v2ZM9,13h2v-2h-2v2ZM13,13h2v-2h-2v2ZM13,9h2v-2h-2v2ZM13,17h2v-2h-2v2ZM13,5h2v-2h-2v2ZM9,21h2v-2h-2v2ZM5.65,8.37l-3.65,3.66,3.6,3.6,1.41-1.41-2.19-2.19,2.24-2.24-1.41-1.41ZM18.4,8.37l-1.41,1.41,2.19,2.19-2.24,2.24,1.41,1.41,3.65-3.66-3.6-3.6Z" fill="#8492c2"/>
|
||||
</svg>
|
After Width: | Height: | Size: 398 B |
3
static/icons/ico-resource-number.svg
Normal file
@ -0,0 +1,3 @@
|
||||
<svg xmlns="http://www.w3.org/2000/svg" viewBox="0 0 24 24">
|
||||
<path d="M16.02,9v6c0,.55-.45,1-1,1h-6c-.55,0-1-.45-1-1v-6c0-.55.45-1,1-1h6c.55,0,1,.45,1,1ZM20.02,13v2h2c.55,0,1,.45,1,1s-.45,1-1,1h-2v.25c0,1.52-1.23,2.75-2.75,2.75h-.25v2c0,.55-.45,1-1,1s-1-.45-1-1v-2h-2v2c0,.55-.45,1-1,1s-1-.45-1-1v-2h-2v2c0,.55-.45,1-1,1s-1-.45-1-1v-2h-.25c-1.52,0-2.75-1.23-2.75-2.75v-.25h-2c-.55,0-1-.45-1-1s.45-1,1-1h2v-2h-2c-.55,0-1-.45-1-1s.45-1,1-1h2v-2h-2c-.55,0-1-.45-1-1s.45-1,1-1h2v-.25c0-1.52,1.23-2.75,2.75-2.75h.25v-2c0-.55.45-1,1-1s1,.45,1,1v2h2v-2c0-.55.45-1,1-1s1,.45,1,1v2h2v-2c0-.55.45-1,1-1s1,.45,1,1v2h.25c1.52,0,2.75,1.23,2.75,2.75v.25h2c.55,0,1,.45,1,1s-.45,1-1,1h-2v2h2c.55,0,1,.45,1,1s-.45,1-1,1h-2ZM18.02,6.75c0-.41-.34-.75-.75-.75H6.77c-.41,0-.75.34-.75.75v10.5c0,.41.34.75.75.75h10.5c.41,0,.75-.34.75-.75V6.75Z" fill="#8492c2" />
|
||||
</svg>
|
After Width: | Height: | Size: 848 B |
4
static/icons/ico-running-jobs.svg
Normal file
@ -0,0 +1,4 @@
|
||||
<svg xmlns="http://www.w3.org/2000/svg" viewBox="0 0 24 24">
|
||||
<circle cx="12" cy="12" r="9" fill="none" stroke-dasharray="0 0 0 3" stroke-linecap="round" stroke-width="2" stroke="#8492c2"/>
|
||||
<path d="M14.78,12.8l-4.22,3.01c-.44.32-1.06.22-1.38-.23-.12-.17-.18-.37-.18-.57v-6.03h0c0-.54.44-.99.99-.99.21,0,.41.07.58.19h0l4.22,3.01c.44.32.54.93.22,1.38-.06.09-.14.16-.22.22h0Z" fill="#8492c2" />
|
||||
</svg>
|
After Width: | Height: | Size: 402 B |
3
static/icons/ico-sort.svg
Normal file
@ -0,0 +1,3 @@
|
||||
<svg xmlns="http://www.w3.org/2000/svg" viewBox="0 0 24 24">
|
||||
<path d="M16,9h-3v-2h3v2ZM13,15v2h9v-2h-9ZM19,11h-6v2h6v-2ZM7,3h-2v14h-3l4,4,4-4h-3V3Z" fill="#8492c2" />
|
||||
</svg>
|
After Width: | Height: | Size: 175 B |