clearml-docs/docs/webapp/webapp_workers_queues.md

102 lines
4.5 KiB
Markdown
Raw Normal View History

2021-05-13 23:48:51 +00:00
---
title: Workers and Queues
---
With the **Workers and Queues** page, users can:
2021-05-23 20:17:12 +00:00
* Monitor resources (CPU and GPU, memory, video memory, and network usage) used by the experiments / Tasks that workers
2021-05-13 23:48:51 +00:00
execute
* View workers and the queues they listen to
* Create and rename queues; delete empty queues; monitor queue utilization
* Reorder, move, and remove experiments from queues
2022-04-07 08:49:11 +00:00
## Workers
Use the **WORKERS** tab to track worker activity and monitor worker utilization.
The page shows a worker activity graph and a worker details table. The graph time span can be controlled through the menu
at its top right corner. Hover over any plot point to see its data. By default, the **WORKER UTILIZATION** graph displays the
number of active and total workers over time.
The worker table shows the currently available workers and their current execution information:
* Current running experiment
* Current execution time
* Training iteration.
Clicking on a worker will open the workers details panel and replace the graph with that workers resource utilization
2023-01-25 11:25:29 +00:00
information. The resource metric being monitored can be selected through the menu at the graphs top left corner:
2022-04-07 08:49:11 +00:00
* CPU and GPU Usage
* Memory Usage
* Video Memory Usage
* Network Usage.
The workers details panel includes the following two tabs:
* **INFO** - worker information:
* Worker Name
* Update time - The last time the worker reported data
* Current Experiment - The experiment currently being executed by the worker
* Experiment Runtime - How long the currently executing experiment has been running
* Experiment iteration - The last reported training iteration for the experiment
2023-01-25 11:25:29 +00:00
* **QUEUES** - Information about the queues that the worker is assigned to:
2022-04-07 08:49:11 +00:00
* Queue - The name of the Queue
* Next experiment - The next experiment available in this queue
* In Queue - The number of experiments currently enqueued
2021-05-23 20:17:12 +00:00
2022-02-13 08:29:12 +00:00
![Worker management](../img/agents_queues_resource_management.png)
2021-05-13 23:48:51 +00:00
2022-04-07 08:49:11 +00:00
## Queues
2021-05-13 23:48:51 +00:00
2022-04-07 08:49:11 +00:00
Use the **QUEUES** tab to manage queues and monitor their statistics. The page shows graphs of the average experiment
wait time and the number of queued experiments, and a queue details table. Hover over any plot point to view its data.
By default, the graphs display the overall information of all queues.
2021-05-13 23:48:51 +00:00
2022-05-18 07:24:00 +00:00
The queue table shows the following queue information:
2022-04-07 08:49:11 +00:00
* Queue - Queue name
2022-05-18 07:24:00 +00:00
* Workers - Number of workers servicing the queue
2022-04-07 08:49:11 +00:00
* Next Experiment - The next experiment available in this queue
* Last Updated - The last time queue contents were modified
* In Queue - Number of experiments currently enqueued in the queue
2021-05-13 23:48:51 +00:00
2022-04-07 08:49:11 +00:00
To create a new queue - Click **+ NEW QUEUE** (top left).
2021-05-13 23:48:51 +00:00
2022-04-07 08:49:11 +00:00
Hover over a queue and click <img src="/docs/latest/icons/ico-copy-to-clipboard.svg" alt="Copy" className="icon size-md space-sm" />
to copy the queues ID.
2021-05-13 23:48:51 +00:00
![image](../img/4100.png)
2022-04-07 08:49:11 +00:00
Right-click on a queue or hover and click its action button <img src="/docs/latest/icons/ico-dots-v-menu.svg" alt="Dot menu" className="icon size-md space-sm" />
to access queue actions:
2021-05-13 23:48:51 +00:00
2022-04-07 08:49:11 +00:00
![Queue context menu](../img/webapp_workers_queues_context.png)
* Delete - Delete the queue. Any pending tasks will be dequeued.
* Rename - Change the queues name
* Clear - Remove all pending tasks from the queue
* Custom action - The ClearML Enterprise Server provides a mechanism to define your own custom actions, which will
appear in the context menu. See [Custom UI Context Menu Actions](../deploying_clearml/clearml_server_config.md#custom-ui-context-menu-actions)
Clicking on a queue will open the queues details panel and replace the graphs with that queues statistics.
2021-05-13 23:48:51 +00:00
2022-04-07 08:49:11 +00:00
The queues details panel includes the following two tabs:
* **EXPERIMENTS** - A list of experiments in the queue. You can reorder and remove enqueued experiments. See
[Controlling Queue Contents](#controlling-queue-contents).
* **WORKERS** - Information about the workers assigned to the queue:
* Name - Worker name
* IP - Workers IP
* Currently Executing - The experiment currently being executed by the worker
2021-05-13 23:48:51 +00:00
2022-04-07 08:49:11 +00:00
### Controlling Queue Contents
2021-05-13 23:48:51 +00:00
2022-04-07 08:49:11 +00:00
Click on an experiments menu button <img src="/docs/latest/icons/ico-dots-v-menu.svg" alt="Dot menu" className="icon size-md space-sm" />
in the **EXPERIMENTS** tab to reorganize your queue:
2022-03-06 11:00:46 +00:00
2022-04-07 08:49:11 +00:00
![Queue experiment's menu](../img/workers_queues_experiment_actions.png)
2022-03-06 11:00:46 +00:00
2022-04-07 08:49:11 +00:00
* Move a task to the top or bottom of the queue
* Move the task to a different queue
* Dequeue the task
You can also reorder experiments in a queue by dragging an experiment to a new position in the queue.