clearml-docs/docs/clearml_agent.md

75 lines
3.8 KiB
Markdown
Raw Normal View History

2021-05-13 23:48:51 +00:00
---
2024-11-07 10:17:26 +00:00
title: ClearML Agent
2021-05-13 23:48:51 +00:00
---
2023-09-04 11:37:36 +00:00
<div class="vid" >
<iframe style={{position: 'absolute', top: '0', left: '0', bottom: '0', right: '0', width: '100%', height: '100%'}}
src="https://www.youtube.com/embed/MX3BrXnaULs"
title="YouTube video player"
frameborder="0"
allow="accelerometer; autoplay; clipboard-write; encrypted-media; gyroscope; picture-in-picture; fullscreen"
allowfullscreen>
</iframe>
</div>
2023-09-10 11:25:37 +00:00
<br/>
2022-05-19 06:59:10 +00:00
**ClearML Agent** is a virtual environment and execution manager for DL / ML solutions on GPU machines. It integrates with the **ClearML Python Package** and ClearML Server to provide a full AI cluster solution. <br/>
2021-05-13 23:48:51 +00:00
Its main focus is around:
2025-02-06 15:31:11 +00:00
- Reproducing tasks, including their complete environments.
2021-05-13 23:48:51 +00:00
- Scaling workflows on multiple target machines.
2025-02-06 15:31:11 +00:00
ClearML Agent executes a task or other workflow by reproducing the state of the code from the original machine
2021-07-20 07:34:10 +00:00
to a remote machine.
2021-05-13 23:48:51 +00:00
2022-02-17 12:09:17 +00:00
![ClearML Agent flow diagram](img/clearml_agent_flow_diagram.png)
2021-05-13 23:48:51 +00:00
2023-08-09 10:28:25 +00:00
The preceding diagram demonstrates a typical flow where an agent executes a task:
2021-07-20 07:34:10 +00:00
1. Enqueue a task for execution on the queue.
1. The agent pulls the task from the queue.
2025-02-10 08:17:24 +00:00
1. The agent launches a container in which to run the task's code.
2021-07-20 07:34:10 +00:00
1. The task's execution environment is set up:
1. Execute any custom setup script configured.
1. Install any required system packages.
1. Clone the code from a git repository.
1. Apply any uncommitted changes recorded.
2025-02-09 17:46:40 +00:00
1. Set up the Python environment and required packages.
2021-07-20 07:34:10 +00:00
1. The task's script/code is executed.
:::note Python Version
2025-02-10 08:17:24 +00:00
ClearML Agent uses the Python version available in the environment or container in which it executes the code. It does not
install Python, so make sure to use a container or environment with the version you need.
:::
2023-09-04 11:37:36 +00:00
While the agent is running, it continuously reports system metrics to the ClearML Server (these can be monitored in the
[**Orchestration**](webapp/webapp_workers_queues.md) page).
2021-06-20 22:00:16 +00:00
2025-02-06 15:31:11 +00:00
Continue using ClearML Agent once it is running on a target machine. Reproduce tasks and execute
2021-05-13 23:48:51 +00:00
automated workflows in one (or both) of the following ways:
2023-12-07 16:33:28 +00:00
* Programmatically (using [`Task.enqueue()`](references/sdk/task.md#taskenqueue) or [`Task.execute_remotely()`](references/sdk/task.md#execute_remotely))
2025-02-06 15:31:11 +00:00
* Through the ClearML Web UI (without working directly with code), by cloning tasks and enqueuing them to the
queue that a ClearML Agent is servicing.
2023-03-19 10:34:17 +00:00
The agent facilitates [overriding task execution detail](webapp/webapp_exp_tuning.md) values through the UI without
code modification. Modifying a task clones configuration will have the ClearML agent executing it override the
original values:
2025-02-06 15:31:11 +00:00
* Modified package requirements will have the task script run with updated packages
2023-03-19 10:34:17 +00:00
* Modified recorded command line arguments will have the ClearML agent inject the new values in their stead
2023-12-07 16:33:28 +00:00
* Code-level configuration instrumented with [`Task.connect()`](references/sdk/task.md#connect) will be overridden by modified hyperparameters
2021-05-13 23:48:51 +00:00
ClearML Agent can be deployed in various setups to suit different workflows and infrastructure needs:
* [Bare Metal](clearml_agent/clearml_agent_deployment.md#spinning-up-an-agent)
* [Kubernetes](clearml_agent/clearml_agent_deployment.md#kubernetes)
* [Slurm](clearml_agent/clearml_agent_deployment.md#slurm)
* [Google Colab](guides/ide/google_colab.md)
2021-05-13 23:48:51 +00:00
## References
For more information, see the following:
* [ClearML Agent CLI](clearml_agent/clearml_agent_ref.md) for a reference for `clearml-agent`'s CLI commands.
* [ClearML Agent Environment Variables](clearml_agent/clearml_agent_env_var.md) for a list of environment variables
to configure ClearML Agent
* [Agent Section](configs/clearml_conf.md#agent-section) for a list of options to configure the ClearML Agent in the
`clearml.conf`