mirror of
https://github.com/clearml/clearml-docs
synced 2025-02-25 05:24:39 +00:00
137 lines
4.5 KiB
Markdown
137 lines
4.5 KiB
Markdown
|
---
|
||
|
title: Manual Deployment
|
||
|
---
|
||
|
|
||
|
## Spinning Up an Agent
|
||
|
You can spin up an agent on any machine: on-prem and/or cloud instance. When spinning up an agent, you assign it to
|
||
|
service a queue(s). Utilize the machine by enqueuing tasks to the queue that the agent is servicing, and the agent will
|
||
|
pull and execute the tasks.
|
||
|
|
||
|
:::tip cross-platform execution
|
||
|
ClearML Agent is platform-agnostic. When using the ClearML Agent to execute tasks cross-platform, set platform
|
||
|
specific environment variables before launching the agent.
|
||
|
|
||
|
For example, to run an agent on an ARM device, set the core type environment variable before spinning up the agent:
|
||
|
|
||
|
```bash
|
||
|
export OPENBLAS_CORETYPE=ARMV8
|
||
|
clearml-agent daemon --queue <queue_name>
|
||
|
```
|
||
|
:::
|
||
|
|
||
|
### Executing an Agent
|
||
|
To execute an agent, listening to a queue, run:
|
||
|
|
||
|
```bash
|
||
|
clearml-agent daemon --queue <queue_name>
|
||
|
```
|
||
|
|
||
|
### Executing in Background
|
||
|
To execute an agent in the background, run:
|
||
|
```bash
|
||
|
clearml-agent daemon --queue <execution_queue_to_pull_from> --detached
|
||
|
```
|
||
|
### Stopping Agents
|
||
|
To stop an agent running in the background, run:
|
||
|
```bash
|
||
|
clearml-agent daemon <arguments> --stop
|
||
|
```
|
||
|
|
||
|
### Allocating Resources
|
||
|
To specify GPUs associated with the agent, add the `--gpus` flag.
|
||
|
|
||
|
:::info Docker Mode
|
||
|
Make sure to include the `--docker` flag, as GPU management through the agent is only supported in [Docker Mode](clearml_agent_execution_env.md#docker-mode).
|
||
|
:::
|
||
|
|
||
|
To execute multiple agents on the same machine (usually assigning GPU for the different agents), run:
|
||
|
```bash
|
||
|
clearml-agent daemon --gpus 0 --queue default --docker
|
||
|
clearml-agent daemon --gpus 1 --queue default --docker
|
||
|
```
|
||
|
To allocate more than one GPU, provide a list of allocated GPUs
|
||
|
```bash
|
||
|
clearml-agent daemon --gpus 0,1 --queue dual_gpu --docker
|
||
|
```
|
||
|
|
||
|
### Queue Prioritization
|
||
|
A single agent can listen to multiple queues. The priority is set by their order.
|
||
|
|
||
|
```bash
|
||
|
clearml-agent daemon --queue high_q low_q
|
||
|
```
|
||
|
This ensures the agent first tries to pull a Task from the `high_q` queue, and only if it is empty, the agent will try to pull
|
||
|
from the `low_q` queue.
|
||
|
|
||
|
To make sure an agent pulls from all queues equally, add the `--order-fairness` flag.
|
||
|
```bash
|
||
|
clearml-agent daemon --queue group_a group_b --order-fairness
|
||
|
```
|
||
|
It will make sure the agent will pull from the `group_a` queue, then from `group_b`, then back to `group_a`, etc. This ensures
|
||
|
that `group_a` or `group_b` will not be able to starve one another of resources.
|
||
|
|
||
|
### SSH Access
|
||
|
By default, ClearML Agent maps the host's `~/.ssh` into the container's `/root/.ssh` directory (configurable,
|
||
|
see [clearml.conf](../configs/clearml_conf.md#docker_internal_mounts)).
|
||
|
|
||
|
If you want to use existing auth sockets with ssh-agent, you can verify your host ssh-agent is working correctly with:
|
||
|
|
||
|
```commandline
|
||
|
echo $SSH_AUTH_SOCK
|
||
|
```
|
||
|
|
||
|
You should see a path to a temporary file, something like this:
|
||
|
|
||
|
```console
|
||
|
/tmp/ssh-<random>/agent.<random>
|
||
|
```
|
||
|
|
||
|
Then run your `clearml-agent` in Docker mode, which will automatically detect the `SSH_AUTH_SOCK` environment variable,
|
||
|
and mount the socket into any container it spins.
|
||
|
|
||
|
You can also explicitly set the `SSH_AUTH_SOCK` environment variable when executing an agent. The command below will
|
||
|
execute an agent in Docker mode and assign it to service a queue. The agent will have access to
|
||
|
the SSH socket provided in the environment variable.
|
||
|
|
||
|
```
|
||
|
SSH_AUTH_SOCK=<file_socket> clearml-agent daemon --gpus <your config> --queue <your queue name> --docker
|
||
|
```
|
||
|
|
||
|
## Google Colab
|
||
|
|
||
|
ClearML Agent can run on a [Google Colab](https://colab.research.google.com/) instance. This helps users to leverage
|
||
|
compute resources provided by Google Colab and send tasks for execution on it.
|
||
|
|
||
|
Check out [this tutorial](../guides/ide/google_colab.md) on how to run a ClearML Agent on Google Colab!
|
||
|
|
||
|
## Explicit Task Execution
|
||
|
|
||
|
ClearML Agent can also execute specific tasks directly, without listening to a queue.
|
||
|
|
||
|
### Execute a Task without Queue
|
||
|
|
||
|
Execute a Task with a `clearml-agent` worker without a queue.
|
||
|
```bash
|
||
|
clearml-agent execute --id <task-id>
|
||
|
```
|
||
|
### Clone a Task and Execute the Cloned Task
|
||
|
|
||
|
Clone the specified Task and execute the cloned Task with a `clearml-agent` worker without a queue.
|
||
|
```bash
|
||
|
clearml-agent execute --id <task-id> --clone
|
||
|
```
|
||
|
|
||
|
### Execute Task inside a Docker
|
||
|
|
||
|
Execute a Task with a `clearml-agent` worker using a Docker container without a queue.
|
||
|
```bash
|
||
|
clearml-agent execute --id <task-id> --docker
|
||
|
```
|
||
|
|
||
|
## Debugging
|
||
|
|
||
|
Run a `clearml-agent` daemon in foreground mode, sending all output to the console.
|
||
|
```bash
|
||
|
clearml-agent daemon --queue default --foreground
|
||
|
```
|