clearml-docs/clearml_agent_deployment_bare_metal.md at 567af28632e76c9063bfa05b985665dd0c4a494c

mirror of https://github.com/clearml/clearml-docs synced 2025-02-25 05:24:39 +00:00

Noam Wasersprung 567af28632

Restructure docs for platform components and use case clarity (#1048 )

2025-02-23 17:33:55 +02:00

4.5 KiB

Raw Blame History

title
Manual Deployment

Spinning Up an Agent

You can spin up an agent on any machine: on-prem and/or cloud instance. When spinning up an agent, you assign it to service a queue(s). Utilize the machine by enqueuing tasks to the queue that the agent is servicing, and the agent will pull and execute the tasks.

:::tip cross-platform execution ClearML Agent is platform-agnostic. When using the ClearML Agent to execute tasks cross-platform, set platform specific environment variables before launching the agent.

For example, to run an agent on an ARM device, set the core type environment variable before spinning up the agent:

export OPENBLAS_CORETYPE=ARMV8
clearml-agent daemon --queue <queue_name>

:::

Executing an Agent

To execute an agent, listening to a queue, run:

clearml-agent daemon --queue <queue_name>

Executing in Background

To execute an agent in the background, run:

clearml-agent daemon --queue <execution_queue_to_pull_from> --detached

Stopping Agents

To stop an agent running in the background, run:

clearml-agent daemon <arguments> --stop

Allocating Resources

To specify GPUs associated with the agent, add the --gpus flag.

:::info Docker Mode Make sure to include the --docker flag, as GPU management through the agent is only supported in Docker Mode. :::

To execute multiple agents on the same machine (usually assigning GPU for the different agents), run:

clearml-agent daemon --gpus 0 --queue default --docker
clearml-agent daemon --gpus 1 --queue default --docker

To allocate more than one GPU, provide a list of allocated GPUs

clearml-agent daemon --gpus 0,1 --queue dual_gpu --docker

Queue Prioritization

A single agent can listen to multiple queues. The priority is set by their order.

clearml-agent daemon --queue high_q low_q

This ensures the agent first tries to pull a Task from the high_q queue, and only if it is empty, the agent will try to pull from the low_q queue.

To make sure an agent pulls from all queues equally, add the --order-fairness flag.

clearml-agent daemon --queue group_a group_b --order-fairness

It will make sure the agent will pull from the group_a queue, then from group_b, then back to group_a, etc. This ensures that group_a or group_b will not be able to starve one another of resources.

SSH Access

By default, ClearML Agent maps the host's ~/.ssh into the container's /root/.ssh directory (configurable, see clearml.conf).

If you want to use existing auth sockets with ssh-agent, you can verify your host ssh-agent is working correctly with:

echo $SSH_AUTH_SOCK

You should see a path to a temporary file, something like this:

/tmp/ssh-<random>/agent.<random>

Then run your clearml-agent in Docker mode, which will automatically detect the SSH_AUTH_SOCK environment variable, and mount the socket into any container it spins.

You can also explicitly set the SSH_AUTH_SOCK environment variable when executing an agent. The command below will execute an agent in Docker mode and assign it to service a queue. The agent will have access to the SSH socket provided in the environment variable.

SSH_AUTH_SOCK=<file_socket> clearml-agent daemon --gpus <your config> --queue <your queue name>  --docker

Google Colab

ClearML Agent can run on a Google Colab instance. This helps users to leverage compute resources provided by Google Colab and send tasks for execution on it.

Check out this tutorial on how to run a ClearML Agent on Google Colab!

Explicit Task Execution

ClearML Agent can also execute specific tasks directly, without listening to a queue.

Execute a Task without Queue

Execute a Task with a clearml-agent worker without a queue.

clearml-agent execute --id <task-id>

Clone a Task and Execute the Cloned Task

Clone the specified Task and execute the cloned Task with a clearml-agent worker without a queue.

clearml-agent execute --id <task-id> --clone

Execute Task inside a Docker

Execute a Task with a clearml-agent worker using a Docker container without a queue.

clearml-agent execute --id <task-id> --docker

Debugging

Run a clearml-agent daemon in foreground mode, sending all output to the console.

clearml-agent daemon --queue default --foreground

4.5 KiB Raw Blame History