Add clarification about Services Mode (#126)

This commit is contained in:
pollfly 2021-12-01 16:18:55 +02:00 committed by GitHub
parent 8e789f6024
commit 3af0edb147
No known key found for this signature in database
GPG Key ID: 4AEE18F83AFDEB23
3 changed files with 25 additions and 20 deletions

View File

@ -558,25 +558,24 @@ will pull a Task from the `opportunistic` queue and allocate up to 4 GPUs based
being used by other agents).
## Services Mode
The ClearML Agent Services Mode executes an Agent that can execute multiple Tasks. This is useful for Tasks that are mostly
idling, such as periodic cleanup services, or a [pipeline controller](references/sdk/automation_controller_pipelinecontroller.md).
Launch a service Task like any other Task, by enqueuing it to the appropriate queue.
:::note
The default `clearml-server` configuration already runs a single `clearml-agent` in services mode that listens to the “services” queue.
:::
ClearML Agent supports a **Services Mode** where, as soon as a task is launched off of its queue, the agent moves on to the
next task without waiting for the previous one to complete. This mode is intended for running resource-sparse tasks that
are usually idling, such as periodic cleanup services or a [pipeline controller](references/sdk/automation_controller_pipelinecontroller.md).
To run a `clearml-agent` in services mode, run:
```bash
clearml-agent daemon --services-mode --queue services --create-queue --docker <docker_name> --cpu-only
```
:::note
`services-mode` currently only supports Docker mode. Each service spins on its own Docker image.
:::note Notes
* `services-mode` currently only supports Docker mode. Each service spins on its own Docker image.
* The default `clearml-server` configuration already runs a single `clearml-agent` in services mode that listens to the
`services` queue.
:::
Launch a service task like any other task, by enqueuing it to the appropriate queue.
:::warning
Do not enqueue training or inference Tasks into the services queue. They will put an unnecessary load on the server.
Do not enqueue training or inference tasks into the services queue. They will put an unnecessary load on the server.
:::
### Setting Server Credentials

View File

@ -87,19 +87,25 @@ The Agent has three running modes:
- Conda Environment Mode: Similar to the Virtual Environment mode, only instead of using pip, it uses conda install and
pip combination. Notice this mode is quite brittle due to the Conda package version support table.
## Services Agent & Queue
## Services Mode
The ClearML Agent, in its default setup, spins a single Task per Agent. It's possible to run multiple agents on the same machine,
but each one will execute a single Task at a time.<br/>
This setup makes sense compute-heavy Tasks that might take some time to complete.
Some tasks, mainly control (Like a pipeline controller) or services (Like an archive cleanup service) are mostly idling, and only implement a thin control logic.<br/>
In its default mode, a ClearML Agent executes a single task at a time, since training tasks typically require all resources
available to them. Some tasks are mostly idling and require less computation power, such as controller tasks (e.g.
a pipeline controller) or service tasks (e.g. cleanup service).
This is where the `services-modes` comes into play. An agent running in services-mode will spin multiple tasks at the same time, each Task will register itself as a sub-agent (visible in the workers Tab in the UI).
Some examples for suitable tasks are:
This is where the `services-modes` comes into play. An agent running in `services-mode` will let multiple tasks execute
in parallel (each task will register itself as a sub-agent, visible in the [Workers & Queues](../webapp/webapp_workers_queues.md) tab in the UI).
This mode is intended for running maintenance tasks. Some suitable tasks include:
- [Pipeline controller](../guides/pipeline/pipeline_controller.md) - Implementing the pipeline scheduling and logic
- [Hyper-Parameter Optimization](../guides/optimization/hyper-parameter-optimization/examples_hyperparam_opt.md) - Implementing an active selection of experiments
- [Control Service](../guides/services/aws_autoscaler.md) - AWS Autoscaler for example
- [External services](../guides/services/slack_alerts.md) - Such as Slack integration alert service
By default, [ClearML Server](../deploying_clearml/clearml_server.md) comes with an Agent running on the machine that runs it. It also comes with a Services queue.
:::warning
Do not enqueue training or inference tasks into the services queue. They will put an unnecessary load on the server.
:::
By default, the open source [ClearML Server](../deploying_clearml/clearml_server.md) runs a single clearml-agent in
services mode that listens to the `services` queue.

View File

@ -53,7 +53,7 @@ The pipeline control logic is processed in a background thread.
:::note
We recommend enqueuing Pipeline Controller Tasks into a
[services](agents_and_queues.md#services-agent--queue) queue
[services](../clearml_agent.md#services-mode) queue.
:::
Callback functions can be specified to be called in the steps of a `PipelineController` object.