Add clarification about Services Mode (#126)

This commit is contained in:
pollfly 2021-12-01 16:18:55 +02:00 committed by GitHub
parent 8e789f6024
commit 3af0edb147
No known key found for this signature in database
GPG Key ID: 4AEE18F83AFDEB23
3 changed files with 25 additions and 20 deletions

View File

@ -558,25 +558,24 @@ will pull a Task from the `opportunistic` queue and allocate up to 4 GPUs based
being used by other agents). being used by other agents).
## Services Mode ## Services Mode
The ClearML Agent Services Mode executes an Agent that can execute multiple Tasks. This is useful for Tasks that are mostly ClearML Agent supports a **Services Mode** where, as soon as a task is launched off of its queue, the agent moves on to the
idling, such as periodic cleanup services, or a [pipeline controller](references/sdk/automation_controller_pipelinecontroller.md). next task without waiting for the previous one to complete. This mode is intended for running resource-sparse tasks that
are usually idling, such as periodic cleanup services or a [pipeline controller](references/sdk/automation_controller_pipelinecontroller.md).
Launch a service Task like any other Task, by enqueuing it to the appropriate queue.
:::note
The default `clearml-server` configuration already runs a single `clearml-agent` in services mode that listens to the “services” queue.
:::
To run a `clearml-agent` in services mode, run: To run a `clearml-agent` in services mode, run:
```bash ```bash
clearml-agent daemon --services-mode --queue services --create-queue --docker <docker_name> --cpu-only clearml-agent daemon --services-mode --queue services --create-queue --docker <docker_name> --cpu-only
``` ```
:::note :::note Notes
`services-mode` currently only supports Docker mode. Each service spins on its own Docker image. * `services-mode` currently only supports Docker mode. Each service spins on its own Docker image.
* The default `clearml-server` configuration already runs a single `clearml-agent` in services mode that listens to the
`services` queue.
::: :::
Launch a service task like any other task, by enqueuing it to the appropriate queue.
:::warning :::warning
Do not enqueue training or inference Tasks into the services queue. They will put an unnecessary load on the server. Do not enqueue training or inference tasks into the services queue. They will put an unnecessary load on the server.
::: :::
### Setting Server Credentials ### Setting Server Credentials

View File

@ -87,19 +87,25 @@ The Agent has three running modes:
- Conda Environment Mode: Similar to the Virtual Environment mode, only instead of using pip, it uses conda install and - Conda Environment Mode: Similar to the Virtual Environment mode, only instead of using pip, it uses conda install and
pip combination. Notice this mode is quite brittle due to the Conda package version support table. pip combination. Notice this mode is quite brittle due to the Conda package version support table.
## Services Agent & Queue ## Services Mode
The ClearML Agent, in its default setup, spins a single Task per Agent. It's possible to run multiple agents on the same machine, In its default mode, a ClearML Agent executes a single task at a time, since training tasks typically require all resources
but each one will execute a single Task at a time.<br/> available to them. Some tasks are mostly idling and require less computation power, such as controller tasks (e.g.
This setup makes sense compute-heavy Tasks that might take some time to complete. a pipeline controller) or service tasks (e.g. cleanup service).
Some tasks, mainly control (Like a pipeline controller) or services (Like an archive cleanup service) are mostly idling, and only implement a thin control logic.<br/>
This is where the `services-modes` comes into play. An agent running in services-mode will spin multiple tasks at the same time, each Task will register itself as a sub-agent (visible in the workers Tab in the UI). This is where the `services-modes` comes into play. An agent running in `services-mode` will let multiple tasks execute
Some examples for suitable tasks are: in parallel (each task will register itself as a sub-agent, visible in the [Workers & Queues](../webapp/webapp_workers_queues.md) tab in the UI).
This mode is intended for running maintenance tasks. Some suitable tasks include:
- [Pipeline controller](../guides/pipeline/pipeline_controller.md) - Implementing the pipeline scheduling and logic - [Pipeline controller](../guides/pipeline/pipeline_controller.md) - Implementing the pipeline scheduling and logic
- [Hyper-Parameter Optimization](../guides/optimization/hyper-parameter-optimization/examples_hyperparam_opt.md) - Implementing an active selection of experiments - [Hyper-Parameter Optimization](../guides/optimization/hyper-parameter-optimization/examples_hyperparam_opt.md) - Implementing an active selection of experiments
- [Control Service](../guides/services/aws_autoscaler.md) - AWS Autoscaler for example - [Control Service](../guides/services/aws_autoscaler.md) - AWS Autoscaler for example
- [External services](../guides/services/slack_alerts.md) - Such as Slack integration alert service - [External services](../guides/services/slack_alerts.md) - Such as Slack integration alert service
By default, [ClearML Server](../deploying_clearml/clearml_server.md) comes with an Agent running on the machine that runs it. It also comes with a Services queue. :::warning
Do not enqueue training or inference tasks into the services queue. They will put an unnecessary load on the server.
:::
By default, the open source [ClearML Server](../deploying_clearml/clearml_server.md) runs a single clearml-agent in
services mode that listens to the `services` queue.

View File

@ -53,7 +53,7 @@ The pipeline control logic is processed in a background thread.
:::note :::note
We recommend enqueuing Pipeline Controller Tasks into a We recommend enqueuing Pipeline Controller Tasks into a
[services](agents_and_queues.md#services-agent--queue) queue [services](../clearml_agent.md#services-mode) queue.
::: :::
Callback functions can be specified to be called in the steps of a `PipelineController` object. Callback functions can be specified to be called in the steps of a `PipelineController` object.