clearml-docs/docs/getting_started/ds/best_practices.md

---
title: Best Practices
---

This section talks about what made us design ClearML the way we did and how it reflects on AI workflows.
While ClearML was designed to fit into any workflow, the practices described below brings a lot of advantages from organizing one's workflow
to preparing it to scale in the long term.

:::important
The following is only an opinion. ClearML is designed to accommodate any workflow whether it conforms to our way or not!
:::

## Develop Locally

**Work on a machine that is easily manageable!** 

During early stages of model development, while code is still being modified heavily, this is the usual setup we'd expect to see used by data scientists:

  - **Local development machine**, usually a laptop (and usually using only CPU) with a fraction of the dataset for faster 
    iterations. Use a local machine for writing, training, and debugging pipeline code. 
  - **Workstation with a GPU**, usually with a limited amount of memory for small batch-sizes. Use this workstation to train 
    the model and ensure that you choose a model that makes sense, and the training procedure works. Can be used to provide initial models for testing. 

The abovementioned setups might be folded into each other and that's great! If you have a GPU machine for each researcher, that's awesome! 
The goal of this phase is to get a code, dataset, and environment set up, so you can start digging to find the best model!

- [ClearML SDK](../../clearml_sdk/clearml_sdk.md) should be integrated into your code (check out [Getting Started](ds_first_steps.md)). 
  This helps visualizing the results and tracking progress.
- [ClearML Agent](../../clearml_agent.md) helps moving your work to other machines without the hassle of rebuilding the environment every time, 
  while also creating an easy queue interface that easily lets you drop your experiments to be executed one by one
  (great for ensuring that the GPUs are churning during the weekend).
- [ClearML Session](../../apps/clearml_session.md) helps with developing on remote machines, in the same way that you'd develop on your local laptop!

## Train Remotely

In this phase, you scale your training efforts, and try to come up with the best code / parameter / data combination that 
yields the best performing model for your task!

  - The real training (usually) should **not** be executed on your development machine.
  - Training sessions should be launched and monitored from a web UI.
  - You should continue coding while experiments are being executed without interrupting them.
  - Stop optimizing your code because your machine struggles, and run it on a beefier machine (cloud / on-prem).

Visualization and comparison dashboards keep your sanity at bay! At this stage you usually have a docker container with all the binaries 
that you need. 
- [ClearML SDK](../../clearml_sdk/clearml_sdk.md) ensures that all the metrics, parameters and Models are automatically logged and can later be 
  accessed, [compared](../../webapp/webapp_exp_comparing.md) and [tracked](../../webapp/webapp_exp_track_visual.md).
- [ClearML Agent](../../clearml_agent.md) does the heavy lifting. It reproduces the execution environment, clones your code, 
  applies code patches, manages parameters (including overriding them on the fly), executes the code, and queues multiple tasks.
  It can even [build](../../clearml_agent/clearml_agent_docker.md#exporting-a-task-into-a-standalone-docker-container) the docker container for you!  
- [ClearML Pipelines](../../pipelines/pipelines.md) ensure that steps run in the same order, 
  programmatically chaining tasks together, while giving an overview of the execution pipeline's status.

**Your entire environment should magically be able to run on any machine, without you working hard.** 

## Track EVERYTHING

Track everything--from obscure parameters to weird metrics, it's impossible to know what will end up
improving your results later on!

- Make sure experiments are reproducible! ClearML logs code, parameters, and environment in a single, easily searchable place. 
- Development is not linear. Configuration / Parameters should not be stored in your git, as
  they are temporary and constantly changing. They still need to be logged because who knows, one day...
- Uncommitted changes to your code should be stored for later forensics in case that magic number actually saved the day. Not every line change should be committed.
- Mark potentially good experiments, make them the new baseline for comparison.

## Visibility Matters

While you can track experiments with one tool, and pipeline them with another, having 
everything under the same roof has its benefits! 

Being able to track experiment progress and compare experiments, and, based on that, send experiments to execution on remote
machines (that also build the environment themselves) has tremendous benefits in terms of visibility and ease of integration.

Being able to have visibility in your pipeline, while using experiments already defined in the platform, 
enables users to have a clearer picture of the pipeline's status 
and makes it easier to start using pipelines earlier in the process by simplifying chaining tasks.

Managing datasets with the same tools and APIs that manage the experiments also lowers the barrier of entry into 
experiment and data provenance.
Initial commit 2021-05-13 23:48:51 +00:00			`---`
			`title: Best Practices`
			`---`

Change wording (#707) 2023-11-14 16:51:45 +00:00			`This section talks about what made us design ClearML the way we did and how it reflects on AI workflows.`
Small edits (#769) 2024-02-06 14:06:43 +00:00			`While ClearML was designed to fit into any workflow, the practices described below brings a lot of advantages from organizing one's workflow`
			`to preparing it to scale in the long term.`
Initial commit 2021-05-13 23:48:51 +00:00
			`:::important`
Small edits (#812) 2024-03-27 09:56:21 +00:00			`The following is only an opinion. ClearML is designed to accommodate any workflow whether it conforms to our way or not!`
Initial commit 2021-05-13 23:48:51 +00:00			`:::`

			`## Develop Locally`

edit ds/best_practices (#76) 2021-09-30 06:54:52 +00:00			`Work on a machine that is easily manageable!`
Initial commit 2021-05-13 23:48:51 +00:00
			`During early stages of model development, while code is still being modified heavily, this is the usual setup we'd expect to see used by data scientists:`

Small edits (#812) 2024-03-27 09:56:21 +00:00			`- Local development machine, usually a laptop (and usually using only CPU) with a fraction of the dataset for faster`
			`iterations. Use a local machine for writing, training, and debugging pipeline code.`
			`- Workstation with a GPU, usually with a limited amount of memory for small batch-sizes. Use this workstation to train`
Small edits (#162) 2022-01-18 11:23:47 +00:00			`the model and ensure that you choose a model that makes sense, and the training procedure works. Can be used to provide initial models for testing.`
Initial commit 2021-05-13 23:48:51 +00:00
edit ds/best_practices (#76) 2021-09-30 06:54:52 +00:00			`The abovementioned setups might be folded into each other and that's great! If you have a GPU machine for each researcher, that's awesome!`
Small edits (#769) 2024-02-06 14:06:43 +00:00			`The goal of this phase is to get a code, dataset, and environment set up, so you can start digging to find the best model!`
Initial commit 2021-05-13 23:48:51 +00:00
Small edits (#636) 2023-08-09 10:28:25 +00:00			`- [ClearML SDK](../../clearml_sdk/clearml_sdk.md) should be integrated into your code (check out [Getting Started](ds_first_steps.md)).`
edit ds/best_practices (#76) 2021-09-30 06:54:52 +00:00			`This helps visualizing the results and tracking progress.`
Initial commit 2021-05-13 23:48:51 +00:00			`- [ClearML Agent](../../clearml_agent.md) helps moving your work to other machines without the hassle of rebuilding the environment every time,`
Small edits (#779) 2024-02-15 13:28:26 +00:00			`while also creating an easy queue interface that easily lets you drop your experiments to be executed one by one`
edit ds/best_practices (#76) 2021-09-30 06:54:52 +00:00			`(great for ensuring that the GPUs are churning during the weekend).`
Small edits (#779) 2024-02-15 13:28:26 +00:00			`- [ClearML Session](../../apps/clearml_session.md) helps with developing on remote machines, in the same way that you'd develop on your local laptop!`
Initial commit 2021-05-13 23:48:51 +00:00
			`## Train Remotely`

Small edits (#476) 2023-02-16 10:17:53 +00:00			`In this phase, you scale your training efforts, and try to come up with the best code / parameter / data combination that`
			`yields the best performing model for your task!`
Initial commit 2021-05-13 23:48:51 +00:00
			`- The real training (usually) should not be executed on your development machine.`
			`- Training sessions should be launched and monitored from a web UI.`
			`- You should continue coding while experiments are being executed without interrupting them.`
edit ds/best_practices (#76) 2021-09-30 06:54:52 +00:00			`- Stop optimizing your code because your machine struggles, and run it on a beefier machine (cloud / on-prem).`
Initial commit 2021-05-13 23:48:51 +00:00
Small edits (#668) 2023-09-11 10:33:30 +00:00			`Visualization and comparison dashboards keep your sanity at bay! At this stage you usually have a docker container with all the binaries`
Small edits (#663) 2023-09-04 12:40:42 +00:00			`that you need.`
Rewrite fundamentals sections (#252) 2022-05-18 08:49:31 +00:00			`- [ClearML SDK](../../clearml_sdk/clearml_sdk.md) ensures that all the metrics, parameters and Models are automatically logged and can later be`
Initial commit 2021-05-13 23:48:51 +00:00			`accessed, [compared](../../webapp/webapp_exp_comparing.md) and [tracked](../../webapp/webapp_exp_track_visual.md).`
edit ds/best_practices (#76) 2021-09-30 06:54:52 +00:00			`- [ClearML Agent](../../clearml_agent.md) does the heavy lifting. It reproduces the execution environment, clones your code,`
Small edits (#550) 2023-05-04 07:52:09 +00:00			`applies code patches, manages parameters (including overriding them on the fly), executes the code, and queues multiple tasks.`
Restructure ClearML Agent pages (#873) 2024-07-15 12:53:41 +00:00			`It can even [build](../../clearml_agent/clearml_agent_docker.md#exporting-a-task-into-a-standalone-docker-container) the docker container for you!`
Reformat pipeline docs (#239) 2022-04-26 10:42:55 +00:00			`- [ClearML Pipelines](../../pipelines/pipelines.md) ensure that steps run in the same order,`
edit ds/best_practices (#76) 2021-09-30 06:54:52 +00:00			`programmatically chaining tasks together, while giving an overview of the execution pipeline's status.`
Initial commit 2021-05-13 23:48:51 +00:00
			`Your entire environment should magically be able to run on any machine, without you working hard.`

			`## Track EVERYTHING`

Small edits (#476) 2023-02-16 10:17:53 +00:00			`Track everything--from obscure parameters to weird metrics, it's impossible to know what will end up`
			`improving your results later on!`
Initial commit 2021-05-13 23:48:51 +00:00
Small edits (#482) 2023-02-21 08:32:54 +00:00			`- Make sure experiments are reproducible! ClearML logs code, parameters, and environment in a single, easily searchable place.`
edit ds/best_practices (#77) 2021-10-03 08:08:45 +00:00			`- Development is not linear. Configuration / Parameters should not be stored in your git, as`
			`they are temporary and constantly changing. They still need to be logged because who knows, one day...`
Initial commit 2021-05-13 23:48:51 +00:00			`- Uncommitted changes to your code should be stored for later forensics in case that magic number actually saved the day. Not every line change should be committed.`
			`- Mark potentially good experiments, make them the new baseline for comparison.`

			`## Visibility Matters`

Small edits (#779) 2024-02-15 13:28:26 +00:00			`While you can track experiments with one tool, and pipeline them with another, having`
edit ds/best_practices (#76) 2021-09-30 06:54:52 +00:00			`everything under the same roof has its benefits!`

Small edits (#722) 2023-11-28 08:03:58 +00:00			`Being able to track experiment progress and compare experiments, and, based on that, send experiments to execution on remote`
edit ds/best_practices (#76) 2021-09-30 06:54:52 +00:00			`machines (that also build the environment themselves) has tremendous benefits in terms of visibility and ease of integration.`

Small edits (#482) 2023-02-21 08:32:54 +00:00			`Being able to have visibility in your pipeline, while using experiments already defined in the platform,`
edit ds/best_practices (#76) 2021-09-30 06:54:52 +00:00			`enables users to have a clearer picture of the pipeline's status`
			`and makes it easier to start using pipelines earlier in the process by simplifying chaining tasks.`

Add Hyper-Datasets 2021-06-20 22:00:16 +00:00			`Managing datasets with the same tools and APIs that manage the experiments also lowers the barrier of entry into`
			`experiment and data provenance.`