clearml-docs/docs/clearml_sdk/task_sdk.md
2022-05-22 10:27:30 +03:00

597 lines
24 KiB
Markdown
Raw Blame History

This file contains ambiguous Unicode characters

This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.

---
title: Task
---
The following page provides an overview of the basic Pythonic interface to ClearML Tasks.
## Task Creation
[`Task.init`](../references/sdk/task.md) is the main method used to create tasks in ClearML. It will create a task, and
populate it with:
* A link to the running git repository (including commit ID and local uncommitted changes)
* Python packages used (i.e. directly imported Python packages, and the versions available on the machine)
* Argparse arguments (default and specific to the current execution)
* Reports to Tensorboard & Matplotlib and model checkpoints.
:::note
ClearML object (e.g. task, project) names are required to be at least 3 characters long
:::
```python
from clearml import Task
task = Task.init(
project_name='example', # project name of at least 3 characters
task_name='task template', # task name of at least 3 characters
task_type=None,
tags=None,
reuse_last_task_id=True,
continue_last_task=False,
output_uri=None,
auto_connect_arg_parser=True,
auto_connect_frameworks=True,
auto_resource_monitoring=True,
auto_connect_streams=True,
)
```
Once a task is created, the task object can be accessed from anywhere in the code by calling [`Task.current_task`](../references/sdk/task.md#taskcurrent_task).
If multiple tasks need to be created in the same process (for example, for logging multiple manual runs),
make sure to close a task, before initializing a new one. To close a task simply call [`Task.close`](../references/sdk/task.md#close)
(see example [here](../guides/advanced/multiple_tasks_single_process.md)).
When initializing a task, its project needs to be specified. If the project entered does not exist, it will be created on-the-fly.
Projects can be divided into sub-projects, just like folders are broken into sub-folders.
For example:
```python
Task.init(project_name='main_project/sub_project', task_name='test')
```
Nesting projects works on multiple levels. For example: `project_name=main_project/sub_project/sub_sub_project`
### Automatic Logging
After invoking `Task.init` in a script, ClearML starts its automagical logging, which includes the following elements:
* **Hyperparameters** - ClearML logs the following types of hyperparameters:
* Command Line Parsing - ClearML captures any command line parameters passed when invoking code that uses standard python packages, including:
* [click](https://click.palletsprojects.com) (see code example [here](https://github.com/allegroai/clearml/blob/master/examples/frameworks/click/click_multi_cmd.py)).
* argparse (see argparse logging example [here](../guides/reporting/hyper_parameters.md).)
* [Python Fire](https://github.com/google/python-fire) - see code examples [here](https://github.com/allegroai/clearml/tree/master/examples/frameworks/fire).
* [LightningCLI](https://pytorch-lightning.readthedocs.io/en/stable/common/lightning_cli.html) - see code example [here](https://github.com/allegroai/clearml/blob/master/examples/frameworks/jsonargparse/pytorch_lightning_cli.py).
* TensorFlow Definitions (`absl-py`)
* [Hydra](https://github.com/facebookresearch/hydra) - the Omegaconf which holds all the configuration files, as well as overridden values.
* **Models** - ClearML automatically logs and updates the models and all snapshot paths saved with the following frameworks:
* Tensorflow (see [code example](../guides/frameworks/tensorflow/tensorflow_mnist.md))
* Keras (see [code example](../guides/frameworks/keras/keras_tensorboard.md))
* Pytorch (see [code example](../guides/frameworks/pytorch/pytorch_mnist.md))
* scikit-learn (only using joblib) (see [code example](../guides/frameworks/scikit-learn/sklearn_joblib_example.md))
* XGBoost (only using joblib) (see [code example](../guides/frameworks/xgboost/xgboost_sample.md))
* FastAI (see [code example](../guides/frameworks/fastai/fastai_with_tensorboard.md))
* MegEngine (see [code example](../guides/frameworks/megengine/megengine_mnist.md))
* CatBoost (see [code example](../guides/frameworks/catboost/catboost.md))
* **Metrics, scalars, plots, debug images** reported through supported frameworks, including:
* Matplotlib (see [code example](../guides/frameworks/matplotlib/matplotlib_example.md))
* Tensorboard (see [code example](../guides/frameworks/pytorch/pytorch_tensorboardx.md))
* TensorboardX (see [code example](../guides/frameworks/tensorboardx/tensorboardx.md))
* **Execution details** including:
* Git information
* Uncommitted code modifications - In cases where no git repository is detected (e.g. when a single python script is
executed outside a git repository, or when running from a Jupyter Notebook), ClearML logs the contents
of the executed script
* Python environment
* Execution [configuration](../webapp/webapp_exp_track_visual.md#configuration)
To control a task's framework logging, use the `auto_connect_framworks` parameter of the [`Task.init`](../references/sdk/task.md#taskinit)
method. Turn off all automatic logging by setting the parameter to `False`. For finer grained control of logged frameworks,
input a dictionary, with framework-boolean pairs.
For example:
```python
auto_connect_frameworks={
'matplotlib': True, 'tensorflow': False, 'tensorboard': False, 'pytorch': True,
'xgboost': False, 'scikit': True, 'fastai': True, 'lightgbm': False,
'hydra': True, 'detect_repository': True, 'tfdefines': True, 'joblib': True,
'megengine': True, 'jsonargparse': True, 'catboost': True
}
```
### Task Reuse
Every `Task.init` call will create a new task for the current execution.
In order to mitigate the clutter that a multitude of debugging tasks might create, a task will be reused if:
* The last time it was executed (on this machine) was under 72 hours ago (configurable, see
`sdk.development.task_reuse_time_window_in_hours` in the [`sdk.development` section](../configs/clearml_conf.md#sdkdevelopment) of
the ClearML configuration reference)
* The previous task execution did not have any artifacts / models
It's possible to always create a new task by passing `reuse_last_task_id=False`.
See full `Task.init` reference [here](../references/sdk/task.md#taskinit).
### Empty Task Creation
A task can also be created without the need to execute the code itself.
Unlike the runtime detections, all the environment and configuration details need to be provided explicitly.
For example:
```python
task = Task.create(
project_name='example',
task_name='task template',
repo='https://github.com/allegroai/clearml.git',
branch='master',
script='examples/reporting/html_reporting.py',
working_directory='.',
docker=None,
)
```
See full `Task.create` reference [here](../references/sdk/task.md#taskcreate).
### Tracking Task Progress
Track a tasks progress by setting the task progress property using the [`Task.set_progress`](../references/sdk/task.md#set_progress) method.
Set a tasks progress to a numeric value between 0 - 100. Access the tasks current progress, using the
[`Task.get_progress`](../references/sdk/task.md#get_progress) method.
```python
task = Task.init(project_name="examples", task_name="Track experiment progress")
task.set_progress(0)
# task doing stuff
task.set_progress(50)
print(task.get_progress())
# task doing more stuff
task.set_progress(100)
```
While the task is running, the WebApp will show the tasks progress indication in the experiment table, next to the
tasks status. If a task failed or was aborted, you can view how much progress it had made.
<div class="max-w-50">
![Experiment table progress indication](../img/fundamentals_task_progress.png)
</div>
Additionally, you can view a tasks progress in its [INFO](../webapp/webapp_exp_track_visual.md#general-information) tab
in the WebApp.
## Accessing Tasks
A task can be identified by its project and name, and by a unique identifier (UUID string). The name and project of
a task can be changed after an experiment has been executed, but its ID can't be changed.
Programmatically, task objects can be retrieved by querying the system based on either the task ID or a project and name
combination using the [`Task.get_task`](../references/sdk/task.md#taskget_task) class method. If a project / name
combination is used, and multiple tasks have the exact same name, the function will return the *last modified task*.
For example:
* Accessing a task object with a task ID:
```python
a_task = Task.get_task(task_id='123456deadbeef')
```
* Accessing a task with a project and name:
```python
a_task = Task.get_task(project_name='examples', task_name='artifacts')
```
Once a task object is obtained, it's possible to query the state of the task, reported scalars, etc.
The task's outputs, such as artifacts and models, can also be retrieved.
## Querying / Searching Tasks
Searching and filtering tasks can be done via the [web UI](../webapp/webapp_overview.md) and programmatically.
Input search parameters into the [`Task.get_tasks`](../references/sdk/task.md#taskget_tasks) method, which returns a
list of task objects that match the search.
For example:
```python
task_list = Task.get_tasks(
task_ids=None, # type Optional[Sequence[str]]
project_name=None, # Optional[str]
task_name=None, # Optional[str]
task_filter=None # Optional[Dict]
)
```
It's possible to also filter tasks by passing filtering rules to `task_filter`.
For example:
```python
task_filter={
# only tasks with tag `included_tag` and without tag `excluded_tag`
'tags': ['included_tag', '-excluded_tag'],
# filter out archived tasks
'system_tags': ['-archived'],
# only completed & published tasks
'status': ['completed', 'published'],
# only training type tasks
'type': ['training'],
# match text in task comment or task name
'search_text': 'reg_exp_text'
}
```
## Cloning & Executing Tasks
Once a task object is created, it can be copied (cloned). [`Task.clone`](../references/sdk/task.md#taskclone) returns
a copy of the original task (`source_task`). By default, the cloned task is added to the same project as the original,
and it's called "Clone Of ORIGINAL_NAME", but the name / project / comment (description) of the cloned task can be directly overridden.
```python
task = Task.init(project_name='examples', task_name='original task',)
cloned = Task.clone(
source_task=task, # type: Optional[Union[Task, str]]
# override default name
name='newly created task', # type: Optional[str]
comment=None, # type: Optional[str]
# insert cloned task into a different project
project=None, # type: Optional[str]
)
```
A newly cloned task has a [draft](../fundamentals/task.md#task-states) status, so it's modifiable.
Once a task is modified, launch it by pushing it into an execution queue with the [Task.enqueue](../references/sdk/task.md#taskenqueue)
class method. Then a [ClearML Agent](../clearml_agent.md) assigned to the queue will pull the task from the queue and execute
it.
```python
Task.enqueue(
task=task, # type: Union[Task, str]
queue_name='default', # type: Optional[str]
queue_id=None # type: Optional[str]
)
```
See enqueue [example](https://github.com/allegroai/clearml/blob/master/examples/automation/programmatic_orchestration.py).
## Advanced Flows
### Remote Execution
A compelling workflow is:
1. Run code on a development machine for a few iterations, or just set up the environment.
1. Move the execution to a beefier remote machine for the actual training.
Use the [Task.execute_remotely](../references/sdk/task.md#execute_remotely) method to implement this workflow. This method
stops the current manual execution, and then re-runs it on a remote machine.
For example:
```python
task.execute_remotely(
queue_name='default', # type: Optional[str]
clone=False, # type: bool
exit_process=True # type: bool
)
```
Once the method is called on the machine, it stops the local process and enqueues the current task into the `default`
queue. From there, an agent can pull and launch it.
See the [Remote Execution](../guides/advanced/execute_remotely.md) example.
#### Remote Function Execution
A specific function can also be launched on a remote machine with the [`Task.create_function_task`](../references/sdk/task.md#create_function_task)
method.
For example:
```python
def run_me_remotely(some_argument):
print(some_argument)
a_func_task = task.create_function_task(
func=run_me_remotely, # type: Callable
func_name='func_id_run_me_remotely', # type:Optional[str]
task_name='a func task', # type:Optional[str]
# everything below will be passed directly to our function as arguments
some_argument=123
)
```
Arguments passed to the function will be automatically logged under the `Function` section in the Hyperparameters tab.
Like any other arguments, they can be changed from the UI or programmatically.
:::note Function Task Creation
Function tasks must be created from within a regular task, created by calling `Task.init`
:::
### Offline Mode
You can work with tasks in Offline Mode, in which all the data and logs that the Task captures are stored in a local
folder, which can later be uploaded to the [ClearML Server](../deploying_clearml/clearml_server.md).
Before initializing a Task, use the [Task.set_offline](../references/sdk/task.md#taskset_offline) class method and set
the `offline_mode` argument to `True`. The method returns the Task ID and a path to the session folder.
```python
from clearml import Task
# Use the set_offline class method before initializing a Task
Task.set_offline(offline_mode=True)
# Initialize a Task
task = Task.init(project_name="examples", task_name="my_task")
# Rest of code is executed. All data is logged locally and not onto the server
```
All the information captured by the Task is saved locally. Once the task script finishes execution, it's zipped.
Upload the execution data that the Task captured offline to the ClearML Server using one of the following:
* [`clearml-task`](../apps/clearml_task.md) CLI
```bash
clearml-task --import-offline-session "path/to/session/.clearml/cache/offline/b786845decb14eecadf2be24affc7418.zip"
```
Pass the path to the zip folder containing the session with the `--import-offline-session` parameter
* [Task.import_offline_session](../references/sdk/task.md#taskimport_offline_session) class method
```python
from clearml import Task
Task.import_offline_session(session_folder_zip="path/to/session/.clearml/cache/offline/b786845decb14eecadf2be24affc7418.zip")
```
In the `session_folder_zip` argument, insert the path to the zip folder containing the session.
Both options will upload the Task's full execution details and outputs and return a link to the Task's results page on
the ClearML Server.
## Artifacts
Artifacts are the output files created by a task. ClearML uploads and logs these products so they can later be easily
accessed, modified, and used.
### Logging Artifacts
To log an artifact in a task, use the [`upload_artifact`](../references/sdk/task.md#upload_artifact) method.
For example:
* Upload a local file containing the preprocessing results of the data:
```python
task.upload_artifact(name='data', artifact_object='/path/to/preprocess_data.csv')
```
* Upload an entire folder with all its content by passing the folder, which will be zipped and uploaded as a single
zip file:
```python
task.upload_artifact(name='folder', artifact_object='/path/to/folder')
```
* Serialize and upload a Python object. ClearML automatically chooses the file format based on the objects type, or you
can explicitly specify the format as follows:
* dict - `.json` (default), `.yaml`
* pandas.DataFrame - `.csv.gz` (default), `.parquet`, `.feather`, `.pickle`
* numpy.ndarray - `.npz` (default), `.csv.gz`
* PIL.Image - Any PIL-supported extensions (default `.png`)
For example:
```python
person_dict = {'name': 'Erik', 'age': 30}
# upload as JSON artifact
task.upload_artifact(name='person dictionary json', artifact_object=person_dict)
# upload as YAML artifact
task.upload_artifact(
name='person dictionary yaml',
artifact_object=person_dict,
extension_name="yaml"
)
```
See more details in the [Artifacts Reporting example](../guides/reporting/artifacts.md) and in the [SDK reference](../references/sdk/task.md#upload_artifact).
### Using Artifacts
A task's artifacts are accessed through the tasks *artifact* property which lists the artifacts locations.
The artifacts can subsequently be retrieved from their respective locations by using:
* `get_local_copy()`- Downloads the artifact and caches it for later use, returning the path to the cached copy.
* `get()` - Returns a Python object constructed from the downloaded artifact file.
The code below demonstrates how to access a file artifact using the previously generated preprocessed data:
```python
# get instance of task that created artifact, using task ID
preprocess_task = Task.get_task(task_id='the_preprocessing_task_id')
# access artifact
local_csv = preprocess_task.artifacts['data'].get_local_copy()
```
See more details in the [Using Artifacts example](https://github.com/allegroai/clearml/blob/master/examples/reporting/using_artifacts_example.py).
## Models
The following is an overview of working with models through a `Task` object. It is also possible to work directly with model
objects (see [Models (SDK)](model_sdk.md)).
### Logging Models Manually
To manually log a model in a task, create an instance of the [OutputModel](../references/sdk/model_outputmodel.md) class.
An OutputModel object is always registered as an output model of the task it is constructed from.
For example:
```python
from clearml import OutputModel, Task
# Instantiate a Task
task = Task.init(project_name="myProject", task_name="myTask")
# Instantiate an OutputModel with a task object argument
output_model = OutputModel(task=task, framework="PyTorch")
```
### Updating Models Manually
The snapshots of manually uploaded models aren't automatically captured. To update a task's model, use the
[Task.update_output_model](../references/sdk/task.md#update_output_model) method:
```python
task.update_output_model(model_path='path/to/model')
```
It's possible to modify the following parameters:
* Model location
* Model name
* Model description
* Iteration number
* Model tags
Models can also be manually updated independently, without any task. See [OutputModel.update_weights](../references/sdk/model_outputmodel.md#update_weights).
### Using Models
Accessing a tasks previously trained model is quite similar to accessing task artifacts. A task's models are accessed
through the tasks models property which lists the input models and output model snapshots locations.
The models can subsequently be retrieved from their respective locations by using `get_local_copy()` which downloads the
model and caches it for later use, returning the path to the cached copy (if using Tensorflow, the snapshots are stored
in a folder, so the `local_weights_path` will point to a folder containing the requested snapshot).
```python
prev_task = Task.get_task(task_id='the_training_task')
last_snapshot = prev_task.models['output'][-1]
local_weights_path = last_snapshot.get_local_copy()
```
Notice that if one of the frameworks loads an existing weights file, the running task will automatically update its
"Input Model", pointing directly to the original training task's model. This makes it easy to get the full lineage of
every trained and used model in our system!
Models loaded by the ML framework appear under the "Input Models" section, under the Artifacts tab in the ClearML UI.
### Setting Upload Destination
ClearML automatically captures the storage location of Models created by frameworks such as TF, Pytorch, and scikit-learn.
By default, it stores the local path they are saved at.
To automatically store all created models by a specific experiment, modify the `Task.init` function as such:
```python
task = Task.init(
project_name='examples',
task_name='storing model',
output_uri='s3://my_models/'
)
```
To automatically store all models created by any experiment at a specific location, edit the `clearml.conf` (see
[ClearML Configuration Reference](../configs/clearml_conf.md#sdkdevelopment)) and set `sdk.developmenmt.default_output_uri`
to the desired storage (see [Storage](../integrations/storage.md)). This is especially helpful when
using [clearml-agent](../clearml_agent.md) to execute code.
## Configuration
### Manual Hyperparameter Logging
#### Setting Parameters
To define parameters manually use the [`Task.set_parameters`](../references/sdk/task.md#set_parameters) method to specify
name-value pairs in a parameter dictionary.
Parameters can be designated into sections: specify a parameters section by prefixing its name, delimited with a slash
(i.e. `section_name/parameter_name:value`). `General` is the default section.
Call the [`set_parameter`](../references/sdk/task.md#set_parameter) method to set a single parameter.
```python
task = Task.init(project_name='examples', task_name='parameters')
# override parameters with provided dictionary
task.set_parameters({'Args/epochs':7, 'lr': 0.5})
# setting a single parameter
task.set_parameter(name='decay',value=0.001)
```
:::warning Overwriting Parameters
The `set_parameters` method replaces any existing hyperparameters in the task.
:::
#### Adding Parameters
To update the parameters in a task, use the [`Task.set_parameters_as_dict`](../references/sdk/task.md#set_parameters_as_dict)
method. Arguments and values are input as a dictionary. Like in `set_parameters` above, the parameter's section can
be specified.
```python
task = Task.task_get(task_id='123456789')
# add parameters
task.set_parameters_as_dict({'my_args/lr':0.3, 'epochs':10})
```
### Accessing Parameters
To access all task parameters, use the [`Task.get_parameters`](../references/sdk/task.md#get_parameters) method. This
method returns a flattened dictionary of the `'section/parameter': 'value'` pairs.
```python
task = Task.get_task(project_name='examples', task_name='parameters')
# will print a flattened dictionary of the 'section/parameter': 'value' pairs
print(task.get_parameters())
```
Access a specific parameter with the [`Task.get_parameter`](../references/sdk/task.md#get_parameter) method specifying
the parameter name.
### Tracking Python Objects
ClearML can track Python objects (such as dictionaries and custom classes) as they evolve in your code, and log them to
your tasks configuration using the [`Task.connect`](../references/sdk/task.md#connect) method. Once objects are connected
to a task, ClearML automatically logs all object elements (e.g. class members, dictionary key-values pairs).
```python
class person:
def __init__(self, name, age):
self.name = name
self.age = age
me = person('Erik', 5)
params_dictionary = {'epochs': 3, 'lr': 0.4}
task = Task.init(project_name='examples',task_name='argparser')
task.connect(me)
task.connect(params_dictionary)
```
![Task parameters](../img/fundamentals_task_config_hyperparams.png)
### Configuration Objects
Configuration objects more elaborate than a key-value dictionary (such as nested dictionaries or configuration files),
can be logged to a task using the [`Task.connect_configuration`](../references/sdk/task.md#connect_configuration) method.
This method saves configuration objects as blobs (i.e. ClearML is not aware of their internal structure).
```python
# connect a configuration dictionary
model_config_dict = {
'value': 13.37, 'dict': {'sub_value': 'string'}, 'list_of_ints': [1, 2, 3, 4],
}
model_config_dict = task.connect_configuration(
name='dictionary', configuration=model_config_dict
)
# connect a configuration file
config_file_yaml = task.connect_configuration(
name="yaml file", configuration='path/to/configuration/file.yaml'
)
```
![Task configuration objects](../img/fundamentals_task_config_object.png)
### User Properties
A tasks user properties do not impact task execution so you can add / modify the properties at any stage. Add user
properties to a task with the [Task.set_user_properties](../references/sdk/task.md#set_user_properties) method.
```python
task.set_user_properties(
{"name": "backbone", "description": "network type", "value": "great"}
)
```
The above example sets the "backbone" property in a task.
![Task user properties](../img/fundamentals_task_config_properties.png)
## SDK Reference
For detailed information, see the complete [Task SDK reference page](../references/sdk/task.md)