Update docs folder (#1384)
Some checks failed
CodeQL / Analyze (python) (push) Has been cancelled
@ -1,167 +1,6 @@
|
|||||||
# `clearml-task` - Execute ANY python code on a remote machine
|
# `clearml-task` - Execute ANY python code on a remote machine
|
||||||
|
|
||||||
Using only your command line and __zero__ additional lines of code, you can easily integrate the ClearML platform
|
Using only your command line and __zero__ additional lines of code, you can easily integrate the ClearML platform
|
||||||
into your experiment. With the `clearml-task` command, you can create a [Task](https://clear.ml/docs/latest/docs/fundamentals/task)
|
into your experiment with the `clearml-task` CLI tool.
|
||||||
using any script from **any python code or repository and launch it on a remote machine**.
|
|
||||||
|
|
||||||
The remote execution is fully monitored. All outputs - including console / tensorboard / matplotlib -
|
For more information, see the [ClearML Documentation](https://clear.ml/docs/latest/docs/apps/clearml_task/).
|
||||||
are logged in real-time into the ClearML UI.
|
|
||||||
|
|
||||||
## What does it do?
|
|
||||||
|
|
||||||
With the `clearml-task` command, you specify the details of your experiment including:
|
|
||||||
* Project and task name
|
|
||||||
* Repository / commit / branch
|
|
||||||
* [Queue](https://clear.ml/docs/latest/docs/fundamentals/agents_and_queues#what-is-a-queue)
|
|
||||||
name
|
|
||||||
* Optional: the base docker image to be used as underlying environment
|
|
||||||
* Optional: alternative python requirements, in case `requirements.txt` is not found inside the repository.
|
|
||||||
|
|
||||||
Then `clearml-task` does the rest of the heavy-lifting. It creates a new experiment or Task on your `clearml-server`
|
|
||||||
according to your specifications, and then, it will enqueue the experiment to the selected execution queue.
|
|
||||||
|
|
||||||
While the Task is executed on the remote machine (performed by an available `clearml-agent`), all the console outputs
|
|
||||||
will be logged in real-time, alongside your TensorBoard and matplotlib. During and after the Task execution, you can
|
|
||||||
track and visualize the results in the ClearML Web UI.
|
|
||||||
|
|
||||||
### Use-cases for `clearml-task` remote execution
|
|
||||||
|
|
||||||
- You have an off-the-shelf code, and you want to launch it on a remote machine with a specific resource (i.e., GPU)
|
|
||||||
- You want to run the [hyper-parameter optimization]() on a codebase that is still not connected to `clearml`
|
|
||||||
- You want to create a [pipeline]() from an assortment of scripts, and you need to create Tasks for those scripts
|
|
||||||
- Sometimes, you just want to run some code on a remote machine, either using an on-prem cluster or on the cloud...
|
|
||||||
|
|
||||||
### Prerequisites
|
|
||||||
|
|
||||||
- A single python script, or an up-to-date repository containing the codebase.
|
|
||||||
- `clearml` installed. `clearml` also has a [Task](https://clear.ml/docs/latest/docs/fundamentals/task)
|
|
||||||
feature but it requires two lines of code in order to integrate the platform.
|
|
||||||
- `clearml-agent` running on at least one machine (to execute the experiment)
|
|
||||||
|
|
||||||
## Tutorial
|
|
||||||
|
|
||||||
### Launching a job from a repository
|
|
||||||
|
|
||||||
You will be launching this [script](https://github.com/allegroai/events/blob/master/webinar-0620/keras_mnist.py)
|
|
||||||
on a remote machine. You will be using the following command-line options:
|
|
||||||
1. Give the experiment a name and select a project, for example: `--project keras_examples --name remote_test`. If the project
|
|
||||||
doesn't exist, a new project will be created with the selected name.
|
|
||||||
2. Select the repository with your code. For example: `--repo https://github.com/allegroai/events.git` You can specify a
|
|
||||||
branch and/or commit using `--branch <branch_name> --commit <commit_id>`. If you do not specify the
|
|
||||||
branch / commit, it will use by default the latest commit from the master branch,
|
|
||||||
3. Specify which script in the repository needs to be run, for example: `--script /webinar-0620/keras_mnist.py`
|
|
||||||
By default, the execution working directory will be the root of the repository. If you need to change it,
|
|
||||||
add `--cwd <folder>`
|
|
||||||
4. If you need, pass an argument to your scripts, use `--args`, followed by the arguments.
|
|
||||||
The names of the arguments should match the argparse arguments, but without the '--' prefix. Instead
|
|
||||||
of --key=value -> use `--args key=value`, for example `--args batch_size=64 epochs=1`
|
|
||||||
5. Select the queue for your Task's execution, for example: `--queue default`. If a queue isn't chosen, the Task
|
|
||||||
will not be executed, it will be left in [draft mode](https://clear.ml/docs/latest/docs/fundamentals/task#task-states),
|
|
||||||
and you can enqueue and execute the Task at a later point.
|
|
||||||
6. Add required packages. If your repo has a requirements.txt file, you don't need to do anything; `clearml-task`
|
|
||||||
will automatically find the file and put it in your Task. If your repo does __not__ have a requirements file and
|
|
||||||
there are packages that are necessary for the execution of your code, use --packages <package_name>. For example:
|
|
||||||
`--packages "keras" "tensorflow>2.2"`.
|
|
||||||
|
|
||||||
``` bash
|
|
||||||
clearml-task --project keras_examples --name remote_test --repo https://github.com/allegroai/events.git
|
|
||||||
--script /webinar-0620/keras_mnist.py --args batch_size=64 epochs=1 --queue default
|
|
||||||
```
|
|
||||||
|
|
||||||
|
|
||||||
### Launching a job from a local script
|
|
||||||
|
|
||||||
You will be launching a single local script file (no git repo needed) on a remote machine:
|
|
||||||
|
|
||||||
1. Give the experiment a name and select a project (`--project examples --name remote_test`)
|
|
||||||
2. Select the script file on your machine, `--script /path/to/my/script.py`
|
|
||||||
3. If you require specific packages to run your code, you can specify them manually with `--packages "package_name" "package_name2`,
|
|
||||||
for example: `packages "keras" "tensorflow>2.2"`
|
|
||||||
or you can pass a requirements file `--requirements /path/to/my/requirements.txt`
|
|
||||||
4. If you need to pass arguments, like in the repo case, add `--args key=value` and make sure that the key names match
|
|
||||||
the argparse arguments (`--args batch_size=64 epochs=1`)
|
|
||||||
5. If you have a docker container with an entire environment in which you want your script to run inside,
|
|
||||||
add e.g. `--docker nvcr.io/nvidia/pytorch:20.11-py3`
|
|
||||||
6. Select the queue for your Task's execution, for example: `--queue dual_gpu`. If a queue isn't chosen, the Task
|
|
||||||
will not be executed, it will be left in [draft mode](https://clear.ml/docs/latest/docs/fundamentals/task#task-states),
|
|
||||||
and you can enqueue and execute it at a later point.
|
|
||||||
|
|
||||||
``` bash
|
|
||||||
clearml-task --project examples --name remote_test --script /path/to/my/script.py
|
|
||||||
--packages "keras" "tensorflow>2.2" --args epochs=1 batch_size=64
|
|
||||||
--queue dual_gpu
|
|
||||||
```
|
|
||||||
|
|
||||||
### CLI options
|
|
||||||
|
|
||||||
``` bash
|
|
||||||
clearml-task --help
|
|
||||||
```
|
|
||||||
|
|
||||||
``` console
|
|
||||||
ClearML launch - launch any codebase on remote machine running clearml-agent
|
|
||||||
|
|
||||||
optional arguments:
|
|
||||||
-h, --help show this help message and exit
|
|
||||||
--version Display the clearml-task utility version
|
|
||||||
--project PROJECT Required: set the project name for the task. If
|
|
||||||
--base-task-id is used, this arguments is optional.
|
|
||||||
--name NAME Required: select a name for the remote task
|
|
||||||
--repo REPO remote URL for the repository to use. Example: --repo
|
|
||||||
https://github.com/allegroai/clearml.git
|
|
||||||
--branch BRANCH Select specific repository branch/tag (implies the
|
|
||||||
latest commit from the branch)
|
|
||||||
--commit COMMIT Select specific commit id to use (default: latest
|
|
||||||
commit, or when used with local repository matching
|
|
||||||
the local commit id)
|
|
||||||
--folder FOLDER Remotely execute the code in the local folder. Notice!
|
|
||||||
It assumes a git repository already exists. Current
|
|
||||||
state of the repo (commit id and uncommitted changes)
|
|
||||||
is logged and will be replicated on the remote machine
|
|
||||||
--script SCRIPT Specify the entry point script for the remote
|
|
||||||
execution. When used in tandem with --repo the script
|
|
||||||
should be a relative path inside the repository, for
|
|
||||||
example: --script source/train.py .When used with
|
|
||||||
--folder it supports a direct path to a file inside
|
|
||||||
the local repository itself, for example: --script
|
|
||||||
~/project/source/train.py
|
|
||||||
--cwd CWD Working directory to launch the script from. Default:
|
|
||||||
repository root folder. Relative to repo root or local
|
|
||||||
folder
|
|
||||||
--args [ARGS [ARGS ...]]
|
|
||||||
Arguments to pass to the remote execution, list of
|
|
||||||
<argument>=<value> strings.Currently only argparse
|
|
||||||
arguments are supported. Example: --args lr=0.003
|
|
||||||
batch_size=64
|
|
||||||
--queue QUEUE Select the queue to launch the task. If not provided a
|
|
||||||
Task will be created but it will not be launched.
|
|
||||||
--requirements REQUIREMENTS
|
|
||||||
Specify requirements.txt file to install when setting
|
|
||||||
the session. If not provided, the requirements.txt
|
|
||||||
from the repository will be used.
|
|
||||||
--packages [PACKAGES [PACKAGES ...]]
|
|
||||||
Manually specify a list of required packages. Example:
|
|
||||||
--packages "tqdm>=2.1" "scikit-learn"
|
|
||||||
--docker DOCKER Select the docker image to use in the remote session
|
|
||||||
--docker_args DOCKER_ARGS
|
|
||||||
Add docker arguments, pass a single string
|
|
||||||
--docker_bash_setup_script DOCKER_BASH_SETUP_SCRIPT
|
|
||||||
Add bash script to be executed inside the docker
|
|
||||||
before setting up the Task's environment
|
|
||||||
--output-uri OUTPUT_URI
|
|
||||||
Optional: set the Task `output_uri` (automatically
|
|
||||||
upload model destination)
|
|
||||||
--task-type TASK_TYPE
|
|
||||||
Set the Task type, optional values: training, testing,
|
|
||||||
inference, data_processing, application, monitor,
|
|
||||||
controller, optimizer, service, qc, custom
|
|
||||||
--skip-task-init If set, Task.init() call is not added to the entry
|
|
||||||
point, and is assumed to be called in within the
|
|
||||||
script. Default: add Task.init() call entry point
|
|
||||||
script
|
|
||||||
--base-task-id BASE_TASK_ID
|
|
||||||
Use a pre-existing task in the system, instead of a
|
|
||||||
local repo/script. Essentially clones an existing task
|
|
||||||
and overrides arguments/requirements.
|
|
||||||
|
|
||||||
```
|
|
Before Width: | Height: | Size: 347 KiB |
158
docs/datasets.md
@ -1,159 +1,5 @@
|
|||||||
# ClearML introducing Dataset management!
|
# ClearML introducing Dataset management!
|
||||||
|
|
||||||
## Decoupling Data from Code - The Dataset Paradigm
|
Simplify data management with ClearML: create, version, and access datasets from anywhere, ensuring traceability and reproducibility.
|
||||||
|
|
||||||
<a href="https://app.clear.ml"><img src="https://github.com/allegroai/clearml/blob/master/docs/dataset_screenshots.gif?raw=true" width="80%"></a>
|
|
||||||
|
|
||||||
### The ultimate goal of `clearml-data` is to transform datasets into configuration parameters
|
|
||||||
Just like any other argument, the dataset argument should retrieve a full local copy of the
|
|
||||||
dataset to be used by the experiment.
|
|
||||||
This means datasets can be efficiently retrieved by any machine in a reproducible way.
|
|
||||||
Together it creates a full version control solution for all your data,
|
|
||||||
that is both machine and environment agnostic.
|
|
||||||
|
|
||||||
|
|
||||||
### Design Goals : Simple / Agnostic / File-based / Efficient
|
|
||||||
|
|
||||||
## Key Concepts:
|
|
||||||
1) **Dataset** is a **collection of files** : e.g. folder with all subdirectories and files included in the dataset
|
|
||||||
2) **Differential storage** : Efficient storage / network
|
|
||||||
3) **Flexible**: support addition / removal / merge of files and datasets
|
|
||||||
4) **Descriptive, transparent & searchable**: support projects, names, descriptions, tags and searchable fields
|
|
||||||
5) **Simple interface** (CLI and programmatic)
|
|
||||||
6) **Accessible**: get a copy of the dataset files from anywhere on any machine
|
|
||||||
|
|
||||||
### Workflow:
|
|
||||||
|
|
||||||
#### Simple dataset creation with CLI:
|
|
||||||
|
|
||||||
- Create a dataset
|
|
||||||
``` bash
|
|
||||||
clearml-data create --project <my_project> --name <my_dataset_name>
|
|
||||||
```
|
|
||||||
- Add local files to the dataset
|
|
||||||
``` bash
|
|
||||||
clearml-data add --files ~/datasets/best_dataset/
|
|
||||||
```
|
|
||||||
- Close dataset and upload files (Optional: specify storage `--storage` `s3://bucket`, `gs://`, `azure://` or `/mnt/shared/`)
|
|
||||||
``` bash
|
|
||||||
clearml-data close --id <dataset_id>
|
|
||||||
```
|
|
||||||
|
|
||||||
|
|
||||||
#### Integrating datasets into your code:
|
|
||||||
```python
|
|
||||||
from argparse import ArgumentParser
|
|
||||||
from clearml import Dataset
|
|
||||||
|
|
||||||
# adding command line interface, so it is easy to use
|
|
||||||
parser = ArgumentParser()
|
|
||||||
parser.add_argument('--dataset', default='aayyzz', type=str, help='Dataset ID to train on')
|
|
||||||
args = parser.parse_args()
|
|
||||||
|
|
||||||
# creating a task, so that later we could override the argparse from UI
|
|
||||||
task = Task.init(project_name='examples', task_name='dataset demo')
|
|
||||||
|
|
||||||
# getting a local copy of the dataset
|
|
||||||
dataset_folder = Dataset.get(dataset_id=args.dataset).get_local_copy()
|
|
||||||
|
|
||||||
# go over the files in `dataset_folder` and train your model
|
|
||||||
```
|
|
||||||
|
|
||||||
#### Create dataset from code
|
|
||||||
Creating datasets from code is especially helpful when some preprocessing is done on raw data and we want to save
|
|
||||||
preprocessing code as well as dataset in a single Task.
|
|
||||||
|
|
||||||
```python
|
|
||||||
from clearml import Dataset
|
|
||||||
|
|
||||||
# Preprocessing code here
|
|
||||||
|
|
||||||
dataset = Dataset.create(dataset_name='dataset name',dataset_project='dataset project')
|
|
||||||
dataset.add_files('/path_to_data')
|
|
||||||
dataset.upload()
|
|
||||||
dataset.finalize()
|
|
||||||
|
|
||||||
```
|
|
||||||
|
|
||||||
#### Modifying a dataset with CLI:
|
|
||||||
|
|
||||||
- Create a new dataset (specify the parent dataset id)
|
|
||||||
```bash
|
|
||||||
clearml-data create --name <improved_dataset> --parents <existing_dataset_id>
|
|
||||||
```
|
|
||||||
- Get a mutable copy of the current dataset
|
|
||||||
```bash
|
|
||||||
clearml-data get --id <created_dataset_id> --copy ~/datasets/working_dataset
|
|
||||||
```
|
|
||||||
- Change / add / remove files from the dataset folder
|
|
||||||
```bash
|
|
||||||
vim ~/datasets/working_dataset/everything.csv
|
|
||||||
```
|
|
||||||
|
|
||||||
#### Folder sync mode
|
|
||||||
|
|
||||||
Folder sync mode updates dataset according to folder content changes.<br/>
|
|
||||||
This is useful in case there's a single point of truth, either a local or network folder that gets updated periodically.
|
|
||||||
When using `clearml-data sync` and specifying parent dataset, the folder changes will be reflected in a new dataset version.
|
|
||||||
This saves time manually updating (adding \ removing) files.
|
|
||||||
|
|
||||||
- Sync local changes
|
|
||||||
``` bash
|
|
||||||
clearml-data sync --id <created_dataset_id> --folder ~/datasets/working_dataset
|
|
||||||
```
|
|
||||||
- Upload files (Optional: specify storage `--storage` `s3://bucket`, `gs://`, `azure://`, `/mnt/shared/`)
|
|
||||||
``` bash
|
|
||||||
clearml-data upload --id <created_dataset_id>
|
|
||||||
```
|
|
||||||
- Close dataset
|
|
||||||
``` bash
|
|
||||||
clearml-data close --id <created_dataset_id>
|
|
||||||
```
|
|
||||||
|
|
||||||
|
|
||||||
#### Command Line Interface Summary:
|
|
||||||
|
|
||||||
- **`search`** Search a dataset based on project / name / description / tag etc.
|
|
||||||
- **`list`** List the file directory content of a dataset (no need to download a copy pf the dataset)
|
|
||||||
- **`verify`** Verify a local copy of a dataset (verify the dataset files SHA2 hash)
|
|
||||||
- **`create`** Create a new dataset (support extending/inheriting multiple parents)
|
|
||||||
- **`delete`** Delete a dataset
|
|
||||||
- **`add`** Add local files to a dataset
|
|
||||||
- **`sync`** Sync dataset with a local folder (source-of-truth being the local folder)
|
|
||||||
- **`remove`** Remove files from dataset (no need to download a copy of the dataset)
|
|
||||||
- **`get`** Get a local copy of the dataset (either readonly --link, or writable --copy)
|
|
||||||
- **`upload`** Upload the dataset (use --storage to specify storage target such as S3/GS/Azure/Folder, default: file server)
|
|
||||||
|
|
||||||
|
|
||||||
#### Under the hood (how it all works):
|
|
||||||
|
|
||||||
Each dataset instance stores the collection of files added/modified from the previous version (parent).
|
|
||||||
|
|
||||||
When requesting a copy of the dataset all parent datasets on the graph are downloaded and a new folder
|
|
||||||
is merged with all changes introduced in the dataset DAG.
|
|
||||||
|
|
||||||
Implementation details:
|
|
||||||
|
|
||||||
Dataset differential snapshot is stored in a single zip file for efficiency in storage and network
|
|
||||||
bandwidth. Local cache is built into the process making sure datasets are downloaded only once.
|
|
||||||
Dataset contains SHA2 hash of all the files in the dataset.
|
|
||||||
In order to increase dataset fetching speed, only file size is verified automatically,
|
|
||||||
the SHA2 hash is verified only on user's request.
|
|
||||||
|
|
||||||
The design supports multiple parents per dataset, essentially merging all parents based on order.
|
|
||||||
To improve deep dataset DAG storage and speed, dataset squashing was introduced. A user can squash
|
|
||||||
a dataset, merging down all changes introduced in the DAG, creating a new flat version without parent datasets.
|
|
||||||
|
|
||||||
|
|
||||||
### Datasets UI:
|
|
||||||
|
|
||||||
A dataset is represented as a special `Task` in the system. <br>
|
|
||||||
It is of type `data-processing` with a special tag `dataset`.
|
|
||||||
|
|
||||||
- Full log (calls / CLI) of the dataset creation process can be found in the "Execution" section.
|
|
||||||
- Listing of the dataset differential snapshot, summary of files added / modified / removed and details of files
|
|
||||||
in the differential snapshot (location / size / hash), is available in the Artifacts section you can find a
|
|
||||||
- The full dataset listing (all files included) is available in the Configuration section under `Dataset Content`.
|
|
||||||
This allows you to quickly compare two dataset contents and visually see the difference.
|
|
||||||
- The dataset genealogy DAG and change-set summary table is visualized in Results / Plots
|
|
||||||
|
|
||||||
|
For more information, see the [ClearML Documentation](https://clear.ml/docs/latest/docs/clearml_data/).
|
1410
docs/logger.md
Before Width: | Height: | Size: 487 KiB |
Before Width: | Height: | Size: 34 KiB |
Before Width: | Height: | Size: 37 KiB |
Before Width: | Height: | Size: 51 KiB |
Before Width: | Height: | Size: 16 KiB |
Before Width: | Height: | Size: 27 KiB |