Add Hyper-Datasets

This commit is contained in:
allegroai 2021-06-21 01:00:16 +03:00
parent 31b3b52cac
commit 81bcabcb10
76 changed files with 3319 additions and 168 deletions

View File

@ -18,6 +18,7 @@ to a remote machine, and executing the code as follows:
working directory and entry point stored in the experiment. It executes with logging and monitoring.
1. While the Task is executing, and anytime after, track the experiment and visualize results in the **ClearML Web UI**.
Continue using **ClearML Agent** once it is running on a target machine. Reproduce experiments and execute
automated workflows in one (or both) of the following ways:
* Programmatically
@ -514,3 +515,165 @@ To do that, set these environment variables on the ClearML Server machine with
CLEARML_API_ACCESS_KEY
CLEARML_API_SECRET_KEY
```
## Google Colab
ClearML Agent can run on a [google colab](https://colab.research.google.com/) instance. This helps users to leverage
compute resources provided by google colab and send experiments for execution on it. <br/>
Check out [this](guides/ide/google_colab.md) tutorial on how to run a ClearML Agent on Google Colab!
## Dynamic GPU Allocation
:::important
Available with the ClearML Enterprise offering
:::
The ClearML Enterprise server supports dynamic allocation of GPUs based on queue properties.
Agents can spin multiple Tasks from different queues based on the number of GPUs the queue
needs.
`dynamic-gpus` enables dynamic allocation of GPUs based on queue properties.
To configure the number of GPUs for a queue, use the `--queue` flag and specify the queue name and number of GPUs:
```console
clearml-agent daemon --dynamic-gpus --queue dual_gpus=2 single_gpu=1
```
### Example
Let's say there are three queues on a server, named:
* `dual_gpu`
* `quad_gpu`
* `opportunistic`
An agent can be spun on multiple GPUs (e.g. 8 GPUs, `--gpus 0-7`), and then attached to multiple
queues that are configured to run with a certain amount of resources:
```console
clearml-agent daemon --dynamic-gpus --queues quad_gpu=4 dual_gpu=2
```
The agent can now spin multiple Tasks from the different queues based on the number of GPUs configured to the queue.
The agent will pick a Task from the `quad_gpu` queue, use GPUs 0-3 and spin it. Then it will pick a Task from `dual_gpu`
queue, look for available GPUs again and spin on GPUs 4-5.
Another option for allocating GPUs:
```console
clearml-agent daemon --dynamic-gpus --queue dual=2 opportunistic=1-4
```
Notice that a minimum and maximum value of GPUs was specified for the `opportunistic` queue. This means the agent
will pull a Task from the `opportunistic` queue and allocate up to 4 GPUs based on availability (i.e. GPUs not currently
being used by other agents).
## Scheduling working hours
:::important
Available with the ClearML Enterprise offering
:::
The Agent scheduler enables scheduling working hours for each Agent. During working hours, a worker will actively poll
queues for Tasks, fetch and execute them. Outside working hours, a worker will be idle.
Schedule workers by:
* Setting configuration file options
* Running `clearml-agent` from the command line (overrides configuration file options)
Override worker schedules by:
* Setting runtime properties to force a worker on or off
* Tagging a queue on or off
### Running clearml-agent with a schedule (command line)
Set a schedule for a worker from the command line when running `clearml-agent`. Two properties enable setting working hours:
:::warning
Use only one of these properties
:::
* `uptime` - Time span during which a worker will actively poll a queue(s) for Tasks, and execute them. Outside this
time span, the worker will be idle.
* `downtime` - Time span during which a worker will be idle. Outside this time span, the worker will actively poll and
execute Tasks.
Define `uptime` or `downtime` as `"<hours> <days>"`, where:
* `<hours>` - A span of hours (`00-23`) or a single hour. A single hour defines a span from that hour to midnight.
* `<days>` - A span of days (`SUN-SAT`) or a single day.
Use `-` for a span, and `,` to separate individual values. To span before midnight to after midnight, use two spans.
For example:
* `"20-23 SUN"` - 8 PM to 11 PM on Sundays.
* `"20-23 SUN,TUE"` - 8 PM to 11 PM on Sundays and Tuesdays.
* `"20-23 SUN-TUE"` - 8 PM to 11 PM on Sundays, Mondays, and Tuesdays.
* `"20 SUN"` - 8 PM to midnight on Sundays.
* `"20-00,00-08 SUN"` - 8 PM to midnight and midnight to 8 AM on Sundays
* `"20-00 SUN", "00-08 MON"` - 8 PM on Sundays to 8 AM on Mondays (spans from before midnight to after midnight).
### Setting worker schedules in the configuration file
Set a schedule for a worker using configuration file options. The options are:
:::warning
Only use one of these properties
:::
* ``agent.uptime``
* ``agent.downtime``
Use the same time span format for days and hours as is used in the command line.
For example, set a worker's schedule from 5 PM to 8 PM on Sunday through Tuesday, and 1 PM to 10 PM on Wednesday.
agent.uptime: ["17-20 SUN-TUE", "13-22 WED"]
### Overriding worker schedules using runtime properties
Runtime properties override the command line uptime / downtime properties. The runtime properties are:
:::warning
Use only one of these properties
:::
* `force:on` - Pull and execute Tasks until the property expires.
* `force:off` - Prevent pulling and execution of Tasks until the property expires.
Currently, these runtime properties can only be set using an ClearML REST API call to the `workers.set_runtime_properties`
endpoint, as follows:
* The body of the request must contain the `worker-id`, and the runtime property to add.
* An expiry date is optional. Use the format `”expiry”:<time>`. For example, `”expiry”:86400` will set an expiry of 24 hours.
* To delete the property, set the expiry date to zero, `'expiry:0'`.
For example, to force a worker on for 24 hours:
curl --user <key>:<secret> --header "Content-Type: application/json" --data '{"worker":"<worker_id>","runtime_properties":[{"key": "force", "value": "on", "expiry": 86400}]}' http://<api-server-hostname-or-ip>:8008/workers.set_runtime_properties
### Overriding worker schedules using queue tags
Queue tags override command line and runtime properties. The queue tags are the following:
:::warning
Use only one of these properties
:::
* ``force_workers:on`` - Any worker listening to the queue will keep pulling Tasks from the queue.
* ``force_workers:off`` - Prevent all workers listening to the queue from pulling Tasks from the queue.
Currently, you can set queue tags using an ClearML REST API call to the ``queues.update`` endpoint, or the
APIClient. The body of the call must contain the ``queue-id`` and the tags to add.
For example, force workers on for a queue using the APIClient:
from trains.backend_api.session.client import APIClient
client = APIClient()
client.queues.update(queue=”<queue_id>”, tags=["force_workers:on"]
Or, force workers on for a queue using the REST API:
curl --user <key>:<secret> --header "Content-Type: application/json" --data '{"queue":"<queue_id>","tags":["force_workers:on"]}' http://<api-server-hostname-or-ip>:8008/queues.update

View File

@ -155,25 +155,25 @@ The following sections contain lists of AMI Image IDs, per region, for each rele
### Latest version
#### v1.0.2
#### v1.0.0
* **eu-north-1** : ami-04db2c396ce47ba78
* **ap-south-1** : ami-0052405573bfec551
* **eu-west-3** : ami-03f632c63f93adf8a
* **eu-west-2** : ami-0a4224787c2f50fe9
* **eu-west-1** : ami-0d1275951594ad1cc
* **ap-northeast-3** : ami-0475a797c30290331
* **ap-northeast-2** : ami-02cb3189a0678d09a
* **ap-northeast-1** : ami-06dd488152487b1f1
* **sa-east-1** : ami-09ae7b95817dc3a71
* **ca-central-1** : ami-03db73a5a7de428f0
* **ap-southeast-1** : ami-0fa16c18439f9085e
* **ap-southeast-2** : ami-01edf47969e2515dd
* **eu-central-1** : ami-0c41a118281233368
* **us-east-2** : ami-0d8203795c3f6a861
* **us-west-1** : ami-0e5eb94cd23c094d5
* **us-west-2** : ami-032afac2f83d1b8f3
* **us-east-1** : ami-0d255a19090ed388d
* **eu-north-1** : ami-0d6b1781328f44b21
* **ap-south-1** : ami-03d18434eb00ba0d4
* **eu-west-3** : ami-0ca027ed4205e7d67
* **eu-west-2** : ami-04304fe1639f8324f
* **eu-west-1** : ami-06260010b2e24b438
* **ap-northeast-3** : ami-0d16f3c2176cf8639
* **ap-northeast-2** : ami-0a3a2e08cec3e2709
* **ap-northeast-1** : ami-04c2c71b7bcecf6af
* **sa-east-1** : ami-00c86a9d8b5b87239
* **ca-central-1** : ami-0889a860b58dd8d88
* **ap-southeast-1** : ami-0a9ac9925ab98a270
* **ap-southeast-2** : ami-01735e0de7b1a13f2
* **eu-central-1** : ami-0b93523a0f9ec5e2b
* **us-east-2** : ami-0fa34e08b01eadb96
* **us-west-1** : ami-0a8cb65f6856dd561
* **us-west-2** : ami-0eb1b443c591054fe
* **us-east-1** : ami-07ed6a6bbb63799cc
## Next Step

View File

@ -3,31 +3,54 @@ title: Agent & Queue
---
Two major components of MLOps is experiment reproducibility, and the ability to scale work to multiple machines. ClearML Agent,
coupled with execution queues, addresses both of those needs.
coupled with execution queues, addresses both these needs.
The Agent is the base for **Automation** in ClearML and can be leveraged to build automated pipelines, services (such as alerts) and more.
The ClearML Agent is the base for **Automation** in ClearML and can be leveraged to build automated pipelines, launch custom services
(e.g. a [monitor and alert service](https://github.com/allegroai/clearml/tree/master/examples/services/monitoring)) and more.
## What does a ClearML Agent do?
An agent (Also referred to as a Worker) allows users to execute code on any machine it's installed on, which is used to scale data science work beyond one's own machine.
ClearML Agent not only clones the code, applies uncommitted changes, tracks experiment metrics and machine's status, but it also recreates the entire execution environment, be it by pulling the docker container or installing specified packages.
Once the environment is set up, and the code is cloned, the script is executed by the Agent, which reports metrics as well as monitor the machine it runs in.
An agent (also referred to as a Worker) allows users to execute code on any machine it's installed on, thus facilitating the
scaling of data science work beyond one's own machine.
The agent takes care of deploying the code to the target machine as well as setting up the entire execution environment:
from installing required packages to setting environment variables,
all leading to executing the code (supporting both virtual environment or flexible docker container configurations)
The Agent also allows code parameters to be modified on-the-fly without code modification, this is the base for [Hyper Parameter Optimization](https://github.com/allegroai/clearml/tree/master/examples/optimization/hyper-parameter-optimization).
An agent can be associated with specific GPUs, so a machine with 8 GPUs can execute code only on a few GPUs or all the GPUs together.
The Agent also supports overriding parameter values on-the-fly without code modification, thus enabling no-code experimentation (This is also the foundation on which
ClearML [Hyper Parameter Optimization](hpo.md) is implemented).
An agent can be associated with specific GPUs, enabling workload distribution. For example, on a machine with 8 GPUs you can allocate several GPUs to an agent and use the rest for a different workload
(even through another agent).
## What is a Queue?
A queue is a list of Task IDs to be executed. You can configure a specific agent or agents to listen to a certain queue,
and to execute all Tasks pushed to that queue one after the other.
A ClearML queue is an ordered list of Tasks scheduled for execution.
A queue can be serviced by one or multiple ClearML agents.
Agents servicing a queue pull the queued tasks in order and execute them.
The Agent can also listen to multiple queues, according to one of the following options:
A ClearML Agent can service multiple queues in either of the following modes:
* The Agent pulls first from tfhe high priority queue then from the low priority queue.
* Strict priority: The agent services the higher priority queue before servicing lower priority ones.
* Round robin: The agent pulls a single task from a queue then moves to service the next queue.
* The Agent can pull in a round-robin (i.e. each queue has the same priority).
## Agent and Queue workflow
![image](../img/clearml_agent_flow_diagram.png)
The diagram above demonstrates a typical flow where an agent executes a task:
1. Enqueue a task for execution on the queue.
1. The agent pulls the task from the queue.
1. The agent launches a docker container in which to run the task's code.
1. The task's execution environment is set up:
1. Execute any custom setup script configured.
1. Install any required system packages.
1. Clone the code from a git repository.
1. Apply any uncommitted changes recorded.
1. Set up the python environment and required packages.
1. The task's script/code is executed.
While the agent is running, it continuously reports system metrics to the ClearML Server (These can be monitored in the **Workers and Queues** page).
## Resource management
Installing an Agent on machines allows it to monitor all the machine's status (GPU \ CPU \ Memory \ Network \ Disk IO).
@ -41,7 +64,7 @@ You can organize your queues according to resource usage. Say you have a single-
"single-gpu-queue" and assign the machine's agent, as well as other single-GPU agents to that queue. This way you will know
that Tasks assigned to that queue will be executed by a single GPU machine.
While the agents are up and running in your machines, you can access these resources from any machine by enqueueing a
While the agents are up and running in your machines, you can access these resources from any machine by enqueueing a
Task to one of your queues, according to the amount of resources you want to allocate to the Task.
With queues and ClearML Agent, you can easily add and remove machines from the cluster, and you can

View File

@ -2,18 +2,18 @@
title: Artifacts & Models
---
ClearML allows easy storarge of experiments' output products as **artifacts** that can later be accessed easily
ClearML allows easy storage of experiments' output products as **artifacts** that can later be accessed easily
and used, through the web UI or programmatically.
A few examples of artifacts are:
* Model snapshot \ weights file
* Model snapshot / weights file
* Data preprocessing
* Feature representation of data
* and more!
## Artifacts
### Logging Artifacts
To log any type of artifact to a Task, use the upload_artifact() method. For example:
To log any type of artifact to a Task, use the `upload_artifact()` method. For example:
* Upload a local file containing the preprocessing results of the data.
```python
@ -24,7 +24,7 @@ task.upload_artifact(name='data', artifact_object='/path/to/preprocess_data.csv'
```python
task.upload_artifact(name='folder', artifact_object='/path/to/folder/')
```
* Upload an instance of an object, Numpy/Pandas/PIL (converted to npz/csv.gz/jpg formats accordingly). If the
* Upload an instance of an object, Numpy / Pandas / PIL (converted to npz / csv.gz / jpg formats accordingly). If the
object type is unknown, it is pickled and uploaded.
```python
person_dict = {'name': 'Erik', 'age': 30}
@ -41,9 +41,9 @@ Tasks).
1. Retrieve all the Task's artifacts with the *artifact* property, which is essentially a dictionary,
where the key is the artifact name, and the value is the artifact itself.
1. Access a specific artifact using one of the following methods:
- Access files by calling *get_local_copy()*, which caches the files for later use and returns a path to the cached
- Access files by calling `get_local_copy()`, which caches the files for later use and returns a path to the cached
file
- Access object artifacts by using the *get()* method that returns the Python object.
- Access object artifacts by using the `get()` method that returns the Python object.
The code below demonstrates how to access a file artifact using the previously generated preprocessed data:
```python
@ -64,13 +64,15 @@ See more details in the using artifacts [example](https://github.com/allegroai/c
- Python objects (pickled)
## Models
Models are a special kind of artifact and, unlike regular artifacts that are accessed with the creating Task's ID, Models
are entities with their own unique ID, this makes Models a standalone entry that can be used as an artifactory interface.
Models are a special kind of artifact and, unlike regular artifacts, which can only be accessed with the creating Task's ID,
Models are entities with their own unique ID that can be accessed directly or via the creating task.
### Logging Models (weights file)
This property makes Models a standalone entry that can be used as an artifactory interface.
When models are saved (for instance, by calling the `torch.save()` method), ClearML automatically logs the models and all
snapshot paths.
### Automatic Model Logging
When models are saved using certain frameworks (for instance, by calling the `torch.save()` method), ClearML automatically
logs the models and all snapshot paths.
![image](../img/fundamentals_artifacts_logging_models.png)
@ -79,6 +81,47 @@ See model storage examples, [TF](https://github.com/allegroai/clearml/blob/maste
[Keras](https://github.com/allegroai/clearml/blob/master/examples/frameworks/keras/keras_tensorboard.py),
[Scikit-Learn](https://github.com/allegroai/clearml/blob/master/examples/frameworks/scikit-learn/sklearn_joblib_example.py).
### Manual Model Logging
To manually log a model, create an instance of OutputModel class:
```python
from clearml import OutputModel, Task
# Instantiate a Task
task = Task.init(project_name="myProject", task_name="myTask")
# Instantiate an OutputModel, with a Task object argument
output_model = OutputModel(task=task, framework="PyTorch")
```
The OutputModel object is always connected to a Task object as it's instantiated with a Task object as an argument.
It is, therefore, automatically registered as the Tasks output model.
The snapshots of manually uploaded models aren't automatically captured, but there are two methods
to update an output model.
#### Updating Via Task Object
Using the [Task.update_output_model](../references/sdk/task.md#update_output_model) method:
```python
task.update_output_model(model_path='path/to/model')
```
It's possible to modify the following parameters:
* Weights file / folder - Uploads the files specified with the `model_path`.
If a remote storage is provided (S3 / GS / Https etc...), it saves the URL.
* Model Metadata - Model name, description, iteration number of model, and tags.
#### Updating Via Model Object
Using the [OutputModel.update_weights](../references/sdk/model_outputmodel.md#update_weights) method:
```python
output_model.update_weights()
```
* Specify either the name of a locally stored weights file to upload (`weights_filename`), or the URI of a storage destination
for model weight upload (`registered_uri`).
* Model Metadata - Model description and iteration number.
### Using Models
Loading a previously trained model is quite similar to loading artifacts.
@ -90,7 +133,7 @@ local_weights_path = last_snapshot.get_local_copy()
```
1. Get the instance of the Task that created the original weights files
2. Query the Task on its output models (a list of snapshots)
3. Get the latest snapshot (if using Tensorflow, the snapshots are stored in a folder, so the 'local_weights_path' will point to a folder containing the requested snapshot).
3. Get the latest snapshot (if using Tensorflow, the snapshots are stored in a folder, so the `local_weights_path` will point to a folder containing the requested snapshot).
Notice that if one of the frameworks will load the weights file, the running Task will automatically update, with
"Input Model" pointing directly to the original training Task's model. With this feature, it's easy to get a full genealogy
@ -109,7 +152,7 @@ task = Task.init(project_name='examples', task_name='storing model', output_uri=
```
To automatically store all created models from all experiments in a certain storage medium, edit the `clearml.conf` (see
[ClearML Cofiguration Reference](../configs/clearml_conf#sdkdevelopment)) and set `sdk.developmenmt.default_output_uri` to the desired
[ClearML Configuration Reference](../configs/clearml_conf#sdkdevelopment)) and set `sdk.developmenmt.default_output_uri` to the desired
storage (see [Storage](../integrations/storage.md)).
This is especially helpful when using [clearml-agent](../clearml_agent.md) to execute code.

View File

@ -1,5 +1,5 @@
---
title: Hyperparameter optimization
title: Hyperparameter Optimization
---
## What is HyperParameter Optimization?
@ -9,11 +9,14 @@ performing models can be complicated. Manually adjusting hyperparameters over th
slow and tedious. Luckily, **hyperparameter optimization** can be automated and boosted using **ClearML**'s
`HyperParameterOptimizer` class.
## What does ClearML's `HyperParameterOptimizer` do?
## ClearML's HyperParameter Optimization
The `HyperParameterOptimizer` class does the following:
* Clones the base experiment that needs to be optimized
* Changes arguments based on an optimizer strategy that is specified
ClearML provides the `HyperParameterOptimizer` class which takes care of the entire optimization process for users in
with a simple interface.
The `HyperParameterOptimizer` class does the following:
* Clones a base experiment that needs to be optimized.
* Changes arguments based on the specified optimizer strategy.
* Tries to minimize / maximize defined objectives.
@ -21,7 +24,7 @@ The `HyperParameterOptimizer` class contains **ClearML**s hyperparameter opti
using different optimizers, including existing software frameworks, enabling simple, accurate, and fast hyperparameter
optimization.
**The optimizers include:**
### Supported Optimizers
* **Optuna** - `automation.optuna.optuna.OptimizerOptuna`. Optuna is the default optimizer in ClearML. It makes use of
different samplers such as grid search, random, bayesian, and evolutionary algorithms.
@ -35,6 +38,13 @@ optimization.
* **Full grid** sampling strategy of every hyperparameter combination - `Grid search automation.optimization.GridSearch`.
* **Custom** - `automation.optimization.SearchStrategy`. - Use a custom class and inherit from the ClearML automation base strategy class
## How Does it Work?
**ClearML**'s approach to hyperparameter optimization is scalable, easy to set up and to manage, and it makes it easy to
compare results.
### Workflow
Make use of **ClearML**'s hyperparameter optimization capabilities by:
* Initializing an Optimizer Task, which will record and monitor arguments, execution details, results, and more.
* Instantiating a `HyperParameterOptimizer`, where the following is specified:
@ -43,11 +53,10 @@ Make use of **ClearML**'s hyperparameter optimization capabilities by:
* Metric to optimize
* Optimizer class (optimization strategy) where the optimization configuration and resources budget are defined
* And more.
* Enqueuing the Task to be executed by a `clearml-agent` or multiple agent in a remote machine.
* Enqueuing the Task to be executed by a ClearML Agent (or multiple agents) in a remote machine.
* Monitoring the optimization process and viewing the summarized results in the **ClearML web UI**
**ClearML**'s approach to hyperparameter optimization is scalable, easy to set up and to manage, and it makes it easy to
compare results.
![image](../img/fundamentals_hpo_summary.png)
## Defining a hyperparameter optimization search example
@ -100,19 +109,8 @@ compare results.
max_iteration_per_job=150000,
)
```
For further information about the `HyperParameterOptimizer` arguments, see the [Automation module reference](../references/sdk/hpo_optimization_hyperparameteroptimizer.md).
<br/><br/>
1. Make sure an agent or multiple agents are listening to the queue defined above (`execution_queue='default'`). See [Clearml Agent](../clearml_agent.md).
1. Start the hyperparameter optimization process:
```python
optimizer.set_report_period(1) # setting the time gap between two consecutive reports
optimizer.start()
optimizer.wait() # wait until process is done
optimizer.stop() # make sure background optimization stopped
```
1. Take a look at the summarized results of the optimization in the **Web UI**, in the optimizer Task's experiment page.
There is also the option to look at the results of a specific experiment, or the results of a few experiments and
to [Compare](../webapp/webapp_exp_comparing.md).
Check out the [Hyperparameter Optimization](../guides/optimization/hyper-parameter-optimization) tutorial for a step by step guide.
For further information about the `HyperParameterOptimizer` arguments, see the [Automation module reference](../references/sdk/hpo_optimization_hyperparameteroptimizer.md).

View File

@ -318,19 +318,45 @@ Like any other arguments, they can be changed from the UI or programmatically.
Function Tasks must be created from within a regular Task, created by calling `Task.init()`
:::
## Task lifecycle
## Task Lifecycle
1. A Task is created when running the code. It collects the environment configuration of the runtime execution.
1. Results of the code execution (graphs, artifacts, etc.) are stored by the Task.
1. To execute a Task (in draft mode) on a remote machine, push the Task into an execution queue.
1. A `clearml-agent` can execute a Task on a remote machine:
1. The agent pulls the Task from the execution queue.
2. The agent sets the environment, runs the code, and collects the results.
1. An existing Task can be replicated (cloned). The environment / configuration is replicated, but the output results are
left empty (draft mode).
ClearML Tasks are created in one of the following methods:
* Manually running code that is instrumented with the ClearML SDK and invokes `Task.init()`.
* Cloning an existing task.
* Creating a task via CLI using [clearml-task](../apps/clearml_task.md).
### Logging Task Information
![image](../img/clearml_logging_diagram.png)
The above diagram describes how execution information is recorded when running code instrumented with ClearML:
1. Once a ClearML Task is initialized, ClearML automatically logs the complete environment information
including:
* Source code
* Python environment
* Configuration parameters.
1. As the execution progresses, any outputs produced are recorded including:
* Console logs
* Metrics and graphs
* Models and other artifacts
1. Once the script terminates, the Task will change its status to either `Completed`, `Failed`, or `Aborted`.
#### Task states and state transitions
All information logged can be viewed in the [task details UI](../webapp/webapp_exp_track_visual.md).
### Cloning Tasks
![image](../img/clearml_task_life_cycle_diagram.png)
The above diagram demonstrates how a previously run task can be used as a baseline for experimentation:
1. A previously run task is cloned, creating a new task, in *draft* mode.
The new task retains all of the source task's configuration. The original task's outputs are not carried over.
1. The new task's configuration is modified to reflect the desired parameters for the new execution.
1. The new task is enqueued for execution.
1. A `clearml-agent` servicing the queue pulls the new task and executes it (where ClearML again logs all of the execution outputs).
### Task states
The state of a Task represents its stage in the Task lifecycle. It indicates whether the Task is read-write (editable) or
read-only. For each state, a state transition indicates which actions can be performed on an experiment, and the new state

View File

@ -67,4 +67,11 @@ improving our results later on!
## Visibility Matters
While it's possible to track experiments with one tool, and pipeline them with another, we believe that having
everything under the same roof benefits you great! It's
everything under the same roof has its benefits! <br/>
Being able to track experiments progress, compare experiments, and based on that send experiments to execution on remote
machines (that also builds the environment themselves) has tremendous benefits in terms of visibility and ease of integration.<br/>
Being able to have visibility into your pipeline, while using experiments already defined in the platform
enables users to have a clearer picture of what's the status of the pipeline
and makes it easier to start using pipelines earlier in the process by simplifying chaining tasks.<br/>
Managing datasets with the same tools and APIs that manage the experiments also lowers the barrier of entry into
experiment and data provenance.

View File

@ -0,0 +1,60 @@
---
title: ClearML Agent on Google Colab
---
[Google Colab](https://colab.research.google.com) is a common development environment for data scientists. It offers a convenient IDE as well as
compute provided by google.<br/>
Users can transform a Google Colab instance into an available resource in ClearML using [Clearml Agent](../../clearml_agent.md).
In this tutorial, we will go over how to create a ClearML worker node in a Google Colab notebook. Once the worker is up
and running, users can send Tasks to be executed on the Google Colab's HW.
## Prerequisites
* Be signed up for ClearML (Or have a server deployed).
* Have a Google account to access Google Colab
## Steps
1. Open up [this Google Colab notebook](https://colab.research.google.com/github/pollfly/clearml/blob/master/examples/clearml_agent/clearml_colab_agent.ipynb).
1. Run the first cell, which installs all the necessary packages:
```
!pip install git+https://github.com/allegroai/clearml
!pip install clearml-agent
```
1. Run the second cell, which exports this environment variable:
```
! export MPLBACKEND=TkAg
```
This environment variable makes Matplotlib work in headless mode, so it won't output graphs to the screen.
1. Create new credentials.
Go to your **profile** in the [ClearML WebApp](https://app.community.clear.ml). Under the **WORKSPACES** section,
go to **App Credentials**, click **+ Create new credentials**, and copy the information that pops up.
1. Set the credentials.
In the third cell, enter your own credentials:
```python
from clearml import Task
Task.set_credentials(api_host="https://api.community.clear.ml",
web_host="https://app.community.clear.ml",
files_host="https://files.community.clear.ml",
key='6ZHX9UQMYL874A1NE8',
secret='=2h6#%@Y&m*tC!VLEXq&JI7QhZPKuJfbaYD4!uUk(t7=9ENv'
)
```
1. In the fourth cell, launch a `clearml-agent` that will listen to the `default` queue:
```
!clearml-agent daemon --queue default
```
For additional options for running `clearml-agent`, see the [clearml-agent reference](../../references/clearml_agent_ref.md).
After cell 4 is executed, the worker should now appear in the [**Workers & Queues**](../../webapp/webapp_workers_queues.md)
page of your server. Clone experiments and enqueue them to your hearts content! The `clearml-agent` will fetch
experiments and execute them using the Google Colab hardware.

View File

@ -2,23 +2,21 @@
title: Integration for PyCharm
---
The **ClearML PyCharm plugin** enables syncing a local execution configuration to a remote execution machine:
The **ClearML PyCharm plugin** enables syncing a local execution configuration to a remote executor machine:
* Sync local repository information to a remote debug machine.
* Multiple users can use the same resource for execution without compromising private credentials.
* Run the [ClearML Agent](../../fundamentals/agents_and_queues.md) on default VMs/Containers.
* Run the [ClearML Agent](../../clearml_agent.md) on default VMs/Containers.
## Installation
**To install the ClearML PyCharm plugin, do the following:**
1. Download the latest plugin version from the [Releases page](https://github.com/allegroai/clearml-pycharm-plugin/releases).
1. Install the plugin in PyCharm from local disk:
![image](../../img/examples_ide_pycharm.png)
![image](../../img/ide_pycharm_plugin_from_disk.png)
## Optional: ClearML configuration parameters
@ -29,17 +27,13 @@ the settings in the ClearML configuration file.
**To set ClearML configuration parameters:**
1. In PyCharm, open **Settings**.
1. In PyCharm, open **Settings** **>** **Tools** **>** **ClearML**.
1. Click **Tools**.
1. Click **ClearML**.
1. Configure ClearML server information:
1. Configure your ClearML server information:
1. API server (for example: ``http://localhost:8008``)
1. Web server (for example: ``http://localhost:8080``)
1. File server (for example: ``http://localhost:8081``)
1. Add **ClearML** user credentials key/secret.
![image](../../img/clearml_pycharm_plugin/pycharm_config_params.png)
![image](../../img/ide_pycharm_config_params.png)

View File

@ -0,0 +1,56 @@
---
title: Annotations
---
With **ClearML Enterprise**, annotations can be applied to video and image frames. [Frames](single_frames.md) support
two types of annotations: **Frame objects** and **Frame labels**.
Annotation Tasks can be used to efficiently organize the annotation of frames in Dataset versions (see
[Annotations Task Page](webapp/webapp_annotator.md)).
For information about how to view, create, and manage annotations using the WebApp, see [Annotating Images and Videos](#annotating-images-and-video).
## Frame objects
Frame objects are labeled Regions of Interest (ROIs), which can be bounded by polygons (including rectangles), ellipses,
or key points. These ROIs are useful for object detection, classification, or semantic segmentation.
Frame objects can include ROI labels, confidence levels, and masks for semantic segmentation. In **ClearML Enterprise**,
one or more labels and sources dictionaries can be associated with an ROI (although multiple source ROIs are not frequently used).
## Frame labels
Frame labels are applied to an entire frame, not a region in a frame.
## Usage
### Adding a frame object
To add a frame object annotation to a SingleFrame, use the [`SingleFrame.add_annotation`](google.com) method.
```python
# a bounding box labeled "test" at x=10,y=10 with width of 30px and height of 20px
frame.add_annotation(box2d_xywh=(10, 10, 30, 20), labels=['test'])
```
The argument `box2d_xywh` specifies the coordinates of the annotation's bounding box, and the argument `labels` specifies
a list of labels for the annotation.
When adding an annotation there are a few options for entering the annotation's boundaries, including:
* `poly2d_xy` - A list of floating points (x,y) to create for single polygon, or a list of Floating points lists for a
complex polygon
* `ellipse2d_xyrrt` - A List consisting of cx, cy, rx, ry, and theta for an ellipse
* And more! See [`SingleFrame.add_annotation`](google.com) for further options.
### Adding a Frame label
Adding a frame label is similar to creating a frame objects, except that coordinates don't need to be specified, since
the whole frame is being referenced.
Use the [`SingleFrame.add_annotation`](google.com) method, but use only the `labels` parameter.
```python
# labels for the whole frame
frame.add_annotation(labels=['frame level label one','frame level label two'])
```

View File

@ -0,0 +1,40 @@
---
title: Custom Metadata
---
Metadata can be customized as needed using: **meta** dictionaries:
* As a top-level key for metadata applying to entire frame
* In `rois` dictionaries, for metadata applying to individual ROIs.
## Usage
### Adding Frame metadata
When instantiating a Frame, metadata that applies for the entire frame can be
added as an argument.
```python
from allegroai import SingleFrame
# create a frame with metadata
frame = SingleFrame(
source='https://allegro-datasets.s3.amazonaws.com/tutorials/000012.jpg',
preview_uri='https://allegro-datasets.s3.amazonaws.com/tutorials/000012.jpg',
# insert metadata dictionary
metadata={'alive':'yes'},
)
# add metadata to the frame
frame.metadata['dangerous'] = 'no'
```
### Adding ROI metadata
Metadata can be added to individual ROIs when adding an annotation to a `frame`, using the [`add_annotation`](google.com)
method.
```python
frame.add_annotation(box2d_xywh=(10, 10, 30, 20), labels=['tiger'],
# insert metadata dictionary
metadata={'dangerous':'yes'})
```

View File

@ -0,0 +1,305 @@
---
title: Datasets and Dataset Versions
---
ClearML Enterprise's **Datasets** and **Dataset versions** provide the internal data structure
and functionality for the following purposes:
* Connecting source data to the **ClearML Enterprise** platform
* Using **ClearML Enterprise**'s GIT-like [Dataset versioning](#dataset-versioning)
* Integrating the powerful features of [Dataviews](dataviews.md) with an experiment
* [Annotating](webapp/webapp_datasets_frames.md#annotations) images and videos
Datasets consist of versions with SingleFrames and / or FrameGroups. Each Dataset can contain multiple versions, where
each version can have multiple children that inherit their parent's SingleFrames and / or FrameGroups. This inheritance
includes the frame metadata and data connecting the source data to the ClearML Enterprise platform, as well as the other
metadata and data.
These parent-child version relationships can be represented as version trees with a root-level parent. A Dataset
can contain one or more trees.
## Dataset version state
Dataset versions can have either **Draft** or **Published** status.
A **Draft** version is editable, so frames can be added to and deleted and / or modified from the Dataset.
A **Published** version is read-only, which ensures reproducible experiments and preserves a version of a Dataset.
Child versions can only be created from *Published* versions. To create a child of a *Draft* Dataset version,
it must be published first.
## Example Datasets
**ClearML Enterprise** provides Example Datasets, available to in the **ClearML Enterprise** platform, with frames already built,
and ready for your experimentation. Find these example Datasets in the **ClearML Enterprise** WebApp (UI). They appear
with an "Example" banner in the WebApp (UI).
## Usage
### Creating Datasets
Use the [Dataset.create](google.com) method to create a Dataset. It will contain an empty version named `Current`.
```python
from allegroai import Dataset
myDataset = Dataset.create(dataset_name='myDataset')
```
Or, use the [DatasetVersion.create_new_dataset](google.com) method.
```python
from allegroai import DatasetVersion
myDataset = DatasetVersion.create_new_dataset(dataset_name='myDataset Two')
```
To raise a `ValueError` exception if the Dataset exists, specify the `raise_if_exists` parameters as `True`.
* With `Dataset.create`
```python
try:
myDataset = Dataset.create(dataset_name='myDataset One', raise_if_exists=True)
except ValueError:
print('Dataset exists.')
```
* Or with `DatasetVersion.create_new_dataset`
```python
try:
myDataset = DatasetVersion.create_new_dataset(dataset_name='myDataset Two', raise_if_exists=True)
except ValueError:
print('Dataset exists.')
```
Additionally, create a Dataset with tags and a description.
```python
myDataset = DatasetVersion.create_new_dataset(dataset_name='myDataset',
tags=['One Tag', 'Another Tag', 'And one more tag'],
description='some description text')
```
### Accessing current Dataset
To get the current Dataset, use the `DatasetVersion.get_current` method.
```python
myDataset = DatasetVersion.get_current(dataset_name='myDataset')
```
### Deleting Datasets
Use the `Dataset.delete` method to delete a Dataset.
Delete an empty Dataset (no versions).
```python
Dataset.delete(dataset_name='MyDataset', delete_all_versions=False, force=False)
```
Delete a Dataset containing only versions whose status is *Draft*.
```python
Dataset.delete(dataset_name='MyDataset', delete_all_versions=True, force=False)
```
Delete a Dataset even if it contains versions whose status is *Published*.
```python
Dataset.delete(dataset_name='MyDataset', delete_all_versions=True, force=True)
```
## Dataset Versioning
Dataset versioning refers to the group of **ClearML Enterprise** SDK and WebApp (UI) features for creating,
modifying, and deleting Dataset versions.
**ClearML Enterprise** supports simple and sophisticated Dataset versioning, including **simple version structures** and
**advanced version structures**.
In a **simple version structure**, a parent can have one and only one child, and the last child in the Dataset versions
tree must be a *Draft*. This simple structure allows working with a single set of versions of a Dataset. Create children
and publish versions to preserve data history. Each version whose status is *Published* in a simple version structure is
referred to as a **snapshot**.
In an **advanced version structure**, at least one parent has more than one child (this can include more than one parent
version at the root level), or the last child in the Dataset versions tree is *Published*.
Creating a version in a simple version structure may convert it to an advanced structure. This happens when creating
a Dataset version that yields a parent with two children, or when publishing the last child version.
## Versioning Usage
Manage Dataset versioning using the [DatasetVersion](google.com) class in the ClearML Enterprise SDK.
### Creating snapshots
If the Dataset contains only one version whose status is *Draft*, snapshots of the current version can be created.
When creating a snapshot, the current version becomes the snapshot (it keeps the same version ID),
and the newly created version (with its new version ID) becomes the current version.
To create a snapshot, use the [DatasetVersion.create_snapshot](google.com) method.
#### Snapshot naming
In the simple version structure, ClearML Enterprise supports two methods for snapshot naming:
* **Timestamp naming** - If only the Dataset name or ID is provided, the snapshot is named `snapshot` with a timestamp
appended.
The timestamp format is ISO 8601 (`YYYY-MM-DDTHH:mm:ss.SSSSSS`). For example, `snapshot 2020-03-26T16:55:38.441671`.
**Example:**
```python
from allegroai import DatasetVersion
myDataset = DatasetVersion.create_snapshot(dataset_name='MyDataset')
```
After the statement above runs, the previous current version keeps its existing version ID, and it becomes a
snapshot named `snapshot` with a timestamp appended. The newly created version with a new version ID becomes
the current version, and its name is `Current`.
* **User-specified snapshot naming** - If the `publish_name` parameter is provided, it will be the name of the snapshot name.
**Example:**
```python
myDataset = DatasetVersion.create_snapshot(dataset_name='MyDataset', publish_name='NewSnapshotName')
```
After the above statement runs, the previous current version keeps its existing version ID and becomes a snapshot named
`NewSnapshotName`.
The newly created version (with a new version ID) becomes the current version, and its name is `Current`.
#### Current version naming
In the simple version structure, ClearML Enterprise supports two methods for current version naming:
* **Default naming** - If the `child_name` parameter is not provided, `Current` is the current version name.
* **User-specified current version naming** - If the `child_name` parameter is provided, that child name becomes the current
version name.
For example, after the following statement runs, the previous current version keeps its existing version ID and becomes
a snapshot named `snapshot` with the timestamp appended.
The newly created version (with a new version ID) is the current version, and its name is `NewCurrentVersionName`.
```python
myDataset = DatasetVersion.create_snapshot(dataset_name='MyDataset',
child_name='NewCurrentVersionName')
```
#### Adding metadata and comments
Add a metadata dictionary and / or comment to a snapshot.
For example:
```python
myDataset = DatasetVersion.create_snapshot(dataset_name='MyDataset',
child_metadata={'abc':'1234','def':'5678'},
child_comment='some text comment')
```
### Creating child versions
Create a new version from any version whose status is *Published*.
To create a new version, call the [DatasetVersion.create_version](google.com) method, and
provide:
* Either the Dataset name or ID
* The parent version name or ID from which the child inherits frames
* The new version's name.
For example, create a new version named `NewChildVersion` from the existing version `PublishedVersion`,
where the new version inherits the frames of the existing version. If `NewChildVersion` already exists,
it is returned.
```python
myVersion = DatasetVersion.create_version(dataset_name='MyDataset',
parent_version_names=['PublishedVersion'],
version_name='NewChildVersion')
```
To raise a ValueError exception if `NewChildVersion` exists, set `raise_if_exists` to `True`.
```python
myVersion = DatasetVersion.create_version(dataset_name='MyDataset',
parent_version_names=['PublishedVersion'],
version_name='NewChildVersion',
raise_if_exists=True))
```
### Creating root-level parent versions
Create a new version at the root-level. This is a version without a parent, and it contains no frames.
```python
myDataset = DatasetVersion.create_version(dataset_name='MyDataset',
version_name='NewRootVersion')
```
### Getting versions
To get a version or versions, use the [DatasetVersion.get_version](google.com) and [DatasetVersion.get_versions](google.com)
methods, respectively.
**Getting a list of all versions**
```python
myDatasetversion = DatasetVersion.get_versions(dataset_name='MyDataset')
```
**Getting a list of all _published_ versions**
```python
myDatasetversion = DatasetVersion.get_versions(dataset_name='MyDataset',
only_published=True)
```
**Getting a list of all _drafts_ versions**
```python
myDatasetversion = DatasetVersion.get_versions(dataset_name='MyDataset',
only_draft=True)
```
**Getting the current version**
If more than one version exists, ClearML Enterprise outputs a warning.
```python
myDatasetversion = DatasetVersion.get_version(dataset_name='MyDataset')
```
**Getting a specific version**
```python
myDatasetversion = DatasetVersion.get_version(dataset_name='MyDataset',
version_name='VersionName')
```
### Deleting versions
Delete versions which are status *Draft* using the [Dataset.delete_version](google.com) method.
```python
from allegroai import Dataset
myDataset = Dataset.get(dataset_name='MyDataset')
myDataset.delete_version(version_name='VersionToDelete')
```
### Publishing versions
Publish (make read-only) versions which are status *Draft* using the [Dataset.publish_version](google.com) method. This includes the current version, if the Dataset is in
the simple version structure.
```python
myVersion = DatasetVersion.get_version(dataset_name='MyDataset',
version_name='VersionToPublish')
myVersion.publish_version()
```

View File

@ -0,0 +1,389 @@
---
title: Dataviews
---
Dataviews is a powerful and easy-to-use **ClearML Enterprise** feature for creating and managing local views of remote
Datasets. Dataviews can use sophisticated queries to input data from a subset of a Dataset
or combinations of Datasets.
Dataviews support:
* Filtering by ROI labels, frame metadata, and data sources
* Data debiasing to adjust for imbalanced data
* ROI label mapping (label translation)
* Class label enumeration
* Controls for the frame iteration, such as sequential or random iteration, limited or infinite iteration, and reproducibility.
Dataviews are lazy and optimize processing. When an experiment script runs in a local environment, Dataview pointers
are initialized. If the experiment is cloned or extended, and that newly cloned or extended experiment is tuned and run,
only changed pointers are initialized. The pointers that did not change are reused.
## Filtering
A Dataview filters experiment input data, using one or more frame filters. A frame filter defines the criteria for the
selection of SingleFrames iterated by a Dataview.
A frame filter contains the following criteria:
* Dataset version - Choose whether the filter applies to one version or all versions of a Dataset.
* Any combination of the following rules:
* ROI rule - Include or exclude frames containing at least one ROI with any combination of labels in the Dataset version.
Optionally, limit the number of matching ROIs (instances) per frame, and / or limit the confidence level of the label.
For example: include frames containing two to four ROIs labeled `cat` and `dog`, with a confidence level from `0.8` to `1.0`.
* Frame rule - Filter by frame metadata key-value pairs, or ROI labels.
For example: if some frames contain the metadata
key `dangerous` with values of `yes` or `no`, filter `(meta.dangerous:'yes')`.
* Source rule - Filter by frame `source` dictionary key-value pairs.
For example: filter by source ID `(source.id:00)`.
* A ratio (weight) allowing to debias input data, to and adjust an imbalance in SingleFrames iterated by the Dataview (optional).
Use combinations of these frame filters to build sophisticated queries.
## Debiasing input data
Apply debiasing to each frame filter to adjust for an imbalance in input data. Ratios (weights) enable setting the proportion
of frames that are inputted, according to any of the criteria in a frame filter, including ROI labels, frame metadata,
and sources, as well as each Dataset version compared with the others.
For example, data may contain five times the number of frames labeled `daylight` as those labeled `nighttime`, but
you want to input the same number of both. To debias the data, create two frame filters, one for `daylight` with a ratio
of `1`, and the other for `nighttime` with a ratio of `5`. The Dataview will iterate approximately an equal number of
SingleFrames for each.
## ROI Label mapping (label translation)
ROI label mapping (label translation) applies to the new model. For example, apply mapping to:
* Combine different labels under another more generic label.
* Consolidate disparate datasets containing different names for the ROI.
* Hide labeled objects from the training process.
## Class label enumeration
Define class labels for the new model and assign integers to each in order to maintain data conformity across multiple
codebases and datasets. It is important to set enumeration values for all labels of importance.
## Data augmentation
On-the-fly data augmentation is applied to SingleFrames, transforming images without creating new data. Apply data augmentation
in steps, where each step is composed of a method, an operation, and a strength as follows:
* **Affine** augmentation method - Transform an image's geometric shape to another position on a 2-dimensional plane.
Use any of the following operations:
* Rotate
* Reflect-horiz - Flip images horizontally
* Reflect-vert - Flip images vertically
* Scale
* Shear - Skew
* No operation - Randomly select SingleFrames that are not transformed (skipped). If the experiment runs again, and
the random seed in [iteration control](#iteration-control) is unchanged, the same SingleFrames are not augmented.
* **Pixel** augmentation method - Transform images by modifying pixel values while retaining shape and perspective.
Use any of the following operations:
* Blur - Gaussian smoothing
* Noise - **ClearML Enterprise**'s own noise augmentation consisting of:
* **high** noise - like snow on analog televisions with a weak TV signal
* **low** noise - like a low resolution image magnified in localized areas on the image
* Recolor - using an internal RGB lookup-table
* No operation - Randomly select SingleFrames that are not transformed (skipped). If the experiment runs again, and
the random seed in [iteration control](#iteration-control) is unchanged, the same SingleFrames are not augmented.
* Strength - A number applied to adjust the degree of transformation. The recommended strengths are the following:
* 0.0 - No effect
* 0.5 - Low (weak)
* 1.0 - Medium (recommended)
* 2.0 - High (strong)
## Iteration control
The input data **iteration control** settings determine the order, number, timing, and reproducibility of the Dataview iterating
SingleFrames. Depending upon the combination of iteration control settings, all SingleFrames may not be iterated, and some
may repeat. The settings include the following:
* Order - Order of the SingleFrames returned by the iteration, which can be either:
* Sequential - Iterate SingleFrames in sorted order by context ID and timestamp.
* Random - Iterate SingleFrames randomly using a random seed that can be set (see Random Seed below).
* Repetition - The repetition of SingleFrames that, in conjunction with the order, determines whether all SingleFrames
are returned, and whether any may repeat. The repetition settings and their impact on iteration are the following:
* Use Each Frame Once - All SingleFrames are iterated. If the order is sequential, then no SingleFrames repeat. If
the order is random, then some SingleFrames may repeat.
* Limit Frames - The maximum number of SingleFrames to iterate, unless the actual number of SingleFrames is fewer than
the maximum, then the actual number of SingleFrames are iterated. If the order is sequential, then no SingleFrames
repeat. If the order is random, then some SingleFrames may repeat.
* Infinite Iterations - Iterate SingleFrames until the experiment is manually terminated. If the order is sequential,
then all SingleFrames are iterated (unless the experiment is manually terminated before all iterate) and SingleFrames
repeat. If the order is random, then all SingleFrames may not be iterated, and some SingleFrames may repeat.
* Random Seed - If the experiment is rerun and the seed remains unchanged, the SingleFrames iteration is the same.
* Clip Length - For video data sources, in the number of sequential SingleFrames from a clip to iterate.
## Usage
### Creating Dataviews
Use the [`allegroai.DataView`](google.com) class to create a DataView object. Instantiate DataView objects, specifying
iteration settings and additional iteration parameters that control query iterations.
```python
from allegroai import DataView, IterationOrder
# Create a DataView object that iterates randomly until terminated by the user
myDataView = DataView(iteration_order=IterationOrder.random, iteration_infinite=True)
```
### Adding queries
To add a query to a DataView, use the [`DataView.add_query`](google.com) method and specify Dataset versions,
ROI and / or frame queries, and other criteria.
The `dataset_name` and `version_name` arguments specify the Dataset Version. The `roi_query` and `frame_query` arguments
specify the queries.
* `roi_query` can be assigned ROI labels by label name or Lucene queries.
* `frame_query` must be assigned a Lucene query.
Multiple queries can be added to the same or different Dataset versions, each query with the same or different ROI
and / or frame queries.
#### ROI queries:
* ROI query for a single label
This example is an ROI query filtering for frames containing at least one ROI with the label `cat`.
```python
# Create a Dataview object for an iterator that randomly returns frames according to queries
myDataView = DataView(iteration_order=IterationOrder.random, iteration_infinite=True)
# Add a query for a Dataset version
myDataView.add_query(dataset_name='myDataset',
version_name='myVersion', roi_query='cat')
```
* ROI query for one label OR another
This example is an ROI query filtering for frames containing at least one ROI with the label `cat` OR `dog`:
```python
# Add a query for a Dataset version
myDataView.add_query(dataset_name='myDataset', version_name='myVersion',
roi_query='cat')
myDataView.add_query(dataset_name='myDataset', version_name='myVersion',
roi_query='dog')
```
* ROI query for one label AND another label
This example is an ROI query filtering for frames containing at least one ROI with the label `Car` AND `partly_occluded`.
```python
# Add a query for a Dataset version
myDataView.add_query(dataset_name='myDataset', version_name='training',
roi_query=['Car','partly_occluded'])
```
* ROI query for one label AND NOT another (Lucene query).
This example is an ROI query filtering for frames containing at least one ROI with the label `Car` AND NOT the label
`partly_occluded`.
```python
# Add a query for a Dataset version
# Use a Lucene Query
# "label" is a key in the rois dictionary of a frame
# In this Lucene Query, specify two values for the label key and use a Logical AND NOT
myDataView.add_query(dataset_name='myDataset', version_name='training',
roi_query='label.keyword:\"Car\" AND NOT label.keyword:\"partly_occluded\"')
```
#### Querying multiple Datasets and versions
This example demonstrates an ROI query filtering for frames containing the ROI labels `car`, `truck`, or `bicycle`
from two versions of one Dataset, and one version of another Dataset.
```python
# Add queries:
# The 1st Dataset version
myDataView.add_query(dataset_name='dataset_1',
version_name='version_1',
roi_query='label.keyword:\"car\" OR label.keyword:\"truck\" OR '
'label.keyword:\"bicycle\"')
# The 1st Dataset, but a different version
myDataView.add_query(dataset_name='dataset_1',
version_name='version_2',
roi_query='label.keyword:\"car\" OR label.keyword:\"truck\" OR '
'label.keyword:\"bicycle\"')
# A 2nd Dataset (version)
myDataView.add_query(dataset_name='dataset_2',
version_name='some_version',
roi_query='label.keyword:\"car\" OR label.keyword:\"truck\" OR '
'label.keyword:\"bicycle\"')
```
#### Frame queries
Use frame queries to filter frames by ROI labels and / or frame metadata key-value pairs that a frame must include or
exclude for the DataView to return the frame.
**Frame queries** match frame meta key-value pairs, ROI labels, or both.
They use the same logical OR, AND, NOT AND matching as ROI queries.
This example demonstrates a frame query filtering for frames containing the meta key `city` value of `bremen`.
```python
# Add a frame query for frames with the meta key "city" value of "bremen"
myDataView.add_query(dataset_name='myDataset',
version_name='version',
frame_query='meta.city:"bremen"')
```
### Controlling query iteration
Use [`DataView.set_iteration_parameters`](google.com) to manage the order, number, timing, and reproducibility of frames
for training.
#### Iterate frames infinitely
This example demonstrates creating a Dataview and setting its parameters to iterate infinitely until the script is
manually terminated.
```python
# Create a Dataview object for an iterator that returns frames
myDataView = DataView()
# Set Iteration Parameters (overrides parameters in constructing the DataView object
myDataView.set_iteration_parameters(order=IterationOrder.random, infinite=True)
```
#### Iterate all frames matching the query
This example demonstrates creating a DataView and setting its parameters to iterate and return all frames matching a query.
```python
# Create a Dataview object for an iterator for frames
myDataView = DataView(iteration_order=IterationOrder.random, iteration_infinite=True)
# Set Iteration Parameters (overrides parameters in constructing the DataView object
myDataView.set_iteration_parameters(
order=IterationOrder.random, infinite=False)
# Add a query for a Dataset version
myDataView.add_query(dataset_name='myDataset',
version_name='myVersion', roi_query='cat')
```
#### Iterate a maximum number of frames
This example demonstrates creating a DataView and setting its parameters to iterate a specific number of frames. If the
Dataset version contains fewer than that number of frames matching the query, then fewer are returned by the iterator.
```python
# Create a Dataview object for an iterator for frames
myDataView = DataView(iteration_order=IterationOrder.random, iteration_infinite=True)
# Set Iteration Parameters (overrides parameters in constructing the DataView object
myDataView.set_iteration_parameters(
order=IterationOrder.random, infinite=False,
maximum_number_of_frames=5000)
```
### Debiasing input data
Debias input data using the [DataView.add_query](google.com) method's [weight](google.com) argument to add weights. This
is the same [DataView.add_query](google.com) that can be used to specify Dataset versions, and ROI queries and frame queries.
This example adjusts an imbalance in the input data to improve training for `Car` ROIs that are also `largely occluded`
(obstructed). For every frame containing at least one ROI labeled `Car`, approximately five frames containing at least
one ROI labeled with both `Car` and `largely_occluded` will be input.
```python
myDataView = DataView(iteration_order=IterationOrder.random, iteration_infinite=True)
myDataView.add_query(dataset_name='myDataset', version_name='training',
roi_query='Car', weight = 1)
myDataView.add_query(dataset_name='myDataset', version_name='training',
roi_query='label.keyword:\"Car\" AND label.keyword:\"largely_occluded\"', weight = 5)
```
### Mapping ROI Labels
ROI label translation (label mapping) enables combining labels for training, combining disparate datasets, and hiding
certain labels for training.
This example demonstrates consolidating two disparate Datasets. Two Dataset versions use `car` (lower case "c"), but a
third uses `Car` (upper case "C").
The example maps `Car` (upper case "C") to `car` (lower case "c").
```python
# Create a Dataview object for an iterator that randomly returns frames according to queries
myDataView = DataView(iteration_order=IterationOrder.random, iteration_infinite=True)
# The 1st Dataset (version) - "car" with lowercase "c"
myDataView.add_query(dataset_name='myDataset', version_name='myVersion' roi_query='car')
# The 2nd Dataset (version) - "car" with lowercase "c"
myDataView.add_query(dataset_name='dataset_2', version_name='aVersion',
roi_query='car')
# A 3rd Dataset (version) - "Car" with uppercase "C"
myDataView.add_query(dataset_name='dataset_3', version_name='training',
roi_query='Car')
# Use a mapping rule to translate "Car" (uppercase) to "car" (lowercase)
myDataView.add_mapping_rule(dataset_name='dataset_3',
version_name='training',
from_labels=['Car'],
to_label='car')
```
### Setting Label Enumeration Values
Set label enumeration values to maintain data conformity across multiple codebases and datasets.
It is important to set enumeration values for all labels of importance.
The default value for labels that are not assigned values is `-1`.
To assign enumeration values for labels use the [`DataView.set_labels`](google.com) method, set a mapping of a label
(string) to an integer for ROI labels in a Dataview object.
If certain ROI labels are [mapped](#mapping-roi-labels) from certain labels **to** other labels,
then use the labels you map **to** when setting enumeration values.
For example, if the labels `truck`, `van`, and `car` are mapped **to** `vehicle`, then set enumeration for `vehicle`.
```python
# Create a Dataview object for an iterator that randomly returns frames according to queries
myDataView = DataView(iteration_order=IterationOrder.random, iteration_infinite=True)
# Add a query for a Dataset version
myDataView.add_query(dataset_name='myDataset', version_name='myVersion',
roi_query='cat')
myDataView.add_query(dataset_name='myDataset', version_name='myVersion',
roi_query='dog')
myDataView.add_query(dataset_name='myDataset', version_name='myVersion',
roi_query='bird')
myDataView.add_query(dataset_name='myDataset', version_name='myVersion',
roi_query='sheep')
myDataView.add_query(dataset_name='myDataset', version_name='myVersion',
roi_query='cow')
# Set the enumeration label values
myDataView.set_labels({"cat": 1, "dog": 2, "bird": 3, "sheep": 4, "cow": 5, "ignore": -1,})
```

View File

@ -0,0 +1,118 @@
---
title: FrameGroups
---
The ClearML Enterprise provides **FrameGroup**s as an easy-to-use type of frame supporting multiple sources.
Add a list of SingleFrames to a FrameGroup, and then register FrameGroups in a Dataset version.
[View and edit](webapp/webapp_datasets_frames.md) FrameGroups and the SingleFrames they contain
in the ClearML Enterprise WebApp (UI).
A SingleFrame is composed of metadata for raw data that is harvested at a specific point in time for a
specific spatial area, as well as additional metadata such as annotations and masks. Therefore, a **FrameGroup** combines
more than one set of raw data and its additional metadata for the same point in time.
For example, use FrameGroups for the following:
* Multiple cameras on an autonomous car - A FrameGroup composed of SingleFrames for each camera.
* Multiple sensors on a machine detecting defects - A FrameGroup composed of SingleFrames for each sensor.
## Usage
### Creating a FrameGroup
A FrameGroup is like a dictionary of SingleFrames. Instantiate a FrameGroup and a SingleFrame. Then add the SingleFrame
object into the FrameGroup, with the key being the name of the SingleFrame, and the value being the SingleFrame object.
```python
from allegroai import FrameGroup, SingleFrame
# Create a FrameGroup object
frame_group = FrameGroup()
# Create a SingleFrame
frame = SingleFrame(source='https://allegro-datasets.s3.amazonaws.com/tutorials/000012.jpg',
width=512, height=512, preview_uri='https://allegro-datasets.s3.amazonaws.com/tutorials/000012.jpg')
# Add the first SingleFrame to the FrameGroup.
frame_group['FrameOne'] = frame
```
### Adding FrameGroups to a Dataset Version
To add FrameGroups to a Dataset Version:
1. Create a FrameGroup object
1. Add SingleFrames to the FrameGroup, where the key of each SingleFrame in the FrameGroup is the SingleFrame's name
1. Append the FrameGroup object to a list of frames
1. Add that list to a DatasetVersion.
```python
# Create a FrameGroup object
frame_group = FrameGroup()
# Create SingleFrame
single_frame = SingleFrame(source='https://allegro-datasets.s3.amazonaws.com/tutorials/000012.jpg')
# Add the first SingleFrame to the FrameGroup.
frame_group['FrameOne'] = single_frame
# The DatasetVersion.add_frames requires a list of frames.
frames = []
frames.append(frame_group)
# Add the FrameGroup to the version
myVersion.add_frames(frames)
```
### Accessing a FrameGroup
To access a FrameGroup, use the [DatasetVersion.get_single_frame](google.com) method, just like when
[accessing a SingleFrame](single_frames.md#accessing-singleframes).
```python
# Get the FrameGroup
frame_group = DatasetVersion.get_single_frame(frame_id='f3ed0e09bf23fc947f426a0d254c652c',
dataset_name='MyDataset',
version_name='FrameGroup')
```
### Updating FrameGroups
Updating FrameGroups is similar to [updating SingleFrames](single_frames.md#updating-singleframes), except that each
SingleFrame needs to be referenced using its name as the key in the FrameGroup.
```python
frames = []
# Get the FrameGroup
frame_group = DatasetVersion.get_single_frame(frame_id='f3ed0e09bf23fc947f426a0d254c652c',
dataset_name='MyDataset', version_name='FrameGroup')
# Add metadata by referencing the name of the SingleFrame in the FrameGroup
frame_group['FrameOne'].metadata['new_key'] = 'new_value'
# Update change to the FrameGroup
frames.append(frame_group)
myVersion.update_frames(frames)
```
### Deleting frames
To delete a FrameGroup, use the [DatasetVersion.delete_frames](google.com) method, just like when deleting a
SingleFrame, except that a FrameGroup is being referenced.
```python
frames = []
# Get the FrameGroup
frame_group = DatasetVersion.get_single_frame(frame_id='f3ed0e09bf23fc947f426a0d254c652c',
dataset_name='MyDataset', version_name='FrameGroup')
# Delete the FrameGroup
frames.append(frame_group)
myVersion.delete_frames(frames)
```

View File

@ -0,0 +1,14 @@
---
title: Frames Overview
---
The concept of a **Frame** represents the basic building block of data in ClearML Enterprise.
Two types of frames are supported:
* [SingleFrames](single_frames.md) - A frame with one source. For example, one image.
* [FrameGroups](frame_groups.md) - A frame with multiple sources. For example, multiple images.
**SingleFrames** and **FrameGroups** contain data sources, metadata, and other data. A Frame can be added to [Datasets](dataset.md)
and then modified or removed. [Versions](dataset.md#dataset-versioning) of the Datasets can be created, which enables
documenting changes and reproducing data for experiments.

241
docs/hyperdatasets/masks.md Normal file
View File

@ -0,0 +1,241 @@
---
title: Masks
---
When applicable, [`sources`](sources.md) contains `masks`, a list of dictionaries used to connect a special type of
source data to the ClearML Enterprise platform. That source data is a **mask**.
Masks are used in deep learning for semantic segmentation.
Masks correspond to raw data where the objects to be detected are marked with colors in the masks. The colors
are RGB values and represent the objects, which are labeled for segmentation.
In frames used for semantic segmentation, the metadata connecting the mask files / images to the ClearML Enterprise platform,
and the RGB values and labels used for segmentation are separate. They are contained in two different dictionaries of
a SingleFrame:
* **`masks`** (plural) is in [`sources`](sources.md) and contains the mask files / images `URI` (in addition to other keys
and values).
* **`mask`** (singular) is in the `rois` array of a Frame.
Each `rois` dictionary contains:
* RGB values and labels of a **mask** (in addition to other keys and values)
* Metadata and data for the labeled area of an image
See [Example 1](#example-1), which shows `masks` in `sources`, `mask` in `rois`, and the key-value pairs used to relate
a mask to its source in a frame.
## Masks structure
The chart below explains the keys and values of the `masks` dictionary (in the [`sources`](sources.md)
section of a Frame).
|Key|Value Description|
|---|----|
|`id`|**Type**: integer. <ul><li> The ID is used to relate this mask data source to the `mask` dictionary containing the label and RGB value for the mask.</li><li> See the `mask` key in `rois`.</li></ul>|
|`content_type`| **Type**: string. <ul><li> Type of mask data. For example, image / png or video / mp4.</li></ul>|
|`timestamp`|**Type**: integer. <ul><li>For images from a video, indicates the absolute position of the frame from the source (video) </li><li> For still images, set this to 0 (for example, video from a camera on a car, at 30 frames per second, would have a timestamp of 0 for the first frame, and 33 for the second frame).</li></ul>|
|`uri`|**Type**: string. <ul><li> URI of the mask file / image.</li></ul>|
## Examples
### Example 1
This example demonstrates an original image, its masks, and its frame containing
the `sources` and ROI metadata.
<details className="cml-expansion-panel info">
<summary className="cml-expansion-panel-summary">Example 1: View the frame</summary>
<div className="cml-expansion-panel-content">
This frame contains the `masks` list of dictionaries in `sources`,
and the `rois` array, as well as several top-level key-value pairs.
```json
{
"timestamp": 1234567889,
"context_id": "car_1",
"meta": {
"velocity": "60"
},
"sources": [
{
"id": "front",
"content_type": "video/mp4",
"width": 800,
"height": 600,
"uri": "https://s3.amazonaws.com/my_cars/car_1/front.mp4",
"timestamp": 1234567889,
"meta" :{
"angle":45,
"fov":129
},
"masks": [
{
"id": "seg",
"content_type": "video/mp4",
"uri": "https://s3.amazonaws.com/seg_masks/car_1/front_seg.mp4",
"timestamp": 123456789
},
{
"id": "seg_instance",
"content_type": "video/mp4",
"uri": "https://s3.amazonaws.com/seg_masks/car_1/front_instance_seg.mp4",
"timestamp": 123456789
}
]
}
],
"rois": [
{
"sources":["front"],
"label": ["seg"],
"mask": {
"id": "car",
"value": [210,210,120]
}
},
{
"sources":["front"],
"label": ["seg"],
"mask": {
"id": "person",
"value": [147,44,209]
}
},
{
"sources":["front"],
"label": ["seg"],
"mask": {
"id": "road",
"value": [197,135,146]
}
},
{
"sources":["front"],
"label": ["seg"],
"mask": {
"id": "street",
"value": [135,198,145]
}
},
{
"sources":["front"],
"label": ["seg"],
"mask": {
"id": "building",
"value": [72,191,65]
}
}
]
}
```
</div>
</details>
<br/>
* In `sources`:
* The source ID is `front`.
* In the `masks` dictionary, the source contains mask sources with IDs of `seg` and `seg_instance`.
* In `rois`:
* Each ROI source is `front`, relating the ROI to its original source image.
* Each ROI has a label of `seg`, indicating segmentation.
* Each `mask` has an `id` (`car`, `person`, `road`, `street`, and `building`) and a unique RGB `value`
(color-coding).
<details className="cml-expansion-panel screenshot">
<summary className="cml-expansion-panel-summary">Example image and masks</summary>
<div className="cml-expansion-panel-content">
Original Image
![image](../img/hyperdatasets/concepts_masks_image_only.png)
Mask image
![image](../img/hyperdatasets/concepts_masks.png)
</div>
</details>
<br/>
### Example 2
This example shows two masks for video from a camera. The masks label cars and the road.
<details className="cml-expansion-panel info">
<summary className="cml-expansion-panel-summary">Example 2: View the frame</summary>
<div className="cml-expansion-panel-content">
```json
"sources": [
{
"id": "front",
"content_type": "video/mp4",
"width": 800,
"height": 600,
"uri": "https://s3.amazonaws.com/my_cars/car_1/front.mp4",
"timestamp": 1234567889,
"meta" :{
"angle":45,
"fov":129
},
"masks": [
{
"id": "car",
"content_type": "video/mp4",
"uri": "https://s3.amazonaws.com/seg_masks/car_1/front_seg.mp4",
"timestamp": 123456789
},
{
"id": "road",
"content_type": "video/mp4",
"uri": "https://s3.amazonaws.com/seg_masks/car_1/front_instance_seg.mp4",
"timestamp": 123456789
}
]
}
],
"rois": [
{
"sources":["front"],
"label": ["right_lane"],
"mask": {
"id": "car",
"value": [210,210,120]
}
},
{
"sources":["front"],
"label": ["right_lane"],
"mask": {
"id": "road",
"value": [197,135,146]
}
}
```
</div>
</details>
<br/>
* In `sources`:
* The source ID is `front`.
* The source contains mask sources with IDs of `car` and `road`.
* In `rois`:
* Each ROI source is `front` relating the ROI to its original source image.
* Each ROI has a label of `right_lane` indicating the ROI object.
* Each `mask` has an `id` (`car`, `person`) and a unique RGB `value` (color-coding).

View File

@ -0,0 +1,34 @@
---
title: Hyper Datasets
---
ClearML's Hyper Datasets are an MLOps-oriented abstraction of your data, which facilitates traceable, reproducible model development
through parametrized data access and meta-data version control.
The basic premise is that a user-formed query is a full representation of the dataset used by the ML/DL process.
ClearML Enterprise's hyperdatasets supports rapid prototyping, creating new opportunities such as:
* Hyperparameter optimization of the data itself
* QA/QC pipelining
* CD/CT (continuous training) during deployment
* Enabling complex applications like collaborative (federated) learning.
## Hyperdataset Components
A hyperdataset is composed of the following components:
* [Frames](frames.md)
* [SingleFrames](single_frames.md)
* [FrameGroups](frame_groups.md)
* [Datasets and Dataset Versions](dataset.md)
* [Dataviews](dataviews.md)
These components interact in a way that enables revising data and tracking and accessing all of its version.
Frames are the basics units of data in ClearML Enterprise. SingleFrames and FrameGroups make up a Dataset version.
Dataset versions can be created, modified, and removed. The different version are recorded and available,
so experiments and their data are reproducible and traceable.
Lastly, Dataviews manage views of the dataset with queries, so the input data to an experiment can be defined from a
subset of a Dataset or combinations of Datasets.

View File

@ -0,0 +1,105 @@
---
title: Previews
---
A `preview` is a dictionary containing metadata for optional thumbnail images that can be used in the ClearML Enterprise
WebApp (UI) to view selected images in a Dataset. `previews` includes the `uri` of the thumbnail image.
Previews are not mandatory. Their primary use is to view images with formats that cannot be displayed in a web browser
(such as TIFF and 3D formats).
## Example
The following is an example of preview metadata.
```json
"preview": {
"content_type": "image/jpg",
"uri": "https://s3.amazonaws.com/my_previews/car_1/front_preview.jpg",
"timestamp": 0
}
```
<details className="cml-expansion-panel info">
<summary className="cml-expansion-panel-summary">View an entire frame containing a preview</summary>
<div className="cml-expansion-panel-content">
```json
{
"timestamp": 1234567889,
"context_id": "car_1",
"meta": {
"velocity": "60"
},
"sources": [
{
"id": "front",
"content_type": "video/mp4",
"width": 800,
"height": 600,
"uri": "https://s3.amazonaws.com/my_cars/car_1/front.mp4",
"timestamp": 1234567889,
"meta" :{
"angle":45,
"fov":129
},
"preview": {
"content_type": "image/jpg",
"uri": "https://s3.amazonaws.com/my_previews/car_1/front_preview.jpg",
"timestamp": 0
},
"masks": [
{
"id": "seg",
"content_type": "video/mp4",
"uri": "https://s3.amazonaws.com/seg_masks/car_1/front_seg.mp4",
"timestamp": 1234567889
},
{
"id": "instances_seg",
"content_type": "video/mp4",
"uri": "https://s3.amazonaws.com/seg_masks/car_1/front_instance_seg.mp4",
"timestamp": 1234567889
}
]
},
{
"id": "rear",
"uri": "https://s3.amazonaws.com/my_cars/car_1/rear.mp4",
"content_type": "video/mp4",
"timestamp": 1234567889
}
],
"rois": [
{
"sources":["front"],
"label": ["right_lane"],
"mask": {
"id": "seg",
"value": [-1, 1, 255]
}
},
{
"sources": ["front"],
"label": ["bike"],
"poly":[30, 50, 50,50, 100,50, 100,100],
"meta": {
"velocity": 5.4
}
},
{
"sources": ["front", "rear"],
"label": ["car"],
"poly":[30, 50, 50,50, 100,50, 100,100]
}
]
}
```
</div>
</details>
<br/>
Here's an example of Previews in the ClearML Enterprise WebApp (UI). Each thumbnail is a Preview.
![image](../img/hyperdatasets/web-app/previews.png)

View File

@ -0,0 +1,311 @@
---
title: SingleFrames
---
A `SingleFrame` contains metadata pointing to raw data, and other metadata and data, which supports experimentation and
**ClearML Enterprise**'s Git-like Dataset versioning.
## Frame Components
A `SingleFrame` contains the following components:
* [Sources](#sources)
* [Annotations](#annotation)
* [Masks](#masks)
* [Previews](#previews)
* [Metadata](#metadata)
### Sources
Every `SingleFrame` includes a [`sources`](sources.md) dictionary, which contains attributes of the raw data, including:
* URI pointing to the source data (image or video)
* Sources for masks used in semantic segmentation
* Image previews, which are thumbnails used in the WebApp (UI).
For more information, see [Sources](sources.md).
### Annotation
Each `SingleFrame` contains a list of dictionaries, where each dictionary includes information about a specific annotation.
Two types of annotations are supported:
* **FrameGroup objects** - label for Regions of Interest (ROIs)
* **FrameGroup labels** - labels for the entire frame
For more information, see [Annotations](annotations.md).
### Masks
A `SingleFrame` includes a URI link to a mask file if applicable. Masks correspond to raw data where the objects to be
detected in raw data are marked with colors in the masks.
For more information, see [Masks](masks.md).
### Previews
`previews` is a dictionary containing metadata for optional thumbnail images that can be used in the ClearML Enterprise WebApp (UI)
to view selected images in a Dataset. `previews` includes the `uri` of the thumbnail image.
For more information, see [Previews](previews.md).
### Metadata
`metadata` is a dictionary with general information about the `SingleFrame`.
For more information, see [Custom Metadata](custom_metadata.md).
## Frame structure
The panel below describes the details contained within a `frame`:
<details className="cml-expansion-panel info">
<summary className="cml-expansion-panel-summary">Frame Structure</summary>
<div className="cml-expansion-panel-content">
* `id` (*string*) - The unique ID of this frame.
* `blob` (*string*) - Raw data.
* `context_id` (*string*) - Source URL.
* `dataset` (*dict*) - The Dataset and version containing the frame.
* `id` - ID of the Dataset.
* `version` - ID of the version.
* `meta` (*dict*) - Frame custom metadata. Any custom key-value pairs (`sources` and `rois` can also contain a meta
dictionary for custom key-value pairs associated with individual sources and rois). See [Custom Metadata](custom_metadata.md).
* `num_frames`
* `rois` (*[dict]*) - Metadata for annotations, which can be Regions of Interest (ROIs) related to this frame's source data,
or frame labels applied to the entire frame (not a region). ROIs are labeled areas bounded by polygons or labeled RGB
values used for object detection and segmentation. See [Annotations](annotations.md).
* `id` - ID of the ROI.
* `confidence` (*float*) - Confidence level of the ROI label (between 0 and 1.0).
* `labels` (*[string]*)
* For [FrameGroup objects](#frame-objects) (Regions of Interest), these are the labels applied to the ROI.
* For [FrameGroup labels](#frame-labels), this is the label applied to the entire frame.
* `mask` (*dict*) - RGB value of the mask applied to the ROI, if a mask is used (for example, for semantic segmentation).
The ID points to the source of the mask.
* `id` - ID of the mask dictionary in `sources`.
* `value` - RGB value of the mask.
* `poly` (*[int]*) - Bounding area vertices.
* `sources` (*[string]*) - The `id` in the `sources` dictionary which relates an annotation to its raw data source.
* `sources` (*[dict]*) - Sources of the raw data in this frame. For a SingleFrame this is one source. For a FrameGroup,
this is multiple sources. See [Sources](sources.md).
* `id` - ID of the source.
* `uri` - URI of the raw data.
* `width` - Width of the image or video.
* `height` - Height of the image or video.
* `mask` - Sources of masks used in the `rois`.
* `id` - ID of the mask source. This relates a mask source to an ROI.
* `content_type` - The type of mask source. For example, `image/jpeg`.
* `uri` - URI of the mask source.
* `timestamp`
* `preview` - URI of the thumbnail preview image used in the ClearML Enterprise WebApp (UI)
* `timestamp` - For images from video, a timestamp that indicates the absolute position of this frame from the source (video).
For example, if video from a camera on a car is taken at 30 frames per second, it would have a timestamp of 0 for
the first frame, and 33 for the second frame. For still images, set this to 0.
* `saved_in_version` - The version in which the frame is saved.
* `saved` - The epoch time that the frame was saved.
* `timestamp` - For images from video, a timestamp that indicates the absolute position of this frame from the source (video).
</div>
</details>
<br/>
## WebApp
A frame that has been connected to the **ClearML Enterprise** platform is available to view and analyze on the
WebApp (UI).
When viewing a frame on the WebApp, all the information associated with it can be viewed, including its frame labels and
object annotations, its metadata, and other details.
<details className="cml-expansion-panel screenshot">
<summary className="cml-expansion-panel-summary">SingleFrame in the WebApp frame viewer</summary>
<div className="cml-expansion-panel-content">
This image shows a SingleFrame in the ClearML Enterprise WebApp (UI) [frame viewer](webapp/webapp_datasets_frames.md#frame-viewer).
![image](../img/hyperdatasets/frame_overview_01.png)
</div>
</details>
<br/>
<details className="cml-expansion-panel info">
<summary className="cml-expansion-panel-summary">SingleFrame details represented in the WebApp</summary>
<div className="cml-expansion-panel-content">
id : "287024"
timestamp : 0
rois : Array[2] [
{
"label":["tennis racket"],
"poly":[174,189,149,152,117,107,91,72,68,45,57,33,53,30,49,32,48,34,46,35,46,37,84,92,112,128,143,166,166,191,170,203,178,196,179,194,-999999999,194,238,204,250,212,250,221,250,223,249,206,230,205,230],
"confidence":1,
"sources":["default"],
"id":"f9fc8629d99b4e65aecacedd32ac356e"
},
{
"label":["person"],
"poly":[158,365,161,358,165,335,170,329,171,321,171,307,173,299,172,292,171,277,171,269,170,260,170,254,171,237,177,225,172,218,167,215,164,207,167,205,171,199,174,196,183,193,188,192,192,192,202,199,207,200,232,187,238,182,240,178,244,172,245,169,245,166,241,163,235,164,233,159,239,150,240,146,240,134,237,137,231,141,222,142,217,136,216,130,215,123,215,116,224,102,229,99,233,96,245,108,256,92,272,84,292,87,309,92,319,101,328,121,329,134,327,137,325,140,331,152,327,155,323,159,324,167,320,174,319,183,327,196,329,232,328,243,323,248,315,254,316,262,314,269,314,280,317,302,313,326,311,330,301,351,299,361,288,386,274,410,269,417,260,427,256,431,249,439,244,448,247,468,249,486,247,491,245,493,243,509,242,524,241,532,237,557,232,584,233,608,233,618,228,640,172,640,169,640,176,621,174,604,147,603,146,609,151,622,144,634,138,638,128,640,49,640,0,640,0,636,0,631,0,630,0,629,37,608,55,599,66,594,74,594,84,593,91,593,99,571,110,534,114,523,117,498,116,474,113,467,113,459,113,433,113,427,118,412,137,391,143,390,147,386,157,378,157,370],
"confidence":1,
"sources":["default"],
"id":"eda8c727fea24c49b6438e5e17c0a846"
}
]
sources : Array[1] [
{
"id":"default",
"uri":"https://s3.amazonaws.com/allegro-datasets/coco/train2017/000000287024.jpg",
"content_type":"image/jpeg",
"width":427,
"height":640,
"timestamp":0
}
]
dataset : Object
{
"id":"f7edb3399164460d82316fa5ab549d5b",
"version":"6ad8b10c668e419f9dd40422f667592c"
}
context_id : https://s3.amazonaws.com/allegro-datasets/coco/train2017/000000287024.jpg
saved : 1598982880693
saved_in_version : "6ad8b10c668e419f9dd40422f667592c"
num_frames : 1
</div>
</details>
<br/>
For more information about using Frames in the WebApp, see [Working with Frames](webapp/webapp_datasets_frames.md).
## Usage
### Creating a SingleFrame
To create a `SingleFrame`, instantiate a `SingleFrame` class and populate it with:
* The URI link to the source file of the data frame
* A preview URI that is accessible by browser, so you will be able to visualize the data frame in the web UI
```python
from allegroai import SingleFrame
frame = SingleFrame(
source='/home/user/woof_meow.jpg',
width=None,
height=None,
preview_uri='https://storage.googleapis.com/kaggle-competitions/kaggle/3362/media/woof_meow.jpg',
metadata=None,
annotations=None,
mask_source=None,
)
```
There are also options to populate the instance with:
* Dimensions - `width` and `height`
* General information about the frame - `metadata`
* A dictionary of annotation objects - `annotations`
* A URI link to a mask file for the frame - `mask_source`
For more information, see [SingleFrame](google.com).
### Adding SingleFrames to a Dataset Version
Use the [`DatasetVersion.add_frames`](google.com) method to add SingleFrames to a [Dataset version](dataset.md#dataset-versioning)
(see [Creating snapshots](dataset.md#creating-snapshots) or [Creating child versions](dataset.md#creating-child-versions)).
```python
from allegroai import DatasetVersion, SingleFrame
# a frames list is required for adding frames
frames = []
# create a frame
frame = SingleFrame(
source='https://allegro-datasets.s3.amazonaws.com/tutorials/000012.jpg',
width=512,
height=512,
preview_uri='https://allegro-datasets.s3.amazonaws.com/tutorials/000012.jpg',
metadata={'alive':'yes'},
)
frames.append(frame)
# add frame to the Dataset version
myDatasetversion.add_frames(frames)
```
### Accessing SingleFrames
To access a SingleFrame, use the [DatasetVersion.get_single_frame](google.com) method.
```python
from allegroai import DatasetVersion
frame = DatasetVersion.get_single_frame(frame_id='dcd81d094ab44e37875c13f2014530ae',
dataset_name='MyDataset', # OR dataset_id='80ccb3ae11a74b91b1c6f25f98539039'
version_name='SingleFrame' # OR version_id='b07b626e3b6f4be7be170a2f39e14bfb'
)
```
To access a SingleFrame, the following must be specified:
* `frame_id`, which can be found in the WebApp, in the frame's **FRAMEGROUP DETAILS**
* The frame's dataset - either with `dataset_name` or `dataset_id`
* The dataset version - either with `version_id` or `version_name`
### Updating SingleFrames
To update a SingleFrame:
* Access the SingleFrame by calling the [DatasetVersion.get_single_frame](google.com) method,
* Make changes to the frame
* Update the frame in a DatasetVersion using the [DatasetVersion.update_frames](google.com) method.
```python
frames = []
# get the SingleFrame
frame = DatasetVersion.get_single_frame(frame_id='dcd81d094ab44e37875c13f2014530ae',
dataset_name='MyDataset',
version_name='SingleFrame')
# make changes to the frame
## add a new annotation
frame.add_annotation(poly2d_xy=[154, 343, 209, 343, 209, 423, 154, 423],
labels=['tire'], metadata={'alive': 'no'},
confidence=0.5)
## add metadata
frame.meta['road_hazard'] = 'yes'
# update the SingeFrame
frames.append(frame)
myDatasetVersion.update_frames(frames)
```
### Deleting frames
To delete a SingleFrame, use the [DatasetVersion.delete_frames](google.com) method.
```python
frames = []
# get the SingleFrame
frame = DatasetVersion.get_single_frame(frame_id='f3ed0e09bf23fc947f426a0d254c652c',
dataset_name='MyDataset', version_name='FrameGroup')
# delete the SingleFrame
frames.append(frame)
myDatasetVersion.delete_frames(frames)
```

View File

@ -0,0 +1,207 @@
---
title: Sources
---
Each frame contains `sources`, a list of dictionaries containing:
* Attributes of the source data (image raw data)
* A `URI` pointing to the source data (image or video)
* Sources for [masks](masks.md) used in semantic segmentation
* Image [previews](previews.md), which are thumbnails used in the ClearML Enterprise WebApp (UI).
`sources` does not contain:
* `rois` even though ROIs are directly associated with the images and `masks` in `sources`
* ROI metadata, because ROIs can be used over multiple frames.
Instead, frames contain a top-level `rois` array, which is a list of ROI dictionaries, where each dictionary contains a
list of source IDs. Those IDs connect `sources` to ROIs.
## Examples
The examples below demonstrate the `sources` section of a Frame for different types of content.
### Example 1: Video sources
This example demonstrates `sources` for video.
<details className="cml-expansion-panel info">
<summary className="cml-expansion-panel-summary">Example 1</summary>
<div className="cml-expansion-panel-content">
```json
/* video from one of four cameras on car */
"sources": [
{
"id": "front",
"content_type": "video/mp4",
"width": 800,
"height": 600,
"uri": "https://s3.amazonaws.com/my_cars/car_1/front.mp4",
"timestamp": 1234567889,
"meta" :{
"angle":45,
"fov":129
},
},
{
"id": "rear",
"uri": "https://s3.amazonaws.com/my_cars/car_1/rear.mp4",
"content_type": "video/mp4",
"timestamp": 1234567889
}
```
</div>
</details>
<br/>
The `sources` example above details a video from a car that has two cameras. One camera
is the source with the ID `front` and the other is the source with the ID `rear`.
`sources` includes the following information about the Frame:
* `content_type` - The video is an `mp4` file
* `width` and `height` - Each frame in the video is `800` pixels by `600` pixels,
* `uri` - The raw data is located in `s3.amazonaws.com/my_cars/car_1/front.mp4` and `s3.amazonaws.com/my_cars/car_1/rear.mp4`
(the front and rear camera, respectively)
* `timestamp` - This indicates the absolute position of the frame in the video
* `meta` - Additional metadata is included for the angle of the camera (`angle`) and its field of vision (`fov`).
:::note
Sources includes a variety of content types. This example shows mp4 video.
:::
### Example 2: Images sources
This example demonstrates `sources` images.
<details className="cml-expansion-panel info">
<summary className="cml-expansion-panel-summary">Example 2</summary>
<div className="cml-expansion-panel-content">
```json
/* camera images */
"sources": [
{
"id": "default",
"content_type": "png",
"width": 800,
"height": 600,
"uri": "https://s3.amazonaws.com/my_images/imag1000.png",
"timestamp": 0,
}
```
</div>
</details>
<br/>
The `sources` of this frame contains the following information:
* `content_type` - This frame contains an image in `png` format.
* `width` and `height` - It is `800` px by `600` px,
* `uri` - The raw data is located in `https://s3.amazonaws.com/my_images/imag1000.png`
* `timestamp` is 0 (timestamps are used for video).
### Example 3: Sources and regions of interest
This example demonstrates `sources` for video, `masks`, and `preview`.
<details className="cml-expansion-panel info">
<summary className="cml-expansion-panel-summary">Example 3</summary>
<div className="cml-expansion-panel-content">
```json
{
"timestamp": 1234567889,
"context_id": "car_1",
"meta": {
"velocity": "60"
},
"sources": [
{
"id": "front",
"content_type": "video/mp4",
"width": 800,
"height": 600,
"uri": "https://s3.amazonaws.com/my_cars/car_1/front.mp4",
"timestamp": 1234567889,
"meta" :{
"angle":45,
"fov":129
},
"preview": {
"content_type": "image/jpg",
"uri": "https://s3.amazonaws.com/my_previews/car_1/front_preview.jpg",
"timestamp": 0
},
"masks": [
{
"id": "seg",
"content_type": "video/mp4",
"uri": "https://s3.amazonaws.com/seg_masks/car_1/front_seg.mp4",
"timestamp": 1234567889
},
{
"id": "instances_seg",
"content_type": "video/mp4",
"uri": "https://s3.amazonaws.com/seg_masks/car_1/front_instance_seg.mp4",
"timestamp": 1234567889
}
]
},
{
"id": "rear",
"uri": "https://s3.amazonaws.com/my_cars/car_1/rear.mp4",
"content_type": "video/mp4",
"timestamp": 1234567889
}
],
"rois": [
{
"sources":["front"],
"label": ["right_lane"],
"mask": {
"id": "seg",
"value": [-1, 1, 255]
}
},
{
"sources": ["front"],
"label": ["bike"],
"poly":[30, 50, 50,50, 100,50, 100,100],
"meta": {
"velocity": 5.4
}
},
{
"sources": ["front", "rear"],
"label": ["car"],
"poly":[30, 50, 50,50, 100,50, 100,100]
}
]
}
```
</div>
</details>
<br/>
This frame shows the `masks` section in `sources`, and the top-level `rois` array.
In `sources`, the `masks` subsection contains the sources for the two masks associated with the raw data.
The raw mask data is located in:
* `https://s3.amazonaws.com/my_cars/car_1/front.mp4`
* `https://s3.amazonaws.com/my_previews/car_1/front_preview.jpg`
In `rois`, the `mask` section is associated with its `masks` source using the `id` key.
In this example:
* In the `rois` array, there is a region of interest that has a `mask` with the ID `seg` and an RGB
value
* The `masks` section in `sources` contains the location of each mask. The first dictionary of `masks`
details the mask with the ID `seg`. The ID connects it to the `seg` mask in `rois`
`sources` also contains the source of a preview. It is located in: `https://s3.amazonaws.com/my_previews/car_1/front_preview.jpg`.

View File

@ -0,0 +1,31 @@
---
title: Tasks
---
Hyper Datasets extend the **ClearML** [**Task**](../fundamentals/task.md) with [Dataviews](dataviews.md)
## Usage
Hyper datasets are supported through `allegroai` python package.
### Connecting Dataviews to a Task
Use [`Task.connect`](../references/sdk/task.md#connect) to connect a Dataview object to a Task:
```python
from allegroai import DataView
dataview = DataView()
task.connect(dataview)
```
### Accessing a Task's Dataviews
Use the [Task.get_dataviews](google.com) method to access the Dataviews that are connected to a Task.
```python
task.get_dataviews():
```
This returns a dictionary of Dataview objects and their names.

View File

@ -0,0 +1,162 @@
---
title: Annotation Tasks
---
Use the Annotations page to access and manage annotation Tasks.
Use annotation tasks to efficiently organize the annotation of frames in Dataset versions and manage the work of annotators
(see [Annotating Images and Videos](#annotating-images-and-video)).
## Managing Annotation Tasks
### Creating Annotation Tasks
![image](../../img/hyperdatasets/annotation_task_01.png)
**To create an annotation task:**
1. On the Annotator page, click **+ ADD NEW ANNOTATION**.
1. Enter a name for your new annotation task.
1. Choose a Dataset version to annotate. If the selected Dataset version's status is *Published*, then creating this
annotation task also creates a child version of the selected version. The new child version's status is *Draft*, and
its name is the same as the annotation task.
1. Set the filters for the frames this annotation task presents to the annotator.
* In the **SET FILTERS** list, choose either:
* **All Frames** - Include all frames in this task.
* **Empty Frames** - Include only frames without any annotations in this task.
* **By Label** - Include only frames with specific labels, and optionally filter these frames by confidence level and
the number of instances. You can also click <img src="/static/icons/ico-code.svg" className="icon size-md space-sm" /> and then add a Lucene query for this ROI label filter.
1. Choose the iteration parameters specifying how frames in this version are presented to the annotator.
1. In **ITERATION**, in the **ORDER** list, choose either:
* **Sequential** - Frames are sorted by the frame top-level `context_id` (primary sort key) and `timestamp` (secondary sort key) metadata key values, and returned by the iterator in the sorted order.
* **Random** - Frames are randomly returned using the value of the `random_seed` argument. The random seed is maintained with the experiments. Therefore, the random order is reproducible if the experiment is rerun.
1. In **REPETITION**, choose either **Use Each Frame Once** or **Limit Frames**. If you select **Limit Frames**, then in **Use Max. Frames**, type the number of frames to annotate.
1. If iterating randomly, in **RANDOM SEED** type your seed or leave blank, and the ClearML Enterprise platform generates a seed for you.
1. If annotating video, then in **CLIP LENGTH (FOR VIDEO)**, type of the number of sequential frames per iteration to annotate.
1. Click **Create**.
### Completing annotation tasks
To mark an annotation task as **Completed**:
* In the annotation task card, click <img src="/static/icons/ico-bars-menu.svg" className="icon size-md space-sm" /> (menu) **>** **Complete** **>** **CONFIRM**.
### Deleting annotation tasks
To delete an annotation task:
* In the annotation task card, click <img src="/static/icons/ico-bars-menu.svg" className="icon size-md space-sm" /> (menu) **>** **Delete** **>** **CONFIRM**.
### Filtering annotation tasks
There are two option for filtering annotation tasks:
* Active / Completed Filter - Toggle to show annotation tasks that are either **Active** or **Completed**
* Dataset Filter - Use to view only the annotation tasks for a specific Dataset.
### Sorting annotation tasks
Sort the annotation tasks by either using **RECENT** or **NAME** from the drop-down menu on the top left of the page.
### Viewing annotation task information
To View the Dataset version, filters, and iteration information:
* In the annotation task card, click <img src="/static/icons/ico-bars-menu.svg" className="icon size-md space-sm" /> (menu) **>** **Info**
## Annotating Images and Video
Annotate images and video by labeling regions of interest in Dataset version frames. The frames presented for annotation
depend upon the settings in the annotation task (see [Creating Annotation Tasks](#creating-annotation-tasks)).
### Annotating Frames
**To annotate frames:**
1. On the Annotator page, click the annotation task card, or click <img src="/static/icons/ico-bars-menu.svg" className="icon size-md space-sm" /> (menu)
and then click **Annotate**.
1. See instructions below about annotating frames.
#### Add FrameGroup objects
1. Select an annotation mode and add the bounded area to the frame image.
* Rectangle mode - Click <img src="/static/icons/ico-rectangle-icon-purple.svg" className="icon size-md space-sm" /> and then click the image, drag and release.
* Polygon mode - Click <img src="/static/icons/ico-polygon-icon-purple.svg" className="icon size-md space-sm" /> and then click the image for the first vertex,
move to another vertex and click, continue until closing the last vertex.
* Key points mode - Click <img src="/static/icons/ico-keypoint-icon-purple.svg" className="icon size-md space-sm" /> and then click each key point.
1. In the new label area, choose or enter a label.
1. Optionally, add metadata.
1. Optionally, lock the annotation.
#### Add frame labels
1. In **FRAME LABEL**, click **+ Add new**.
1. In the new label area, choose or enter a label.
1. Optionally, add metadata.
1. Optionally, lock the annotation.
#### Copy / paste an annotations
1. Click the annotation or bounded area in the image or video clip.
1. Optionally, navigate to a different frame.
1. Click **PASTE**. The new annotation appears in the same location as the one you copied.
1. Optionally, to paste the same annotation, again, click **PASTE**.
#### Copy / paste all annotations
1. Click **COPY ALL**.
1. Optionally, navigate to a different frame.
1. Click **PASTE**.
#### Move annotations
* Move a bounded area by clicking on it and dragging.
#### Resize annotations
* Resize a bounded area by clicking on a vertex and dragging.
#### Delete annotations
1. Click the annotation or bounded area in the image or video clip.
1. Press **DELETE** or in the annotation, click **>X**.
#### Add labels
* Click in the annotation and choose a label from the label list, or type a new label.
#### Modify labels
* In the annotation label textbox, choose a label from the list or type a new label.
#### Delete labels
* In the annotation, in the label area, click the label's **X**.
#### Modify annotation metadata
* In the label, click edit and then in the popup modify the metadata dictionary (in JSON format).
#### Modify annotation color
* Modify the color of an area by clicking the circle in the label name and select a new color.
#### Lock / unlock annotations
* Click the lock.
#### Modify frame metadata
* Expand the **FRAME METADATA** area, click edit, and then in the popup modify the metadata dictionary (in JSON format).

View File

@ -0,0 +1,44 @@
---
title: Datasets Page
---
The Datasets page offers the following functionalities:
* Managing the ClearML Enterprise **Datasets** and **versions**, which connect raw data to the ClearML Enterprise platform
* Using ClearML Enterprise's Git-like Dataset versioning features
* Managing SingleFrames and FrameGroups.
![image](../../img/hyperdatasets/datasets_01.png)
## Dataset cards
Dataset cards show summary information about versions, frames, and labels in a Dataset, and the elapsed time since the Dataset was last update and the user doing the update. Dataset cards allow you to open a specific Dataset to perform Dataset versioning and frames management.
* Dataset name
* Elapsed time since the last update. Hover over elapsed time and view date of last update.
* User updating the Dataset
* The number of versions in the Dataset
* The total number of frames in all versions of the Dataset. If an asterisk (\*) appears next to **FRAMES**, then you can hover it and see the name of the version whose frames were last updated appears.
* The percentage of frames annotated in all versions of the Dataset. If an asterisk (\*) appears next to **ANNOTATED**, then you can hover it and see the name of the version whose frames were last annotated appears.
* If the Dataset version's status is *Published*, then the top labels in the Dataset, color coded (colors are editable) appear. If the Dataset version is Draft, then no labels appear.
:::note
To change the label color coding, hover over a label color, click thr hand pointer, and then select a new color.
:::
## Creating new Datasets
Create a new Dataset which will contain one version named `Current`. The new version will not contain any frames.
* Click **+ NEW DATASET** **>** Enter a name and optionally a description **>** **CREATE DATASET**.
## Sort Datasets
* In **RECENT**, choose either:
* **RECENT** - Most recently update of the Datasets.
* **NAME** - Alphabetically sort by Dataset name.

View File

@ -0,0 +1,310 @@
---
title: Working with Frames
---
View and edit SingleFrames in the Dataset page. After selecting a Dataset version, the **Version Browser** shows a sample
of frames and enables viewing SingleFrames and FramesGroups, and edit SingleFrames, in the [frame viewer](#frame-viewer).
Before opening the frame viewer, you can filter the frames by applying [simple](#simple-frame-filtering) or [advanced](#advanced-frame-filtering)
filtering logic.
![image](../../img/hyperdatasets/frames_01.png)
## Frame viewer
Frame viewer allows you to view and edit annotations which can be FrameGroup objects (Regions of Interest) and FrameGroup
labels applied to the entire frame not a region of the frame, the frame details (see [frames](../frames.md)),
frame metadata, the raw data source URI, as well as providing navigation and viewing tools.
![image](../../img/hyperdatasets/web-app/dataset_example_frame_editor.png)
### Frame viewer controls
Use frame viewer controls to navigate between frames in a Dataset Version, and control frame changes and viewing.
|Control Icon|Actions|
|-----|------|
|<img src="/static/icons/ico-skip-backward.svg" className="icon size-md space-sm" />|Jump backwards (CTRL + Left). Jumps backwards by five frames.|
|<img src="/static/icons/ico-skip-previous.svg" className="icon size-md space-sm" />|Go to the previous frame containing a non-filtered annotation. The filter is the minimum confidence level setting. If the confidence level filter is set to zero, any frame containing annotations matches the filter.|
|<img src="/static/icons/ico-arrow-left.svg" className="icon size-md space-sm" />|Go to the previous frame (Left Arrow).|
|<img src="/static/icons/ico-arrow-right.svg" className="icon size-md space-sm" />|Go to the next frame (Right Arrow).|
|<img src="/static/icons/ico-skip-next.svg" className="icon size-md space-sm" />|Go to the next frame containing a non-filtered annotation (same filter as <img src="/static/icons/ico-skip-previous.svg" className="icon size-md space-sm" />).|
|<img src="/static/icons/ico-skip-forward.svg" className="icon size-md space-sm" />|Jump forwards (CTRL + Right). Jumps 5 frames forwards.|
|<img src="/static/icons/ico-revert.svg" className="icon size-md space-sm" />|Reload the frame.|
|<img src="/static/icons/ico-undo.svg" className="icon size-md space-sm" />|Undo changes.|
|<img src="/static/icons/ico-redo.svg" className="icon size-md space-sm" />|Redo changes.|
|<img src="/static/icons/ico-reset_1.svg" className="icon size-md space-sm" />|Autofit|
|<img src="/static/icons/ico-zoom-in.svg" className="icon size-md space-sm" />|Zoom in|
|<img src="/static/icons/ico-zoom-out.svg" className="icon size-md space-sm" />|Zoom out|
|Percentage textbox|Zoom percentage|
### Viewing and editing frames
**To view / edit a frame in the frame editor**
1. Locate your frame by applying a [simple frame filter](#simple) or [advanced frame filter](#advanced), and clicking <span class="tr_gui">LOAD MORE</span>, if required.
1. Click the frame thumbnail. The frame editor appears.
1. Do any of the following:
* View frame details, including:
* Frame file path
* Dimensions of the image or video
* Frame details
* Frame metadata
* Annotations
* Frame objects - Labeled Regions of Interest, with confidence levels and custom metadata per frame object.
* Frame labels - Labels applied to the entire frame, not a region in the frame.
* Optionally, filter annotations by confidence level using the <span class="tr_gui">Minimum confidence</span> slider.
* Add, change, and delete [annotations](#annotations) and [frame metadata](#frame-metadata).
:::important
To save frames changes at any time, click **SAVE** (below the annotation list area).
:::
### Viewing FrameGroups
Viewing and editing frames in a FrameGroup is similar to viewing and editing SingleFrames.
Click the FrameGroup in the Dataset. In the frame viewer, select SingleFrame to view / modify from
a dropdown list in the **Current Source** section.
![image](../../img/hyperdatasets/framegroup_01.png)
## Filtering frames
### Simple frame filtering
Simple frame filtering applies one annotation object (ROI) label and returns frames containing at least one annotation
with that label.
**To apply a simple frame filter:**
* In the **Version Browser**, choose a label on the label list.
For example:
* Before filtering, the **Version Browser** in the image below contains seven frames.
<details className="cml-expansion-panel screenshot">
<summary className="cml-expansion-panel-summary">View a screenshot</summary>
<div className="cml-expansion-panel-content">
![image](../../img/hyperdatasets/frame_filtering_01.png)
</div>
</details>
<br/>
* A simple label filter for `person` shows three frames with each containing at least one ROI labeled `person`.
<details className="cml-expansion-panel screenshot">
<summary className="cml-expansion-panel-summary">View a screenshot</summary>
<div className="cml-expansion-panel-content">
![image](../../img/hyperdatasets/frame_filtering_02.png)
</div>
</details>
### Advanced frame filtering
Advanced frame filtering applies sophisticated filtering logic, which is composed of as many frame filters as needed,
where each frame filter can be a combination of ROI, frame, and source rules.
* ROI rules use include and exclude logic to match frames by ROI label; an ROI label can match frames containing at least
one annotation object (ROI) with all labels in the rule.
* Frame rules and source rules use Lucene queries with AND, OR, and NOT logic. Frame rules apply to frame metadata.
* Source rules apply to frame source information.
**To apply advanced filters:**
1. In the **Version Browser**, click **Switch to advanced filters**.
1. In a **FRAME FILTER**, create one of the following rules:
* ROI rule
* Choose **Include** or **Exclude**, select ROI labels, and optionally set the confidence level range.
* To switch from the ROI dropdown list to a Lucene query mode, click <img src="/static/icons/ico-edit.svg" className="icon size-md space-sm" />.
* Frame rule - Enter a Lucene query using frame metadata fields in the format `meta.<key>:<value>`.
* Source rule - Enter a Lucene query using frame metadata fields in the format `sources.<key>:<value>`.
### Examples
#### ROI rules
* Create one ROI rule for <code>person</code> shows the same three frames as the simple frame filter (above).
<details className="cml-expansion-panel screenshot">
<summary className="cml-expansion-panel-summary">View a screenshot</summary>
<div className="cml-expansion-panel-content">
![image](../../img/hyperdatasets/frame_filtering_03.png)
</div>
</details>
<br/>
* In the ROI rule, add a second label. Add `partially_occluded`. Only frames containing at least one ROI labeled as both <code>person</code> and <code>partially_occluded</code> match the filter.
<details className="cml-expansion-panel screenshot">
<summary className="cml-expansion-panel-summary">View a screenshot</summary>
<div className="cml-expansion-panel-content">
![image](../../img/hyperdatasets/frame_filtering_04.png)
</div>
</details>
<br/>
By opening a frame in the frame viewer, you can see an ROI labeled with both.
<details className="cml-expansion-panel screenshot">
<summary className="cml-expansion-panel-summary">View a screenshot</summary>
<div className="cml-expansion-panel-content">
![image](../../img/hyperdatasets/frame_filtering_05.png)
</div>
</details>
<br/>
#### Frame rules
Filter by metadata using Lucene queries.
* Add a frame rule to filter by the metadata key <code>dangerous</code> for the value of <code>no</code>.
<details className="cml-expansion-panel screenshot">
<summary className="cml-expansion-panel-summary">View a screenshot</summary>
<div className="cml-expansion-panel-content">
![image](../../img/hyperdatasets/frame_filtering_08.png)
</div>
</details>
<br/>
By opening a frame in the frame viewer, you can see the metadata.
<details className="cml-expansion-panel screenshot">
<summary className="cml-expansion-panel-summary">View a screenshot</summary>
<div className="cml-expansion-panel-content">
![image](../../img/hyperdatasets/frame_filtering_09.png)
</div>
</details>
<br/>
#### Source rules
Filter by sources using Lucene queries.
* Add a source rule to filter for sources URIs with a wildcards.
<details className="cml-expansion-panel screenshot">
<summary className="cml-expansion-panel-summary">View a screenshot</summary>
<div className="cml-expansion-panel-content">
![image](../../img/hyperdatasets/frame_filtering_10.png)
</div>
</details>
<br/>
Use Lucene queries in ROI label filters and frame rules.
## Annotations
### Frame objects (Regions of Interest)
You can add annotations by drawing new bounding areas, and copying existing annotations in the same or other frames.
**To draw a bounding area for a new annotation:**
1. Optionally, select a default label in the Default ROI Label(s) list. New annotations are automatically assigned this label.
1. Click one of the following modes and create a bounding area in the frame:
* <img src="/static/icons/ico-rectangle-icon-purple.svg" className="icon size-md space-sm" /> - Rectangle mode: Drag a
rectangle onto the frame.
* <img src="/static/icons/ico-ellipse-icon-purple.svg" className="icon size-md space-sm" /> - Ellipse mode: Drag an ellipse
onto the frame.
* <img src="/static/icons/ico-polygon-icon-purple.svg" className="icon size-md space-sm" /> - Polygon mode: Click the polygon
vertices onto the frame.
* <img src="/static/icons/ico-keypoint-icon-purple.svg" className="icon size-md space-sm" /> - Key points mode: Click each
keypoint onto the frame. After the clicking the last keypoint, click the first again to close the bounding area.
A new annotation is created.
1. In the newly created annotation, select or type a label.
1. Optionally, add metadata. This is metadata for the annotation, not the entire frame.
1. Optionally, lock the annotation.
1. If you move to another frame, the frame editor automatically saves changes. Otherwise, if you exit the frame editor,
you are prompted to save.
**To copy an annotation:**
1. Click the annotation or bounded area in the image or video clip.
1. Optionally, navigate to a different frame.
1. Click **PASTE**. The new annotation appears in the same location as the one you copied.
1. Optionally, to paste the same annotation, click **PASTE** again in the desired frame.
**To copy all annotations:**
1. Click **COPY ALL**.
1. Optionally, navigate to a different frame.
1. Click **PASTE**.
### Frame labels
**To add frame labels:**
1. Expand the **FRAME LABELS** area.
1. Click **+ Add new**.
1. Enter a label.
1. Optionally, click <img src="/static/icons/ico-edit.svg" className="icon size-md space-sm" />.
### Annotation management
**To move annotations:**
* Move a bounded area by clicking on it and dragging.
**To resize annotations:**
* Resize a bounded area by clicking on a vertex and dragging.
**To modify annotation metadata:**
* In the label, click edit and then in the popup modify the metadata dictionary (in JSON format).
**To modify annotation colors:**
* Modify the color of an area by clicking the circle in the label name and select a new color.
**To lock annotations:**
* All annotations - Above the annotations, click the lock / unlock.
* A specific annotation - In an annotation, click the lock / unlock.
**To delete annotations:**
1. Click the annotation or bounded area in the image or video clip.
1. Press the **DELETE** key, or in the annotation, click **X**.
**To add, change, or delete labels to annotations labels:**
* Add - Click in the annotation and choose a label from the label list, or type a new label.
* Change - In the annotation label textbox, choose a label from the list or type a new label.
* Delete - In the annotation, in the label area, click the label's **X**.
## Frame metadata
**To edit frame metadata:**
* Expand the **FRAME METADATA** area, click edit, and then in the popup modify the metadata dictionary (in JSON format).

View File

@ -0,0 +1,113 @@
---
title: Dataset Versioning
---
Use the Dataset versioning WebApp (UI) features for viewing, creating, modifying, and
deleting Dataset versions.
From the Datasets page, click on one of the Datasets in order to see and work with its versions.
### Viewing snapshots
View snapshots in the simple version structure using either:
* The simple view, a table of snapshots.
<details className="cml-expansion-panel screenshot">
<summary className="cml-expansion-panel-summary">Simple view (snapshot table)</summary>
<div className="cml-expansion-panel-content">
![image](../../img/hyperdatasets/web-app/dataset_simple_adv_01.png)
</div>
</details>
<br/>
* The advanced view, a tree of versions. The tree contains one version whose status is <i>Draft</i>, and snapshots appear in
chronological order, with oldest at the top, and the most recent at the bottom.
<details className="cml-expansion-panel screenshot">
<summary className="cml-expansion-panel-summary">Advanced view (version tree)</summary>
<div className="cml-expansion-panel-content">
![image](../../img/hyperdatasets/web-app/dataset_simple_adv_02.png)
</div>
</details>
### Creating snapshots
To create a snapshot, you must be in the simple (version table) view.
**To create a snapshot, do the following:**
1. If you are in the advanced view, click **Switch to Simple View** (In certain situations, this may not be possible,
see [Dataset Versioning](../dataset.md#dataset-versioning))
1. If the **DATASET HISTORY** section is not opened, click it.
1. If a snapshot is currently selected, click **RETURN TO CURRENT VERSION**.
1. Click **+ CREATE SNAPSHOT**.
1. Enter a version name, and optionally a description.
1. Click **CREATE**.
:::note
The WebApp (UI) does not currently support the automatic naming of snapshots with timestamps appended. You must provide a snapshot name.
:::
### Creating versions
To create a version, you must be in the advanced (version tree) view.
**To create a child version, do the following:**
1. If you are in the simple view, click **Switch to Advanced View**.
1. Click the (parent) version from which to create a child (inherit all frames).
1. Click **+ CREATE NEW VERSION**.
1. Enter a version name, and optionally a description.
1. Click **CREATE**.
### Publishing versions
Publish (make read-only) any Dataset version whose status is *Draft*. If the Dataset is in the simple structure,
and you publish the current version, then only the advanced view is available,
and you cannot create snapshots.
**To publish a version, do the following:**
1. If you are in the simple view, click **Switch to Advanced View**.
1. Click the version to publish.
1. Click **PUBLISH**.
1. Click **PUBLISH** again to confirm.
### Exporting frames
Frame exports downloaded filtered frames as a JSON file.
**To export frames, do the following:**
* In the Thumbnails area, click **EXPORT FRAMES**. The frames JSON file downloads.
### Modifying version names
**To modify a Dataset version name, do the following:**
* At the top right of the Dataset page, hover over the Dataset version name, click <img src="/static/icons/ico-edit.svg" className="icon size-md space-sm" /> , edit the name, and then click <img src="/static/icons/ico-save.svg" className="icon size-md space-sm" /> (check).
### Modifying version descriptions
**To modify a version description, do the following:**
* Expand the **INFO** area, hover over the **Description**, click <img src="/static/icons/ico-edit.svg" className="icon size-md space-sm" />,
edit the name, and then click <img src="/static/icons/ico-save.svg" className="icon size-md space-sm" /> (check).
### Deleting versions
You can delete versions whose status is *Draft*.
**To delete the current version, do the following:**
* If you are in the simple view, click **Switch to Advanced View**.
* Click the version to delete.
* Click **DELETE**.
* Click **DELETE** again to confirm.

View File

@ -0,0 +1,95 @@
---
title: Dataviews Table
---
[Dataviews](../dataviews.mda) appear in the same Project as the experiment that stored the Dataview in the **ClearML Enterprise** platform,
as well as the **DATAVIEWS** tab in the **All Projects** page.
The **Dataviews table** is a [customizable](#customizing-the-dataviews-table) list of Dataviews associated with a project.
Use it to [view, create, and edit Dataviews](#viewing-adding-and-editing-dataviews) in the info panel. Dataview tables
can be filtered by name or name fragments and / or ID, by using the search bar.
![image](../../img/hyperdatasets/dataviews_table_01.png)
The Dataviews table columns in their default order are below. Dynamically order the columns by dragging a column heading
to a new position.
* **DATAVIEW** - Dataview name.
* **USER** - User creating the Dataview.
* **CREATED** - Elapsed time since the Dataview was created.
* **DESCRIPTION**
## Customizing the Dataviews table
The Dataviews table can be customized. Changes are persistent (cached in the browser), and represented in the URL.
Save customized settings in a browser bookmark, and share the URL with teammates.
Customize any combination of the following:
* Dynamic column ordering - Drag a column title to a different position.
* Filter by user
* Sort columns - By experiment name and / or elapsed time since creation.
* Column autofit - In the column heading, double click a resizer (column separator).
## Viewing, adding, and editing Dataviews
**To view, add, or edit a Dataview:**
1. Do one of the following:
* Create a new Dataview - Click **+ NEW DATAVIEW**.
* View or edit a Dataview - In the Dataview table, click the Dataview.
1. To edit sections of the Dataview, follow the steps on the "Modifying Dataviews" page for the following:
1. [Selecting Dataset versions](webapp_exp_modifying.md#selecting-dataset-versions)
1. [Filtering frames](webapp_exp_modifying.md#filtering-frames)
1. [Mapping labels (label translation)](webapp_exp_modifying.md#mapping-labels-label-translation) (if appropriate for
the data and experiment)
1. [Label enumeration](webapp_exp_modifying.md#label-enumeration)
1. [Data augmentation](webapp_exp_modifying.md#data-augmentation) (if appropriate for the data
and experiment)
1. [Iteration controls](webapp_exp_modifying.md#iteration-controls)
## Cloning Dataviews
Create an exact editable copy of a Dataview. For example, when tuning an experiment, clone a Dataview to apply the same
frame filters to different Dataset versions.
**To clone a Dataview:**
1. Do one of the following:
* In the Dataview table, right click a Dataview and then click **Clone**.
* If the info panel is opened, click <img src="/docs/img/svg/bars-menu.svg" alt="Menu" className="icon size-lg space-sm" />
(menu) and then click **Clone**.
1. Select a project or accept the current project, enter a name, and optionally enter a description
1. Click **CLONE**.
## Archiving Dataviews
Archive Dataviews to more easily manage current work. Archived Dataviews do not appear on the active Dataviews table.
They only appear in the archive. After archiving, the Dataview can be restored from the archive later.
**To archive a Dataview:**
* In the Dataview table:
* Archive one Dataview - Right click the Dataview **>** **Archive**.
* Archive multiple Dataviews - Select the Dataview checkboxes **>** In the footer menu that appears at the bottom of
the page, click **ARCHIVE**.
* In the Dataview info panel - Click <img src="/docs/img/svg/bars-menu.svg" alt="Menu" className="icon size-lg space-sm" />
(menu) **>** **ARCHIVE**.
**To restore a Dataview:**
1. Go to the Dataview table of the archived Dataview or of the **All Projects** page
1. Click **OPEN ARCHIVE**
1. Do any of the following:
* In the Dataview table:
* Restore one Dataview - Right click the Dataview **>** **Restore**.
* Restore multiple Dataviews - Select the Dataview checkboxes **>** **Restore**.
* In the info panel, restore one Dataview - Click <img src="/docs/img/svg/bars-menu.svg" alt="Menu" className="icon size-lg space-sm" />
(menu) **>** **Restore**.

View File

@ -0,0 +1,37 @@
---
title: Comparing Dataviews
---
In addition to [**ClearML**'s comparison features](../../webapp/webapp_exp_comparing.md), the **ClearML Enterprise** WebApp
provides a deep comparison of input data selection criteria of experiment Dataviews, enabling to easily locate, visualize, and analyze differences.
## Selecting experiments
**To select experiments to compare:**
1. In the experiment's table, select the checkbox of each experiment to compare, or select the top checkbox for all experiments.
After selecting the second checkbox, the bottom bar appears.
1. In the bottom bar, click **COMPARE**. The comparison page appears, showing a column for each experiment and differences with
a highlighted background color. The experiment on the left is the base experiment. Other experiments compare to the base experiment.
## Dataviews (input data)
**To locate the input data differences:**
1. Click the **DETAILS** tab **>** Expand the **DATAVIEWS** section, or, in the header, click <img src="/static/icons/ico-previous-diff.svg" alt="Previous diff" className="icon size-md" />
(Previous diff) or <img src="/static/icons/ico-next-diff.svg" className="icon size-md space-sm" /> (Next diff).
1. Expand any of the following sections:
* **Augmentation** - On-the-fly data augmentation.
* **Filtering**
* Frame inclusion and exclusion rules based on ROI labels
* Frame metadata
* Frame sources
* Number of instances of a rule matching ROIs in each frame
* Confidence levels.
* **Iteration** - Iteration controls.
* **Labels Enumeration** - Class label enumeration.
* **Mapping** - ROI label translation.
* **View**
![image](../../img/hyperdatasets/web-app/compare_dataviews.png)

View File

@ -0,0 +1,168 @@
---
title: Modifying Dataviews
---
An experiment that has been executed can be [cloned](../../webapp/webapp_exp_reproducing.md), then the cloned experiment's
execution details can be modified, and the modified experiment can be executed.
In addition to all the [**ClearML** tuning capabilities](../../webapp/webapp_exp_tuning.md), the **ClearML Enterprise WebApp** (UI)
enables modifying Dataviews, including:
* [Selected Dataview](#selected-dataview)
* [Dataset versions](#selecting-dataset-versions)
* [Frame filtering](#filtering-frames)
* [Label mapping](#mapping-labels-label-translation)
* [Class label enumeration](#label-enumeration)
* [Data augmentation](#data-augmentation)
* [Input frame iteration controls](#iteration-controls)
The selection and control of input data can be modified in *Draft* experiments that are not [development experiments](../task.md#development-experiments).
Do this by modifying the Dataview used by the experiment. The Dataview specifies the Dataset versions from which frames
are iterated and frame filters (see [Dataviews](webapp_dataviews.md).
**To choose a Dataview**, do any of the following:
* Create a new Dataview
* Click **+** and then follow the instructions below to select Dataset versions, filter frames, map labels (label translation),
and set label enumeration, data augmentation, and iteration controls.
* Select a different Dataview already associated with the experiment.
* In the **SELECTED DATAVIEW** list, choose a Dataview.
* Import a different Dataview associated with the same or another project.
* Click <img src="/static/icons/ico-import.svg" className="icon size-md space-sm" /> (**Import dataview**) and then
select **Import to current dataview** or **Import to aux dataview**.
:::note
After importing a Dataview, it can be renamed and / or removed.
:::
### Selecting Dataset versions
To input data from a different data source or different version of a data source, select a different Dataset version used
by the Dataview.
**To select Dataset versions for input data:**
1. In the **INPUT** area, click **EDIT**.
1. Do any of the following:
* Add a Dataset version - Input frames from another a version of another Dataset.
* Click **+**
* Select a Dataset and a Dataset version
* Remove a Dataset version - Do not input frames from a Dataset version.
Select frames from as many Dataset versions as are needed.
1. Click **SAVE**.
## Filtering frames
Filtering of SingleFrames iterated by a Dataview for input to the experiment is accomplished by frame filters.
For more detailed information, see [Filtering](../dataviews.md#filtering).
**To modify frame filtering:**
1. In the **FILTERING** area, click **EDIT**.
1. For each frame filter:
1. Select the Dataset version to which the frame filter applies.
1. Add, change, or remove any combination of the following rules:
* ROI rule - Include or exclude frames containing any single ROI with any combination of labels in the Dataset
version. Specify a range of the number of matching ROI (instances) per frame, and a range of confidence levels.
* Frame rule - Filter by frame metadata key-value pairs, or ROI labels.
* Source rule - Filter by frame `source` dictionary key-value pairs.
1. Optionally, debias input data by setting ratios for frames returned by the Dataview for each frame filter. These
ratios allow adjusting an imbalance in input data.
1. Click **SAVE**.
## Mapping labels (label translation)
Modify the ROI label mapping rules, which translate one or more input labels to another label for the output model. Labels
that are not mapped are ignored.
**To modify label mapping:**
1. In the **MAPPING** section, click **EDIT**
* Add (**+**) or edit a mapping:
1. Select the Dataset and version whose labels will be mapped.
1. Select one or more labels to map.
1. Select or enter the label to map to in the output model.
* Remove (<img src="/static/icons/ico-trash.svg" className="icon size-md space-sm" />) a mapping.
1. Click **SAVE**
## Label enumeration
Modify the label enumeration assigned to output models.
**To modify label enumeration:**
1. In the **LABELS ENUMERATION** section, click **EDIT**.
* Add (**+**) or edit an enumeration:
* Select a label and then enter an integer for it.
* Remove (<img src="/static/icons/ico-trash.svg" className="icon size-md space-sm" />) an enumeration.
1. Click **SAVE**.
## Data augmentation
Modify the on-the-fly data augmentation applied to frames input from the select Dataset versions and filtered by the frame filters. Data augmentation is applied in steps, where each step applies a method, operation, and strength.
For more detailed information, see [Data Augmentation](../dataviews.md#data-augmentation).
**To modify data augmentation**
1. In the **AUGMENTATION** section, click **EDIT**.
* Add (**+**) or edit an augmentation step - Select a **METHOD**, **OPERATION**, and **STRENGTH**.
* Remove (<img src="/static/icons/ico-trash.svg" className="icon size-md space-sm" />) an augmentation step.
1. Click **SAVE**.
## Iteration controls
Modify the frame iteration performed by the Dataview to control the order, number, timing, and reproducibility of frames
for training.
For more detailed information, see [Iteration Control](../dataviews.md#iteration-control).
**To modify iteration controls:**
1. In the **ITERATION** sections, click **EDIT**.
1. Select the **ORDER** of the SingleFrames returned by the iteration, either:
* **Sequential** - Iterate SingleFrames in sorted order by context ID and timestamp.
* **Random** - Iterate SingleFrames randomly using the random seed you can set (see Random Seed below).
1. Select the frame **REPETITION** option, either:
* **Use Each Frame Once**
* **Limit Frames**
* **Infinite Iterations**
1. Select the **RANDOM SEED** - If the experiment is rerun and the seed remains unchanged, the frames iteration is the same.
1. For video, enter a **CLIP LENGTH** - For video data sources, in the number of sequential frames from a clip to iterate.
1. Click **SAVE**.

View File

@ -0,0 +1,69 @@
---
title: Viewing Experiments
---
While an experiment is running, and any time after it finishes, results are tracked and can be visualized in the ClearML
Enterprise WebApp (UI).
In addition to all of **ClearML**'s offerings, ClearML Enterprise keeps track of the Dataviews associated with an
experiment, which can be viewed and [modified](webapp_exp_modifying.md) in the WebApp.
## Viewing an experiment's Dataviews
In an experiment's page, go to the **DATAVIEWS** tab to view all the experiment's Dataview details, including:
* Input data [selection](#dataset-versions) and [filtering](#filtering)
* ROI [mapping](#mapping) (label translation)
* [Label enumeration](#label-enumeration)
* On-the-fly [data augmentation](#augmentation)
* [Iteration controls](#iteration-control)
![image](../../img/hyperdatasets/web-app/dataview_tab.png)
### Input
SingleFrames are iterated from the Dataset versions specified in the **INPUT** area, in the **SELECTED DATAVIEW** drop-down
menu.
### Filtering
The **FILTERING** section lists the SingleFrame filters iterated by a Dataview, applied to the experiment data.
Each frame filter is composed of:
* A Dataset version to input from
* ROI Rules for SingleFrames to include and / or exclude certain criteria.
* Weights for debiasing input data.
Combinations of frame filters can implement complex querying.
For more detailed information, see [Filtering](../dataviews.md#filtering).
### Mapping
ROI label mapping (label translation) applies to the new model. For example, use ROI label mapping to accomplish the following:
* Combine several labels under another more generic label.
* Consolidate disparate datasets containing different names for the ROI.
* Hide labeled objects from the training process.
For detailed information, see [Mapping ROI labels](../dataviews.md#mapping-roi-labels).
### Label enumeration
Assign label enumeration in the **LABELS ENUMERATION** area.
### Augmentation
On-the-fly data augmentation applied to SingleFrames, which does not create new data. Apply data Augmentation in steps,
where each step is composed of a method, an operation, and a strength.
For detailed information, see [Data augmentation](../dataviews.md#data-augmentation).
### Iteration control
The input data iteration control settings determine the order, number, timing, and reproducibility of the Dataview iterating
SingleFrames. Depending upon the combination of iteration control settings, all SingleFrames may not be iterated, and some may repeat.
For detailed information, see [Iteration control](../dataviews.md#iteration-control).

Binary file not shown.

After

Width:  |  Height:  |  Size: 131 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 94 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 150 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 79 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 137 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 1.1 MiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 108 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 48 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 138 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 548 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 265 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 277 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 216 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 686 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 345 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 98 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 341 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 675 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 183 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 775 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 811 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 645 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 106 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 74 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 58 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 600 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 345 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 347 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 41 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 50 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 93 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 905 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 112 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 466 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 64 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 91 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 82 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 79 KiB

View File

@ -2,12 +2,11 @@
title: Libraries
---
ClearML integrates with many frameworks and tools out of the box! <br/>
Just follow the [getting started](/getting_started/ds/ds_first_steps.md) to automatically capture metrics, models and artifacts or check out examples for each library.
![integrations](../img/integrations.png)
![image](../img/integration_tools.png)
**Frameworks**
- [Pytorch](https://github.com/allegroai/clearml/tree/master/examples/frameworks/pytorch)

View File

@ -8,10 +8,7 @@ and charts.
Supported storage mediums include:
<ImageSwitcher alt="ClearML Supported Storage"
lightImageSrc="/docs/latest/icons/ClearML_Supported_Storage--on-light.png"
darkImageSrc="/docs/latest/icons/ClearML_Supported_Storage--on-dark.png"
/>
![image](../../static/icons/ClearML_Supported_Storage--on-light.png)
:::note
Once uploading an object to a storage medium, each machine that uses the object must have access to it.

View File

@ -2,74 +2,6 @@
title: Version 1.0
---
### ClearML 1.0.3
**Features**
- Use default `boto` credential chain if no keys are provided in the configuration file or environment variables [ClearML GitHub PR 342](https://github.com/allegroai/clearml/issues/342)
- Support `DummyModel` configuration [Slack Channel](https://clearml.slack.com/archives/CTK20V944/p1621469235085400)
- Add `report_matplotlib_figure(..., report_interactive=False)` allowing to upload a matplotlib as a non-interactive (high quality png) plot
- Add `Logger.matplotlib_force_report_non_interactive()`
- Remove matplotlib axis range (`plotly.js` auto-range can adjust it in real-time)
- Add object-storage support in cleanup-service
- Add `dataset_tags` argument to `Dataset.create()`
- Expose `docker_args` and `docker_bash_setup_script` in `clearml-task` CLI
- Add logging for Nvidia driver and Cuda version
- Add optional ignored packages in script requirements (currently used for `pywin32`)
- Update examples
* Increase channel result to support max of 1K channels for finding slack channel and use cursor in Slack Alerts monitoring service
* Add `csv` data sample to `data_samples`
* Remove deprecated examples
**Bug Fixes**
- Fix Hydra should not store the full resolved OmegaConf [ClearML GitHub issue 327](https://github.com/allegroai/clearml/issues/327)
- Fix direct import of keras save/load model functions [ClearML GitHub issue 355](https://github.com/allegroai/clearml/issues/355)
- Fix run as module [ClearML GitHub issue 359](https://github.com/allegroai/clearml/issues/359)
- Fix Python 2.7 support [ClearML GitHub issue 366](https://github.com/allegroai/clearml/issues/366)
- Fix `Task.add_requirements()` passing `package_version` starting with `@`, `;` or `#`
- Fix import keras from TF
- Fix support for Hydra's `run_job()` change in parameter order by passing `config` and `task_function` as keyword arguments
- Fix background upload retries with Google Storage (`gs://`)
- Fix Python 3.8 race condition in `Task.close()`
- Fix shutting down a Task immediately after creation might block
- Fix `Task.execute_remotely()` from Jupyter notebook
- Fix Jupyter Notebook inside VSCode
- Fix support for `Dataset.create()` argument `use_current_task`
- Fix `Dataset.finalize()` can hang in extreme scenario
- Protect against wrong file object type when auto-binding models
- Fix matplotlib date convertor
- Fix automation controller overrides nodes clone
### ClearML Server 1.0.2
**Bug Fixes**
- Fix Task container does not accept `null` values [Slack Channel](https://clearml.slack.com/archives/CTK20V944/p1622119047293300) [ClearML GitHub issue 365](https://github.com/allegroai/clearml/issues/365)
- Fix debug images exception in Results page
- Fix a typo in Worker Setup help popup
### ClearML Server 1.0.1
**Bug Fixes**
- Fix clearing experiment requirements causes "empty" requirements (as opposed to "no requirements")
- Fix logout fails with `endpoint not found` error [ClearML GitHub issue 349](https://github.com/allegroai/clearml/issues/349)
- Fix hamburger side menu `Manage Queues` does nothing and returns console error [Slack Channel](https://clearml.slack.com/archives/CTK20V944/p1620308724418100)
- Fix broken config dir backwards compatibility (`/opt/trains/config` should also be supported)
### ClearML 1.0.2
**Bug Fixes**
- Fix in rare scenarios process stuck on exit, again :)
### ClearML 1.0.1
**Bug Fixes**
- Fix in rare scenarios process stuck on exit
### ClearML 1.0.0
**Breaking Changes**

View File

@ -63,6 +63,11 @@ module.exports = {
label: 'Docs',
position: 'left',
},
{
to:'/docs/hyperdatasets/overview',
label: 'Hyper Datasets',
position: 'left',
},
// {to: 'tutorials', label: 'Tutorials', position: 'left'},
// Please keep GitHub link to the right for consistency.
{to: '/docs/guides', label: 'Examples', position: 'left'},

View File

@ -48,9 +48,11 @@ module.exports = {
},
'deploying_clearml/clearml_server_config', 'deploying_clearml/clearml_config_for_clearml_server', 'deploying_clearml/clearml_server_security'
]},
//'Comments': ['Notes'],
],
guidesSidebar: [
'guides/guidemain',
@ -88,7 +90,7 @@ module.exports = {
},
{'XGboost': ['guides/frameworks/xgboost/xgboost_sample']}
]},
{'IDEs': ['guides/ide/remote_jupyter_tutorial', 'guides/ide/integration_pycharm']},
{'IDEs': ['guides/ide/remote_jupyter_tutorial', 'guides/ide/integration_pycharm', 'guides/ide/google_colab']},
{'Optimization': ['guides/optimization/hyper-parameter-optimization/examples_hyperparam_opt']},
{'Pipelines': ['guides/pipeline/pipeline_controller']},
@ -96,7 +98,7 @@ module.exports = {
'guides/reporting/hyper_parameters', 'guides/reporting/image_reporting', 'guides/reporting/manual_matplotlib_reporting', 'guides/reporting/media_reporting',
'guides/reporting/model_config', 'guides/reporting/pandas_reporting', 'guides/reporting/plotly_reporting',
'guides/reporting/scalar_reporting', 'guides/reporting/scatter_hist_confusion_mat_reporting', 'guides/reporting/text_reporting']},
{'Services': ['guides/services/aws_autoscaler', 'guides/services/cleanup_service', 'guides/services/execute_jupyter_notebook_server', 'guides/services/slack_alerts']},
{'Services': ['guides/services/aws_autoscaler', 'guides/services/cleanup_service', 'guides/services/slack_alerts']},
{'Storage': ['guides/storage/examples_storagehelper']},
{'Web UI': ['guides/ui/building_leader_board','guides/ui/tuning_exp']}
@ -128,8 +130,42 @@ module.exports = {
'references/sdk/hpo_parameters_parameterset',
]},
],
hyperdatasetsSidebar: [
'hyperdatasets/overview',
{'Frames': [
'hyperdatasets/frames',
'hyperdatasets/single_frames',
'hyperdatasets/frame_groups',
'hyperdatasets/sources',
'hyperdatasets/annotations',
'hyperdatasets/masks',
'hyperdatasets/previews',
'hyperdatasets/custom_metadata'
]},
'hyperdatasets/dataset',
'hyperdatasets/dataviews',
'hyperdatasets/task',
{'WebApp': [
{'Dataviews': [
'hyperdatasets/webapp/webapp_dataviews',
'hyperdatasets/webapp/webapp_exp_modifying',
'hyperdatasets/webapp/webapp_exp_track_visual',
'hyperdatasets/webapp/webapp_exp_comparing',
]},
{'Datasets': [
'hyperdatasets/webapp/webapp_datasets',
'hyperdatasets/webapp/webapp_datasets_versioning',
'hyperdatasets/webapp/webapp_datasets_frames'
]},
'hyperdatasets/webapp/webapp_annotator'
]}
],
apiSidebar: [
'references/api/definitions',
'references/api/endpoints',
]
],
};

View File

@ -0,0 +1,4 @@
<svg xmlns="http://www.w3.org/2000/svg" width="16" height="16" viewBox="0 0 16 16">
<title>ellipse-icon-purple</title>
<path d="M15.8,6H14.41A6.89,6.89,0,0,0,9.5,3.12V2h-3V3.12A6.89,6.89,0,0,0,1.59,6H.2V9h.94c.55,1.93,2.67,3.46,5.36,3.88V14h3v-1.1c2.69-.42,4.81-2,5.36-3.88h.94ZM9.5,11.86V11h-3v.88C4.43,11.49,2.76,10.38,2.21,9h1V6H2.84A6.4,6.4,0,0,1,6.5,4.14V5h3V4.14A6.4,6.4,0,0,1,13.16,6H12.8V9h1C13.24,10.38,11.57,11.49,9.5,11.86Z" fill="#8f9dc9"/>
</svg>

After

Width:  |  Height:  |  Size: 464 B

View File

@ -0,0 +1,9 @@
<svg xmlns="http://www.w3.org/2000/svg" width="16px" height="16px" viewBox="0 0 13 13">
<g id="Group_1063" data-name="Group 1063" transform="translate(-374 -163)">
<rect id="Rectangle_824" data-name="Rectangle 824" width="3" height="3" rx="1" transform="translate(379 163)" fill="#8f9dc9"/>
<rect id="Rectangle_825" data-name="Rectangle 825" width="3" height="3" rx="1" transform="translate(384 167)" fill="#8f9dc9"/>
<rect id="Rectangle_826" data-name="Rectangle 826" width="3" height="3" rx="1" transform="translate(382 173)" fill="#8f9dc9"/>
<rect id="Rectangle_827" data-name="Rectangle 827" width="3" height="3" rx="1" transform="translate(376 173)" fill="#8f9dc9"/>
<rect id="Rectangle_828" data-name="Rectangle 828" width="3" height="3" rx="1" transform="translate(374 167)" fill="#8f9dc9"/>
</g>
</svg>

After

Width:  |  Height:  |  Size: 835 B

View File

@ -0,0 +1,3 @@
<svg xmlns="http://www.w3.org/2000/svg" width="16px" height="16px" viewBox="0 0 13 13">
<path id="np_vector-path_1158607_000000" d="M17.567,9.044h-1.69L12.8,6.806V5.433A.426.426,0,0,0,12.367,5H10.633a.426.426,0,0,0-.433.433V6.806L7.123,9.044H5.433A.426.426,0,0,0,5,9.478v1.733a.434.434,0,0,0,.433.433h.621L7.282,15.4a.44.44,0,0,0-.4.433v1.733A.434.434,0,0,0,7.311,18H9.044a.434.434,0,0,0,.433-.433v-.433h4.044v.433a.434.434,0,0,0,.433.433h1.733a.434.434,0,0,0,.433-.433V15.833a.429.429,0,0,0-.4-.433l1.228-3.756h.621A.434.434,0,0,0,18,11.211V9.478A.426.426,0,0,0,17.567,9.044ZM14.807,15.4h-.852a.426.426,0,0,0-.433.433v.433H9.478v-.433a.426.426,0,0,0-.433-.433H8.192L6.965,11.644h.2a.434.434,0,0,0,.433-.433V9.767L10.59,7.6h1.82L15.4,9.767v1.444a.434.434,0,0,0,.433.433h.2Z" transform="translate(-5 -5)" fill="#8f9dc9"/>
</svg>

After

Width:  |  Height:  |  Size: 830 B

View File

@ -0,0 +1,3 @@
<svg xmlns="http://www.w3.org/2000/svg" width="16px" height="16px" viewBox="0 0 13 13">
<path id="np_vector-path-square_1158609_000000" d="M17.567,7.6A.434.434,0,0,0,18,7.167V5.433A.426.426,0,0,0,17.567,5H15.833a.426.426,0,0,0-.433.433v.433H7.6V5.433A.426.426,0,0,0,7.167,5H5.433A.426.426,0,0,0,5,5.433V7.167a.434.434,0,0,0,.433.433h.433v7.8H5.433A.426.426,0,0,0,5,15.833v1.733A.434.434,0,0,0,5.433,18H7.167a.434.434,0,0,0,.433-.433v-.433h7.8v.433a.434.434,0,0,0,.433.433h1.733A.434.434,0,0,0,18,17.567V15.833a.426.426,0,0,0-.433-.433h-.433V7.6Zm-1.3,7.8h-.433a.426.426,0,0,0-.433.433v.433H7.6v-.433a.426.426,0,0,0-.433-.433H6.733V7.6h.433A.434.434,0,0,0,7.6,7.167V6.733h7.8v.433a.434.434,0,0,0,.433.433h.433Z" transform="translate(-5 -5)" fill="#8f9dc9"/>
</svg>

After

Width:  |  Height:  |  Size: 766 B