mirror of
https://github.com/clearml/clearml-docs
synced 2025-01-31 14:37:18 +00:00
154 lines
4.8 KiB
Markdown
154 lines
4.8 KiB
Markdown
---
|
||
title: Masks
|
||
---
|
||
|
||
Masks are source data used in deep learning for image segmentation. Mask URIs are a property of a SingleFrame.
|
||
|
||
ClearML applies the masks in one of two modes:
|
||
* [Pixel segmentation](#pixel-segmentation-masks) - Pixel RGB values are each mapped to segmentation labels.
|
||
* [Alpha channel](#alpha-channel-masks) - Pixel RGB values are interpreted as opacity levels.
|
||
|
||
In the WebApp's [frame viewer](webapp/webapp_datasets_frames.md#frame-viewer), you can select how to apply a mask over
|
||
a frame.
|
||
|
||
## Pixel Segmentation Masks
|
||
For pixel segmentation, mask RGB pixel values are mapped to labels.
|
||
|
||
Mask-label mapping is defined at the dataset level, through the `mask_labels` property in a version's metadata.
|
||
|
||
`mask_labels` is a list of dictionaries, where each dictionary includes the following keys:
|
||
* `value` - Mask's RGB pixel value
|
||
* `labels` - Label associated with the value.
|
||
|
||
See how to manage dataset version mask labels pythonically [here](dataset.md#managing-version-mask-labels).
|
||
|
||
In the UI, you can view the mapping in a dataset version's [Metadata](webapp/webapp_datasets_versioning.md#metadata) tab.
|
||
|
||
![Dataset metadata panel](../img/hyperdatasets/dataset_metadata.png)
|
||
|
||
When viewing a frame with a mask corresponding with the version’s mask-label mapping, the UI arbitrarily assigns a color
|
||
to each label . The color assignment can be [customized](webapp/webapp_datasets_frames.md#labels).
|
||
|
||
For example:
|
||
* Original frame image:
|
||
|
||
![Frame without mask](../img/hyperdatasets/dataset_pixel_masks_1.png)
|
||
|
||
* Frame image with the semantic segmentation mask enabled. Labels are applied according to the dataset version’s
|
||
mask-label mapping:
|
||
|
||
![Frame with semantic seg mask](../img/hyperdatasets/dataset_pixel_masks_2.png)
|
||
|
||
The frame's sources array contains a masks list of dictionaries that looks something like this:
|
||
|
||
```editorconfig
|
||
{
|
||
"id": "<framegroup_id>",
|
||
"timestamp": "<timestamp>" ,
|
||
"context_id": "car_1",
|
||
"sources": [
|
||
{
|
||
"id": "<source_id>",
|
||
"content_type": "<type>",
|
||
"uri": "<image_uri>",
|
||
"timestamp": 1234567889,
|
||
...
|
||
"masks": [
|
||
{
|
||
"id": "<mask_id>",
|
||
"content_type": "video/mp4",
|
||
"uri": "<mask_uri>",
|
||
"timestamp": 123456789
|
||
}
|
||
]
|
||
}
|
||
]
|
||
}
|
||
```
|
||
|
||
The masks dictionary includes the frame's masks’ URIs and IDs.
|
||
|
||
## Alpha Channel Masks
|
||
For alpha channel, mask RGB pixel values are interpreted as opacity values so that when the mask is applied, only the
|
||
desired sections of the source are visible.
|
||
|
||
For example:
|
||
* Original frame:
|
||
|
||
![Maskless frame](../img/hyperdatasets/dataset_alpha_masks_1.png)
|
||
|
||
* Same frame with an alpha channel mask, emphasizing the troll doll:
|
||
|
||
![Alpha mask frame](../img/hyperdatasets/dataset_alpha_masks_2.png)
|
||
|
||
|
||
The frame's sources array contains a masks list of dictionaries that looks something like this:
|
||
|
||
```editorconfig
|
||
{
|
||
"sources" : [
|
||
{
|
||
"id" : "321"
|
||
"uri" : "https://i.ibb.co/bs7R9k6/troll.png"
|
||
"masks" : [
|
||
{
|
||
"id" : "troll",
|
||
"uri" : "https://i.ibb.co/TmJ3mvT/troll-alpha.png"
|
||
}
|
||
]
|
||
"timestamp" : 0
|
||
}
|
||
]
|
||
}
|
||
```
|
||
|
||
Note that for alpha channel masks, no labels are used.
|
||
|
||
## Usage
|
||
### Register Frames with a Masks
|
||
To register frames with a mask, create a frame and specify the frame's mask file's URI.
|
||
|
||
```python
|
||
# create dataset version
|
||
version = DatasetVersion.create_version(
|
||
dataset_name="Example",
|
||
version_name="Registering frame with mask"
|
||
)
|
||
|
||
# create frame with mask
|
||
frame = SingleFrame(
|
||
source='https://s3.amazonaws.com/allegro-datasets/cityscapes/leftImg8bit_trainvaltest/leftImg8bit/val/frankfurt/frankfurt_000000_000294_leftImg8bit.png',
|
||
mask_source='https://s3.amazonaws.com/allegro-datasets/cityscapes/gtFine_trainvaltest/gtFine/val/frankfurt/frankfurt_000000_000294_gtFine_labelIds.png'
|
||
)
|
||
|
||
# add frame to version
|
||
version.add_frames([frame])
|
||
```
|
||
|
||
To use the mask for pixel segmentation, define the pixel-label mapping for the DatasetVersion:
|
||
|
||
```python
|
||
version.set_masks_labels(
|
||
{(0,0,0): ["background"], (1,1,1): ["person", "sitting"], (2,2,2): ["cat"]}
|
||
)
|
||
```
|
||
|
||
The relevant label is applied to all masks in the version according to the version’s mask-label mapping dictionary.
|
||
|
||
### Registering Frames with Multiple Masks
|
||
Frames can contain multiple masks. To add multiple masks, use the SingleFrame’s `masks_source` property. Input one of
|
||
the following:
|
||
* A dictionary with mask string ID keys and mask URI values
|
||
* A list of mask URIs. Number IDs are automatically assigned to the masks ( "00", "01", etc.)
|
||
|
||
```python
|
||
frame = SingleFrame(source='https://s3.amazonaws.com/allegro-datasets/cityscapes/leftImg8bit_trainvaltest/leftImg8bit/val/frankfurt/frankfurt_000000_000294_leftImg8bit.png',)
|
||
|
||
# add multiple masks
|
||
# with dictionary
|
||
frame.masks_source={"ID 1 ": "<mask_URI_1>", "ID 2": "<mask_URI_2>"}
|
||
# with list
|
||
frame.masks_source=[ "<mask_URI_1>", "<mask_URI_2>"]
|
||
```
|
||
|