diff --git a/docs/hyperdatasets/dataset.md b/docs/hyperdatasets/dataset.md
index f5213ec2..4d8c05a3 100644
--- a/docs/hyperdatasets/dataset.md
+++ b/docs/hyperdatasets/dataset.md
@@ -5,15 +5,14 @@ title: Datasets and Dataset Versions
ClearML Enterprise's **Datasets** and **Dataset versions** provide the internal data structure
and functionality for the following purposes:
* Connecting source data to the ClearML Enterprise platform
-* Using ClearML Enterprise's GIT-like [Dataset versioning](#dataset-versioning)
+* Using ClearML Enterprise's Git-like [Dataset versioning](#dataset-versioning)
* Integrating the powerful features of [Dataviews](dataviews.md) with an experiment
* [Annotating](webapp/webapp_datasets_frames.md#annotations) images and videos
Datasets consist of versions with SingleFrames and/or FrameGroups. Each Dataset can contain multiple versions, which
can have multiple children that inherit their parent's contents.
-Mask-labels can be defined globally, for a DatasetVersion. When defined this way, they will be applied to all masks in
-that version.
+Mask-labels are defined at the DatasetVersion level, and are applied to all masks in a DatasetVersion.
## Example Datasets
diff --git a/docs/hyperdatasets/masks.md b/docs/hyperdatasets/masks.md
index 809733e3..3a8e8cb8 100644
--- a/docs/hyperdatasets/masks.md
+++ b/docs/hyperdatasets/masks.md
@@ -2,241 +2,152 @@
title: Masks
---
-When applicable, [`sources`](sources.md) contains `masks`, a list of dictionaries used to connect a special type of
-source data to the ClearML Enterprise platform. That source data is a **mask**.
+Masks are source data used in deep learning for image segmentation. Mask URIs are a property of a SingleFrame.
-Masks are used in deep learning for semantic segmentation.
+ClearML applies the masks in one of two modes:
+* [Pixel segmentation](#pixel-segmentation-masks) - Pixel RGB values are each mapped to segmentation labels.
+* [Alpha channel](#alpha-channel-masks) - Pixel RGB values are interpreted as opacity levels.
-Masks correspond to raw data where the objects to be detected are marked with colors in the masks. The colors
-are RGB values and represent the objects that are labeled for segmentation.
+In the WebApp's [frame viewer](webapp/webapp_datasets_frames.md#frame-viewer), you can select how to apply a mask over
+a frame.
-In frames used for semantic segmentation, the metadata connecting the mask files / images to the ClearML Enterprise platform,
-and the RGB values and labels used for segmentation are separate. They are contained in two different dictionaries of
-a SingleFrame:
+## Pixel Segmentation Masks
+For pixel segmentation, mask RGB pixel values are mapped to labels.
-* **`masks`** (plural) is in [`sources`](sources.md) and contains the mask files / images `URI` (in addition to other keys
- and values).
+Mask-label mapping is defined at the dataset level, through the `mask_labels` property in a version's metadata.
-* **`mask`** (singular) is in the `rois` array of a Frame.
-
- Each `rois` dictionary contains:
+`mask_labels` is a list of dictionaries, where each dictionary includes the following keys:
+* `value` - Mask's RGB pixel value
+* `labels` - Label associated with the value.
- * RGB values and labels of a **mask** (in addition to other keys and values)
+See how to manage dataset version mask labels pythonically [here](dataset.md#managing-version-mask-labels).
- * Metadata and data for the labeled area of an image
-
-
-See [Example 1](#example-1), which shows `masks` in `sources`, `mask` in `rois`, and the key-value pairs used to relate
-a mask to its source in a frame.
+In the UI, you can view the mapping in a dataset version's [Metadata](webapp/webapp_datasets_versioning.md#metadata) tab.
+
-## Masks Structure
+When viewing a frame with a mask corresponding with the version’s mask-label mapping, the UI arbitrarily assigns a color
+to each label . The color assignment can be [customized](webapp/webapp_datasets_frames.md#labels).
-The chart below explains the keys and values of the `masks` dictionary (in the [`sources`](sources.md)
-section of a Frame).
+For example:
+* Original frame image:
-|Key|Value Description|
-|---|----|
-|`id`|**Type**: integer.
- The ID is used to relate this mask data source to the `mask` dictionary containing the label and RGB value for the mask.
- See the `mask` key in `rois`.
|
-|`content_type`| **Type**: string. - Type of mask data. For example, image / png or video / mp4.
|
-|`timestamp`|**Type**: integer. - For images from a video, indicates the absolute position of the frame from the source (video)
- For still images, set this to 0 (for example, video from a camera on a car, at 30 frames per second, would have a timestamp of 0 for the first frame, and 33 for the second frame).
|
-|`uri`|**Type**: string. - URI of the mask file / image.
|
+ 
+* Frame image with the semantic segmentation mask enabled. Labels are applied according to the dataset version’s
+ mask-label mapping:
-## Examples
-### Example 1
+ 
-This example demonstrates an original image, its masks, and its frame containing
-the `sources` and ROI metadata.
+The frame's sources array contains a masks list of dictionaries that looks something like this:
-
-This frame contains the `masks` list of dictionaries in `sources`,
-and the `rois` array, as well as several top-level key-value pairs.
-
-
-```json
+```editorconfig
{
- "timestamp": 1234567889,
- "context_id": "car_1",
- "meta": {
- "velocity": "60"
- },
- "sources": [
- {
- "id": "front",
- "content_type": "video/mp4",
- "width": 800,
- "height": 600,
- "uri": "https://s3.amazonaws.com/my_cars/car_1/front.mp4",
- "timestamp": 1234567889,
- "meta" :{
- "angle":45,
- "fov":129
- },
- "masks": [
- {
- "id": "seg",
- "content_type": "video/mp4",
- "uri": "https://s3.amazonaws.com/seg_masks/car_1/front_seg.mp4",
- "timestamp": 123456789
- },
- {
- "id": "seg_instance",
- "content_type": "video/mp4",
- "uri": "https://s3.amazonaws.com/seg_masks/car_1/front_instance_seg.mp4",
- "timestamp": 123456789
- }
- ]
- }
- ],
- "rois": [
- {
- "sources":["front"],
- "label": ["seg"],
- "mask": {
- "id": "car",
- "value": [210,210,120]
- }
- },
- {
- "sources":["front"],
- "label": ["seg"],
- "mask": {
- "id": "person",
- "value": [147,44,209]
- }
- },
- {
- "sources":["front"],
- "label": ["seg"],
- "mask": {
- "id": "road",
- "value": [197,135,146]
- }
- },
- {
- "sources":["front"],
- "label": ["seg"],
- "mask": {
- "id": "street",
- "value": [135,198,145]
- }
- },
- {
- "sources":["front"],
- "label": ["seg"],
- "mask": {
- "id": "building",
- "value": [72,191,65]
- }
- }
- ]
+ "id": "",
+ "timestamp": "" ,
+ "context_id": "car_1",
+ "sources": [
+ {
+ "id": "",
+ "content_type": "",
+ "uri": "",
+ "timestamp": 1234567889,
+ ...
+ "masks": [
+ {
+ "id": "",
+ "content_type": "video/mp4",
+ "uri": "",
+ "timestamp": 123456789
+ }
+ ]
+ }
+ ]
}
```
-
+The masks dictionary includes the frame's masks’ URIs and IDs.
+## Alpha Channel Masks
+For alpha channel, mask RGB pixel values are interpreted as opacity values so that when the mask is applied, only the
+desired sections of the source are visible.
-* In `sources`:
- * The source ID is `front`.
- * In the `masks` dictionary, the source contains mask sources with IDs of `seg` and `seg_instance`.
-* In `rois`:
- * Each ROI source is `front`, relating the ROI to its original source image.
- * Each ROI has a label of `seg`, indicating segmentation.
- * Each `mask` has an `id` (`car`, `person`, `road`, `street`, and `building`) and a unique RGB `value`
- (color-coding).
-
+For example:
+* Original frame:
+ 
-
-Original Image
+* Same frame with an alpha channel mask, emphasizing the troll doll:
+
+ 
-
-Mask image
+The frame's sources array contains a masks list of dictionaries that looks something like this:
-
-
-
-
-### Example 2
-
-This example shows two masks for video from a camera. The masks label cars and the road.
-
-
-
-```json
-"sources": [
- {
- "id": "front",
- "content_type": "video/mp4",
- "width": 800,
- "height": 600,
- "uri": "https://s3.amazonaws.com/my_cars/car_1/front.mp4",
- "timestamp": 1234567889,
- "meta" :{
- "angle":45,
- "fov":129
- },
- "masks": [
- {
- "id": "car",
- "content_type": "video/mp4",
- "uri": "https://s3.amazonaws.com/seg_masks/car_1/front_seg.mp4",
- "timestamp": 123456789
- },
- {
- "id": "road",
- "content_type": "video/mp4",
- "uri": "https://s3.amazonaws.com/seg_masks/car_1/front_instance_seg.mp4",
- "timestamp": 123456789
- }
- ]
- }
- ],
- "rois": [
- {
- "sources":["front"],
- "label": ["right_lane"],
- "mask": {
- "id": "car",
- "value": [210,210,120]
- }
- },
- {
- "sources":["front"],
- "label": ["right_lane"],
- "mask": {
- "id": "road",
- "value": [197,135,146]
- }
- }
+```editorconfig
+{
+ "sources" : [
+ {
+ "id" : "321"
+ "uri" : "https://i.ibb.co/bs7R9k6/troll.png"
+ "masks" : [
+ {
+ "id" : "troll",
+ "uri" : "https://i.ibb.co/TmJ3mvT/troll-alpha.png"
+ }
+ ]
+ "timestamp" : 0
+ }
+ ]
+}
```
-
+Note that for alpha channel masks, no labels are used.
-* In `sources`:
- * The source ID is `front`.
- * The source contains mask sources with IDs of `car` and `road`.
-* In `rois`:
- * Each ROI source is `front` relating the ROI to its original source image.
- * Each ROI has a label of `right_lane` indicating the ROI object.
- * Each `mask` has an `id` (`car`, `person`) and a unique RGB `value` (color-coding).
-
## Usage
-
-### Adding Mask Annotations
-
-To add a mask annotation to a frame, use the [`SingleFrame.add_annotation`](../references/hyperdataset/singleframe.md#add_annotation).
-This method is generally used to add ROI annotations, but it can also be used to add frame specific mask labels. Input the
-mask value as a list with the RGB values in the `mask_rgb` parameter, and a list of labels in the `labels` parameter.
+### Register Frames with a Masks
+To register frames with a mask, create a frame and specify the frame's mask file's URI.
```python
-frame = SingleFrame(
- source='/home/user/woof_meow.jpg',
- preview_uri='https://storage.googleapis.com/kaggle-competitions/kaggle/3362/media/woof_meow.jpg',
+# create dataset version
+version = DatasetVersion.create_version(
+ dataset_name="Example",
+ version_name="Registering frame with mask"
)
-
-frame.add_annotation(mask_rgb=[0, 0, 0], labels=['cat'])
+
+# create frame with mask
+frame = SingleFrame(
+ source='https://s3.amazonaws.com/allegro-datasets/cityscapes/leftImg8bit_trainvaltest/leftImg8bit/val/frankfurt/frankfurt_000000_000294_leftImg8bit.png',
+ mask_source='https://s3.amazonaws.com/allegro-datasets/cityscapes/gtFine_trainvaltest/gtFine/val/frankfurt/frankfurt_000000_000294_gtFine_labelIds.png'
+)
+
+# add frame to version
+version.add_frames([frame])
```
+To use the mask for pixel segmentation, define the pixel-label mapping for the DatasetVersion:
+
+```python
+version.set_masks_labels(
+ {(0,0,0): ["background"], (1,1,1): ["person", "sitting"], (2,2,2): ["cat"]}
+)
+```
+
+The relevant label is applied to all masks in the version according to the version’s mask-label mapping dictionary.
+
+### Registering Frames with Multiple Masks
+Frames can contain multiple masks. To add multiple masks, use the SingleFrame’s `masks_source` property. Input one of
+the following:
+* A dictionary with mask string ID keys and mask URI values
+* A list of mask URIs. Number IDs are automatically assigned to the masks ( "00", "01", etc.)
+
+```python
+frame = SingleFrame(source='https://s3.amazonaws.com/allegro-datasets/cityscapes/leftImg8bit_trainvaltest/leftImg8bit/val/frankfurt/frankfurt_000000_000294_leftImg8bit.png',)
+
+# add multiple masks
+# with dictionary
+frame.masks_source={"ID 1 ": "", "ID 2": ""}
+# with list
+frame.masks_source=[ "", ""]
+```
+
diff --git a/docs/hyperdatasets/previews.md b/docs/hyperdatasets/previews.md
index b1b1b44a..8d4a4b3d 100644
--- a/docs/hyperdatasets/previews.md
+++ b/docs/hyperdatasets/previews.md
@@ -69,14 +69,6 @@ The following is an example of preview metadata.
}
],
"rois": [
- {
- "sources":["front"],
- "label": ["right_lane"],
- "mask": {
- "id": "seg",
- "value": [-1, 1, 255]
- }
- },
{
"sources": ["front"],
"label": ["bike"],
diff --git a/docs/hyperdatasets/single_frames.md b/docs/hyperdatasets/single_frames.md
index f905fd98..1459e3cb 100644
--- a/docs/hyperdatasets/single_frames.md
+++ b/docs/hyperdatasets/single_frames.md
@@ -35,8 +35,8 @@ For more information, see [Annotations](annotations.md).
### Masks
-A `SingleFrame` includes a URI link to a mask file if applicable. Masks correspond to raw data where the objects to be
-detected in raw data are marked with colors in the masks.
+A `SingleFrame` can include a URI link to masks file if applicable. Masks correspond to raw data where the objects to be
+detected are marked with colors or different opacity levels in the masks.
For more information, see [Masks](masks.md).
@@ -100,7 +100,12 @@ The panel below describes the details contained within a `frame`:
* `id` - ID of the mask dictionary in `sources`.
* `value` - RGB value of the mask.
-
+
+ :::info
+ The `mask` dictionary is deprecated. Mask labels and their associated pixel values are now stored in the dataset
+ version’s metadata. See [Masks](masks.md).
+ :::
+
* `poly` (*[int]*) - Bounding area vertices.
* `sources` (*[string]*) - The `id` in the `sources` dictionary which relates an annotation to its raw data source.
@@ -112,11 +117,11 @@ The panel below describes the details contained within a `frame`:
* `uri` - URI of the raw data.
* `width` - Width of the image or video.
* `height` - Height of the image or video.
- * `mask` - Sources of masks used in the `rois`.
+ * `masks` - List of available masks.
- * `id` - ID of the mask source. This relates a mask source to an ROI.
- * `content_type` - The type of mask source. For example, `image/jpeg`.
- * `uri` - URI of the mask source.
+ * `id` - Mask ID
+ * `content_type` - Mask type. For example, `image/jpeg`.
+ * `uri` - Mask URI
* `timestamp`
* `preview` - URI of the thumbnail preview image used in the ClearML Enterprise WebApp (UI)
diff --git a/docs/hyperdatasets/sources.md b/docs/hyperdatasets/sources.md
index 93192e9e..2d1c4142 100644
--- a/docs/hyperdatasets/sources.md
+++ b/docs/hyperdatasets/sources.md
@@ -7,12 +7,9 @@ Each frame contains `sources`, a list of dictionaries containing:
* A `URI` pointing to the source data (image or video)
* Sources for [masks](masks.md) used in semantic segmentation
* Image [previews](previews.md), which are thumbnails used in the ClearML Enterprise WebApp (UI).
-
-`sources` does not contain:
-* `rois` even though ROIs are directly associated with the images and `masks` in `sources`
-* ROI metadata, because ROIs can be used over multiple frames.
-
-Instead, frames contain a top-level `rois` array, which is a list of ROI dictionaries, where each dictionary contains a
+
+`sources` does not contain ROI metadata, because ROIs can be used over multiple frames. Instead, frames contain a
+top-level `rois` array, which is a list of ROI dictionaries, where each dictionary contains a
list of source IDs. Those IDs connect `sources` to ROIs.
## Examples
diff --git a/docs/img/hyperdatasets/concepts_masks.png b/docs/img/hyperdatasets/concepts_masks.png
deleted file mode 100644
index dc441dca..00000000
Binary files a/docs/img/hyperdatasets/concepts_masks.png and /dev/null differ
diff --git a/docs/img/hyperdatasets/concepts_masks_image_only.png b/docs/img/hyperdatasets/concepts_masks_image_only.png
deleted file mode 100644
index ff57ac91..00000000
Binary files a/docs/img/hyperdatasets/concepts_masks_image_only.png and /dev/null differ
diff --git a/docs/img/hyperdatasets/dataset_alpha_masks_1.png b/docs/img/hyperdatasets/dataset_alpha_masks_1.png
new file mode 100644
index 00000000..fef7b564
Binary files /dev/null and b/docs/img/hyperdatasets/dataset_alpha_masks_1.png differ
diff --git a/docs/img/hyperdatasets/dataset_alpha_masks_2.png b/docs/img/hyperdatasets/dataset_alpha_masks_2.png
new file mode 100644
index 00000000..1a325374
Binary files /dev/null and b/docs/img/hyperdatasets/dataset_alpha_masks_2.png differ
diff --git a/docs/img/hyperdatasets/dataset_metadata.png b/docs/img/hyperdatasets/dataset_metadata.png
new file mode 100644
index 00000000..ffc4a68c
Binary files /dev/null and b/docs/img/hyperdatasets/dataset_metadata.png differ
diff --git a/docs/img/hyperdatasets/dataset_pixel_masks_1.png b/docs/img/hyperdatasets/dataset_pixel_masks_1.png
new file mode 100644
index 00000000..233e28fc
Binary files /dev/null and b/docs/img/hyperdatasets/dataset_pixel_masks_1.png differ
diff --git a/docs/img/hyperdatasets/dataset_pixel_masks_2.png b/docs/img/hyperdatasets/dataset_pixel_masks_2.png
new file mode 100644
index 00000000..a9cd7a37
Binary files /dev/null and b/docs/img/hyperdatasets/dataset_pixel_masks_2.png differ