diff --git a/docs/hyperdatasets/dataset.md b/docs/hyperdatasets/dataset.md index f5213ec2..4d8c05a3 100644 --- a/docs/hyperdatasets/dataset.md +++ b/docs/hyperdatasets/dataset.md @@ -5,15 +5,14 @@ title: Datasets and Dataset Versions ClearML Enterprise's **Datasets** and **Dataset versions** provide the internal data structure and functionality for the following purposes: * Connecting source data to the ClearML Enterprise platform -* Using ClearML Enterprise's GIT-like [Dataset versioning](#dataset-versioning) +* Using ClearML Enterprise's Git-like [Dataset versioning](#dataset-versioning) * Integrating the powerful features of [Dataviews](dataviews.md) with an experiment * [Annotating](webapp/webapp_datasets_frames.md#annotations) images and videos Datasets consist of versions with SingleFrames and/or FrameGroups. Each Dataset can contain multiple versions, which can have multiple children that inherit their parent's contents. -Mask-labels can be defined globally, for a DatasetVersion. When defined this way, they will be applied to all masks in -that version. +Mask-labels are defined at the DatasetVersion level, and are applied to all masks in a DatasetVersion. ## Example Datasets diff --git a/docs/hyperdatasets/masks.md b/docs/hyperdatasets/masks.md index 809733e3..3a8e8cb8 100644 --- a/docs/hyperdatasets/masks.md +++ b/docs/hyperdatasets/masks.md @@ -2,241 +2,152 @@ title: Masks --- -When applicable, [`sources`](sources.md) contains `masks`, a list of dictionaries used to connect a special type of -source data to the ClearML Enterprise platform. That source data is a **mask**. +Masks are source data used in deep learning for image segmentation. Mask URIs are a property of a SingleFrame. -Masks are used in deep learning for semantic segmentation. +ClearML applies the masks in one of two modes: +* [Pixel segmentation](#pixel-segmentation-masks) - Pixel RGB values are each mapped to segmentation labels. +* [Alpha channel](#alpha-channel-masks) - Pixel RGB values are interpreted as opacity levels. -Masks correspond to raw data where the objects to be detected are marked with colors in the masks. The colors -are RGB values and represent the objects that are labeled for segmentation. +In the WebApp's [frame viewer](webapp/webapp_datasets_frames.md#frame-viewer), you can select how to apply a mask over +a frame. -In frames used for semantic segmentation, the metadata connecting the mask files / images to the ClearML Enterprise platform, -and the RGB values and labels used for segmentation are separate. They are contained in two different dictionaries of -a SingleFrame: +## Pixel Segmentation Masks +For pixel segmentation, mask RGB pixel values are mapped to labels. -* **`masks`** (plural) is in [`sources`](sources.md) and contains the mask files / images `URI` (in addition to other keys - and values). +Mask-label mapping is defined at the dataset level, through the `mask_labels` property in a version's metadata. -* **`mask`** (singular) is in the `rois` array of a Frame. - - Each `rois` dictionary contains: +`mask_labels` is a list of dictionaries, where each dictionary includes the following keys: +* `value` - Mask's RGB pixel value +* `labels` - Label associated with the value. - * RGB values and labels of a **mask** (in addition to other keys and values) +See how to manage dataset version mask labels pythonically [here](dataset.md#managing-version-mask-labels). - * Metadata and data for the labeled area of an image - - -See [Example 1](#example-1), which shows `masks` in `sources`, `mask` in `rois`, and the key-value pairs used to relate -a mask to its source in a frame. +In the UI, you can view the mapping in a dataset version's [Metadata](webapp/webapp_datasets_versioning.md#metadata) tab. +![Dataset metadata panel](../img/hyperdatasets/dataset_metadata.png) -## Masks Structure +When viewing a frame with a mask corresponding with the version’s mask-label mapping, the UI arbitrarily assigns a color +to each label . The color assignment can be [customized](webapp/webapp_datasets_frames.md#labels). -The chart below explains the keys and values of the `masks` dictionary (in the [`sources`](sources.md) -section of a Frame). +For example: +* Original frame image: -|Key|Value Description| -|---|----| -|`id`|**Type**: integer. | -|`content_type`| **Type**: string. | -|`timestamp`|**Type**: integer. | -|`uri`|**Type**: string. | + ![Frame without mask](../img/hyperdatasets/dataset_pixel_masks_1.png) +* Frame image with the semantic segmentation mask enabled. Labels are applied according to the dataset version’s + mask-label mapping: -## Examples -### Example 1 + ![Frame with semantic seg mask](../img/hyperdatasets/dataset_pixel_masks_2.png) -This example demonstrates an original image, its masks, and its frame containing -the `sources` and ROI metadata. +The frame's sources array contains a masks list of dictionaries that looks something like this: - -This frame contains the `masks` list of dictionaries in `sources`, -and the `rois` array, as well as several top-level key-value pairs. - - -```json +```editorconfig { - "timestamp": 1234567889, - "context_id": "car_1", - "meta": { - "velocity": "60" - }, - "sources": [ - { - "id": "front", - "content_type": "video/mp4", - "width": 800, - "height": 600, - "uri": "https://s3.amazonaws.com/my_cars/car_1/front.mp4", - "timestamp": 1234567889, - "meta" :{ - "angle":45, - "fov":129 - }, - "masks": [ - { - "id": "seg", - "content_type": "video/mp4", - "uri": "https://s3.amazonaws.com/seg_masks/car_1/front_seg.mp4", - "timestamp": 123456789 - }, - { - "id": "seg_instance", - "content_type": "video/mp4", - "uri": "https://s3.amazonaws.com/seg_masks/car_1/front_instance_seg.mp4", - "timestamp": 123456789 - } - ] - } - ], - "rois": [ - { - "sources":["front"], - "label": ["seg"], - "mask": { - "id": "car", - "value": [210,210,120] - } - }, - { - "sources":["front"], - "label": ["seg"], - "mask": { - "id": "person", - "value": [147,44,209] - } - }, - { - "sources":["front"], - "label": ["seg"], - "mask": { - "id": "road", - "value": [197,135,146] - } - }, - { - "sources":["front"], - "label": ["seg"], - "mask": { - "id": "street", - "value": [135,198,145] - } - }, - { - "sources":["front"], - "label": ["seg"], - "mask": { - "id": "building", - "value": [72,191,65] - } - } - ] + "id": "", + "timestamp": "" , + "context_id": "car_1", + "sources": [ + { + "id": "", + "content_type": "", + "uri": "", + "timestamp": 1234567889, + ... + "masks": [ + { + "id": "", + "content_type": "video/mp4", + "uri": "", + "timestamp": 123456789 + } + ] + } + ] } ``` - +The masks dictionary includes the frame's masks’ URIs and IDs. +## Alpha Channel Masks +For alpha channel, mask RGB pixel values are interpreted as opacity values so that when the mask is applied, only the +desired sections of the source are visible. -* In `sources`: - * The source ID is `front`. - * In the `masks` dictionary, the source contains mask sources with IDs of `seg` and `seg_instance`. -* In `rois`: - * Each ROI source is `front`, relating the ROI to its original source image. - * Each ROI has a label of `seg`, indicating segmentation. - * Each `mask` has an `id` (`car`, `person`, `road`, `street`, and `building`) and a unique RGB `value` - (color-coding). - +For example: +* Original frame: + ![Maskless frame](../img/hyperdatasets/dataset_alpha_masks_1.png) - -Original Image +* Same frame with an alpha channel mask, emphasizing the troll doll: + + ![Alpha mask frame](../img/hyperdatasets/dataset_alpha_masks_2.png) -![image](../img/hyperdatasets/concepts_masks_image_only.png) -Mask image +The frame's sources array contains a masks list of dictionaries that looks something like this: -![image](../img/hyperdatasets/concepts_masks.png) - - - -### Example 2 - -This example shows two masks for video from a camera. The masks label cars and the road. - - - -```json -"sources": [ - { - "id": "front", - "content_type": "video/mp4", - "width": 800, - "height": 600, - "uri": "https://s3.amazonaws.com/my_cars/car_1/front.mp4", - "timestamp": 1234567889, - "meta" :{ - "angle":45, - "fov":129 - }, - "masks": [ - { - "id": "car", - "content_type": "video/mp4", - "uri": "https://s3.amazonaws.com/seg_masks/car_1/front_seg.mp4", - "timestamp": 123456789 - }, - { - "id": "road", - "content_type": "video/mp4", - "uri": "https://s3.amazonaws.com/seg_masks/car_1/front_instance_seg.mp4", - "timestamp": 123456789 - } - ] - } - ], - "rois": [ - { - "sources":["front"], - "label": ["right_lane"], - "mask": { - "id": "car", - "value": [210,210,120] - } - }, - { - "sources":["front"], - "label": ["right_lane"], - "mask": { - "id": "road", - "value": [197,135,146] - } - } +```editorconfig +{ + "sources" : [ + { + "id" : "321" + "uri" : "https://i.ibb.co/bs7R9k6/troll.png" + "masks" : [ + { + "id" : "troll", + "uri" : "https://i.ibb.co/TmJ3mvT/troll-alpha.png" + } + ] + "timestamp" : 0 + } + ] +} ``` - +Note that for alpha channel masks, no labels are used. -* In `sources`: - * The source ID is `front`. - * The source contains mask sources with IDs of `car` and `road`. -* In `rois`: - * Each ROI source is `front` relating the ROI to its original source image. - * Each ROI has a label of `right_lane` indicating the ROI object. - * Each `mask` has an `id` (`car`, `person`) and a unique RGB `value` (color-coding). - ## Usage - -### Adding Mask Annotations - -To add a mask annotation to a frame, use the [`SingleFrame.add_annotation`](../references/hyperdataset/singleframe.md#add_annotation). -This method is generally used to add ROI annotations, but it can also be used to add frame specific mask labels. Input the -mask value as a list with the RGB values in the `mask_rgb` parameter, and a list of labels in the `labels` parameter. +### Register Frames with a Masks +To register frames with a mask, create a frame and specify the frame's mask file's URI. ```python -frame = SingleFrame( - source='/home/user/woof_meow.jpg', - preview_uri='https://storage.googleapis.com/kaggle-competitions/kaggle/3362/media/woof_meow.jpg', +# create dataset version +version = DatasetVersion.create_version( + dataset_name="Example", + version_name="Registering frame with mask" ) - -frame.add_annotation(mask_rgb=[0, 0, 0], labels=['cat']) + +# create frame with mask +frame = SingleFrame( + source='https://s3.amazonaws.com/allegro-datasets/cityscapes/leftImg8bit_trainvaltest/leftImg8bit/val/frankfurt/frankfurt_000000_000294_leftImg8bit.png', + mask_source='https://s3.amazonaws.com/allegro-datasets/cityscapes/gtFine_trainvaltest/gtFine/val/frankfurt/frankfurt_000000_000294_gtFine_labelIds.png' +) + +# add frame to version +version.add_frames([frame]) ``` +To use the mask for pixel segmentation, define the pixel-label mapping for the DatasetVersion: + +```python +version.set_masks_labels( + {(0,0,0): ["background"], (1,1,1): ["person", "sitting"], (2,2,2): ["cat"]} +) +``` + +The relevant label is applied to all masks in the version according to the version’s mask-label mapping dictionary. + +### Registering Frames with Multiple Masks +Frames can contain multiple masks. To add multiple masks, use the SingleFrame’s `masks_source` property. Input one of +the following: +* A dictionary with mask string ID keys and mask URI values +* A list of mask URIs. Number IDs are automatically assigned to the masks ( "00", "01", etc.) + +```python +frame = SingleFrame(source='https://s3.amazonaws.com/allegro-datasets/cityscapes/leftImg8bit_trainvaltest/leftImg8bit/val/frankfurt/frankfurt_000000_000294_leftImg8bit.png',) + +# add multiple masks +# with dictionary +frame.masks_source={"ID 1 ": "", "ID 2": ""} +# with list +frame.masks_source=[ "", ""] +``` + diff --git a/docs/hyperdatasets/previews.md b/docs/hyperdatasets/previews.md index b1b1b44a..8d4a4b3d 100644 --- a/docs/hyperdatasets/previews.md +++ b/docs/hyperdatasets/previews.md @@ -69,14 +69,6 @@ The following is an example of preview metadata. } ], "rois": [ - { - "sources":["front"], - "label": ["right_lane"], - "mask": { - "id": "seg", - "value": [-1, 1, 255] - } - }, { "sources": ["front"], "label": ["bike"], diff --git a/docs/hyperdatasets/single_frames.md b/docs/hyperdatasets/single_frames.md index f905fd98..1459e3cb 100644 --- a/docs/hyperdatasets/single_frames.md +++ b/docs/hyperdatasets/single_frames.md @@ -35,8 +35,8 @@ For more information, see [Annotations](annotations.md). ### Masks -A `SingleFrame` includes a URI link to a mask file if applicable. Masks correspond to raw data where the objects to be -detected in raw data are marked with colors in the masks. +A `SingleFrame` can include a URI link to masks file if applicable. Masks correspond to raw data where the objects to be +detected are marked with colors or different opacity levels in the masks. For more information, see [Masks](masks.md). @@ -100,7 +100,12 @@ The panel below describes the details contained within a `frame`: * `id` - ID of the mask dictionary in `sources`. * `value` - RGB value of the mask. - + + :::info + The `mask` dictionary is deprecated. Mask labels and their associated pixel values are now stored in the dataset + version’s metadata. See [Masks](masks.md). + ::: + * `poly` (*[int]*) - Bounding area vertices. * `sources` (*[string]*) - The `id` in the `sources` dictionary which relates an annotation to its raw data source. @@ -112,11 +117,11 @@ The panel below describes the details contained within a `frame`: * `uri` - URI of the raw data. * `width` - Width of the image or video. * `height` - Height of the image or video. - * `mask` - Sources of masks used in the `rois`. + * `masks` - List of available masks. - * `id` - ID of the mask source. This relates a mask source to an ROI. - * `content_type` - The type of mask source. For example, `image/jpeg`. - * `uri` - URI of the mask source. + * `id` - Mask ID + * `content_type` - Mask type. For example, `image/jpeg`. + * `uri` - Mask URI * `timestamp` * `preview` - URI of the thumbnail preview image used in the ClearML Enterprise WebApp (UI) diff --git a/docs/hyperdatasets/sources.md b/docs/hyperdatasets/sources.md index 93192e9e..2d1c4142 100644 --- a/docs/hyperdatasets/sources.md +++ b/docs/hyperdatasets/sources.md @@ -7,12 +7,9 @@ Each frame contains `sources`, a list of dictionaries containing: * A `URI` pointing to the source data (image or video) * Sources for [masks](masks.md) used in semantic segmentation * Image [previews](previews.md), which are thumbnails used in the ClearML Enterprise WebApp (UI). - -`sources` does not contain: -* `rois` even though ROIs are directly associated with the images and `masks` in `sources` -* ROI metadata, because ROIs can be used over multiple frames. - -Instead, frames contain a top-level `rois` array, which is a list of ROI dictionaries, where each dictionary contains a + +`sources` does not contain ROI metadata, because ROIs can be used over multiple frames. Instead, frames contain a +top-level `rois` array, which is a list of ROI dictionaries, where each dictionary contains a list of source IDs. Those IDs connect `sources` to ROIs. ## Examples diff --git a/docs/img/hyperdatasets/concepts_masks.png b/docs/img/hyperdatasets/concepts_masks.png deleted file mode 100644 index dc441dca..00000000 Binary files a/docs/img/hyperdatasets/concepts_masks.png and /dev/null differ diff --git a/docs/img/hyperdatasets/concepts_masks_image_only.png b/docs/img/hyperdatasets/concepts_masks_image_only.png deleted file mode 100644 index ff57ac91..00000000 Binary files a/docs/img/hyperdatasets/concepts_masks_image_only.png and /dev/null differ diff --git a/docs/img/hyperdatasets/dataset_alpha_masks_1.png b/docs/img/hyperdatasets/dataset_alpha_masks_1.png new file mode 100644 index 00000000..fef7b564 Binary files /dev/null and b/docs/img/hyperdatasets/dataset_alpha_masks_1.png differ diff --git a/docs/img/hyperdatasets/dataset_alpha_masks_2.png b/docs/img/hyperdatasets/dataset_alpha_masks_2.png new file mode 100644 index 00000000..1a325374 Binary files /dev/null and b/docs/img/hyperdatasets/dataset_alpha_masks_2.png differ diff --git a/docs/img/hyperdatasets/dataset_metadata.png b/docs/img/hyperdatasets/dataset_metadata.png new file mode 100644 index 00000000..ffc4a68c Binary files /dev/null and b/docs/img/hyperdatasets/dataset_metadata.png differ diff --git a/docs/img/hyperdatasets/dataset_pixel_masks_1.png b/docs/img/hyperdatasets/dataset_pixel_masks_1.png new file mode 100644 index 00000000..233e28fc Binary files /dev/null and b/docs/img/hyperdatasets/dataset_pixel_masks_1.png differ diff --git a/docs/img/hyperdatasets/dataset_pixel_masks_2.png b/docs/img/hyperdatasets/dataset_pixel_masks_2.png new file mode 100644 index 00000000..a9cd7a37 Binary files /dev/null and b/docs/img/hyperdatasets/dataset_pixel_masks_2.png differ