clearml-docs/docs/hyperdatasets/masks.md

259 lines
7.9 KiB
Markdown
Raw Normal View History

2021-06-20 22:00:16 +00:00
---
title: Masks
---
When applicable, [`sources`](sources.md) contains `masks`, a list of dictionaries used to connect a special type of
source data to the ClearML Enterprise platform. That source data is a **mask**.
Masks are used in deep learning for semantic segmentation.
Masks correspond to raw data where the objects to be detected are marked with colors in the masks. The colors
2022-01-18 11:23:47 +00:00
are RGB values and represent the objects that are labeled for segmentation.
2021-06-20 22:00:16 +00:00
In frames used for semantic segmentation, the metadata connecting the mask files / images to the ClearML Enterprise platform,
and the RGB values and labels used for segmentation are separate. They are contained in two different dictionaries of
a SingleFrame:
* **`masks`** (plural) is in [`sources`](sources.md) and contains the mask files / images `URI` (in addition to other keys
and values).
* **`mask`** (singular) is in the `rois` array of a Frame.
Each `rois` dictionary contains:
* RGB values and labels of a **mask** (in addition to other keys and values)
* Metadata and data for the labeled area of an image
2023-01-25 11:25:29 +00:00
See [Example 1](#example-1), which shows `masks` in `sources`, `mask` in `rois`, and the key-value pairs used to relate
2021-06-20 22:00:16 +00:00
a mask to its source in a frame.
2021-09-09 10:17:46 +00:00
## Masks Structure
2021-06-20 22:00:16 +00:00
The chart below explains the keys and values of the `masks` dictionary (in the [`sources`](sources.md)
section of a Frame).
|Key|Value Description|
|---|----|
|`id`|**Type**: integer. <ul><li> The ID is used to relate this mask data source to the `mask` dictionary containing the label and RGB value for the mask.</li><li> See the `mask` key in `rois`.</li></ul>|
|`content_type`| **Type**: string. <ul><li> Type of mask data. For example, image / png or video / mp4.</li></ul>|
|`timestamp`|**Type**: integer. <ul><li>For images from a video, indicates the absolute position of the frame from the source (video) </li><li> For still images, set this to 0 (for example, video from a camera on a car, at 30 frames per second, would have a timestamp of 0 for the first frame, and 33 for the second frame).</li></ul>|
|`uri`|**Type**: string. <ul><li> URI of the mask file / image.</li></ul>|
## Examples
### Example 1
This example demonstrates an original image, its masks, and its frame containing
the `sources` and ROI metadata.
<details className="cml-expansion-panel info">
<summary className="cml-expansion-panel-summary">Example 1: View the frame</summary>
<div className="cml-expansion-panel-content">
This frame contains the `masks` list of dictionaries in `sources`,
and the `rois` array, as well as several top-level key-value pairs.
```json
{
"timestamp": 1234567889,
"context_id": "car_1",
"meta": {
"velocity": "60"
},
"sources": [
{
"id": "front",
"content_type": "video/mp4",
"width": 800,
"height": 600,
"uri": "https://s3.amazonaws.com/my_cars/car_1/front.mp4",
"timestamp": 1234567889,
"meta" :{
"angle":45,
"fov":129
},
"masks": [
{
"id": "seg",
"content_type": "video/mp4",
"uri": "https://s3.amazonaws.com/seg_masks/car_1/front_seg.mp4",
"timestamp": 123456789
},
{
"id": "seg_instance",
"content_type": "video/mp4",
"uri": "https://s3.amazonaws.com/seg_masks/car_1/front_instance_seg.mp4",
"timestamp": 123456789
}
]
}
],
"rois": [
{
"sources":["front"],
"label": ["seg"],
"mask": {
"id": "car",
"value": [210,210,120]
}
},
{
"sources":["front"],
"label": ["seg"],
"mask": {
"id": "person",
"value": [147,44,209]
}
},
{
"sources":["front"],
"label": ["seg"],
"mask": {
"id": "road",
"value": [197,135,146]
}
},
{
"sources":["front"],
"label": ["seg"],
"mask": {
"id": "street",
"value": [135,198,145]
}
},
{
"sources":["front"],
"label": ["seg"],
"mask": {
"id": "building",
"value": [72,191,65]
}
}
]
}
```
</div>
</details>
<br/>
* In `sources`:
* The source ID is `front`.
* In the `masks` dictionary, the source contains mask sources with IDs of `seg` and `seg_instance`.
* In `rois`:
* Each ROI source is `front`, relating the ROI to its original source image.
* Each ROI has a label of `seg`, indicating segmentation.
* Each `mask` has an `id` (`car`, `person`, `road`, `street`, and `building`) and a unique RGB `value`
(color-coding).
<details className="cml-expansion-panel screenshot">
<summary className="cml-expansion-panel-summary">Example image and masks</summary>
<div className="cml-expansion-panel-content">
Original Image
![image](../img/hyperdatasets/concepts_masks_image_only.png)
Mask image
![image](../img/hyperdatasets/concepts_masks.png)
</div>
</details>
<br/>
### Example 2
This example shows two masks for video from a camera. The masks label cars and the road.
<details className="cml-expansion-panel info">
<summary className="cml-expansion-panel-summary">Example 2: View the frame</summary>
<div className="cml-expansion-panel-content">
```json
"sources": [
{
"id": "front",
"content_type": "video/mp4",
"width": 800,
"height": 600,
"uri": "https://s3.amazonaws.com/my_cars/car_1/front.mp4",
"timestamp": 1234567889,
"meta" :{
"angle":45,
"fov":129
},
"masks": [
{
"id": "car",
"content_type": "video/mp4",
"uri": "https://s3.amazonaws.com/seg_masks/car_1/front_seg.mp4",
"timestamp": 123456789
},
{
"id": "road",
"content_type": "video/mp4",
"uri": "https://s3.amazonaws.com/seg_masks/car_1/front_instance_seg.mp4",
"timestamp": 123456789
}
]
}
],
"rois": [
{
"sources":["front"],
"label": ["right_lane"],
"mask": {
"id": "car",
"value": [210,210,120]
}
},
{
"sources":["front"],
"label": ["right_lane"],
"mask": {
"id": "road",
"value": [197,135,146]
}
}
```
</div>
</details>
<br/>
* In `sources`:
* The source ID is `front`.
* The source contains mask sources with IDs of `car` and `road`.
* In `rois`:
* Each ROI source is `front` relating the ROI to its original source image.
* Each ROI has a label of `right_lane` indicating the ROI object.
* Each `mask` has an `id` (`car`, `person`) and a unique RGB `value` (color-coding).
## Usage
2021-10-21 09:40:05 +00:00
### Adding Mask Annotations
To add a mask annotation to a frame, use the [`SingleFrame.add_annotation`](../references/hyperdataset/singleframe.md#add_annotation).
This method is generally used to add ROI annotations, but it can also be used to add frame specific mask labels. Input the
mask value as a list with the RGB values in the `mask_rgb` parameter, and a list of labels in the `labels` parameter.
```python
frame = SingleFrame(
source='/home/user/woof_meow.jpg',
preview_uri='https://storage.googleapis.com/kaggle-competitions/kaggle/3362/media/woof_meow.jpg',
2021-12-14 13:12:30 +00:00
)
frame.add_annotation(mask_rgb=[0, 0, 0], labels=['cat'])
```