Edit hyperdataset examples (#895)

This commit is contained in:
pollfly 2023-01-24 12:30:55 +02:00 committed by GitHub
parent 7d80406290
commit 3585eff49b
No known key found for this signature in database
GPG Key ID: 4AEE18F83AFDEB23
4 changed files with 13 additions and 12 deletions

View File

@ -538,7 +538,7 @@ class Dataset(object):
return removed
def sync_folder(self, local_path, dataset_path=None, verbose=False):
# type: (Union[Path, _Path, str], Union[Path, _Path, str], bool) -> (int, int)
# type: (Union[Path, _Path, str], Union[Path, _Path, str], bool) -> (int, int, int)
"""
Synchronize the dataset with a local folder. The dataset is synchronized from the
relative_base_folder (default: dataset root) and deeper with the specified local path.

View File

@ -1,15 +1,15 @@
# ClearML HyperDatasets #
Hyper-Datasets is a data management system thats designed for unstructured data such as text, audio, or visual data. It is part of the ClearML enterprise offering, which means it includes quite a few upgrades over the open source clearml-data.
Hyper-Datasets is a data management system thats designed for unstructured data such as text, audio, or visual data. It is part of the ClearML Enterprise offering, which means it includes quite a few upgrades over the open source clearml-data.
The main conceptual difference between the two is that Hyper-Datasets decouples the metadata from the raw data files. This allows you to manipulate the metadata in all kinds of ways, while abstracting away the logistics of having to deal with large amounts of data.
The main conceptual difference between the two is that Hyper-Datasets decouple the metadata from the raw data files. This allows you to manipulate the metadata in all kinds of ways, while abstracting away the logistics of having to deal with large amounts of data.
To leverage Hyper-Datasets power, users define Dataviews which are sophisticated queries connecting specific data from one or more datasets to an experiment in the Experiment Manager. Essentially it creates and manages local views of remote Datasets.
To leverage Hyper-Datasets power, users define Dataviews, which are sophisticated queries connecting specific data from one or more datasets to an experiment in the Experiment Manager. Essentially it creates and manages local views of remote Datasets.
![Dataview in the UI](../../docs/screenshots/hpd.png)
## Examples Overview ##
- Hyperdataset registration into ClearML Enterprise
- Hypderdataset usage exmaples, retrieving frames using the Dataview Class and connecting to pytorch dataloader
- Hyper-Dataset registration into ClearML Enterprise
- Hypder-Dataset usage examples, retrieving frames using the DataView Class and connecting to pytorch dataloader
## Further Resources ##

View File

@ -1,13 +1,14 @@
"""
How to register data with ROIs and metadata from a json file.
Create a list of ROI's for each image in the metadata format required by a frame.
How to register data with masks from a json file.
Create a list of masks for each image and add to a DatasetVersion.
Define DatasetVersion-level mask-label mapping, which maps RGB values from the mask to class labels.
Notice: This is a custom parser for a specific dataset. Each dataset requires a different parser.
You can run this example from this dir with:
python registration_with_roi_and_meta.py
--path data/sample_ds --ext jpg --ds_name my_uploaded_dataset --version_name my_version
python register_dataset_masks.py
--ext jpg --ds_name my_uploaded_dataset --version_name my_version
"""
import glob

View File

@ -6,8 +6,8 @@ Notice: This is a custom parser for a specific dataset. Each dataset requires a
You can run this example from this dir with:
python registration_with_roi_and_meta.py
--path data/sample_ds --ext jpg --ds_name my_uploaded_dataset --version_name my_version
python register_dataset_with_roi.py
--ext jpg --ds_name my_uploaded_dataset --version_name my_version
"""
import glob