mirror of
https://github.com/clearml/clearml-docs
synced 2025-06-26 18:17:44 +00:00
Add Hyper-Datasets
This commit is contained in:
34
docs/hyperdatasets/overview.md
Normal file
34
docs/hyperdatasets/overview.md
Normal file
@@ -0,0 +1,34 @@
|
||||
---
|
||||
title: Hyper Datasets
|
||||
---
|
||||
|
||||
ClearML's Hyper Datasets are an MLOps-oriented abstraction of your data, which facilitates traceable, reproducible model development
|
||||
through parametrized data access and meta-data version control.
|
||||
|
||||
The basic premise is that a user-formed query is a full representation of the dataset used by the ML/DL process.
|
||||
|
||||
ClearML Enterprise's hyperdatasets supports rapid prototyping, creating new opportunities such as:
|
||||
* Hyperparameter optimization of the data itself
|
||||
* QA/QC pipelining
|
||||
* CD/CT (continuous training) during deployment
|
||||
* Enabling complex applications like collaborative (federated) learning.
|
||||
|
||||
|
||||
## Hyperdataset Components
|
||||
|
||||
A hyperdataset is composed of the following components:
|
||||
|
||||
* [Frames](frames.md)
|
||||
* [SingleFrames](single_frames.md)
|
||||
* [FrameGroups](frame_groups.md)
|
||||
* [Datasets and Dataset Versions](dataset.md)
|
||||
* [Dataviews](dataviews.md)
|
||||
|
||||
These components interact in a way that enables revising data and tracking and accessing all of its version.
|
||||
|
||||
Frames are the basics units of data in ClearML Enterprise. SingleFrames and FrameGroups make up a Dataset version.
|
||||
Dataset versions can be created, modified, and removed. The different version are recorded and available,
|
||||
so experiments and their data are reproducible and traceable.
|
||||
|
||||
Lastly, Dataviews manage views of the dataset with queries, so the input data to an experiment can be defined from a
|
||||
subset of a Dataset or combinations of Datasets.
|
||||
Reference in New Issue
Block a user