Add Dataset alias explanation (#449)

2025-06-26 18:17:44 +00:00 · 2023-01-22 14:46:30 +02:00
parent 9fea8f1e33
commit 6df1cd9561
3 changed files with 24 additions and 3 deletions
--- a/docs/clearml_data/data_management_examples/data_man_cifar_classification.md
+++ b/docs/clearml_data/data_management_examples/data_man_cifar_classification.md
@@ -85,7 +85,8 @@ from clearml import Dataset

 dataset_path = Dataset.get(
    dataset_name=dataset_name, 
-    dataset_project=dataset_project
+    dataset_project=dataset_project,
+    alias="Cifar dataset"
 ).get_local_copy()

 trainset = datasets.CIFAR10(
@@ -95,7 +96,13 @@ trainset = datasets.CIFAR10(
    transform=transform
 )
 ```
+
+In cases like this, where you use a dataset in a task, you can have the dataset's ID stored in the task’s 
+hyperparameters. Passing `alias=<dataset_alias_string>` stores the dataset’s ID in the 
+`dataset_alias_string` parameter in the experiment's **CONFIGURATION > HYPERPARAMETERS > Datasets** section. This way 
+you can easily track which dataset the task is using. 
+
 The Dataset's [`get_local_copy`](../../references/sdk/dataset.md#get_local_copy) method will return a path to the cached, 
-downloaded dataset. Then we provide the path to Pytorch's dataset object.
+downloaded dataset. Then we provide the path to PyTorch's dataset object.

 The script then trains a neural network to classify images using the dataset created above.