mirror of
https://github.com/clearml/clearml-docs
synced 2025-06-26 18:17:44 +00:00
Small edits (#689)
This commit is contained in:
@@ -34,7 +34,7 @@ most recent dataset in a project. The same is true with tags; if a tag is specif
|
||||
|
||||
In cases where you use a dataset in a task (e.g. consuming a dataset), you can easily track which dataset the task is
|
||||
using by using `Dataset.get`'s `alias` parameter. Pass `alias=<dataset_alias_string>`, and the task using the dataset
|
||||
will store the dataset’s ID in the `dataset_alias_string` parameter under the task's **CONFIGURATION > HYPERPARAMETERS >
|
||||
will store the dataset's ID in the `dataset_alias_string` parameter under the task's **CONFIGURATION > HYPERPARAMETERS >
|
||||
Datasets** section.
|
||||
|
||||
|
||||
|
||||
@@ -20,7 +20,7 @@ ClearML Data Management solves two important challenges:
|
||||
Moreover, it can be difficult and inefficient to find on a git tree the commit associated with a certain version of a dataset.
|
||||
|
||||
Use ClearML Data to create, manage, and version your datasets. Store your files in any storage location of your choice
|
||||
(S3 / GS / Azure / Network Storage) by setting the dataset’s upload destination (see [`--storage`](clearml_data_cli.md#upload)
|
||||
(S3 / GS / Azure / Network Storage) by setting the dataset's upload destination (see [`--storage`](clearml_data_cli.md#upload)
|
||||
CLI option or [`output_url`](clearml_data_sdk.md#uploading-files) parameter).
|
||||
|
||||
Datasets can be set up to inherit from other datasets, so data lineages can be created, and users can track when and how
|
||||
|
||||
@@ -8,7 +8,7 @@ See [Hyper-Datasets](../hyperdatasets/overview.md) for ClearML's advanced querya
|
||||
:::
|
||||
|
||||
Datasets can be created, modified, and managed with ClearML Data's python interface. You can upload your dataset to any
|
||||
storage service of your choice (S3 / GS / Azure / Network Storage) by setting the dataset’s upload destination (see
|
||||
storage service of your choice (S3 / GS / Azure / Network Storage) by setting the dataset's upload destination (see
|
||||
[`output_url`](#uploading-files) parameter of `Dataset.upload()`). Once you have uploaded your dataset, you can access
|
||||
it from any machine.
|
||||
|
||||
|
||||
@@ -97,8 +97,8 @@ trainset = datasets.CIFAR10(
|
||||
)
|
||||
```
|
||||
|
||||
In cases like this, where you use a dataset in a task, you can have the dataset's ID stored in the task’s
|
||||
hyperparameters. Passing `alias=<dataset_alias_string>` stores the dataset’s ID in the
|
||||
In cases like this, where you use a dataset in a task, you can have the dataset's ID stored in the task's
|
||||
hyperparameters. Passing `alias=<dataset_alias_string>` stores the dataset's ID in the
|
||||
`dataset_alias_string` parameter in the experiment's **CONFIGURATION > HYPERPARAMETERS > Datasets** section. This way
|
||||
you can easily track which dataset the task is using.
|
||||
|
||||
|
||||
@@ -118,7 +118,7 @@ You'll need to input the Dataset ID you received when created the dataset above
|
||||
```bash
|
||||
clearml-data add --files new_data.txt
|
||||
```
|
||||
Which should return this output:
|
||||
The console should display this output:
|
||||
|
||||
```console
|
||||
clearml-data - Dataset Management & Versioning CLI
|
||||
|
||||
Reference in New Issue
Block a user