Small edits (#476)

This commit is contained in:
pollfly
2023-02-16 12:17:53 +02:00
committed by GitHub
parent 5458f8036b
commit 2cf096f7ec
27 changed files with 64 additions and 64 deletions

View File

@@ -2,13 +2,13 @@
title: Dataset Management with CLI and SDK
---
In this tutorial, we are going to manage the CIFAR dataset with `clearml-data` CLI, and then use ClearML's [`Dataset`](../../references/sdk/dataset.md)
In this tutorial, you are going to manage the CIFAR dataset with `clearml-data` CLI, and then use ClearML's [`Dataset`](../../references/sdk/dataset.md)
class to ingest the data.
## Creating the Dataset
### Downloading the Data
Before we can register the CIFAR dataset with `clearml-data`, we need to obtain a local copy of it.
Before registering the CIFAR dataset with `clearml-data`, you need to obtain a local copy of it.
Execute this python script to download the data
```python
@@ -43,7 +43,7 @@ New dataset created id=ee1c35f60f384e65bc800f42f0aca5ec
Where `ee1c35f60f384e65bc800f42f0aca5ec` is the dataset ID.
## Adding Files
Add the files we just downloaded to the dataset:
Add the files that were just downloaded to the dataset:
```
clearml-data add --files <dataset_path>
@@ -72,7 +72,7 @@ In the panel's **CONTENT** tab, you can see a table summarizing version contents
## Using the Dataset
Now that we have a new dataset registered, we can consume it.
Now that a new dataset is registered, you can consume it.
The [data_ingestion.py](https://github.com/allegroai/clearml/blob/master/examples/datasets/data_ingestion.py) example
script demonstrates using the dataset within Python code.
@@ -103,6 +103,6 @@ hyperparameters. Passing `alias=<dataset_alias_string>` stores the datasets I
you can easily track which dataset the task is using.
The Dataset's [`get_local_copy`](../../references/sdk/dataset.md#get_local_copy) method will return a path to the cached,
downloaded dataset. Then we provide the path to PyTorch's dataset object.
downloaded dataset. Then the dataset path is input to PyTorch's `datasets` object.
The script then trains a neural network to classify images using the dataset created above.

View File

@@ -18,7 +18,7 @@ demonstrates how to do the following:
### Downloading the Data
We first need to obtain a local copy of the CIFAR dataset.
You first need to obtain a local copy of the CIFAR dataset.
```python
from clearml import StorageManager
@@ -79,7 +79,7 @@ In the panel's **CONTENT** tab, you can see a table summarizing version contents
## Data Ingestion
Now that we have a new dataset registered, we can consume it!
Now that a new dataset is registered, you can consume it!
The [data_ingestion.py](https://github.com/allegroai/clearml/blob/master/examples/datasets/data_ingestion.py) script
demonstrates data ingestion using the dataset created in the first script.