mirror of
https://github.com/clearml/clearml-docs
synced 2025-03-03 18:53:37 +00:00
Edit ClearML Data CLI page (#248)
This commit is contained in:
parent
88b53fa5c9
commit
24f42c6026
@ -16,7 +16,8 @@ The following page provides a reference to `clearml-data`'s CLI commands.
|
|||||||
Creates a new dataset.
|
Creates a new dataset.
|
||||||
|
|
||||||
```bash
|
```bash
|
||||||
clearml-data create [-h] [--parents [PARENTS [PARENTS ...]]] [--project PROJECT] --name NAME [--tags [TAGS [TAGS ...]]]
|
clearml-data create [-h] [--parents [PARENTS [PARENTS ...]]] [--project PROJECT]
|
||||||
|
--name NAME [--tags [TAGS [TAGS ...]]]
|
||||||
```
|
```
|
||||||
|
|
||||||
**Parameters**
|
**Parameters**
|
||||||
@ -75,7 +76,8 @@ clearml-data add [-h] [--id ID] [--dataset-folder DATASET_FOLDER]
|
|||||||
Remove files/links from the dataset.
|
Remove files/links from the dataset.
|
||||||
|
|
||||||
```bash
|
```bash
|
||||||
clearml-data remove [-h] [--id ID] [--files [FILES [FILES ...]]] [--non-recursive] [--verbose]
|
clearml-data remove [-h] [--id ID] [--files [FILES [FILES ...]]]
|
||||||
|
[--non-recursive] [--verbose]
|
||||||
```
|
```
|
||||||
|
|
||||||
**Parameters**
|
**Parameters**
|
||||||
@ -99,7 +101,8 @@ Upload the local dataset changes to the server. By default, it's uploaded to the
|
|||||||
medium by entering an upload destination, such as `s3://bucket`, `gs://`, `azure://`, `/mnt/shared/`.
|
medium by entering an upload destination, such as `s3://bucket`, `gs://`, `azure://`, `/mnt/shared/`.
|
||||||
|
|
||||||
```bash
|
```bash
|
||||||
clearml-data upload [--id <dataset_id>] [--storage <upload_destination>]
|
clearml-data upload [-h] [--id ID] [--storage STORAGE] [--chunk-size CHUNK_SIZE]
|
||||||
|
[--verbose]
|
||||||
```
|
```
|
||||||
|
|
||||||
|
|
||||||
@ -111,6 +114,7 @@ clearml-data upload [--id <dataset_id>] [--storage <upload_destination>]
|
|||||||
|---|---|---|
|
|---|---|---|
|
||||||
|`--id`| Dataset's ID. Default: previously created / accessed dataset| <img src="/docs/latest/icons/ico-optional-yes.svg" alt="Yes" className="icon size-md center-md" /> |
|
|`--id`| Dataset's ID. Default: previously created / accessed dataset| <img src="/docs/latest/icons/ico-optional-yes.svg" alt="Yes" className="icon size-md center-md" /> |
|
||||||
|`--storage`| Remote storage to use for the dataset files. Default: files_server | <img src="/docs/latest/icons/ico-optional-yes.svg" alt="Yes" className="icon size-md center-md" /> |
|
|`--storage`| Remote storage to use for the dataset files. Default: files_server | <img src="/docs/latest/icons/ico-optional-yes.svg" alt="Yes" className="icon size-md center-md" /> |
|
||||||
|
|`--chunk-size`| Set dataset artifact upload chunk size in MB. Default 512, (pass -1 for a single chunk). Example: 512, dataset will be split and uploaded in 512 MB chunks. | <img src="/docs/latest/icons/ico-optional-yes.svg" alt="Yes" className="icon size-md center-md" />|
|
||||||
|`--verbose` | Verbose reporting | <img src="/docs/latest/icons/ico-optional-yes.svg" alt="Yes" className="icon size-md center-md" />|
|
|`--verbose` | Verbose reporting | <img src="/docs/latest/icons/ico-optional-yes.svg" alt="Yes" className="icon size-md center-md" />|
|
||||||
|
|
||||||
</div>
|
</div>
|
||||||
@ -123,7 +127,8 @@ Finalize the dataset and makes it ready to be consumed. This automatically uploa
|
|||||||
Once a dataset is finalized, it can no longer be modified.
|
Once a dataset is finalized, it can no longer be modified.
|
||||||
|
|
||||||
```bash
|
```bash
|
||||||
clearml-data close --id <dataset_id>
|
clearml-data close [-h] [--id ID] [--storage STORAGE] [--disable-upload]
|
||||||
|
[--chunk-size CHUNK_SIZE] [--verbose]
|
||||||
```
|
```
|
||||||
|
|
||||||
**Parameters**
|
**Parameters**
|
||||||
@ -135,6 +140,7 @@ clearml-data close --id <dataset_id>
|
|||||||
|`--id`| Dataset's ID. Default: previously created / accessed dataset| <img src="/docs/latest/icons/ico-optional-yes.svg" alt="Yes" className="icon size-md center-md" /> |
|
|`--id`| Dataset's ID. Default: previously created / accessed dataset| <img src="/docs/latest/icons/ico-optional-yes.svg" alt="Yes" className="icon size-md center-md" /> |
|
||||||
|`--storage`| Remote storage to use for the dataset files. Default: files_server | <img src="/docs/latest/icons/ico-optional-yes.svg" alt="Yes" className="icon size-md center-md" /> |
|
|`--storage`| Remote storage to use for the dataset files. Default: files_server | <img src="/docs/latest/icons/ico-optional-yes.svg" alt="Yes" className="icon size-md center-md" /> |
|
||||||
|`--disable-upload` | Disable automatic upload when closing the dataset | <img src="/docs/latest/icons/ico-optional-yes.svg" alt="Yes" className="icon size-md center-md" /> |
|
|`--disable-upload` | Disable automatic upload when closing the dataset | <img src="/docs/latest/icons/ico-optional-yes.svg" alt="Yes" className="icon size-md center-md" /> |
|
||||||
|
|`--chunk-size`| Set dataset artifact upload chunk size in MB. Default 512, (pass -1 for a single chunk). Example: 512, dataset will be split and uploaded in 512 MB chunks. | <img src="/docs/latest/icons/ico-optional-yes.svg" alt="Yes" className="icon size-md center-md" />|
|
||||||
|`--verbose` | Verbose reporting | <img src="/docs/latest/icons/ico-optional-yes.svg" alt="Yes" className="icon size-md center-md" />|
|
|`--verbose` | Verbose reporting | <img src="/docs/latest/icons/ico-optional-yes.svg" alt="Yes" className="icon size-md center-md" />|
|
||||||
|
|
||||||
</div>
|
</div>
|
||||||
@ -152,7 +158,10 @@ and the changes (either file addition, modification and removal) will be reflect
|
|||||||
This command also uploads the data and finalizes the dataset automatically.
|
This command also uploads the data and finalizes the dataset automatically.
|
||||||
|
|
||||||
```bash
|
```bash
|
||||||
clearml-data sync [--id <dataset_id] --folder <folder_location> [--parents '<parent_id>']
|
clearml-data sync [-h] [--id ID] [--dataset-folder DATASET_FOLDER] --folder FOLDER
|
||||||
|
[--parents [PARENTS [PARENTS ...]]] [--project PROJECT] [--name NAME]
|
||||||
|
[--tags [TAGS [TAGS ...]]] [--storage STORAGE] [--skip-close]
|
||||||
|
[--chunk-size CHUNK_SIZE] [--verbose]
|
||||||
```
|
```
|
||||||
|
|
||||||
**Parameters**
|
**Parameters**
|
||||||
@ -162,6 +171,7 @@ clearml-data sync [--id <dataset_id] --folder <folder_location> [--parents '<pa
|
|||||||
|Name|Description|Optional|
|
|Name|Description|Optional|
|
||||||
|---|---|---|
|
|---|---|---|
|
||||||
|`--id`| Dataset's ID. Default: previously created / accessed dataset| <img src="/docs/latest/icons/ico-optional-yes.svg" alt="Yes" className="icon size-md center-md" /> |
|
|`--id`| Dataset's ID. Default: previously created / accessed dataset| <img src="/docs/latest/icons/ico-optional-yes.svg" alt="Yes" className="icon size-md center-md" /> |
|
||||||
|
|`--dataset-folder`|Dataset base folder to add the files to (default: Dataset root)|<img src="/docs/latest/icons/ico-optional-yes.svg" alt="Yes" className="icon size-md center-md" />|
|
||||||
|`--folder`|Local folder to sync. Wildcard selection is supported, for example: `~/data/*.jpg ~/data/json`|<img src="/docs/latest/icons/ico-optional-no.svg" alt="No" className="icon size-md center-md" />|
|
|`--folder`|Local folder to sync. Wildcard selection is supported, for example: `~/data/*.jpg ~/data/json`|<img src="/docs/latest/icons/ico-optional-no.svg" alt="No" className="icon size-md center-md" />|
|
||||||
|`--storage`|Remote storage to use for the dataset files. Default: files_server |<img src="/docs/latest/icons/ico-optional-yes.svg" alt="Yes" className="icon size-md center-md" />|
|
|`--storage`|Remote storage to use for the dataset files. Default: files_server |<img src="/docs/latest/icons/ico-optional-yes.svg" alt="Yes" className="icon size-md center-md" />|
|
||||||
|`--parents`|IDs of the dataset's parents (i.e. merge all parents). All modifications made to the folder since the parents were synced will be reflected in the dataset|<img src="/docs/latest/icons/ico-optional-yes.svg" alt="Yes" className="icon size-md center-md" />|
|
|`--parents`|IDs of the dataset's parents (i.e. merge all parents). All modifications made to the folder since the parents were synced will be reflected in the dataset|<img src="/docs/latest/icons/ico-optional-yes.svg" alt="Yes" className="icon size-md center-md" />|
|
||||||
@ -169,6 +179,7 @@ clearml-data sync [--id <dataset_id] --folder <folder_location> [--parents '<pa
|
|||||||
|`--name`|If creating a new dataset, specify the dataset's name|<img src="/docs/latest/icons/ico-optional-yes.svg" alt="Yes" className="icon size-md center-md" />|
|
|`--name`|If creating a new dataset, specify the dataset's name|<img src="/docs/latest/icons/ico-optional-yes.svg" alt="Yes" className="icon size-md center-md" />|
|
||||||
|`--tags`|Dataset user tags|<img src="/docs/latest/icons/ico-optional-yes.svg" alt="Yes" className="icon size-md center-md" />|
|
|`--tags`|Dataset user tags|<img src="/docs/latest/icons/ico-optional-yes.svg" alt="Yes" className="icon size-md center-md" />|
|
||||||
|`--skip-close`|Do not auto close dataset after syncing folders|<img src="/docs/latest/icons/ico-optional-yes.svg" alt="Yes" className="icon size-md center-md" />|
|
|`--skip-close`|Do not auto close dataset after syncing folders|<img src="/docs/latest/icons/ico-optional-yes.svg" alt="Yes" className="icon size-md center-md" />|
|
||||||
|
|`--chunk-size`| Set dataset artifact upload chunk size in MB. Default 512, (pass -1 for a single chunk). Example: 512, dataset will be split and uploaded in 512 MB chunks. |<img src="/docs/latest/icons/ico-optional-yes.svg" alt="Yes" className="icon size-md center-md" />|
|
||||||
|`--verbose` | Verbose reporting |<img src="/docs/latest/icons/ico-optional-yes.svg" alt="Yes" className="icon size-md center-md" />|
|
|`--verbose` | Verbose reporting |<img src="/docs/latest/icons/ico-optional-yes.svg" alt="Yes" className="icon size-md center-md" />|
|
||||||
|
|
||||||
</div>
|
</div>
|
||||||
@ -180,7 +191,8 @@ clearml-data sync [--id <dataset_id] --folder <folder_location> [--parents '<pa
|
|||||||
List a dataset's contents.
|
List a dataset's contents.
|
||||||
|
|
||||||
```bash
|
```bash
|
||||||
clearml-data list [--id <dataset_id>]
|
clearml-data list [-h] [--id ID] [--project PROJECT] [--name NAME]
|
||||||
|
[--filter [FILTER [FILTER ...]]] [--modified]
|
||||||
```
|
```
|
||||||
|
|
||||||
**Parameters**
|
**Parameters**
|
||||||
@ -205,8 +217,8 @@ Delete an entire dataset from ClearML. This can also be used to delete a newly c
|
|||||||
|
|
||||||
This does not work on datasets with children.
|
This does not work on datasets with children.
|
||||||
|
|
||||||
```
|
```bash
|
||||||
clearml-data delete [--id <dataset_id_to_delete>]
|
clearml-data delete [-h] [--id ID] [--force]
|
||||||
```
|
```
|
||||||
|
|
||||||
|
|
||||||
@ -230,7 +242,8 @@ Search datasets in the system by project, name, ID, and/or tags.
|
|||||||
Returns list of all datasets in the system that match the search request, sorted by creation time.
|
Returns list of all datasets in the system that match the search request, sorted by creation time.
|
||||||
|
|
||||||
```bash
|
```bash
|
||||||
clearml-data search [--name <name>] [--ids [IDS [IDS ...]]] [--project <project_name>] [--tags <tag>]
|
clearml-data search [-h] [--ids [IDS [IDS ...]]] [--project PROJECT]
|
||||||
|
[--name NAME] [--tags [TAGS [TAGS ...]]]
|
||||||
```
|
```
|
||||||
|
|
||||||
**Parameters**
|
**Parameters**
|
||||||
@ -255,7 +268,7 @@ Compare two datasets (target vs. source). The command returns a comparison summa
|
|||||||
`Comparison summary: 4 files removed, 3 files modified, 0 files added`
|
`Comparison summary: 4 files removed, 3 files modified, 0 files added`
|
||||||
|
|
||||||
```bash
|
```bash
|
||||||
clearml-data compare [--source SOURCE] [--target TARGET]
|
clearml-data compare [-h] --source SOURCE --target TARGET [--verbose]
|
||||||
```
|
```
|
||||||
|
|
||||||
|
|
||||||
@ -276,7 +289,7 @@ clearml-data compare [--source SOURCE] [--target TARGET]
|
|||||||
Squash multiple datasets into a single dataset version (merge down).
|
Squash multiple datasets into a single dataset version (merge down).
|
||||||
|
|
||||||
```bash
|
```bash
|
||||||
clearml-data squash --name NAME --ids [IDS [IDS ...]]
|
clearml-data squash [-h] --name NAME --ids [IDS [IDS ...]] [--storage STORAGE] [--verbose]
|
||||||
```
|
```
|
||||||
|
|
||||||
**Parameters**
|
**Parameters**
|
||||||
@ -297,7 +310,7 @@ clearml-data squash --name NAME --ids [IDS [IDS ...]]
|
|||||||
Verify that the dataset content matches the data from the local source.
|
Verify that the dataset content matches the data from the local source.
|
||||||
|
|
||||||
```bash
|
```bash
|
||||||
clearml-data verify [--id ID] [--folder FOLDER]
|
clearml-data verify [-h] [--id ID] [--folder FOLDER] [--filesize] [--verbose]
|
||||||
```
|
```
|
||||||
|
|
||||||
**Parameters**
|
**Parameters**
|
||||||
@ -319,7 +332,8 @@ Get a local copy of a dataset. By default, you get a read only cached folder, bu
|
|||||||
`--copy` flag.
|
`--copy` flag.
|
||||||
|
|
||||||
```bash
|
```bash
|
||||||
clearml-data get [--id ID] [--copy COPY] [--link LINK] [--overwrite]
|
clearml-data get [-h] [--id ID] [--copy COPY] [--link LINK] [--part PART]
|
||||||
|
[--num-parts NUM_PARTS] [--overwrite] [--verbose]
|
||||||
```
|
```
|
||||||
|
|
||||||
**Parameters**
|
**Parameters**
|
||||||
@ -331,8 +345,10 @@ clearml-data get [--id ID] [--copy COPY] [--link LINK] [--overwrite]
|
|||||||
|`--id`| Specify dataset ID. Default: previously created / accessed dataset|<img src="/docs/latest/icons/ico-optional-yes.svg" alt="Yes" className="icon size-md center-md" />|
|
|`--id`| Specify dataset ID. Default: previously created / accessed dataset|<img src="/docs/latest/icons/ico-optional-yes.svg" alt="Yes" className="icon size-md center-md" />|
|
||||||
|`--copy`| Get a writable copy of the dataset to a specific output folder|<img src="/docs/latest/icons/ico-optional-yes.svg" alt="Yes" className="icon size-md center-md" />|
|
|`--copy`| Get a writable copy of the dataset to a specific output folder|<img src="/docs/latest/icons/ico-optional-yes.svg" alt="Yes" className="icon size-md center-md" />|
|
||||||
|`--link`| Create a soft link (not supported on Windows) to a read-only cached folder containing the dataset|<img src="/docs/latest/icons/ico-optional-yes.svg" alt="Yes" className="icon size-md center-md" />|
|
|`--link`| Create a soft link (not supported on Windows) to a read-only cached folder containing the dataset|<img src="/docs/latest/icons/ico-optional-yes.svg" alt="Yes" className="icon size-md center-md" />|
|
||||||
|
|`--part`|Retrieve a partial copy of the dataset. Part number (0 to `--num-parts`-1) of total parts `--num-parts`.|<img src="/docs/latest/icons/ico-optional-yes.svg" alt="Yes" className="icon size-md center-md" />|
|
||||||
|
|`--num-parts`|Total number of parts to divide the dataset into. Notice, minimum retrieved part is a single chunk in a dataset (or its parents). Example: Dataset gen4, with 3 parents, each with a single chunk, can be divided into 4 parts |<img src="/docs/latest/icons/ico-optional-yes.svg" alt="Yes" className="icon size-md center-md" />|
|
||||||
|`--overwrite`| If `True`, overwrite the target folder|<img src="/docs/latest/icons/ico-optional-yes.svg" alt="Yes" className="icon size-md center-md" />|
|
|`--overwrite`| If `True`, overwrite the target folder|<img src="/docs/latest/icons/ico-optional-yes.svg" alt="Yes" className="icon size-md center-md" />|
|
||||||
|`--verbose`| Verbose report all file changes (instead of summary)|<img src="/docs/latest/icons/ico-optional-yes.svg" alt="Yes" className="icon size-md center-md" />|
|
|`--verbose`| Verbose report all file changes (instead of summary)| <img src="/docs/latest/icons/ico-optional-yes.svg" alt="Yes" className="icon size-md center-md" />|
|
||||||
|
|
||||||
</div>
|
</div>
|
||||||
|
|
||||||
@ -341,7 +357,7 @@ clearml-data get [--id ID] [--copy COPY] [--link LINK] [--overwrite]
|
|||||||
Publish the dataset for public use. The dataset must be [finalized](#close) before it is published.
|
Publish the dataset for public use. The dataset must be [finalized](#close) before it is published.
|
||||||
|
|
||||||
```bash
|
```bash
|
||||||
clearml-data publish --id ID
|
clearml-data publish [-h] --id ID
|
||||||
```
|
```
|
||||||
|
|
||||||
**Parameters**
|
**Parameters**
|
||||||
@ -350,6 +366,6 @@ clearml-data publish --id ID
|
|||||||
|
|
||||||
|Name|Description|Optional|
|
|Name|Description|Optional|
|
||||||
|---|---|---|
|
|---|---|---|
|
||||||
|`--id`| The dataset task ID to be published.|<img src="/docs/latest/icons/ico-optional-no.svg" alt="No" className="icon size-md center-md" />|
|
|`--id`| The dataset task ID to be published.| <img src="/docs/latest/icons/ico-optional-no.svg" alt="No" className="icon size-md center-md" />|
|
||||||
|
|
||||||
</div>
|
</div>
|
||||||
|
Loading…
Reference in New Issue
Block a user