Update docs for ClearML server 1.6 (#280)

This commit is contained in:
pollfly
2022-06-30 20:16:13 +03:00
committed by GitHub
parent 92a4826dcb
commit b2af25b52d
312 changed files with 209 additions and 75 deletions

View File

@@ -26,6 +26,9 @@ Dataset changes are stored using differentiable storage, meaning a version will
Local copies of datasets are always cached, so the same data never needs to be downloaded twice.
When a dataset is pulled it will automatically pull all parent datasets and merge them into one output folder for you to work with.
The [Dataset Versions](../webapp/pipelines/webapp_pipeline_viewing.md) page in the web UI displays dataset versions'
lineage and content information. See [dataset UI](../webapp/datasets/webapp_dataset_page.md) for more details.
## Setup
`clearml-data` comes built-in with the `clearml` python package! Just check out the [Getting Started](../getting_started/ds/ds_first_steps.md)
@@ -39,59 +42,15 @@ ClearML Data offers two interfaces:
For an overview of our recommendations for ClearML Data workflows and practices, see [Best Practices](best_practices.md).
## WebApp
ClearML's WebApp provides a visual interface to your datasets through dataset tasks. Dataset tasks are categorized
as data-processing [task type](../fundamentals/task.md#task-types), and they are labeled with a `DATASET` system tag.
Full log (calls / CLI) of the dataset creation process can be found in a dataset's **EXECUTION** section.
Listing of the dataset differential snapshot, summary of files added / modified / removed and details of files in the
differential snapshot (location / size / hash), is available in the **ARTIFACTS** section. Download the dataset
by clicking <img src="/docs/latest/icons/ico-download-json.svg" alt="Download" className="icon size-sm space-sm" />,
next to the **FILE PATH**.
The full dataset listing (all files included) is available in the **CONFIGURATION** section under **Dataset Content**.
This allows you to quickly compare two dataset contents and visually see the difference.
The dataset genealogy DAG and change-set summary table is visualized in **PLOTS**
<details className="cml-expansion-panel screenshot">
<summary className="cml-expansion-panel-summary">Dataset Contents</summary>
<div className="cml-expansion-panel-content">
![Dataset data WebApp](../img/dataset_data.png)
</div>
</details>
<br/>
View a DAG of the dataset dependencies (all previous dataset versions and their parents) in the dataset's page **> ARTIFACTS > state**.
<details className="cml-expansion-panel screenshot">
<summary className="cml-expansion-panel-summary">Data Dependency DAG</summary>
<div className="cml-expansion-panel-content">
![Dataset state WebApp](../img/dataset_data_state.png)
</div>
</details>
Once a dataset has been finalized, view its genealogy in the dataset's
page **>** **PLOTS**
<details className="cml-expansion-panel screenshot">
<summary className="cml-expansion-panel-summary">Dataset Genealogy</summary>
<div className="cml-expansion-panel-content">
![Dataset genealogy and summary](../img/dataset_genealogy_summary.png)
</div>
</details>
## Dataset Version States
The following table displays the possible states for a dataset version.
| State | Description |
|---|---|
|*Uploading*| Dataset creation is in progress |
|*Failed*| Dataset creation was terminated with an error|
|*Aborted*| Dataset creation was aborted by user before it was finalization |
|*Final*| A dataset was created and finalized successfully |
|*Published*| The dataset is read-only. Publish a dataset to prevent changes to it |

View File

@@ -7,6 +7,11 @@ This page covers `clearml-data`, ClearML's file-based data management solution.
See [Hyper-Datasets](../hyperdatasets/overview.md) for ClearML's advanced queryable dataset management solution.
:::
:::tip version compatibility
To use the WebApp's [Dataset pages](../webapp/datasets/webapp_dataset_page.md), you must use `clearml` and
`clearml-server` versions 1.6+.
:::
The `clearml-data` utility is a CLI tool for controlling and managing your data with ClearML.
The following page provides a reference to `clearml-data`'s CLI commands.
@@ -38,9 +43,9 @@ clearml-data create [-h] [--parents [PARENTS [PARENTS ...]]] [--project PROJECT]
:::tip Dataset ID
* To locate a dataset's ID, go to the dataset task's info panel in the [WebApp](../webapp/webapp_exp_track_visual.md). In the top of the panel,
to the right of the dataset task name, click `ID` and the dataset ID appears.
* To locate a dataset's ID, go to the dataset versions info panel in the [Dataset UI](../webapp/datasets/webapp_dataset_viewing.md)
where the ID is listed. If using `clearml` or `clearml-server` versions older than 1.6, go to the [dataset task's info
panel](../webapp/webapp_exp_track_visual.md), where the ID is displayed in the task header.
* clearml-data works in a stateful mode so once a new dataset is created, the following commands
do not require the `--id` flag.
:::

View File

@@ -7,6 +7,11 @@ This page covers `clearml-data`, ClearML's file-based data management solution.
See [Hyper-Datasets](../hyperdatasets/overview.md) for ClearML's advanced queryable dataset management solution.
:::
:::tip version compatibility
To use the WebApp's [Dataset pages](../webapp/datasets/webapp_dataset_page.md), you must use `clearml` and
`clearml-server` versions 1.6+.
:::
Datasets can be created, modified, and managed with ClearML Data's python interface. The following page provides an overview
for using the most basic methods of the `Dataset` class. See the [Dataset reference page](../references/sdk/dataset.md)
for a complete list of available methods.
@@ -43,8 +48,9 @@ dataset = Dataset.create(
```
:::tip Locating Dataset ID
To locate a dataset's ID, go to the dataset task's info panel in the [WebApp](../webapp/webapp_overview.md). In the top of the panel,
to the right of the dataset task name, click `ID` and the dataset ID appears
To locate a dataset's ID, go to the dataset versions info panel in the [Dataset UI](../webapp/datasets/webapp_dataset_viewing.md)
where the ID is listed. If using `clearml` or `clearml-server` versions older than 1.6, go to the [dataset task's info
panel](../webapp/webapp_exp_track_visual.md), where the ID is displayed in the task header.
:::
Use the `output_uri` parameter to specify a network storage target to upload the dataset files, and associated information