Refactor ClearML Data docs (#108)

This commit is contained in:
pollfly
2021-11-08 13:21:44 +02:00
committed by GitHub
parent 43751dc64b
commit e155c49cfd
17 changed files with 847 additions and 683 deletions

View File

@@ -29,9 +29,9 @@ Once we have a Task in ClearML, we can clone and edit its definitions in the UI,
- Once there are two or more experiments that run after another, group them together into a [pipeline](../../fundamentals/pipelines.md).
## Manage Your Data
Use [ClearML Data](../../clearml_data.md) to version your data, then link it to running experiments for easy reproduction.
Make datasets machine agnostic (i.e. store original dataset in a shared storage location, e.g. shared-folder/S3/Gs/Azure).
ClearML Data supports efficient Dataset storage and caching, differentiable & compressed.
Use [ClearML Data](../../clearml_data/clearml_data.md) to version your data, then link it to running experiments for easy reproduction.
Make datasets machine agnostic (i.e. store original dataset in a shared storage location, e.g. shared-folder/S3/Gs/Azure)
ClearML Data supports efficient Dataset storage and caching, differentiable & compressed
## Scale Your Work
Use [ClearML Agent](../../clearml_agent.md) to scale work. Install the agent machines (Remote or local) and manage

View File

@@ -87,7 +87,7 @@ Task.enqueue(cloned_task, queue_name='default')
## Logging Artifacts
Artifacts are a great way to pass and reuse data between Tasks in the system.
From anywhere in the code you can upload [multiple](../../fundamentals/artifacts.md#logging-artifacts) types of data, object and files.
Artifacts are the base of ClearML's [Data Management](../../clearml_data.md) solution and as a way to communicate complex objects between different
Artifacts are the base of ClearML's [Data Management](../../clearml_data/clearml_data.md) solution and as a way to communicate complex objects between different
stages of a [pipeline](../../fundamentals/pipelines.md).
```python
@@ -139,7 +139,7 @@ You can also search and query Tasks in the system.
Use the `Task.get_tasks` call to retrieve Tasks objects and filter based on the specific values of the Task - status, parameters, metrics and more!
```python
from clearml import Task
tasks = Task.get_tasks(project_name='examples', task_name='partial_name_match', task_filter={'status': 'in_proress'})
tasks = Task.get_tasks(project_name='examples', task_name='partial_name_match', task_filter={'status': 'in_progress'})
```
## Manage Your Data
@@ -147,7 +147,7 @@ Data is probably one of the biggest factors that determines the success of a pro
Associating the data a model used to the model's configuration, code and results (such as accuracy) is key to deducing meaningful insights into how
models behave.
[ClearML Data](../../clearml_data.md) allows you to version your data so it's never lost, fetch it from every machine with minimal code changes,
[ClearML Data](../../clearml_data/clearml_data.md) allows you to version your data so it's never lost, fetch it from every machine with minimal code changes,
and associate data to experiment results.
Logging data can be done via command line, or via code. If any preprocessing code is involved, ClearML logs it as well! Once data is logged, it can be used by other experiments.