Refactor ClearML Data docs (#108)

2025-06-26 18:17:44 +00:00 · 2021-11-08 13:21:44 +02:00
parent 43751dc64b
commit e155c49cfd
17 changed files with 847 additions and 683 deletions
--- a/docs/getting_started/mlops/mlops_best_practices.md
+++ b/docs/getting_started/mlops/mlops_best_practices.md
@@ -29,9 +29,9 @@ Once we have a Task in ClearML, we can clone and edit its definitions in the UI,
 - Once there are two or more experiments that run after another, group them together into a [pipeline](../../fundamentals/pipelines.md).

 ## Manage Your Data
-Use [ClearML Data](../../clearml_data.md) to version your data, then link it to running experiments for easy reproduction.
-Make datasets machine agnostic (i.e. store original dataset in a shared storage location, e.g. shared-folder/S3/Gs/Azure).
-ClearML Data supports efficient Dataset storage and caching, differentiable & compressed.
+Use [ClearML Data](../../clearml_data/clearml_data.md) to version your data, then link it to running experiments for easy reproduction.
+Make datasets machine agnostic (i.e. store original dataset in a shared storage location, e.g. shared-folder/S3/Gs/Azure)
+ClearML Data supports efficient Dataset storage and caching, differentiable & compressed

 ## Scale Your Work
 Use [ClearML Agent](../../clearml_agent.md) to scale work. Install the agent machines (Remote or local) and manage
--- a/docs/getting_started/mlops/mlops_first_steps.md
+++ b/docs/getting_started/mlops/mlops_first_steps.md
@@ -87,7 +87,7 @@ Task.enqueue(cloned_task, queue_name='default')
 ## Logging Artifacts
 Artifacts are a great way to pass and reuse data between Tasks in the system.
 From anywhere in the code you can upload [multiple](../../fundamentals/artifacts.md#logging-artifacts) types of data, object and files.
-Artifacts are the base of ClearML's [Data Management](../../clearml_data.md) solution and as a way to communicate complex objects between different
+Artifacts are the base of ClearML's [Data Management](../../clearml_data/clearml_data.md) solution and as a way to communicate complex objects between different
 stages of a [pipeline](../../fundamentals/pipelines.md).

 ```python
@@ -139,7 +139,7 @@ You can also search and query Tasks in the system.
 Use the `Task.get_tasks` call to retrieve Tasks objects and filter based on the specific values of the Task - status, parameters, metrics and more!
 ```python
 from clearml import Task
-tasks = Task.get_tasks(project_name='examples', task_name='partial_name_match', task_filter={'status': 'in_proress'})
+tasks = Task.get_tasks(project_name='examples', task_name='partial_name_match', task_filter={'status': 'in_progress'})
 ```

 ## Manage Your Data
@@ -147,7 +147,7 @@ Data is probably one of the biggest factors that determines the success of a pro
 Associating the data a model used to the model's configuration, code and results (such as accuracy) is key to deducing meaningful insights into how
 models behave. 

-[ClearML Data](../../clearml_data.md) allows you to version your data so it's never lost, fetch it from every machine with minimal code changes, 
+[ClearML Data](../../clearml_data/clearml_data.md) allows you to version your data so it's never lost, fetch it from every machine with minimal code changes, 
 and associate data to experiment results.

 Logging data can be done via command line, or via code. If any preprocessing code is involved, ClearML logs it as well! Once data is logged, it can be used by other experiments.