Small edits (#451)

2025-06-26 18:17:44 +00:00 · 2023-01-23 15:04:24 +02:00 · 2023-01-23 15:04:24 +02:00 · e8d0267bbd
commit e8d0267bbd
parent 2fd532b2c3
18 changed files with 44 additions and 42 deletions
--- a/docs/apps/clearml_session.md
+++ b/docs/apps/clearml_session.md
@ -96,7 +96,7 @@ To reconnect to a previous session, execute `clearml-session` with no additional
 to an existing session will show up: 

 ```console
-Connect to active session id=c7302b564aa945408aaa40ac5c69399c [Y]/n?`
+Connect to active session id=c7302b564aa945408aaa40ac5c69399c [Y]/n?
 ```

 If multiple sessions were launched from a local machine and are still active, choose the desired session:
--- a/docs/clearml_agent.md
+++ b/docs/clearml_agent.md
@ -307,7 +307,7 @@ clearml-agent execute --id <task-id> --docker

 ### Debugging

-* Run a `clearml-agent` daemon in foreground mode, sending all output to the console.
+Run a `clearml-agent` daemon in foreground mode, sending all output to the console.
 ```bash
 clearml-agent daemon --queue default --foreground
 ```
--- a/docs/clearml_data/best_practices.md
+++ b/docs/clearml_data/best_practices.md
@ -35,7 +35,7 @@ most recent dataset in a project. The same is true with tags; if a tag is specif
 In cases where you use a dataset in a task (e.g. consuming a dataset), you can easily track which dataset the task is 
 using by using `Dataset.get`'s `alias` parameter. Pass `alias=<dataset_alias_string>`, and the task using the dataset 
 will store the dataset’s ID in the `dataset_alias_string` parameter under the task's **CONFIGURATION > HYPERPARAMETERS >
-Datasets` section.
+Datasets** section.


 ## Document your Datasets 
--- a/docs/clearml_sdk/task_sdk.md
+++ b/docs/clearml_sdk/task_sdk.md
@ -66,11 +66,11 @@ After invoking `Task.init` in a script, ClearML starts its automagical logging,
        * [Python Fire](https://github.com/google/python-fire)  - see code examples [here](https://github.com/allegroai/clearml/tree/master/examples/frameworks/fire).
        * [LightningCLI](https://pytorch-lightning.readthedocs.io/en/latest/api/pytorch_lightning.cli.LightningCLI.html) - see code example [here](https://github.com/allegroai/clearml/blob/master/examples/frameworks/jsonargparse/pytorch_lightning_cli.py).
    * TensorFlow Definitions (`absl-py`)
-    * [Hydra](https://github.com/facebookresearch/hydra) - the Omegaconf which holds all the configuration files, as well as overridden values. 
+    * [Hydra](https://github.com/facebookresearch/hydra) - the OmegaConf which holds all the configuration files, as well as overridden values. 
 * **Models** - ClearML automatically logs and updates the models and all snapshot paths saved with the following frameworks:
-    * Tensorflow (see [code example](../guides/frameworks/tensorflow/tensorflow_mnist.md))
+    * TensorFlow (see [code example](../guides/frameworks/tensorflow/tensorflow_mnist.md))
    * Keras (see [code example](../guides/frameworks/keras/keras_tensorboard.md))
-    * Pytorch (see [code example](../guides/frameworks/pytorch/pytorch_mnist.md))
+    * PyTorch (see [code example](../guides/frameworks/pytorch/pytorch_mnist.md))
    * scikit-learn (only using joblib) (see [code example](../guides/frameworks/scikit-learn/sklearn_joblib_example.md))
    * XGBoost (only using joblib) (see [code example](../guides/frameworks/xgboost/xgboost_sample.md))
    * FastAI (see [code example](../guides/frameworks/fastai/fastai_with_tensorboard.md))
@ -143,7 +143,7 @@ train/loss scalar reported was for iteration 100, when continued, the next repor

 :::note Reproducibility
 Continued tasks may not be reproducible. In order to guarantee task reproducibility, you must ensure that all steps are 
-done in the same order (e.g. maintaining learning rate profile, ensuring data is fed in same order).
+done in the same order (e.g. maintaining learning rate profile, ensuring data is fed in the same order).
 :::

 Pass one of the following in the `continue_last_task` parameter:
@ -396,7 +396,8 @@ a_func_task = task.create_function_task(
    some_argument=123
 )
 ```
-Arguments passed to the function will be automatically logged under the `Function` section in the Hyperparameters tab. 
+Arguments passed to the function will be automatically logged in the 
+experiment's **CONFIGURATION** tab under the **HYPERPARAMETER > Function** section . 
 Like any other arguments, they can be changed from the UI or programmatically.

 :::note Function Task Creation
@ -567,7 +568,7 @@ Accessing a task’s previously trained model is quite similar to accessing task
 through the task’s models property which lists the input models and output model snapshots’ locations.

 The models can subsequently be retrieved from their respective locations by using `get_local_copy()` which downloads the 
-model and caches it for later use, returning the path to the cached copy (if using Tensorflow, the snapshots are stored 
+model and caches it for later use, returning the path to the cached copy (if using TensorFlow, the snapshots are stored 
 in a folder, so the `local_weights_path` will point to a folder containing the requested snapshot).

 ```python
@ -584,7 +585,7 @@ Models loaded by the ML framework appear under the "Input Models" section, under

 ### Setting Upload Destination

-ClearML automatically captures the storage location of Models created by frameworks such as TF, Pytorch, and scikit-learn. 
+ClearML automatically captures the storage location of Models created by frameworks such as TensorFlow, PyTorch, and scikit-learn. 
 By default, it stores the local path they are saved at.

 To automatically store all created models by a specific experiment, modify the `Task.init` function as such:
--- a/docs/clearml_serving/clearml_serving.md
+++ b/docs/clearml_serving/clearml_serving.md
@ -11,7 +11,7 @@ solution.

 * Easy to deploy & configure
    * Support Machine Learning Models (Scikit Learn, XGBoost, LightGBM)
-    * Support Deep Learning Models (Tensorflow, PyTorch, ONNX)
+    * Support Deep Learning Models (TensorFlow, PyTorch, ONNX)
    * Customizable RestAPI for serving (i.e. allow per model pre/post-processing for easy integration)
 * Flexible
    * On-line model deployment
--- a/docs/fundamentals/artifacts.md
+++ b/docs/fundamentals/artifacts.md
@ -13,9 +13,9 @@ interface.

 Once integrated into code, ClearML automatically logs and tracks models and any snapshots created by the following 
 frameworks:
- Tensorflow (see [code example](../guides/frameworks/tensorflow/tensorflow_mnist.md))
+- TensorFlow (see [code example](../guides/frameworks/tensorflow/tensorflow_mnist.md))
 - Keras (see [code example](../guides/frameworks/keras/keras_tensorboard.md))
- Pytorch (see [code example](../guides/frameworks/pytorch/pytorch_mnist.md))
+- PyTorch (see [code example](../guides/frameworks/pytorch/pytorch_mnist.md))
 - scikit-learn (only using joblib) (see [code example](../guides/frameworks/scikit-learn/sklearn_joblib_example.md))
 - XGBoost (only using joblib) (see [code example](../guides/frameworks/xgboost/xgboost_sample.md))
 - FastAI (see [code example](../guides/frameworks/fastai/fastai_with_tensorboard.md))
--- a/docs/fundamentals/hyperparameters.md
+++ b/docs/fundamentals/hyperparameters.md
@ -29,7 +29,7 @@ the following types of parameters:
 * TensorFlow Definitions (`absl-py`). See examples of ClearML's automatic logging of TF Defines:
    * [TensorFlow MNIST](../guides/frameworks/tensorflow/tensorflow_mnist.md)
    * [TensorBoard PR Curve](../guides/frameworks/tensorflow/tensorboard_pr_curve.md)
-* [Hydra](https://github.com/facebookresearch/hydra) - ClearML logs the `Omegaconf` which holds all the configuration files, 
+* [Hydra](https://github.com/facebookresearch/hydra) - ClearML logs the `OmegaConf` which holds all the configuration files, 
  as well as values overridden during runtime. See code example [here](https://github.com/allegroai/clearml/blob/master/examples/frameworks/hydra/hydra_example.py).
    
 :::tip Disabling Automatic Logging
@ -47,7 +47,7 @@ Environment variables can be logged by modifying the [clearml.conf](../configs/c
 parameter specifying parameters to log.

 ```editorconfig
-log_os_environments: ["AWS_*", "CUDA_VERSION"]`
+log_os_environments: ["AWS_*", "CUDA_VERSION"]
 ```

 It's also possible to specify environment variables using the `CLEARML_LOG_ENVIRONMENT` variable.
--- a/docs/fundamentals/logger.md
+++ b/docs/fundamentals/logger.md
@ -48,7 +48,7 @@ Check out some of ClearML's automatic reporting examples for supported packages:
  * [Tensorboard with PyTorch](../guides/frameworks/pytorch/pytorch_tensorboard.md) - logging TensorBoard scalars, debug samples, and text integrated into 
    code that uses PyTorch
 * TensorBoardX
-  * [TensorBoardX with Pytorch](../guides/frameworks/tensorboardx/tensorboardx.md) - logging TensorBoardX scalars, debug 
+  * [TensorBoardX with PyTorch](../guides/frameworks/tensorboardx/tensorboardx.md) - logging TensorBoardX scalars, debug 
  samples, and text in code using PyTorch
  * [MegEngine MNIST](../guides/frameworks/megengine/megengine_mnist.md) - logging scalars using TensorBoardX's `SummaryWriter`  
 * Matplotlib 
--- a/docs/getting_started/ds/ds_second_steps.md
+++ b/docs/getting_started/ds/ds_second_steps.md
@ -24,7 +24,7 @@ Once we have a Task object we can query the state of the Task, get its Model, sc
 ## Log Hyperparameters

 For full reproducibility, it's paramount to save Hyperparameters for each experiment. Since Hyperparameters can have substantial impact
-on Model performance, saving and comparing these between experiments is sometimes the key to understand model behavior.
+on Model performance, saving and comparing these between experiments is sometimes the key to understanding model behavior.

 ClearML supports logging `argparse` module arguments out of the box, so once ClearML is integrated into the code, it automatically logs all parameters provided to the argument parser.

@ -93,8 +93,8 @@ Check out the [artifacts retrieval](https://github.com/allegroai/clearml/blob/ma

 ### Models

-Models are a special kind artifact.
-Models created by popular frameworks (such as Pytorch, Tensorflow, Scikit-learn) are automatically logged by ClearML.
+Models are a special kind of artifact.
+Models created by popular frameworks (such as PyTorch, TensorFlow, Scikit-learn) are automatically logged by ClearML.
 All snapshots are automatically logged. In order to make sure we also automatically upload the model snapshot (instead of saving its local path),
 we need to pass a storage location for the model files to be uploaded to.

@ -107,11 +107,12 @@ task = Task.init(
 )
 ```

-Now, whenever the framework (TF/Keras/PyTorch etc.) stores a snapshot, the model file is automatically uploaded to the bucket to a specific folder for the experiment.
+Now, whenever the framework (TensorFlow/Keras/PyTorch etc.) stores a snapshot, the model file is automatically uploaded to the bucket to a specific folder for the experiment.

-Loading models by a framework is also logged by the system, these models appear under the “Input Models” section, under the Artifacts tab.
+Loading models by a framework is also logged by the system; these models appear in an experiment's **Artifacts** tab,
+under the "Input Models" section.

-Check out model snapshots examples for [TF](https://github.com/allegroai/clearml/blob/master/examples/frameworks/tensorflow/tensorflow_mnist.py),
+Check out model snapshots examples for [TensorFlow](https://github.com/allegroai/clearml/blob/master/examples/frameworks/tensorflow/tensorflow_mnist.py),
 [PyTorch](https://github.com/allegroai/clearml/blob/master/examples/frameworks/pytorch/pytorch_mnist.py),
 [Keras](https://github.com/allegroai/clearml/blob/master/examples/frameworks/keras/keras_tensorboard.py),
 [Scikit-Learn](https://github.com/allegroai/clearml/blob/master/examples/frameworks/scikit-learn/sklearn_joblib_example.py).
@ -127,7 +128,7 @@ local_weights_path = last_snapshot.get_local_copy()

 Like before we have to get the instance of the Task training the original weights files, then we can query the task for its output models (a list of snapshots), and get the latest snapshot.
 :::note
-Using Tensorflow, the snapshots are stored in a folder, meaning the `local_weights_path` will point to a folder containing our requested snapshot.
+Using TensorFlow, the snapshots are stored in a folder, meaning the `local_weights_path` will point to a folder containing our requested snapshot.
 :::
 As with Artifacts, all models are cached, meaning the next time we run this code, no model needs to be downloaded.
 Once one of the frameworks will load the weights file, the running Task will be automatically updated with “Input Model” pointing directly to the original training Task’s Model.
--- a/docs/getting_started/mlops/mlops_first_steps.md
+++ b/docs/getting_started/mlops/mlops_first_steps.md
@ -162,7 +162,7 @@ and [pipeline](../../pipelines/pipelines.md) solutions.

 #### Log Models
 Logging models into the model repository is the easiest way to integrate the development process directly with production. 
-Any model stored by a supported framework (Keras / TF / PyTorch / Joblib etc.) will be automatically logged into ClearML.
+Any model stored by a supported framework (Keras / TensorFlow / PyTorch / Joblib etc.) will be automatically logged into ClearML.

 ClearML also offers methods to explicitly log models. Models can be automatically stored on a preferred storage medium 
 (s3 bucket, google storage, etc.). 
--- a/docs/getting_started/video_tutorials/hyperdatasets_data_versioning.md
+++ b/docs/getting_started/video_tutorials/hyperdatasets_data_versioning.md
@ -1,6 +1,6 @@
 ---
-title: Hyperdatasets Data Versioning
-description: Learn more about the hyperdatasets, a supercharged version of ClearML Data.
+title: Hyper-Datasets Data Versioning
+description: Learn more about the Hyper-Datasets, a supercharged version of ClearML Data.
 keywords: [mlops, components, hyperdatasets]
 ---

@ -21,11 +21,11 @@ keywords: [mlops, components, hyperdatasets]
 <summary className="cml-expansion-panel-summary">Read the transcript</summary>
 <div className="cml-expansion-panel-content">

-Hello and welcome to ClearML. In this video, we're taking a closer look at hyperdatasets, a supercharged version of ClearML Data.
+Hello and welcome to ClearML. In this video, we're taking a closer look at Hyper-Datasets, a supercharged version of ClearML Data.

-Hyperdatasets is a data management system that’s designed for unstructured data like text, audio, or visual data. It is part of the ClearML paid offering, which means it brings along quite a bit of upgrades over the open source `clearml-data`.
+Hyper-Datasets is a data management system that’s designed for unstructured data like text, audio, or visual data. It is part of the ClearML paid offering, which means it brings along quite a bit of upgrades over the open source `clearml-data`.

-The main conceptual difference between the two is that hyperdatasets decouple the metadata from the raw data files. This allows you to manipulate the metadata in all kinds of ways while abstracting away the logistics of having to deal with large amounts of data. 
+The main conceptual difference between the two is that Hyper-Datasets decouple the metadata from the raw data files. This allows you to manipulate the metadata in all kinds of ways while abstracting away the logistics of having to deal with large amounts of data. 

 Manipulating the metadata is done through queries and parameters, both of which can then be tracked using the experiment manager. 

@ -35,9 +35,9 @@ The data manipulations themselves become part of the experiment, we call it a da

 By contrast, in ClearML Data, just like many other data versioning tools, the data and the metadata are entangled. Take this example where the label of the image is defined by which folder it is in, a common dataset structure. What if I want to train only on donuts? Or what if I have a large class imbalance? I still have to download the whole dataset even though I might only be using a small part of it. Then I have to change my code to only grab the donut images or to rebalance my classes by over or under sampling them. If later I want to add waffles to the mix, I have to change my code again. 

-Let’s take a look at an example that will show you how to use hyperdatasets to debug an underperforming model. But first, we start where any good data science projects starts: data exploration.
+Let’s take a look at an example that will show you how to use Hyper-Datasets to debug an underperforming model. But first, we start where any good data science projects starts: data exploration.

-When you open hyperdatasets to explore a dataset, you can find the version history of that dataset here. Datasets can have multiple versions, which in turn can have multiple child versions. Each of the child versions will inherit the contents of their parents.
+When you open Hyper-Datasets to explore a dataset, you can find the version history of that dataset here. Datasets can have multiple versions, which in turn can have multiple child versions. Each of the child versions will inherit the contents of their parents.

 By default, a dataset version will be in draft mode, meaning it can still be modified. You can press the publish button to essentially lock it to make sure it will not change anymore. If you want to make changes to a published dataset version, make a new version that’s based on it.

@ -53,13 +53,13 @@ The goal of these queries is not to simply serve as a neat filter for data explo

 Enter the dataviews that I introduced in the beginning of this video. Dataviews can use sophisticated queries to connect specific data from one or more datasets to an experiment in the experiment manager. Essentially it creates and manages local views of remote Datasets.

-As an example, imagine you have created an experiment that tries to train a model based on a specific subset of data using hyperdatasets.
+As an example, imagine you have created an experiment that tries to train a model based on a specific subset of data using Hyper-Datasets.

 To get the data you need to train on, you can easily create a dataview from code like so. Then you can add all sorts of constraints, like class filters, metadata filters, and class weights which will over or under sample the data as is required.

-After running the task, we can see it in the experiment manager. The model is reporting scalars and training as we would expect. When using hyperdatasets, there is also a dataviews tab with all of the possibilities at your disposal. You can see which input datasets and versions that you used and can see the querying system that is used to subset them. This will already give you a nice, clean way to train your models on a very specific subset of the data, but there is more!
+After running the task, we can see it in the experiment manager. The model is reporting scalars and training as we would expect. When using Hyper-Datasets, there is also a dataviews tab with all of the possibilities at your disposal. You can see which input datasets and versions that you used and can see the querying system that is used to subset them. This will already give you a nice, clean way to train your models on a very specific subset of the data, but there is more!

-If you want to remap labels, or enumerate them to integers on-the-fly, ClearML will keep track of all the transformations that are done and make sure they are reproducible. There is, of course, more still, so if you’re interested check out our documentation on hyperdatasets.
+If you want to remap labels, or enumerate them to integers on-the-fly, ClearML will keep track of all the transformations that are done and make sure they are reproducible. There is, of course, more still, so if you’re interested check out our documentation on Hyper-Datasets.

 ClearML veterans already know what’s coming next. Cloning.

@ -73,6 +73,6 @@ After the remote machine has executed the experiment on the new dataview, we can

 If you’ve been following along with the other Getting Started videos, you should already start to see the potential this approach can have. For example: we could now run hyperparameter optimization on the data itself, because all of the filters and settings previously shown are just parameters on a task. The whole process could be running in parallel on a cloud autoscaler for example. Imagine finding the best training data confidence threshold for each class to optimize the model performance.

-If you’re interested in using Hyperdatasets for your team, then contact us using our website and we’ll get you going in no time. In the meantime, you can enjoy the power of the open source components at app.clear.ml, and don’t forget to join our Slack channel, if you need any help!
+If you’re interested in using Hyper-Datasets for your team, then contact us using our website and we’ll get you going in no time. In the meantime, you can enjoy the power of the open source components at app.clear.ml, and don’t forget to join our Slack channel, if you need any help!
 </div>
 </details>
--- a/docs/guides/datasets/data_man_cifar_classification.md
+++ b/docs/guides/datasets/data_man_cifar_classification.md
@ -103,6 +103,6 @@ hyperparameters. Passing `alias=<dataset_alias_string>` stores the dataset’s I
 you can easily track which dataset the task is using. 

 The Dataset's [`get_local_copy`](../../references/sdk/dataset.md#get_local_copy) method will return a path to the cached, 
-downloaded dataset. Then we provide the path to Pytorch's dataset object.
+downloaded dataset. Then we provide the path to PyTorch's dataset object.

 The script then trains a neural network to classify images using the dataset created above.
--- a/docs/release_notes/ver_0_14.md
+++ b/docs/release_notes/ver_0_14.md
@ -48,7 +48,7 @@ title: Version 0.14
 * Improve Jupyter support:
    * Make sure `trains` is included in Jupyter requirements.
    * Ignore IPython directives in converted Python script (like `%` and `!` lines).
-* Update Pytorch / TensorboardX examples.
+* Update PyTorch / TensorboardX examples.

 **Bug Fixes**

--- a/docs/release_notes/ver_0_15.md
+++ b/docs/release_notes/ver_0_15.md
@ -33,7 +33,7 @@ title: Version 0.15
 * Fix TensorFlow version 2 and later histogram binding.
 * Fix `Logger.tensorboard_single_series_per_graph`.
 * Fix anonymous named models.
-* Fix incorrect entry point detection when called from Trains wrapper (e.g. `TrainsLogger` in Pytorch Ignite / Lightning).
+* Fix incorrect entry point detection when called from Trains wrapper (e.g. `TrainsLogger` in PyTorch Ignite / Lightning).

 ### Trains Server

--- a/docs/release_notes/ver_0_16.md
+++ b/docs/release_notes/ver_0_16.md
@ -89,7 +89,7 @@ title: Version 0.16
 * Fix diff command output was stripped.
 * Make sure local packages with multi-files are marked as `package`.
 * Fix `Task.set_base_docker()` should be skipped when running remotely.
-* Fix ArgParser binding handling of string argument with boolean default value (affects Pytorch Lightning integration).
+* Fix ArgParser binding handling of string argument with boolean default value (affects PyTorch Lightning integration).
 * When using `detect_with_pip_freeze` make sure that `package @ file://` lines are replaced with `package==x.y.z` as local file will probably not be available.
 * Fix git packages to new pip standard `package @ git+`.
 * Improve conda package naming `_` and `-` support.
--- a/docs/release_notes/ver_1_0.md
+++ b/docs/release_notes/ver_1_0.md
@ -71,7 +71,7 @@ This release is not backwards compatible

 - Fix `default_output_uri` for Dataset creation [ClearML GitHub issue 371](https://github.com/allegroai/clearml/issues/371)
 - Fix `clearml-task` failing without a docker script [ClearML GitHub issue 378](https://github.com/allegroai/clearml/issues/378)
- Fix Pytorch DDP sub-process spawn multi-process
+- Fix PyTorch DDP sub-process spawn multi-process
 - Fix `Task.execute_remotely()` on created Task (not initialized Task)
 - Fix auto scaler custom bash script should be called last before starting agent
 - Fix auto scaler spins too many instances at once then kills the idle ones (spin time is longer than poll time)
--- a/docs/release_notes/ver_1_1.md
+++ b/docs/release_notes/ver_1_1.md
@ -269,7 +269,7 @@ This release is not backwards compatible - see notes below on upgrading
 - Fix `PY3.x` fails calling `SemLock._after_fork` with forkserver context, forking while lock is acquired [ClearML Agent GitHub issue #73](https://github.com/allegroai/clearml-agent/issues/73)
 - Fix wrong download path in `StorageManager.download_folder()`
 - Fix jupyter notebook `display(...)` convert to `print(...)`
- Fix Tensorflow `add_image()` with `description='text'`
+- Fix TensorFlow `add_image()` with `description='text'`
 - Fix `Task.close()` should remove `current_task()` reference
 - Fix `TaskScheduler` weekdays, change default `execute_immediately` to `False`
 - Fix Python2 compatibility
--- a/docs/release_notes/ver_1_4.md
+++ b/docs/release_notes/ver_1_4.md
@ -73,7 +73,7 @@ title: Version 1.4
    * Add manual seaborn logging example [ClearML GitHub PR #628](https://github.com/allegroai/clearml/pull/628)
    * Change package author
    * Change pipeline example to run locally [ClearML GitHub PR #642](https://github.com/allegroai/clearml/pull/642)
-    * Update Pytorch Lightning example for `pytorch-lightning>=v1.6.0` [ClearML GitHub PR #650](https://github.com/allegroai/clearml/pull/650)
+    * Update PyTorch Lightning example for `pytorch-lightning>=v1.6.0` [ClearML GitHub PR #650](https://github.com/allegroai/clearml/pull/650)

 **Bug Fixes**
 * Fix Keras model config serialization in `PatchKerasModelIO` [ClearML GitHub issue #614](https://github.com/allegroai/clearml/issues/614)