From e72ca23b54d70f9a9aa4c69bbc6e0fc0c15f5931 Mon Sep 17 00:00:00 2001 From: pollfly <75068813+pollfly@users.noreply.github.com> Date: Tue, 18 Jan 2022 13:23:47 +0200 Subject: [PATCH] Small edits (#162) --- docs/apps/clearml_session.md | 27 ++++++++-------- docs/apps/clearml_task.md | 6 ++-- docs/clearml_agent.md | 8 ++--- .../data_man_simple.md | 10 +++--- docs/configs/clearml_conf.md | 32 +++++++++---------- .../clearml_server_linux_mac.md | 2 +- .../upgrade_server_linux_mac.md | 4 +-- docs/faq.md | 12 +++---- docs/fundamentals/agents_and_queues.md | 6 ++-- docs/fundamentals/hpo.md | 13 ++++---- docs/fundamentals/logger.md | 2 +- docs/fundamentals/task.md | 2 +- docs/getting_started/ds/best_practices.md | 8 ++--- docs/getting_started/ds/ds_second_steps.md | 22 ++++++------- .../mlops/mlops_first_steps.md | 2 +- .../mlops/mlops_second_steps.md | 15 +++++---- .../distributed_pytorch_example.md | 3 +- docs/guides/ide/remote_jupyter_tutorial.md | 2 +- docs/guides/services/aws_autoscaler.md | 2 +- docs/hyperdatasets/annotations.md | 2 +- docs/hyperdatasets/masks.md | 2 +- docs/webapp/webapp_exp_comparing.md | 3 +- docs/webapp/webapp_exp_table.md | 2 +- docs/webapp/webapp_project_overview.md | 2 +- 24 files changed, 96 insertions(+), 93 deletions(-) diff --git a/docs/apps/clearml_session.md b/docs/apps/clearml_session.md index 1f28bf31..c0d30063 100644 --- a/docs/apps/clearml_session.md +++ b/docs/apps/clearml_session.md @@ -13,8 +13,8 @@ in the UI and send it for long-term training on a remote machine. ## What Does ClearML Session Do? `clearml-session` is a feature that allows to launch a session of JupyterLab and VS Code, and to execute code on a remote -machine that better meets resource needs. With this feature, local links are provided, which can be used to access -JupyterLab and VS Code on a remote machine over a secure and encrypted SSH connection. By default, the JupyterLab and +machine that better meets resource needs. This feature provides local links to access JupyterLab and VS Code on a +remote machine over a secure and encrypted SSH connection. By default, the JupyterLab and VS Code remote sessions use ports 8878 and 8898 respectively.
@@ -40,7 +40,7 @@ VS Code remote sessions use ports 8878 and 8898 respectively. ## How it Works ClearML allows to leverage a resource (e.g. GPU or CPU machine) by utilizing the [ClearML Agent](../clearml_agent). -A ClearML Agent will run on a target machine, and ClearML Session will instruct it to execute the Jupyter / VS Code +A ClearML Agent runs on a target machine, and ClearML Session instructs it to execute the Jupyter / VS Code server to develop remotely. After entering a `clearml-session` command with all specifications: @@ -51,8 +51,8 @@ After entering a `clearml-session` command with all specifications: launches it. 1. Once the agent finishes the initial setup of the interactive Task, the local `cleaml-session` connects to the host - machine via SSH, and tunnels both SSH and JupyterLab over the SSH connection. If a specific Docker was specified, the - JupyterLab environment will run inside the Docker. + machine via SSH, and tunnels both SSH and JupyterLab over the SSH connection. If a Docker is specified, the + JupyterLab environment runs inside the Docker. 1. The CLI outputs access links to the remote JupyterLab and VS Code sessions: @@ -73,14 +73,15 @@ To run a session inside a Docker container, use the `--docker` flag and enter th session. ### Installing Requirements -`clearml-session` can install required Python packages when setting up the remote environment. A `requirement.txt` file -can be attached to the command using `--requirements `. -Alternatively, packages can be manually specified, using `--packages ""` +`clearml-session` can install required Python packages when setting up the remote environment. +Specify requirements in one of the following ways: +* Attach a `requirement.txt` file to the command using `--requirements `. +* Manually specify packages using `--packages ""` (for example `--packages "keras" "clearml"`), and they'll be automatically installed. ### Accessing a Git Repository -To access a git repository remotely, add a `--git-credentials` flag and set it to `true`, so the local .git-credentials -file will be sent to the interactive session. This is helpful if working on private git repositories, and it allows for seamless +To access a git repository remotely, add a `--git-credentials` flag and set it to `true`, so the local `.git-credentials` +file is sent to the interactive session. This is helpful if working on private git repositories, and it allows for seamless cloning and tracking of git references, including untracked changes. ### Re-launching and Shutting Down Sessions @@ -101,11 +102,11 @@ Active sessions: Connect to session [0-1] or 'N' to skip ``` -To shut down a remote session, which will free the `clearml-agent` and close the CLI, enter "Shutdown". If a session +To shut down a remote session, which frees the `clearml-agent` and closes the CLI, enter "Shutdown". If a session is shutdown, there is no option to reconnect to it. ### Connecting to an Existing Session -If a `clearml-session` is running remotely, it's possible to continue working on the session from any machine. +If a `clearml-session` is running remotely, you can continue working on the session from any machine. When `clearml-session` is launched, it initializes a task with a unique ID in the ClearML Server. To connect to an existing session: @@ -116,7 +117,7 @@ To connect to an existing session: ### Starting a Debugging Session -Previously executed experiments in the ClearML system can be debugged on a remote interactive session. +You can debug previously executed experiments registered in the ClearML system on a remote interactive session. Input into `clearml-session` the ID of a Task to debug, then `clearml-session` clones the experiment's git repository and replicates the environment on a remote machine. Then the code can be interactively executed and debugged on JupyterLab / VS Code. diff --git a/docs/apps/clearml_task.md b/docs/apps/clearml_task.md index 5e46b608..33fab2cc 100644 --- a/docs/apps/clearml_task.md +++ b/docs/apps/clearml_task.md @@ -6,7 +6,7 @@ ClearML Task is ClearML's Zero Code Integration Module. Using only the command l you can easily track your work and integrate ClearML with your existing code. `clearml-task` automatically integrates ClearML into any script or **any** python repository. `clearml-task` has the option -to send the task to a queue, where a **ClearML Agent** listening to the queue will fetch the task and execute it on a +to send the task to a queue, where a ClearML Agent assigned to the queue fetches the task and executes it on a remote or local machine. It's even possible to provide command line arguments and provide Python module dependencies and requirements.txt file! ## How Does ClearML Task Work? @@ -14,8 +14,8 @@ remote or local machine. It's even possible to provide command line arguments an 1. Execute `clearml-task`, pointing it to your script or repository, and optionally an execution queue. 1. `clearml-task` does its magic! It creates a new experiment on the [ClearML Server](../deploying_clearml/clearml_server.md), and, if a queue was specified, it sends the experiment to the queue to be fetched and executed by a **ClearML Agent**. -1. The command line will provide you with a link to your task's page in the ClearML web UI, - where you will be able to view the task's details. +1. The command line provides you with a link to your task's page in the ClearML web UI, + where you can view the task's details. ## Features and Options ### Docker diff --git a/docs/clearml_agent.md b/docs/clearml_agent.md index 69ff597c..b192b6be 100644 --- a/docs/clearml_agent.md +++ b/docs/clearml_agent.md @@ -41,7 +41,7 @@ and [configuration options](configs/clearml_conf.md#agent-section). ## Installation :::note -If **ClearML** was previously configured, follow [this](#adding-clearml-agent-to-a-configuration-file) to add +If ClearML was previously configured, follow [this](#adding-clearml-agent-to-a-configuration-file) to add ClearML Agent specific configurations ::: @@ -78,7 +78,7 @@ Install ClearML Agent as a system Python package and not in a Python virtual env Detected credentials key="********************" secret="*******" -1. **Enter** to accept default server URL, which is detected from the credentials or Enter a ClearML web server URL. +1. **Enter** to accept default server URL, which is detected from the credentials or enter a ClearML web server URL. A secure protocol, https, must be used. **Do not use http.** @@ -531,7 +531,7 @@ clearml-agent daemon --dynamic-gpus --queue dual_gpus=2 single_gpu=1 ### Example -Let's say there are three queues on a server, named: +Let's say a server has three queues: * `dual_gpu` * `quad_gpu` * `opportunistic` @@ -553,7 +553,7 @@ Another option for allocating GPUs: clearml-agent daemon --dynamic-gpus --gpus 0-7 --queue dual=2 opportunistic=1-4 ``` -Notice that a minimum and maximum value of GPUs was specified for the `opportunistic` queue. This means the agent +Notice that a minimum and maximum value of GPUs is specified for the `opportunistic` queue. This means the agent will pull a Task from the `opportunistic` queue and allocate up to 4 GPUs based on availability (i.e. GPUs not currently being used by other agents). diff --git a/docs/clearml_data/data_management_examples/data_man_simple.md b/docs/clearml_data/data_management_examples/data_man_simple.md index 4cf3aa5b..f84a5a7c 100644 --- a/docs/clearml_data/data_management_examples/data_man_simple.md +++ b/docs/clearml_data/data_management_examples/data_man_simple.md @@ -100,7 +100,7 @@ Total 5 files, 248771 bytes ## Creating a Child Dataset -In Clear Data, it's possible to create datasets that inherit the content of other datasets, there are called child datasets. +Using ClearML Data, you can create child datasets that inherit the content of other datasets. 1. Create a new dataset, specifying the previously created one as its parent: @@ -111,8 +111,8 @@ In Clear Data, it's possible to create datasets that inherit the content of othe You'll need to input the Dataset ID you received when created the dataset above ::: -1. Now, we want to add a new file. - * Create a new file: `echo "data data data" > new_data.txt` (this will create the file `new_data.txt`), +1. Add a new file. + * Create a new file: `echo "data data data" > new_data.txt` * Now add the file to the dataset: ```bash @@ -126,7 +126,7 @@ You'll need to input the Dataset ID you received when created the dataset above 1 file added ``` -1. Let's also remove a file. We'll need to specify the file's full path (within the dataset, not locally) to remove it. +1. Remove a file. We'll need to specify the file's full path (within the dataset, not locally) to remove it. ```bash clearml-data remove --files data_samples/dancing.jpg @@ -145,7 +145,7 @@ You'll need to input the Dataset ID you received when created the dataset above clearml-data close ``` -1. Let's take a look again at the files in the dataset: +1. Look again at the files in the dataset: ``` clearml-data list --id 8b68686a4af040d081027ba3cf6bbca6 diff --git a/docs/configs/clearml_conf.md b/docs/configs/clearml_conf.md index e472f2ea..0e1cf945 100644 --- a/docs/configs/clearml_conf.md +++ b/docs/configs/clearml_conf.md @@ -1,18 +1,18 @@ --- title: Configuration File --- -This reference page provides detailed information about the configurable options for **ClearML** and **ClearML Agent**. -**ClearML** and **ClearML Agent** use the same configuration file `clearml.conf`. +This reference page provides detailed information about the configurable options for ClearML and ClearML Agent. +ClearML and ClearML Agent use the same configuration file `clearml.conf`. This reference page is organized by configuration file section: -* [agent](#agent-section) - Contains **ClearML Agent** configuration options. If **ClearML Agent** was not installed, the configuration +* [agent](#agent-section) - Contains ClearML Agent configuration options. If ClearML Agent was not installed, the configuration file will not have an `agent` section. -* [api](#api-section) - Contains **ClearML** and **ClearML Agent** configuration options for **ClearML Server**. -* [sdk](#sdk-section) - Contains **ClearML** and **ClearML Agent** configuration options for **ClearML Python Package** and **ClearML Server**. +* [api](#api-section) - Contains ClearML and ClearML Agent configuration options for ClearML Server. +* [sdk](#sdk-section) - Contains ClearML and ClearML Agent configuration options for ClearML Python Package and ClearML Server. An example configuration file is located [here](https://github.com/allegroai/clearml-agent/blob/master/docs/clearml.conf), -in the **ClearML Agent** GitHub repository. +in the ClearML Agent GitHub repository. :::info The values in the ClearML configuration file can be overridden by environment variables, the [configuration vault](../webapp/webapp_profile.md#configuration-vault), @@ -23,7 +23,7 @@ and command-line arguments. To add, change, or delete options, edit your configuration file. -**To edit your **ClearML** configuration file:** +**To edit your ClearML configuration file:** 1. Open the configuration file for editing, depending upon your operating system: @@ -60,7 +60,7 @@ for information about using environment variables with Windows in the configurat **`agent`** (*dict*) -* Dictionary of top-level **ClearML Agent** options to configure **ClearML Agent** for Git credentials, package managers, cache management, workers, and Docker for workers. +* Dictionary of top-level ClearML Agent options to configure ClearML Agent for Git credentials, package managers, cache management, workers, and Docker for workers. --- **`agent.cuda_version`** (*float*) @@ -538,25 +538,25 @@ Torch Nightly builds are ephemeral and are deleted from time to time. **`api`** (*dict*) -Dictionary of configuration options for the **ClearML Server** API, web, and file servers and credentials. +Dictionary of configuration options for the ClearML Server API, web, and file servers and credentials. --- **`api.api_server`** (*string*) -* The URL of your **ClearML** API server. For example, `https://api.MyDomain.com`. +* The URL of your ClearML API server. For example, `https://api.MyDomain.com`. --- **`api.web_server`** (*string*) -* The URL of your **ClearML** web server. For example, `https://app.MyDomain.com`. +* The URL of your ClearML web server. For example, `https://app.MyDomain.com`. --- **`api.files_server`** (*string*) -* The URL of your **ClearML** file server. For example, `https://files.MyDomain.com`. +* The URL of your ClearML file server. For example, `https://files.MyDomain.com`. :::warning You must use a secure protocol. For ``api.web_server``, ``api.files_server``, and ``api.files_server``. You must use a secure protocol, "https". Do not use "http". @@ -576,13 +576,13 @@ You must use a secure protocol. For ``api.web_server``, ``api.files_server``, an **`api.credentials.access_key`** (*string*) -* Your **ClearML** access key. +* Your ClearML access key. --- **`api.credentials.secret_key`** (*string*) -* Your **ClearML** credentials. +* Your ClearML credentials. --- @@ -607,7 +607,7 @@ Set to False only if required. **`sdk`** (*dict*) -* Dictionary that contains configuration options for the **ClearML Python Package** and related options, including storage, +* Dictionary that contains configuration options for the ClearML Python Package and related options, including storage, metrics, network, AWS S3 buckets and credentials, Google Cloud Storage, Azure Storage, log, and development.
@@ -852,7 +852,7 @@ and limitations on bucket naming. **`sdk.development.worker.report_period_sec`** (*integer*) -* For development mode workers, the interval in seconds for a development mode **ClearML** worker to report. +* For development mode workers, the interval in seconds for a development mode ClearML worker to report.
diff --git a/docs/deploying_clearml/clearml_server_linux_mac.md b/docs/deploying_clearml/clearml_server_linux_mac.md index c58a2db2..b7454184 100644 --- a/docs/deploying_clearml/clearml_server_linux_mac.md +++ b/docs/deploying_clearml/clearml_server_linux_mac.md @@ -123,7 +123,7 @@ instructions in the [Security](clearml_server_security.md) page. sudo curl https://raw.githubusercontent.com/allegroai/clearml-server/master/docker/docker-compose.yml -o /opt/clearml/docker-compose.yml -1. For Linux only, configure the **ClearML Agent Services**. If `CLEARML_HOST_IP` is not provided, then **ClearML Agent Services** will use the external public address of the **ClearML Server**. If `CLEARML_AGENT_GIT_USER` / `CLEARML_AGENT_GIT_PASS` are not provided, then **ClearML Agent Services** will not be able to access any private repositories for running service tasks. +1. For Linux only, configure the **ClearML Agent Services**. If `CLEARML_HOST_IP` is not provided, then **ClearML Agent Services** uses the external public address of the **ClearML Server**. If `CLEARML_AGENT_GIT_USER` / `CLEARML_AGENT_GIT_PASS` are not provided, then **ClearML Agent Services** can't access any private repositories for running service tasks. export CLEARML_HOST_IP=server_host_ip_here export CLEARML_AGENT_GIT_USER=git_username_here diff --git a/docs/deploying_clearml/upgrade_server_linux_mac.md b/docs/deploying_clearml/upgrade_server_linux_mac.md index 84e8d3dc..d3200faa 100644 --- a/docs/deploying_clearml/upgrade_server_linux_mac.md +++ b/docs/deploying_clearml/upgrade_server_linux_mac.md @@ -15,8 +15,8 @@ This documentation page applies to deploying your own open source ClearML Server For Linux only, if upgrading from Trains Server v0.14 or older, configure the ClearML Agent Services. - * If ``CLEARML_HOST_IP`` is not provided, then **ClearML Agent Services** will use the external public address of the **ClearML Server**. - * If ``CLEARML_AGENT_GIT_USER`` / ``CLEARML_AGENT_GIT_PASS`` are not provided, then **ClearML Agent Services** will not be able to access any private repositories for running service tasks. + * If ``CLEARML_HOST_IP`` is not provided, then **ClearML Agent Services** uses the external public address of the **ClearML Server**. + * If ``CLEARML_AGENT_GIT_USER`` / ``CLEARML_AGENT_GIT_PASS`` are not provided, then **ClearML Agent Services** can't access any private repositories for running service tasks. export CLEARML_HOST_IP=server_host_ip_here diff --git a/docs/faq.md b/docs/faq.md index a822fa20..d3203a74 100644 --- a/docs/faq.md +++ b/docs/faq.md @@ -228,7 +228,7 @@ To replace the URL of each model, execute the following commands: sudo docker exec -it clearml-mongo /bin/bash ``` -1. Inside the docker shell, create the following script. Make sure to replace `` and ``, +1. Create the following script inside the Docker shell: as well as the URL protocol if you aren't using `s3`. ```bash cat <> script.js @@ -237,7 +237,7 @@ To replace the URL of each model, execute the following commands: db.model.save(e);}); EOT ``` - + Make sure to replace `` and ``. 1. Run the script against the backend DB: ```bash @@ -258,7 +258,7 @@ To fix this, the registered URL of each model needs to be replaced with its curr sudo docker exec -it clearml-mongo /bin/bash ``` -1. Inside the Docker shell, create the following script. Make sure to replace `` and ``, as well as the URL protocol prefixes if you aren't using S3. +1. Create the following script inside the Docker shell. ```bash cat <> script.js db.model.find({uri:{$regex:/^s3/}}).forEach(function(e,i) { @@ -266,7 +266,7 @@ To fix this, the registered URL of each model needs to be replaced with its curr db.model.save(e);}); EOT ``` - + Make sure to replace `` and ``, as well as the URL protocol prefixes if you aren't using S3. 1. Run the script against the backend DB: ```bash @@ -930,7 +930,7 @@ If a port conflict occurs, change the MongoDB and / or Elastic ports in the `doc To change the MongoDB and / or Elastic ports for your ClearML Server, do the following: 1. Edit the `docker-compose.yml` file. -1. In the `services/trainsserver/environment` section, add the following environment variable(s): +1. Add the following environment variable(s) in the `services/trainsserver/environment` section: * For MongoDB: @@ -994,7 +994,7 @@ Do the following: 1. If a ClearML configuration file (`clearml.conf`) exists, delete it. 1. Open a terminal session. -1. In the terminal session, set the system environment variable to `127.0.0.1`, for example: +1. Set the system environment variable to `127.0.0.1` in the terminal session. For example: * Linux: diff --git a/docs/fundamentals/agents_and_queues.md b/docs/fundamentals/agents_and_queues.md index 856b9bc0..60e6474b 100644 --- a/docs/fundamentals/agents_and_queues.md +++ b/docs/fundamentals/agents_and_queues.md @@ -26,7 +26,7 @@ can allocate several GPUs to an agent and use the rest for a different workload, ## What is a Queue? -A ClearML queue is an ordered list of Tasks scheduled for execution. A queue can be serviced by one or multiple agents. +A ClearML queue is an ordered list of Tasks scheduled for execution. One or multiple agents can service a queue. Agents servicing a queue pull the queued tasks in order and execute them. A ClearML Agent can service multiple queues in either of the following modes: @@ -51,8 +51,8 @@ The diagram above demonstrates a typical flow where an agent executes a task: 1. Set up the python environment and required packages. 1. The task's script/code is executed. -While the agent is running, it continuously reports system metrics to the ClearML Server (these can be monitored in the -[**Workers and Queues**](../webapp/webapp_workers_queues.md) page). +While the agent is running, it continuously reports system metrics to the ClearML Server. You can monitor these metrics +in the [**Workers and Queues**](../webapp/webapp_workers_queues.md) page. ## Resource Management Installing an Agent on machines allows it to monitor all the machine's status (GPU / CPU / Memory / Network / Disk IO). diff --git a/docs/fundamentals/hpo.md b/docs/fundamentals/hpo.md index e8fda619..1a4afcfd 100644 --- a/docs/fundamentals/hpo.md +++ b/docs/fundamentals/hpo.md @@ -6,7 +6,7 @@ title: Hyperparameter Optimization Hyperparameters are variables that directly control the behaviors of training algorithms, and have a significant effect on the performance of the resulting machine learning models. Finding the hyperparameter values that yield the best performing models can be complicated. Manually adjusting hyperparameters over the course of many training trials can be -slow and tedious. Luckily, hyperparameter optimization can be automated and boosted using ClearML's +slow and tedious. Luckily, you can automate and boost hyperparameter optimization with ClearML's [**`HyperParameterOptimizer`**](../references/sdk/hpo_optimization_hyperparameteroptimizer.md) class. ## ClearML's HyperParameter Optimization @@ -77,11 +77,12 @@ optimization. ```python from clearml import Task - task = Task.init(project_name='Hyper-Parameter Optimization', - task_name='Automatic Hyper-Parameter Optimization', - task_type=Task.TaskTypes.optimizer, - reuse_last_task_id=False) - + task = Task.init( + project_name='Hyper-Parameter Optimization', + task_name='Automatic Hyper-Parameter Optimization', + task_type=Task.TaskTypes.optimizer, + reuse_last_task_id=False + ) ``` 1. Define the optimization configuration and resources budget: diff --git a/docs/fundamentals/logger.md b/docs/fundamentals/logger.md index 720f68c0..4e667bdd 100644 --- a/docs/fundamentals/logger.md +++ b/docs/fundamentals/logger.md @@ -8,7 +8,7 @@ member of the [Task](task.md) object. ClearML integrates with the leading visualization libraries, and automatically captures reports to them. ## Types of Logged Results -In ClearML, there are four types of reports: +ClearML supports four types of reports: - Text - Mostly captured automatically from stdout and stderr but can be logged manually. - Scalars - Time series data. X-axis is always a sequential number, usually iterations but can be epochs or others. - Plots - General graphs and diagrams, such as histograms, confusion matrices line plots, and custom plotly charts. diff --git a/docs/fundamentals/task.md b/docs/fundamentals/task.md index 1782b732..804905ac 100644 --- a/docs/fundamentals/task.md +++ b/docs/fundamentals/task.md @@ -14,7 +14,7 @@ information as well as execution outputs. All the information captured by a task is by default uploaded to the [ClearML Server](../deploying_clearml/clearml_server.md) and it can be visualized in the [ClearML WebApp](../webapp/webapp_overview.md) (UI). ClearML can also be configured to upload model checkpoints, artifacts, and charts to cloud storage (see [Storage](../integrations/storage.md)). Additionally, -there is an option to work with tasks in Offline Mode, in which all information is saved in a local folder (see +you can work with tasks in Offline Mode, in which all information is saved in a local folder (see [Storing Task Data Offline](../guides/set_offline.md)). In the UI and code, tasks are grouped into [projects](projects.md), which are logical entities similar to folders. Users can decide diff --git a/docs/getting_started/ds/best_practices.md b/docs/getting_started/ds/best_practices.md index cc10954e..787fe0d6 100644 --- a/docs/getting_started/ds/best_practices.md +++ b/docs/getting_started/ds/best_practices.md @@ -16,10 +16,10 @@ The below is only our opinion. ClearML was designed to fit into any workflow whe During early stages of model development, while code is still being modified heavily, this is the usual setup we'd expect to see used by data scientists: - - A local development machine, usually a laptop (and usually using only CPU) with a fraction of the dataset for faster iterations - this is used for writing the training pipeline code, ensuring it knows to parse the data - and there are no glaring bugs. - - A workstation with a GPU, usually with a limited amount of memory for small batch-sizes. This is used to train the model and ensure the model we chose makes sense and that the training - procedure works. Can be used to provide initial models for testing. + - A local development machine, usually a laptop (and usually using only CPU) with a fraction of the dataset for faster + iterations - Use a local machine for writing, training, and debugging pipeline code. + - A workstation with a GPU, usually with a limited amount of memory for small batch-sizes - Use this workstation to train + the model and ensure that you choose a model that makes sense, and the training procedure works. Can be used to provide initial models for testing. The abovementioned setups might be folded into each other and that's great! If you have a GPU machine for each researcher, that's awesome! The goal of this phase is to get a code, dataset and environment setup, so we can start digging to find the best model! diff --git a/docs/getting_started/ds/ds_second_steps.md b/docs/getting_started/ds/ds_second_steps.md index 1dab184b..db3097e8 100644 --- a/docs/getting_started/ds/ds_second_steps.md +++ b/docs/getting_started/ds/ds_second_steps.md @@ -10,7 +10,7 @@ Now, we'll learn how to track Hyperparameters, Artifacts and Metrics! Every previously executed experiment is stored as a Task. A Task has a project and a name, both can be changed after the experiment has been executed. -A Task is also automatically assigned an auto-generated unique identifier (UUID string) that cannot be changed and will always locate the same Task in the system. +A Task is also automatically assigned an auto-generated unique identifier (UUID string) that cannot be changed and always locates the same Task in the system. It's possible to retrieve a Task object programmatically by querying the system based on either the Task ID, or project & name combination. It's also possible to query tasks based on their properties, like Tags. @@ -26,7 +26,7 @@ Once we have a Task object we can query the state of the Task, get its Model, sc For full reproducibility, it's paramount to save Hyperparameters for each experiment. Since Hyperparameters can have substantial impact on Model performance, saving and comparing these between experiments is sometimes the key to understand model behavior. -ClearML supports logging `argparse` module arguments out of the box, so once integrating it into the code, it will automatically log all parameters provided to the argument parser. +ClearML supports logging `argparse` module arguments out of the box, so once ClearML is integrated into the code, it automatically logs all parameters provided to the argument parser. It's also possible to log parameter dictionaries (very useful when parsing an external config file and storing as a dict object), whole configuration files or even custom objects or [Hydra](https://hydra.cc/docs/intro/) configurations! @@ -46,7 +46,7 @@ Essentially, artifacts are files (or python objects) uploaded from a script and These Artifacts can be easily accessed by the web UI or programmatically. Artifacts can be stored anywhere, either on the ClearML server, or any object storage solution or shared folder. -See all [storage capabilities](../../integrations/storage). +See all [storage capabilities](../../integrations/storage.md). ### Adding Artifacts @@ -84,9 +84,9 @@ local_csv = preprocess_task.artifacts['data'].get_local_copy() ``` The `task.artifacts` is a dictionary where the keys are the Artifact names, and the returned object is the Artifact object. -Calling `get_local_copy()` will return a local cached copy of the artifact, -this means that the next time we execute the code we will not need to download the artifact again. -Calling `get()` will get a deserialized pickled object. +Calling `get_local_copy()` returns a local cached copy of the artifact. Therefore, next time we execute the code, we don't +need to download the artifact again. +Calling `get()` gets a deserialized pickled object. Check out the [artifacts retrieval](https://github.com/allegroai/clearml/blob/master/examples/reporting/artifacts_retrieval.py) example code. @@ -94,15 +94,15 @@ Check out the [artifacts retrieval](https://github.com/allegroai/clearml/blob/ma Models are a special kind artifact. Models created by popular frameworks (such as Pytorch, Tensorflow, Scikit-learn) are automatically logged by ClearML. -All snapshots are automatically logged, in order to make sure we also automatically upload the model snapshot (instead of saving its local path) +All snapshots are automatically logged. In order to make sure we also automatically upload the model snapshot (instead of saving its local path), we need to pass a storage location for the model files to be uploaded to. -For example uploading all snapshots to our S3 bucket: +For example, upload all snapshots to an S3 bucket: ```python task = Task.init(project_name='examples', task_name='storing model', output_uri='s3://my_models/') ``` -From now on, whenever the framework (TF/Keras/PyTorch etc.) will be storing a snapshot, the model file will automatically get uploaded to our bucket under a specific folder for the experiment. +Now, whenever the framework (TF/Keras/PyTorch etc.) stores a snapshot, the model file is automatically uploaded to the bucket to a specific folder for the experiment. Loading models by a framework is also logged by the system, these models appear under the “Input Models” section, under the Artifacts tab. @@ -124,7 +124,7 @@ Like before we have to get the instance of the Task training the original weight :::note Using Tensorflow, the snapshots are stored in a folder, meaning the `local_weights_path` will point to a folder containing our requested snapshot. ::: -As with Artifacts all models are cached, meaning the next time we will run this code, no model will need to be downloaded. +As with Artifacts, all models are cached, meaning the next time we run this code, no model needs to be downloaded. Once one of the frameworks will load the weights file, the running Task will be automatically updated with “Input Model” pointing directly to the original training Task’s Model. This feature allows you to easily get a full genealogy of every trained and used model by your system! @@ -150,7 +150,7 @@ The experiment table is a powerful tool for creating dashboards and views of you ### Creating Leaderboards -The [experiments table](../../webapp/webapp_exp_table.md) can be customized to your own needs, adding desired views of parameters, metrics and tags. +Customize the [experiments table](../../webapp/webapp_exp_table.md) to fit your own needs, adding desired views of parameters, metrics and tags. It's possible to filter and sort based on parameters and metrics, so creating custom views is simple and flexible. Create a dashboard for a project, presenting the latest Models and their accuracy scores, for immediate insights. diff --git a/docs/getting_started/mlops/mlops_first_steps.md b/docs/getting_started/mlops/mlops_first_steps.md index 1653a07e..f201de7e 100644 --- a/docs/getting_started/mlops/mlops_first_steps.md +++ b/docs/getting_started/mlops/mlops_first_steps.md @@ -115,7 +115,7 @@ Task.enqueue(task=cloned_task, queue_name='default') ``` ### Advanced Usage -Before execution, there are a variety of programmatic methods which can be used to manipulate a task object. +Before execution, use a variety of programmatic methods to manipulate a task object. #### Modify Hyperparameters [Hyperparameters](../../fundamentals/hyperparameters.md) are an integral part of Machine Learning code as they let you diff --git a/docs/getting_started/mlops/mlops_second_steps.md b/docs/getting_started/mlops/mlops_second_steps.md index 18d903bc..d74d0a64 100644 --- a/docs/getting_started/mlops/mlops_second_steps.md +++ b/docs/getting_started/mlops/mlops_second_steps.md @@ -7,7 +7,10 @@ Pipelines provide users with a greater level of abstraction and automation, with Tasks can interface with other Tasks in the pipeline and leverage other Tasks' work products. -We'll go through a scenario where users create a Dataset, process the data then consume it with another task, all running as a pipeline. +The sections below describe the following scenarios: +* Dataset creation +* Data processing and consumption +* Pipeline building ## Building Tasks @@ -56,11 +59,11 @@ dataset.tags = [] new_dataset.tags = ['latest'] ``` -We passed the `parents` argument when we created v2 of the Dataset, this inherits all the parent's version content. -This will not only help us in tracing back dataset changes with full genealogy, but will also make our storage more efficient, -as it will only store the files that were changed / added from the parent versions. -When we will later need access to the Dataset it will automatically merge the files from all parent versions -in a fully automatic and transparent process, as if they were always part of the requested Dataset. +We passed the `parents` argument when we created v2 of the Dataset, which inherits all the parent's version content. +This not only helps trace back dataset changes with full genealogy, but also makes our storage more efficient, +since it only store the changed and / or added files from the parent versions. +When we access the Dataset, it automatically merges the files from all parent versions +in a fully automatic and transparent process, as if the files were always part of the requested Dataset. ### Training We can now train our model with the **latest** Dataset we have in the system. diff --git a/docs/guides/distributed/distributed_pytorch_example.md b/docs/guides/distributed/distributed_pytorch_example.md index 38ca3221..9c46edac 100644 --- a/docs/guides/distributed/distributed_pytorch_example.md +++ b/docs/guides/distributed/distributed_pytorch_example.md @@ -17,8 +17,7 @@ dataset), and reports (uploads) the following to the main Task: Each Task in a subprocess references the main Task by calling [Task.current_task](../../references/sdk/task#taskcurrent_task), which always returns the main Task. -When the script runs, it creates an experiment named `test torch distributed`, which is associated with the `examples` project -in the **ClearML Web UI**. +When the script runs, it creates an experiment named `test torch distributed`, which is associated with the `examples` project. ## Artifacts diff --git a/docs/guides/ide/remote_jupyter_tutorial.md b/docs/guides/ide/remote_jupyter_tutorial.md index b1527378..808eb9b9 100644 --- a/docs/guides/ide/remote_jupyter_tutorial.md +++ b/docs/guides/ide/remote_jupyter_tutorial.md @@ -32,7 +32,7 @@ clearml-session --docker nvidia/cuda:10.1-cudnn7-runtime-ubuntu18.04 --packages * Specify the resource queue `--queue default`. :::note -There is an option to enter a project name using `--project `. If no project is input, the default project +Enter a project name using `--project `. If no project is input, the default project name is "DevOps" ::: diff --git a/docs/guides/services/aws_autoscaler.md b/docs/guides/services/aws_autoscaler.md index 4317654c..b1847c3e 100644 --- a/docs/guides/services/aws_autoscaler.md +++ b/docs/guides/services/aws_autoscaler.md @@ -6,7 +6,7 @@ The ClearML [AWS autoscaler example](https://github.com/allegroai/clearml/blob/m demonstrates how to use the [`clearml.automation.auto_scaler`](https://github.com/allegroai/clearml/blob/master/clearml/automation/auto_scaler.py) module to implement a service that optimizes AWS EC2 instance scaling according to a defined instance budget. -It periodically polls your AWS cluster and automatically stops idle instances based on a defined maximum idle time or spins +The autoscaler periodically polls your AWS cluster and automatically stops idle instances based on a defined maximum idle time or spins up new instances when there aren't enough to execute pending tasks. ## Running the ClearML AWS Autoscaler diff --git a/docs/hyperdatasets/annotations.md b/docs/hyperdatasets/annotations.md index 090d3748..bb7df80f 100644 --- a/docs/hyperdatasets/annotations.md +++ b/docs/hyperdatasets/annotations.md @@ -38,7 +38,7 @@ frame.add_annotation(box2d_xywh=(10, 10, 30, 20), labels=['test']) The `box2d_xywh` argument specifies the coordinates of the annotation's bounding box, and the `labels` argument specifies a list of labels for the annotation. -When adding an annotation there are a few options for entering the annotation's boundaries, including: +Enter the annotation's boundaries in one of the following ways: * `poly2d_xy` - A list of floating points (x,y) to create for single polygon, or a list of floating points lists for a complex polygon. * `ellipse2d_xyrrt` - A List consisting of cx, cy, rx, ry, and theta for an ellipse. diff --git a/docs/hyperdatasets/masks.md b/docs/hyperdatasets/masks.md index a3bc8426..21e7236b 100644 --- a/docs/hyperdatasets/masks.md +++ b/docs/hyperdatasets/masks.md @@ -8,7 +8,7 @@ source data to the ClearML Enterprise platform. That source data is a **mask**. Masks are used in deep learning for semantic segmentation. Masks correspond to raw data where the objects to be detected are marked with colors in the masks. The colors -are RGB values and represent the objects, which are labeled for segmentation. +are RGB values and represent the objects that are labeled for segmentation. In frames used for semantic segmentation, the metadata connecting the mask files / images to the ClearML Enterprise platform, and the RGB values and labels used for segmentation are separate. They are contained in two different dictionaries of diff --git a/docs/webapp/webapp_exp_comparing.md b/docs/webapp/webapp_exp_comparing.md index 89e61446..46d9ccaa 100644 --- a/docs/webapp/webapp_exp_comparing.md +++ b/docs/webapp/webapp_exp_comparing.md @@ -1,8 +1,7 @@ --- title: Comparing Experiments --- -It is always useful to be able to do some forensics on what causes an experiment to succeed and to better understand -performance issues. +It is always useful to investigate what causes an experiment to succeed. The **ClearML Web UI** provides a deep experiment comparison, allowing to locate, visualize, and analyze differences including: * [Details](#details) diff --git a/docs/webapp/webapp_exp_table.md b/docs/webapp/webapp_exp_table.md index 81f8c57c..04969c0a 100644 --- a/docs/webapp/webapp_exp_table.md +++ b/docs/webapp/webapp_exp_table.md @@ -170,7 +170,7 @@ Click the checkbox in the top left corner of the table to select all items curre An extended bulk selection tool is available through the down arrow next to the checkbox in the top left corner, enabling selecting items beyond the items currently on-screen: * **All** - Select all experiments in the project -* **None** - Clear Selection +* **None** - Clear selection * **Filtered** - Select **all experiments in the project** that match the current active filters in the project ## Creating an Experiment Leaderboard diff --git a/docs/webapp/webapp_project_overview.md b/docs/webapp/webapp_project_overview.md index cde026da..b9fb1a4b 100644 --- a/docs/webapp/webapp_project_overview.md +++ b/docs/webapp/webapp_project_overview.md @@ -11,7 +11,7 @@ meaning that it's the first thing that is seen when opening the project. ## Metric Snapshot -On the top of the **OVERVIEW** tab, there is an option to display a **metric snapshot**. Choose a metric and variant, +On the top of the **OVERVIEW** tab, you can display a **metric snapshot**. Choose a metric and variant, and then the window will present an aggregated view of the value for that metric and the time that each experiment scored that value. This way, the project's progress can be quickly deduced.