Merge branch 'allegroai:main' into uri_links

2025-06-26 18:17:44 +00:00 · 2021-08-22 12:19:54 +03:00 · 2021-08-22 12:19:54 +03:00 · 5ac6108623
commit 5ac6108623
parent 29c9dc5b11 d5942a025c
7 changed files with 72 additions and 16 deletions
--- a/docs/apps/clearml_session.md
+++ b/docs/apps/clearml_session.md
@ -12,12 +12,13 @@ in the UI and send it for long-term training on a remote machine.
 **If you are not that lucky**, this section is for you :)

 ## What does ClearML Session do?
-`clearml-session` is a feature that allows to launch a session of Jupyterlab and VS Code, and to execute code on a remote 
+`clearml-session` is a feature that allows to launch a session of JupyterLab and VS Code, and to execute code on a remote 
 machine that better meets resource needs. With this feature, local links are provided, which can be used to access 
-JupyterLab and VS Code on a remote machine over a secure and encrypted SSH connection.
+JupyterLab and VS Code on a remote machine over a secure and encrypted SSH connection. By default, the JupyterLab and 
+VS Code remote sessions use ports 8878 and 8898 respectively. 

 <details className="cml-expansion-panel screenshot">
-<summary className="cml-expansion-panel-summary">Jupyter-Lab Window</summary>
+<summary className="cml-expansion-panel-summary">JupyterLab Window</summary>
 <div className="cml-expansion-panel-content">

 ![image](../img/session_jupyter.png)
@ -138,7 +139,7 @@ The Task must be connected to a git repository, since currently single script de

 | Command line options | Description | Default value |
 |-----|---|---|
-| `--jupyter-lab` | Download a Jupyter-Lab environment | `true` |
+| `--jupyter-lab` | Download a JupyterLab environment | `true` |
 | `--vscode-server` | Download a VSCode environment | `true` |
 | `--public-ip` | Register the public IP of the remote machine (if you are running the session on a public cloud) | Session runs on the machine whose agent is executing the session|
 | `--init-script` | Specify a BASH init script file to be executed when the interactive session is being set up | `none` or previously entered BASH script |
--- a/docs/apps/clearml_task.md
+++ b/docs/apps/clearml_task.md
@ -6,7 +6,7 @@ ClearML Task is ClearML's Zero Code Integration Module. Using only the command l
 you can easily track your work and integrate ClearML with your existing code.

 `clearml-task` automatically integrates ClearML into any script or **any** python repository. `clearml-task` has the option 
-to send the Task to a queue, where a **ClearML Agent** listening to the queue will fetch the Task it and executes it on a 
+to send the task to a queue, where a **ClearML Agent** listening to the queue will fetch the task and execute it on a 
 remote or local machine. It's even possible to provide command line arguments and provide Python module dependencies and requirements.txt file! 

 ## How Does ClearML Task Work?
@ -14,8 +14,8 @@ remote or local machine. It's even possible to provide command line arguments an
 1. Execute `clearml-task`, pointing it to your script or repository, and optionally an execution queue. 
 1. `clearml-task` does its magic! It creates a new experiment on the [ClearML Server](../deploying_clearml/clearml_server.md), 
   and, if a queue was specified, it sends the experiment to the queue to be fetched and executed by a **ClearML Agent**.
-1. The command line will provide you with a link to your Task's page in the ClearML web UI, 
-   where you will be able to view the Task's details. 
+1. The command line will provide you with a link to your task's page in the ClearML web UI, 
+   where you will be able to view the task's details. 
   
 ## Features and Options
 ### Docker
@ -24,12 +24,12 @@ The ClearML Agent will pull it from dockerhub or a docker artifactory automatica

 ### Package Dependencies
 If the local script requires packages to be installed installed or the remote repository doesn't have a requirements.txt file,
-specify manually the required python packages using<br/>
+specify manually the required python packages using <br/>
 `--packages "<package_name>"`, for example `--packages "keras" "tensorflow>2.2"`.

 ### Queue
-Tasks are passed to ClearML Agents via [Queues](../fundamentals/agents_and_queues.md). Specify a queue to enqueue the Task to.
-If a queue isn't chosen in the `clearml-task` command, the Task will not be executed; it will be left in draft mode,
+Tasks are passed to ClearML Agents via [Queues](../fundamentals/agents_and_queues.md). Specify a queue to enqueue the task to.
+If a queue isn't chosen in the `clearml-task` command, the task will not be executed; it will be left in draft mode,
 and can be enqueued at a later point. 

 ### Branch and Working Directory
@ -37,5 +37,34 @@ A specific branch and commit ID, other than latest commit in master, to be execu
 `--branch <branch_name> --commit <commit_id>` flags.
 If unspecified, `clearml-task` will use the latest commit from the master branch.

-Learn how to use the `clearml-task` feature [here](../guides/clearml-task/clearml_task_tutorial.md).
+### Command line options

+<div className="tbl-cmd">
+
+|Name | Description| Optional |
+|---|----|---|
+| `--version` | Display the `clearml-task` utility version | <img src="/docs/latest/icons/ico-optional-yes.svg" alt="Yes" className="icon size-md center-md" /> |
+| `--project`| Set the project name for the task (Required, unless using `--base-task-id`) | <img src="/docs/latest/icons/ico-optional-no.svg" alt="No" className="icon size-md center-md" /> |
+| `--name` | Select a name for the remote task | <img src="/docs/latest/icons/ico-optional-no.svg" alt="No" className="icon size-md center-md" /> |
+| `--repo` | URL of remote repository. Example: `--repo https://github.com/allegroai/clearml.git` | <img src="/docs/latest/icons/ico-optional-yes.svg" alt="Yes" className="icon size-md center-md" /> |
+| `--branch` | Select specific repository branch / tag. By default, latest commit from the master branch | <img src="/docs/latest/icons/ico-optional-yes.svg" alt="Yes" className="icon size-md center-md" /> |
+| `--commit` | Select specific commit ID to use. By default, latest commit, or local commit ID when using local repository | <img src="/docs/latest/icons/ico-optional-yes.svg" alt="Yes" className="icon size-md center-md" /> |
+| `--folder` | Remotely execute the code in a local folder. Notice! It assumes a git repository already exists. Current state of the repo (commit ID and uncommitted changes) is logged and will be replicated on the remote machine | <img src="/docs/latest/icons/ico-optional-yes.svg" alt="Yes" className="icon size-md center-md" /> | 
+| `--script` | Entry point script for the remote execution. When used in tandem with `--repo`, the script should be a relative path inside the repository. For example: `--script source/train.py`. When used with `--folder`, it supports a direct path to a file inside the local repository itself, for example: `--script ~/project/source/train.py` | <img src="/docs/latest/icons/ico-optional-no.svg" alt="No" className="icon size-md center-md" /> |
+| `--cwd` | Working directory to launch the script from. Relative to repo root or local `--folder` | <img src="/docs/latest/icons/ico-optional-yes.svg" alt="Yes" className="icon size-md center-md" /> |
+| `--args` | Arguments to pass to the remote task, list of `<argument>=<value>` strings. Currently only argparse arguments are supported. Example: `--args lr=0.003 batch_size=64` | <img src="/docs/latest/icons/ico-optional-yes.svg" alt="Yes" className="icon size-md center-md" /> |
+| `--queue` | Select task's execution queue. If not provided, a task will be created but it will not be launched | <img src="/docs/latest/icons/ico-optional-yes.svg" alt="Yes" className="icon size-md center-md" /> |
+| `--requirements` | Specify `requirements.txt` file to install when setting the session. By default, the` requirements.txt` from the repository will be used |  <img src="/docs/latest/icons/ico-optional-yes.svg" alt="Yes" className="icon size-md center-md" /> |
+| `--packages` | Manually specify a list of required packages. Example: `--packages "tqdm>=2.1" "scikit-learn"` | <img src="/docs/latest/icons/ico-optional-yes.svg" alt="Yes" className="icon size-md center-md" /> |
+| `--docker` | Select the docker image to use in the remote task | <img src="/docs/latest/icons/ico-optional-yes.svg" alt="Yes" className="icon size-md center-md" /> |
+| `--docker_args` | Add docker arguments, pass a single string | <img src="/docs/latest/icons/ico-optional-yes.svg" alt="Yes" className="icon size-md center-md" /> |
+| `--docker_bash_setup_script` | Add bash script to be executed inside the docker before setting up the task's environment | <img src="/docs/latest/icons/ico-optional-yes.svg" alt="Yes" className="icon size-md center-md" /> |
+| `--output-uri` | Set the task `output_uri`, upload destination for task models and artifacts (Optional) | <img src="/docs/latest/icons/ico-optional-yes.svg" alt="Yes" className="icon size-md center-md" /> |
+| `--task-type` | Set the task type. Optional values: training, testing, inference, data_processing, application, monitor, controller, optimizer, service, qc, custom | <img src="/docs/latest/icons/ico-optional-yes.svg" alt="Yes" className="icon size-md center-md" /> |
+| `--skip-task-init` | If set, `Task.init()` call is not added to the entry point, and is assumed to be called within the script. Default: Add `Task.init()` call to entry point script | <img src="/docs/latest/icons/ico-optional-yes.svg" alt="Yes" className="icon size-md center-md" /> |
+| `--base-task-id` | Use a pre-existing task in the system, instead of a local repo / script. Essentially clones an existing task and overrides arguments / requirements | <img src="/docs/latest/icons/ico-optional-yes.svg" alt="Yes" className="icon size-md center-md" /> |
+
+</div>
+
+## Tutorial
+Learn how to use the `clearml-task` feature [here](../guides/clearml-task/clearml_task_tutorial.md).
--- a/docs/deploying_clearml/clearml_server_config.md
+++ b/docs/deploying_clearml/clearml_server_config.md
@ -300,6 +300,7 @@ the watchdog marks them as `aborted`. The non-responsive experiment watchdog is

 Modify the following settings for the watchdog:

+* Watchdog status - enabled / disabled
 * The time threshold (in seconds) of experiment inactivity (default value is 7200 seconds (2 hours)).
 * The time interval (in seconds) between watchdog cycles.
 
@ -312,6 +313,8 @@ Modify the following settings for the watchdog:

        tasks {
            non_responsive_tasks_watchdog {
+                enabled: true
+
                # In-progress tasks that haven't been updated for at least 'value' seconds will be stopped by the watchdog
                threshold_sec: 7200
        
--- a/docs/faq.md
+++ b/docs/faq.md
@ -94,6 +94,7 @@ title: FAQ
 * [How do I bypass a proxy configuration to access my local ClearML Server?](#proxy-localhost)
 * [Trains is failing to update ClearML Server. I get an error 500 (or 400). How do I fix this?](#elastic_watermark)
 * [Why is my Trains Web-App (UI) not showing any data?](#web-ui-empty)
+* [Why can't I access my ClearML Server when I run my code in a virtual machine?](#vm_server)

 **ClearML Agent**

@ -816,7 +817,7 @@ Do the following:

 <br/>

-**The ClearML Server keeps returning HTTP 500 (or 400) errors. How do I fix this?**
+**The ClearML Server keeps returning HTTP 500 (or 400) errors. How do I fix this?** <a id="elastic_watermark"></a>

 The ClearML Server will return HTTP error responses (5XX, or 4XX) when some of its [backend components](deploying_clearml/clearml_server.md) 
 are failing. 
@ -839,6 +840,28 @@ A likely indication of this situation can be determined by searching your clearm

 If your ClearML Web-App (UI) does not show anything, it may be an error authenticating with the server. Try clearing the application cookies for the site in your browser's developer tools. 
    
+**Why can't I access my ClearML Server when I run my code in a virtual machine?** <a id="vm_server"></a>
+
+The network definitions inside a virtual machine (or container) are different from those of the host. The virtual machine's 
+and the server machine's IP addresses are different, so you have to make sure that the machine that is executing the 
+experiment can access the server's machine. 
+
+Make sure to have an independent configuration file for the virtual machine where you are running your experiments. 
+Edit the `api` section of your `clearml.conf` file and insert IP addresses of the server machine that are accessible 
+from the VM. It should look something like this:
+
+```
+api {
+    web_server: http://192.168.1.2:8080
+    api_server: http://192.168.1.2:8008
+    credentials {
+        "access_key" = "KEY"
+        "secret_key" = "SECRET"
+    }
+}
+```
+
+
 ## ClearML Agent

 **How can I execute ClearML Agent without installing packages each time?** <a className="tr_top_negative" id="system_site_packages"></a>
--- a/docs/getting_started/mlops/mlops_first_steps.md
+++ b/docs/getting_started/mlops/mlops_first_steps.md
@ -30,7 +30,7 @@ pip install clearml-agent
 Connect the Agent to the server by [creating credentials](https://app.community.clear.ml/profile), then run this:

 ```bash
-clearml-init
+clearml-agent init
 ```

 :::note
--- a/docs/guides/frameworks/pytorch/pytorch_mnist.md
+++ b/docs/guides/frameworks/pytorch/pytorch_mnist.md
@ -9,7 +9,7 @@ The example script does the following:
 * Trains a simple deep neural network on the PyTorch built-in [MNIST](https://pytorch.org/docs/stable/torchvision/datasets.html#mnist)
  dataset.
 * Uses **ClearML** automatic logging. 
-* Calls the [Logger.report_scalar](../../../references/sdk/logger.md#report_scalar) method to demonstrate explicit reporting and explicit reporting, 
+* Calls the [Logger.report_scalar](../../../references/sdk/logger.md#report_scalar) method to demonstrate explicit reporting, 
  which allows adding customized reporting to the code.
 * Creates an experiment named `pytorch mnist train`, which is associated with the `examples` project.

--- a/sidebars.js
+++ b/sidebars.js
@ -59,7 +59,7 @@ module.exports = {
            'guides/guidemain',
            {'Automation': ['guides/automation/manual_random_param_search_example', 'guides/automation/task_piping']},
            {'Data Management': ['guides/data management/data_man_simple', 'guides/data management/data_man_folder_sync', 'guides/data management/data_man_cifar_classification']},
-            {'Clearml Task': ['guides/clearml-task/clearml_task_tutorial']},
+            {'ClearML Task': ['guides/clearml-task/clearml_task_tutorial']},
            {'Distributed': ['guides/distributed/distributed_pytorch_example', 'guides/distributed/subprocess_example']},
            {'Docker': ['guides/docker/extra_docker_shell_script']},
            {'Frameworks': [
@ -106,7 +106,7 @@ module.exports = {

    ],
    rnSidebar: {
-        'Release Notes': ['release_notes/ver_1_0', 'release_notes/ver_0_17', 'release_notes/ver_0_16', 'release_notes/ver_0_15', 'release_notes/ver_0_14',
+        'Release Notes': ['release_notes/ver_1_1', 'release_notes/ver_1_0', 'release_notes/ver_0_17', 'release_notes/ver_0_16', 'release_notes/ver_0_15', 'release_notes/ver_0_14',
            'release_notes/ver_0_13', 'release_notes/ver_0_12', 'release_notes/ver_0_11', 'release_notes/ver_0_10',
            'release_notes/ver_0_9',
        ],