Merge branch 'allegroai:main' into main

2025-06-26 18:17:44 +00:00 · 2021-08-18 15:42:29 +03:00
parent 2d22f21ec7 5b51117434
commit f295d639af
4 changed files with 33 additions and 6 deletions
--- a/docs/apps/clearml_session.md
+++ b/docs/apps/clearml_session.md
@@ -12,12 +12,13 @@ in the UI and send it for long-term training on a remote machine.
 **If you are not that lucky**, this section is for you :)

 ## What does ClearML Session do?
-`clearml-session` is a feature that allows to launch a session of Jupyterlab and VS Code, and to execute code on a remote 
+`clearml-session` is a feature that allows to launch a session of JupyterLab and VS Code, and to execute code on a remote 
 machine that better meets resource needs. With this feature, local links are provided, which can be used to access 
-JupyterLab and VS Code on a remote machine over a secure and encrypted SSH connection.
+JupyterLab and VS Code on a remote machine over a secure and encrypted SSH connection. By default, the JupyterLab and 
+VS Code remote sessions use ports 8878 and 8898 respectively. 

 <details className="cml-expansion-panel screenshot">
-<summary className="cml-expansion-panel-summary">Jupyter-Lab Window</summary>
+<summary className="cml-expansion-panel-summary">JupyterLab Window</summary>
 <div className="cml-expansion-panel-content">

 ![image](../img/session_jupyter.png)
@@ -138,7 +139,7 @@ The Task must be connected to a git repository, since currently single script de

 | Command line options | Description | Default value |
 |-----|---|---|
-| `--jupyter-lab` | Download a Jupyter-Lab environment | `true` |
+| `--jupyter-lab` | Download a JupyterLab environment | `true` |
 | `--vscode-server` | Download a VSCode environment | `true` |
 | `--public-ip` | Register the public IP of the remote machine (if you are running the session on a public cloud) | Session runs on the machine whose agent is executing the session|
 | `--init-script` | Specify a BASH init script file to be executed when the interactive session is being set up | `none` or previously entered BASH script |
--- a/docs/deploying_clearml/clearml_server_config.md
+++ b/docs/deploying_clearml/clearml_server_config.md
@@ -300,6 +300,7 @@ the watchdog marks them as `aborted`. The non-responsive experiment watchdog is

 Modify the following settings for the watchdog:

+* Watchdog status - enabled / disabled
 * The time threshold (in seconds) of experiment inactivity (default value is 7200 seconds (2 hours)).
 * The time interval (in seconds) between watchdog cycles.
 
@@ -312,6 +313,8 @@ Modify the following settings for the watchdog:

        tasks {
            non_responsive_tasks_watchdog {
+                enabled: true
+
                # In-progress tasks that haven't been updated for at least 'value' seconds will be stopped by the watchdog
                threshold_sec: 7200
        
--- a/docs/faq.md
+++ b/docs/faq.md
@@ -94,6 +94,7 @@ title: FAQ
 * [How do I bypass a proxy configuration to access my local ClearML Server?](#proxy-localhost)
 * [Trains is failing to update ClearML Server. I get an error 500 (or 400). How do I fix this?](#elastic_watermark)
 * [Why is my Trains Web-App (UI) not showing any data?](#web-ui-empty)
+* [Why can't I access my ClearML Server when I run my code in a virtual machine?](#vm_server)

 **ClearML Agent**

@@ -816,7 +817,7 @@ Do the following:

 <br/>

-**The ClearML Server keeps returning HTTP 500 (or 400) errors. How do I fix this?**
+**The ClearML Server keeps returning HTTP 500 (or 400) errors. How do I fix this?** <a id="elastic_watermark"></a>

 The ClearML Server will return HTTP error responses (5XX, or 4XX) when some of its [backend components](deploying_clearml/clearml_server.md) 
 are failing. 
@@ -839,6 +840,28 @@ A likely indication of this situation can be determined by searching your clearm

 If your ClearML Web-App (UI) does not show anything, it may be an error authenticating with the server. Try clearing the application cookies for the site in your browser's developer tools. 
    
+**Why can't I access my ClearML Server when I run my code in a virtual machine?** <a id="vm_server"></a>
+
+The network definitions inside a virtual machine (or container) are different from those of the host. The virtual machine's 
+and the server machine's IP addresses are different, so you have to make sure that the machine that is executing the 
+experiment can access the server's machine. 
+
+Make sure to have an independent configuration file for the virtual machine where you are running your experiments. 
+Edit the `api` section of your `clearml.conf` file and insert IP addresses of the server machine that are accessible 
+from the VM. It should look something like this:
+
+```
+api {
+    web_server: http://192.168.1.2:8080
+    api_server: http://192.168.1.2:8008
+    credentials {
+        "access_key" = "KEY"
+        "secret_key" = "SECRET"
+    }
+}
+```
+
+
 ## ClearML Agent

 **How can I execute ClearML Agent without installing packages each time?** <a className="tr_top_negative" id="system_site_packages"></a>
--- a/docs/getting_started/mlops/mlops_first_steps.md
+++ b/docs/getting_started/mlops/mlops_first_steps.md
@@ -30,7 +30,7 @@ pip install clearml-agent
 Connect the Agent to the server by [creating credentials](https://app.community.clear.ml/profile), then run this:

 ```bash
-clearml-init
+clearml-agent init
 ```

 :::note