mirror of
https://github.com/clearml/clearml-docs
synced 2025-06-26 18:17:44 +00:00
Merge branch 'allegroai:main' into main
This commit is contained in:
@@ -12,12 +12,13 @@ in the UI and send it for long-term training on a remote machine.
|
||||
**If you are not that lucky**, this section is for you :)
|
||||
|
||||
## What does ClearML Session do?
|
||||
`clearml-session` is a feature that allows to launch a session of Jupyterlab and VS Code, and to execute code on a remote
|
||||
`clearml-session` is a feature that allows to launch a session of JupyterLab and VS Code, and to execute code on a remote
|
||||
machine that better meets resource needs. With this feature, local links are provided, which can be used to access
|
||||
JupyterLab and VS Code on a remote machine over a secure and encrypted SSH connection.
|
||||
JupyterLab and VS Code on a remote machine over a secure and encrypted SSH connection. By default, the JupyterLab and
|
||||
VS Code remote sessions use ports 8878 and 8898 respectively.
|
||||
|
||||
<details className="cml-expansion-panel screenshot">
|
||||
<summary className="cml-expansion-panel-summary">Jupyter-Lab Window</summary>
|
||||
<summary className="cml-expansion-panel-summary">JupyterLab Window</summary>
|
||||
<div className="cml-expansion-panel-content">
|
||||
|
||||

|
||||
@@ -138,7 +139,7 @@ The Task must be connected to a git repository, since currently single script de
|
||||
|
||||
| Command line options | Description | Default value |
|
||||
|-----|---|---|
|
||||
| `--jupyter-lab` | Download a Jupyter-Lab environment | `true` |
|
||||
| `--jupyter-lab` | Download a JupyterLab environment | `true` |
|
||||
| `--vscode-server` | Download a VSCode environment | `true` |
|
||||
| `--public-ip` | Register the public IP of the remote machine (if you are running the session on a public cloud) | Session runs on the machine whose agent is executing the session|
|
||||
| `--init-script` | Specify a BASH init script file to be executed when the interactive session is being set up | `none` or previously entered BASH script |
|
||||
|
||||
@@ -300,6 +300,7 @@ the watchdog marks them as `aborted`. The non-responsive experiment watchdog is
|
||||
|
||||
Modify the following settings for the watchdog:
|
||||
|
||||
* Watchdog status - enabled / disabled
|
||||
* The time threshold (in seconds) of experiment inactivity (default value is 7200 seconds (2 hours)).
|
||||
* The time interval (in seconds) between watchdog cycles.
|
||||
|
||||
@@ -312,6 +313,8 @@ Modify the following settings for the watchdog:
|
||||
|
||||
tasks {
|
||||
non_responsive_tasks_watchdog {
|
||||
enabled: true
|
||||
|
||||
# In-progress tasks that haven't been updated for at least 'value' seconds will be stopped by the watchdog
|
||||
threshold_sec: 7200
|
||||
|
||||
|
||||
25
docs/faq.md
25
docs/faq.md
@@ -94,6 +94,7 @@ title: FAQ
|
||||
* [How do I bypass a proxy configuration to access my local ClearML Server?](#proxy-localhost)
|
||||
* [Trains is failing to update ClearML Server. I get an error 500 (or 400). How do I fix this?](#elastic_watermark)
|
||||
* [Why is my Trains Web-App (UI) not showing any data?](#web-ui-empty)
|
||||
* [Why can't I access my ClearML Server when I run my code in a virtual machine?](#vm_server)
|
||||
|
||||
**ClearML Agent**
|
||||
|
||||
@@ -816,7 +817,7 @@ Do the following:
|
||||
|
||||
<br/>
|
||||
|
||||
**The ClearML Server keeps returning HTTP 500 (or 400) errors. How do I fix this?**
|
||||
**The ClearML Server keeps returning HTTP 500 (or 400) errors. How do I fix this?** <a id="elastic_watermark"></a>
|
||||
|
||||
The ClearML Server will return HTTP error responses (5XX, or 4XX) when some of its [backend components](deploying_clearml/clearml_server.md)
|
||||
are failing.
|
||||
@@ -839,6 +840,28 @@ A likely indication of this situation can be determined by searching your clearm
|
||||
|
||||
If your ClearML Web-App (UI) does not show anything, it may be an error authenticating with the server. Try clearing the application cookies for the site in your browser's developer tools.
|
||||
|
||||
**Why can't I access my ClearML Server when I run my code in a virtual machine?** <a id="vm_server"></a>
|
||||
|
||||
The network definitions inside a virtual machine (or container) are different from those of the host. The virtual machine's
|
||||
and the server machine's IP addresses are different, so you have to make sure that the machine that is executing the
|
||||
experiment can access the server's machine.
|
||||
|
||||
Make sure to have an independent configuration file for the virtual machine where you are running your experiments.
|
||||
Edit the `api` section of your `clearml.conf` file and insert IP addresses of the server machine that are accessible
|
||||
from the VM. It should look something like this:
|
||||
|
||||
```
|
||||
api {
|
||||
web_server: http://192.168.1.2:8080
|
||||
api_server: http://192.168.1.2:8008
|
||||
credentials {
|
||||
"access_key" = "KEY"
|
||||
"secret_key" = "SECRET"
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
|
||||
## ClearML Agent
|
||||
|
||||
**How can I execute ClearML Agent without installing packages each time?** <a className="tr_top_negative" id="system_site_packages"></a>
|
||||
|
||||
@@ -30,7 +30,7 @@ pip install clearml-agent
|
||||
Connect the Agent to the server by [creating credentials](https://app.community.clear.ml/profile), then run this:
|
||||
|
||||
```bash
|
||||
clearml-init
|
||||
clearml-agent init
|
||||
```
|
||||
|
||||
:::note
|
||||
|
||||
Reference in New Issue
Block a user