mirror of
https://github.com/clearml/clearml-session
synced 2025-01-31 02:46:56 +00:00
360 lines
17 KiB
Markdown
360 lines
17 KiB
Markdown
<div align="center">
|
||
|
||
<a href="https://app.clear.ml"><img src="https://github.com/allegroai/clearml/blob/master/docs/clearml-logo.svg?raw=true" width="250px"></a>
|
||
|
||
## **`clearml-session` </br> CLI for launching JupyterLab / VSCode on a remote machine**
|
||
|
||
|
||
[![GitHub license](https://img.shields.io/github/license/allegroai/clearml-session.svg)](https://img.shields.io/github/license/allegroai/clearml-session.svg)
|
||
[![PyPI pyversions](https://img.shields.io/pypi/pyversions/clearml-session.svg)](https://img.shields.io/pypi/pyversions/clearml-session.svg)
|
||
[![PyPI version shields.io](https://img.shields.io/pypi/v/clearml-session.svg)](https://img.shields.io/pypi/v/clearml-session.svg)
|
||
[![PyPI status](https://img.shields.io/pypi/status/clearml-session.svg)](https://pypi.python.org/pypi/clearml-session/)
|
||
[![Slack Channel](https://img.shields.io/badge/slack-%23clearml--community-blueviolet?logo=slack)](https://joinslack.clear.ml)
|
||
|
||
|
||
</div>
|
||
|
||
**`clearml-session`** is a utility for launching detachable remote interactive sessions (MacOS, Windows, Linux)
|
||
|
||
### tl;dr
|
||
CLI to launch remote sessions for JupyterLab / VSCode-server / SSH, inside any docker image!
|
||
|
||
### What does it do?
|
||
Starting a clearml (ob)session from your local machine triggers the following:
|
||
- ClearML allocates a remote instance (GPU) from your dedicated pool
|
||
- On the allocated instance it will spin **jupyter-lab** + **vscode server** + **SSH** access for
|
||
interactive usage (i.e., development)
|
||
- ClearML will start monitoring machine performance, allowing DevOps to detect stale instances and spin them down
|
||
|
||
> ℹ️ **Remote PyCharm:** You can also work with PyCharm in a remote session over SSH. Use the [PyCharm Plugin](https://github.com/allegroai/clearml-pycharm-plugin) to automatically sync local configurations with a remote session.
|
||
|
||
|
||
### Use-cases for remote interactive sessions:
|
||
1. Development requires resources not available on the current developer's machines
|
||
2. Team resource sharing (e.g. how to dynamically assign GPUs to developers)
|
||
3. Spin a copy of a previously executed experiment for remote debugging purposes (:open_mouth:!)
|
||
4. Scale-out development to multiple clouds, assign development machines on AWS/GCP/Azure in a seamless way
|
||
|
||
## Prerequisites:
|
||
* **An SSH client installed on your machine** - To verify open your terminal and execute `ssh`, if you did not receive an error, we are good to go.
|
||
* At least one `clearml-agent` running on a remote host. See installation [details](https://github.com/allegroai/clearml-agent).
|
||
|
||
Supported OS: MacOS, Windows, Linux
|
||
|
||
|
||
## Secure & Stable
|
||
**clearml-session** creates a single, secure, and encrypted connection to the remote machine over SSH.
|
||
SSH credentials are automatically generated by the CLI and contain fully random 32 bytes password.
|
||
|
||
All http connections are tunneled over the SSH connection,
|
||
allowing users to add additional services on the remote machine (!)
|
||
|
||
Furthermore, all tunneled connections have a special stable network layer allowing you to refresh the underlying SSH
|
||
connection without breaking any network sockets!
|
||
|
||
This means that if the network connection is unstable, you can refresh
|
||
the base SSH network tunnel, without breaking JupyterLab/VSCode-server or your own SSH connection
|
||
(e.h. debugging over SSH with PyCharm)
|
||
|
||
---
|
||
|
||
## How to use: Interactive Session
|
||
|
||
|
||
1. run `clearml-session`
|
||
2. select the requested queue (resource)
|
||
3. wait until a machine is up and ready
|
||
4. click on the link to the remote JupyterLab/VSCode OR connect with the provided SSH details
|
||
|
||
**Notice! You can also**: Select a **docker image** to execute in, install required **python packages**, run **bash script**,
|
||
pass **git credentials**, etc.
|
||
See below for full CLI options.
|
||
|
||
## Frequently Asked Questions:
|
||
|
||
#### How Does ClearML enable this?
|
||
|
||
The `clearml-session` creates a new interactive `Task` in the system (default project: DevOps).
|
||
|
||
This `Task` is responsible for setting the SSH and JupyterLab/VSCode on the host machine.
|
||
|
||
The local `clearml-session` awaits for the interactive Task to finish with the initial setup, then
|
||
it connects via SSH to the host machine (see "safe and stable" above), and tunnels
|
||
both SSH and JupyterLab over the SSH connection.
|
||
|
||
The end results is a local link which you can use to access the JupyterLab/VSCode on the remote machine, over a **secure and encrypted** connection!
|
||
|
||
#### How can this be used to scale up/out development resources?
|
||
|
||
**Clearml** has a cloud autoscaler, so you can easily and automatically spin machines for development!
|
||
|
||
There is also a default docker image to use when initiating a task.
|
||
|
||
This means that using **clearml-session**s
|
||
with the autoscaler enabled, allows for turn-key secure development environment inside a docker of your choosing.
|
||
|
||
Learn more about it [here](https://clear.ml/docs/latest/docs/guides/services/aws_autoscaler)
|
||
|
||
#### Does this fit Work From Home situations?
|
||
**YES**. Install `clearml-agent` on target machines inside the organization, connect over your company VPN
|
||
and use `clearml-session` to gain access to a dedicated on-prem machine with the docker of your choosing
|
||
(with out-of-the-box support for any internal docker artifactory).
|
||
|
||
Learn more about how to utilize your office workstations and on-prem machines [here](https://clear.ml/docs/latest/docs/clearml_agent)
|
||
|
||
## Tutorials
|
||
|
||
### Getting started
|
||
|
||
Requirements `clearml` python package installed and configured (see detailed [instructions](https://clear.ml/docs/latest/docs/getting_started/ds/ds_first_steps))
|
||
``` bash
|
||
pip install clearml-session
|
||
clearml-session --docker nvcr.io/nvidia/pytorch:20.11-py3 --git-credentials
|
||
```
|
||
|
||
Wait for the machine to spin up:
|
||
Expected CLI output would look something like:
|
||
``` console
|
||
Creating new session
|
||
New session created [id=3d38e738c5ff458a9ec465e77e19da23]
|
||
Waiting for remote machine allocation [id=3d38e738c5ff458a9ec465e77e19da23]
|
||
.Status [queued]
|
||
....Status [in_progress]
|
||
Remote machine allocated
|
||
Setting remote environment [Task id=3d38e738c5ff458a9ec465e77e19da23]
|
||
Setup process details: https://app.community.clear.ml/projects/64ae77968db24b27abf86a501667c330/experiments/3d38e738c5ff458a9ec465e77e19da23/output/log
|
||
Waiting for environment setup to complete [usually about 20-30 seconds]
|
||
..............
|
||
Remote machine is ready
|
||
Setting up connection to remote session
|
||
Starting SSH tunnel
|
||
Warning: Permanently added '[192.168.0.17]:10022' (ECDSA) to the list of known hosts.
|
||
root@192.168.0.17's password: f7bae03235ff2a62b6bfbc6ab9479f9e28640a068b1208b63f60cb097b3a1784
|
||
|
||
|
||
Interactive session is running:
|
||
SSH: ssh root@localhost -p 8022 [password: f7bae03235ff2a62b6bfbc6ab9479f9e28640a068b1208b63f60cb097b3a1784]
|
||
Jupyter Lab URL: http://localhost:8878/?token=df52806d36ad30738117937507b213ac14ed638b8c336a7e
|
||
VSCode server available at http://localhost:8898/
|
||
|
||
Connection is up and running
|
||
Enter "r" (or "reconnect") to reconnect the session (for example after suspend)
|
||
`s` (or "shell") to connect to the SSH session
|
||
`Ctrl-C` (or "quit") to abort (remote session remains active)
|
||
or "Shutdown" to shut down remote interactive session
|
||
```
|
||
|
||
Click on the JupyterLab link (http://localhost:8878/?token=xyz)
|
||
Open your terminal, clone your code & start working :)
|
||
|
||
### Leaving a session and reconnecting from the same machine
|
||
|
||
On the `clearml-session` CLI terminal, enter 'quit' or press `Ctrl-C`
|
||
It will close the CLI but preserve the remote session (i.e. remote session will remain running)
|
||
|
||
When you want to reconnect to it, execute:
|
||
``` bash
|
||
clearml-session
|
||
```
|
||
|
||
Then press "Y" (or enter) to reconnect to the already running session
|
||
``` console
|
||
clearml-session - launch interactive session
|
||
Checking previous session
|
||
Connect to active session id=3d38e738c5ff458a9ec465e77e19da23 [Y]/n?
|
||
```
|
||
|
||
### Shutting down a remote session
|
||
|
||
On the `clearml-session` CLI terminal, enter 'shutdown' (case-insensitive)
|
||
It will shut down the remote session, free the resource and close the CLI
|
||
|
||
``` console
|
||
Enter "r" (or "reconnect") to reconnect the session (for example after suspend)
|
||
`s` (or "shell") to connect to the SSH session
|
||
`Ctrl-C` (or "quit") to abort (remote session remains active)
|
||
or "Shutdown" to shut down remote interactive session
|
||
|
||
shutdown
|
||
|
||
Shutting down interactive session
|
||
Remote session shutdown
|
||
Goodbye
|
||
```
|
||
|
||
### Connecting to a running interactive session from a different machine
|
||
|
||
Continue working on an interactive session from **any** machine.
|
||
In the `clearml` web UI, go to DevOps project, and find your interactive session.
|
||
Click on the ID button next to the Task name, and copy the unique ID.
|
||
|
||
``` bash
|
||
clearml-session --attach <session_id_here>
|
||
```
|
||
|
||
Click on the JupyterLab/VSCode link, or connect directly to the SSH session
|
||
|
||
### Debug a previously executed experiment
|
||
|
||
If you have a previously executed experiment in the system,
|
||
you can create an exact copy of the experiment and debug it on the remote interactive session.
|
||
`clearml-session` will replicate the exact remote environment, add JupyterLab/VSCode/SSH and allow you interactively
|
||
execute and debug the experiment, on the allocated remote machine.
|
||
|
||
In the `clearml` web UI, find the experiment (Task) you wish to debug.
|
||
Click on the ID button next to the Task name, and copy the unique ID.
|
||
|
||
``` bash
|
||
clearml-session --debugging-session <experiment_id_here>
|
||
```
|
||
|
||
Click on the JupyterLab/VSCode link, or connect directly to the SSH session
|
||
|
||
## CLI options
|
||
|
||
``` bash
|
||
clearml-session --help
|
||
```
|
||
|
||
``` console
|
||
clearml-session - CLI for launching JupyterLab / VSCode on a remote machine
|
||
usage: clearml-session [-h] [--version] [--attach [ATTACH]]
|
||
[--shutdown [SHUTDOWN]] [--shell]
|
||
[--debugging-session DEBUGGING_SESSION] [--queue QUEUE]
|
||
[--docker DOCKER] [--docker-args DOCKER_ARGS]
|
||
[--public-ip [true/false]]
|
||
[--remote-ssh-port REMOTE_SSH_PORT]
|
||
[--vscode-server [true/false]]
|
||
[--vscode-version VSCODE_VERSION]
|
||
[--vscode-extensions VSCODE_EXTENSIONS]
|
||
[--jupyter-lab [true/false]]
|
||
[--upload-files UPLOAD_FILES]
|
||
[--git-credentials [true/false]]
|
||
[--user-folder USER_FOLDER]
|
||
[--packages [PACKAGES [PACKAGES ...]]]
|
||
[--requirements REQUIREMENTS]
|
||
[--init-script [INIT_SCRIPT]]
|
||
[--config-file CONFIG_FILE]
|
||
[--remote-gateway [REMOTE_GATEWAY]]
|
||
[--base-task-id BASE_TASK_ID] [--project PROJECT]
|
||
[--keepalive [true/false]]
|
||
[--queue-excluded-tag [QUEUE_EXCLUDED_TAG [QUEUE_EXCLUDED_TAG ...]]]
|
||
[--queue-include-tag [QUEUE_INCLUDE_TAG [QUEUE_INCLUDE_TAG ...]]]
|
||
[--skip-docker-network [true/false]]
|
||
[--password PASSWORD] [--username USERNAME]
|
||
[--force_dropbear [true/false]] [--verbose] [--yes]
|
||
|
||
clearml-session - CLI for launching JupyterLab / VSCode on a remote machine
|
||
|
||
optional arguments:
|
||
-h, --help show this help message and exit
|
||
--version Display the clearml-session utility version
|
||
--attach [ATTACH] Attach to running interactive session (default:
|
||
previous session)
|
||
--shutdown [SHUTDOWN], -S [SHUTDOWN]
|
||
Shut down an active session (default: previous
|
||
session)
|
||
--shell Open the SSH shell session directly, notice quiting
|
||
the SSH session will Not shutdown the remote session
|
||
--debugging-session DEBUGGING_SESSION
|
||
Pass existing Task id (experiment), create a copy of
|
||
the experiment on a remote machine, and launch
|
||
jupyter/ssh for interactive access. Example
|
||
--debugging-session <task_id>
|
||
--queue QUEUE Select the queue to launch the interactive session on
|
||
(default: previously used queue)
|
||
--docker DOCKER Select the docker image to use in the interactive
|
||
session on (default: previously used docker image or
|
||
`nvidia/cuda:10.1-runtime-ubuntu18.04`)
|
||
--docker-args DOCKER_ARGS
|
||
Add additional arguments for the docker image to use
|
||
in the interactive session on (default: previously
|
||
used docker-args)
|
||
--public-ip [true/false]
|
||
If True register the public IP of the remote machine.
|
||
Set if running on the cloud. Default: false (use for
|
||
local / on-premises)
|
||
--remote-ssh-port REMOTE_SSH_PORT
|
||
Set the remote ssh server port, running on the agent`s
|
||
machine. (default: 10022)
|
||
--vscode-server [true/false]
|
||
Install vscode server (code-server) on interactive
|
||
session (default: true)
|
||
--vscode-version VSCODE_VERSION
|
||
Set vscode server (code-server) version, as well as
|
||
vscode python extension version <vscode:python-ext>
|
||
(example: "3.7.4:2020.10.332292344")
|
||
--vscode-extensions VSCODE_EXTENSIONS
|
||
Install additional vscode extensions, as well as
|
||
vscode python extension (example: "ms-
|
||
python.python,ms-python.black-formatter,ms-
|
||
python.pylint,ms-python.flake8")
|
||
--jupyter-lab [true/false]
|
||
Install Jupyter-Lab on interactive session (default:
|
||
true)
|
||
--upload-files UPLOAD_FILES
|
||
Advanced: Upload local files/folders to the remote
|
||
session. Example: `/my/local/data/` will upload the
|
||
local folder and extract it into the container in
|
||
~/session-files/
|
||
--git-credentials [true/false]
|
||
If true, local .git-credentials file is sent to the
|
||
interactive session. (default: false)
|
||
--user-folder USER_FOLDER
|
||
Advanced: Set the remote base folder (default: ~/)
|
||
--packages [PACKAGES [PACKAGES ...]]
|
||
Additional packages to add, supports version numbers
|
||
(default: previously added packages). examples:
|
||
--packages torch==1.7 tqdm
|
||
--requirements REQUIREMENTS
|
||
Specify requirements.txt file to install when setting
|
||
the interactive session. Requirements file is read and
|
||
stored in `packages` section as default for the next
|
||
sessions. Can be overridden by calling `--packages`
|
||
--init-script [INIT_SCRIPT]
|
||
Specify BASH init script file to be executed when
|
||
setting the interactive session. Script content is
|
||
read and stored as default script for the next
|
||
sessions. To clear the init-script do not pass a file
|
||
--config-file CONFIG_FILE
|
||
Advanced: Change the configuration file used to store
|
||
the previous state (default: ~/.clearml_session.json)
|
||
--remote-gateway [REMOTE_GATEWAY]
|
||
Advanced: Specify gateway ip/address:port to be passed
|
||
to interactive session (for use with k8s ingestion /
|
||
ELB)
|
||
--base-task-id BASE_TASK_ID
|
||
Advanced: Set the base task ID for the interactive
|
||
session. (default: previously used Task). Use `none`
|
||
for the default interactive session
|
||
--project PROJECT Advanced: Set the project name for the interactive
|
||
session Task
|
||
--keepalive [true/false]
|
||
Advanced: If set, enables the transparent proxy always
|
||
keeping the sockets alive. Default: False, do not use
|
||
transparent socket for mitigating connection drops.
|
||
--queue-excluded-tag [QUEUE_EXCLUDED_TAG [QUEUE_EXCLUDED_TAG ...]]
|
||
Advanced: Excluded queues with this specific tag from
|
||
the selection
|
||
--queue-include-tag [QUEUE_INCLUDE_TAG [QUEUE_INCLUDE_TAG ...]]
|
||
Advanced: Only include queues with this specific tag
|
||
from the selection
|
||
--skip-docker-network [true/false]
|
||
Advanced: If set, `--network host` is **not** passed
|
||
to docker (assumes k8s network ingestion) (default:
|
||
false)
|
||
--password PASSWORD Advanced: Select ssh password for the interactive
|
||
session (default: `randomly-generated` or previously
|
||
used one)
|
||
--username USERNAME Advanced: Select ssh username for the interactive
|
||
session (default: `root` or previously used one)
|
||
--force_dropbear [true/false]
|
||
Force using `dropbear` instead of SSHd
|
||
--verbose Advanced: If set, print verbose progress information,
|
||
e.g. the remote machine setup process log
|
||
--yes, -y Automatic yes to prompts; assume "yes" as answer to
|
||
all prompts and run non-interactively
|
||
|
||
Notice! all arguments are stored as new defaults for the next session
|
||
```
|