Merge branch 'main' of https://github.com/allegroai/clearml-docs into bot_usecase
@@ -47,7 +47,7 @@ that you need.
|
||||
accessed, [compared](../webapp/webapp_exp_comparing.md) and [tracked](../webapp/webapp_exp_track_visual.md).
|
||||
- [ClearML Agent](../clearml_agent.md) does the heavy lifting. It reproduces the execution environment, clones your code,
|
||||
applies code patches, manages parameters (including overriding them on the fly), executes the code, and queues multiple tasks.
|
||||
It can even [build](../../clearml_agent/clearml_agent_docker_exec#exporting-a-task-into-a-standalone-docker-container) the docker container for you!
|
||||
It can even [build](../getting_started/clearml_agent_docker_exec.md#exporting-a-task-into-a-standalone-docker-container) the container for you!
|
||||
- [ClearML Pipelines](../pipelines/pipelines.md) ensure that steps run in the same order,
|
||||
programmatically chaining tasks together, while giving an overview of the execution pipeline's status.
|
||||
|
||||
|
||||
@@ -17,7 +17,7 @@ title: ClearML Agent
|
||||
|
||||
**ClearML Agent** is a virtual environment and execution manager for DL / ML solutions on GPU machines. It integrates with the **ClearML Python Package** and ClearML Server to provide a full AI cluster solution. <br/>
|
||||
Its main focus is around:
|
||||
- Reproducing tasks, including their complete environments.
|
||||
- Reproducing task runs, including their complete environments.
|
||||
- Scaling workflows on multiple target machines.
|
||||
|
||||
ClearML Agent executes a task or other workflow by reproducing the state of the code from the original machine
|
||||
@@ -46,7 +46,7 @@ install Python, so make sure to use a container or environment with the version
|
||||
While the agent is running, it continuously reports system metrics to the ClearML Server (these can be monitored in the
|
||||
[**Orchestration**](webapp/webapp_workers_queues.md) page).
|
||||
|
||||
Continue using ClearML Agent once it is running on a target machine. Reproduce tasks and execute
|
||||
Continue using ClearML Agent once it is running on a target machine. Reproducing task runs and execute
|
||||
automated workflows in one (or both) of the following ways:
|
||||
* Programmatically (using [`Task.enqueue()`](references/sdk/task.md#taskenqueue) or [`Task.execute_remotely()`](references/sdk/task.md#execute_remotely))
|
||||
* Through the ClearML Web UI (without working directly with code), by cloning tasks and enqueuing them to the
|
||||
|
||||
@@ -1,8 +1,9 @@
|
||||
---
|
||||
title: Dynamic GPU Allocation
|
||||
---
|
||||
|
||||
:::important Enterprise Feature
|
||||
This feature is available under the ClearML Enterprise plan.
|
||||
Dynamic GPU allocation is available under the ClearML Enterprise plan.
|
||||
:::
|
||||
|
||||
The ClearML Enterprise server supports dynamic allocation of GPUs based on queue properties.
|
||||
|
||||
@@ -232,7 +232,7 @@ ranging from 2 GB to 12 GB (see [clearml-fractional-gpu repository](https://gith
|
||||
|
||||
This example runs the ClearML Ubuntu 22 with CUDA 12.3 container on GPU 0, which is limited to use up to 8GB of its memory.
|
||||
:::note
|
||||
--pid=host is required to allow the driver to differentiate between the container's processes and other host processes when limiting memory usage
|
||||
`--pid=host` is required to allow the driver to differentiate between the container's processes and other host processes when limiting memory usage
|
||||
:::
|
||||
1. Run the following command inside the container to verify that the fractional gpu memory limit is working correctly:
|
||||
```bash
|
||||
|
||||
@@ -68,7 +68,8 @@ reproducibility.
|
||||
Information about the dataset can be viewed in the WebApp, in the dataset's [details panel](../../webapp/datasets/webapp_dataset_viewing.md#version-details-panel).
|
||||
In the panel's **CONTENT** tab, you can see a table summarizing version contents, including file names, file sizes, and hashes.
|
||||
|
||||

|
||||

|
||||

|
||||
|
||||
## Using the Dataset
|
||||
|
||||
|
||||
@@ -79,7 +79,8 @@ After a dataset has been closed, it can no longer be modified. This ensures futu
|
||||
Information about the dataset can be viewed in the WebApp, in the dataset's [details panel](../../webapp/datasets/webapp_dataset_viewing.md#version-details-panel).
|
||||
In the panel's **CONTENT** tab, you can see a table summarizing version contents, including file names, file sizes, and hashes.
|
||||
|
||||

|
||||

|
||||

|
||||
|
||||
## Data Ingestion
|
||||
|
||||
|
||||
@@ -212,7 +212,7 @@ Example:
|
||||
ClearML serving instances send serving statistics (count/latency) automatically to Prometheus and Grafana can be used
|
||||
to visualize and create live dashboards.
|
||||
|
||||
The default docker-compose installation is preconfigured with Prometheus and Grafana. Notice that by default data/ate
|
||||
The default `docker-compose` installation is preconfigured with Prometheus and Grafana. Notice that by default data/ate
|
||||
of both containers is *not* persistent. To add persistence, adding a volume mount is recommended.
|
||||
|
||||
You can also add many custom metrics on the input/predictions of your models. Once a model endpoint is registered,
|
||||
|
||||
@@ -22,7 +22,7 @@ The values in the ClearML configuration file can be overridden by environment va
|
||||
and command-line arguments.
|
||||
:::
|
||||
|
||||
# Editing Your Configuration File
|
||||
## Editing Your Configuration File
|
||||
|
||||
To add, change, or delete options, edit your configuration file.
|
||||
|
||||
@@ -414,7 +414,7 @@ These settings define which Docker image and arguments should be used unless [ex
|
||||
* **`agent.default_docker.match_rules`** (*[dict]*)
|
||||
|
||||
:::important Enterprise Feature
|
||||
This feature is available under the ClearML Enterprise plan.
|
||||
The `match_rules` configuration option is available under the ClearML Enterprise plan.
|
||||
:::
|
||||
|
||||
* Lookup table of rules that determine the default container and arguments when running a worker in Docker mode. The
|
||||
@@ -1548,7 +1548,7 @@ environment {
|
||||
}
|
||||
```
|
||||
|
||||
### files section
|
||||
### files section
|
||||
|
||||
**`files`** (*dict*)
|
||||
|
||||
@@ -1599,7 +1599,7 @@ sdk {
|
||||
## Configuration Vault
|
||||
|
||||
:::important Enterprise Feature
|
||||
This feature is available under the ClearML Enterprise plan.
|
||||
Configuration vaults are available under the ClearML Enterprise plan.
|
||||
:::
|
||||
|
||||
The ClearML Enterprise Server includes the configuration vault. Users can add configuration sections to the vault and, once
|
||||
|
||||
@@ -361,10 +361,16 @@ You can also use hashed passwords instead of plain-text passwords. To do that:
|
||||
|
||||
### Non-responsive Task Watchdog
|
||||
|
||||
The non-responsive task watchdog monitors tasks that were not updated for a specified time interval, and then
|
||||
the watchdog marks them as `aborted`. The non-responsive experiment watchdog is always active.
|
||||
The non-responsive task watchdog monitors for running tasks that have stopped communicating with the ClearML Server for a specified
|
||||
time interval. If a task remains unresponsive beyond the set threshold, the watchdog marks it as `aborted`.
|
||||
|
||||
Modify the following settings for the watchdog:
|
||||
A task is considered non-responsive if the time since its last communication with the ClearML Server exceeds the
|
||||
configured threshold. The watchdog starts counting after each successful communication with the server. If no further
|
||||
updates are received within the specified time, the task is considered non-responsive. This typically happens if:
|
||||
* The task's main process is stuck but has not exited.
|
||||
* There is a network issue preventing the task from communicating with the server.
|
||||
|
||||
You can configure the following watchdog settings:
|
||||
|
||||
* Watchdog status - enabled / disabled
|
||||
* The time threshold (in seconds) of experiment inactivity (default value is 7200 seconds (2 hours)).
|
||||
@@ -372,10 +378,15 @@ Modify the following settings for the watchdog:
|
||||
|
||||
**To configure the non-responsive watchdog for the ClearML Server:**
|
||||
|
||||
1. In the ClearML Server `/opt/clearml/config/services.conf` file, add or edit the `tasks.non_responsive_tasks_watchdog`
|
||||
section and specify the watchdog settings.
|
||||
1. Open the ClearML Server `/opt/clearml/config/services.conf` file.
|
||||
|
||||
:::tip
|
||||
If the `services.conf` file does not exist, create your own in ClearML Server's `/opt/clearml/config` directory (or
|
||||
an alternate folder you configured).
|
||||
:::
|
||||
|
||||
1. Add or edit the `tasks.non_responsive_tasks_watchdog` section and specify the watchdog settings. For example:
|
||||
|
||||
For example:
|
||||
```
|
||||
tasks {
|
||||
non_responsive_tasks_watchdog {
|
||||
@@ -389,11 +400,6 @@ Modify the following settings for the watchdog:
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
:::tip
|
||||
If the `services.conf` file does not exist, create your own in ClearML Server's `/opt/clearml/config` directory (or
|
||||
an alternate folder you configured), and input the modified configuration
|
||||
:::
|
||||
|
||||
1. Restart ClearML Server.
|
||||
|
||||
@@ -422,7 +428,7 @@ options.
|
||||
### Custom UI Context Menu Actions
|
||||
|
||||
:::important Enterprise Feature
|
||||
This feature is available under the ClearML Enterprise plan.
|
||||
Custom UI context menu actions are available under the ClearML Enterprise plan.
|
||||
:::
|
||||
|
||||
Create custom UI context menu actions to be performed on ClearML objects (projects, tasks, models, dataviews, or queues)
|
||||
|
||||
@@ -129,7 +129,7 @@ and ClearML Server needs to be installed.
|
||||
1. Add the `clearml-server` repository to Helm client.
|
||||
|
||||
```
|
||||
helm repo add allegroai https://allegroai.github.io/clearml-server-helm/
|
||||
helm repo add clearml https://clearml.github.io/clearml-server-helm/
|
||||
```
|
||||
|
||||
Confirm the `clearml-server` repository is now in the Helm client.
|
||||
|
||||
@@ -5,8 +5,8 @@ title: Linux and macOS
|
||||
Deploy the ClearML Server in Linux or macOS using the pre-built Docker image.
|
||||
|
||||
For ClearML docker images, including previous versions, see [https://hub.docker.com/r/allegroai/clearml](https://hub.docker.com/r/allegroai/clearml).
|
||||
However, pulling the ClearML Docker image directly is not required. ClearML provides a docker-compose YAML file that does this.
|
||||
The docker-compose file is included in the instructions on this page.
|
||||
However, pulling the ClearML Docker image directly is not required. ClearML provides a `docker-compose` YAML file that does this.
|
||||
The `docker-compose` file is included in the instructions on this page.
|
||||
|
||||
For information about upgrading ClearML Server in Linux or macOS, see [here](upgrade_server_linux_mac.md).
|
||||
|
||||
@@ -134,9 +134,9 @@ Deploying the server requires a minimum of 8 GB of memory, 16 GB is recommended.
|
||||
sudo chown -R $(whoami):staff /opt/clearml
|
||||
```
|
||||
|
||||
2. Download the ClearML Server docker-compose YAML file.
|
||||
2. Download the ClearML Server `docker-compose` YAML file:
|
||||
```
|
||||
sudo curl https://raw.githubusercontent.com/allegroai/clearml-server/master/docker/docker-compose.yml -o /opt/clearml/docker-compose.yml
|
||||
sudo curl https://raw.githubusercontent.com/clearml/clearml-server/master/docker/docker-compose.yml -o /opt/clearml/docker-compose.yml
|
||||
```
|
||||
1. For Linux only, configure the **ClearML Agent Services**:
|
||||
|
||||
|
||||
@@ -54,10 +54,10 @@ Deploying the server requires a minimum of 8 GB of memory, 16 GB is recommended.
|
||||
mkdir c:\opt\clearml\logs
|
||||
```
|
||||
|
||||
1. Save the ClearML Server docker-compose YAML file.
|
||||
1. Save the ClearML Server `docker-compose` YAML file.
|
||||
|
||||
```
|
||||
curl https://raw.githubusercontent.com/allegroai/clearml-server/master/docker/docker-compose-win10.yml -o c:\opt\clearml\docker-compose-win10.yml
|
||||
curl https://raw.githubusercontent.com/clearml/clearml-server/master/docker/docker-compose-win10.yml -o c:\opt\clearml\docker-compose-win10.yml
|
||||
```
|
||||
|
||||
1. Run `docker-compose`. In PowerShell, execute the following commands:
|
||||
|
||||
@@ -2,6 +2,10 @@
|
||||
title: Installing External Applications Server
|
||||
---
|
||||
|
||||
:::important Enterprise Feature
|
||||
UI application deployment is available under the ClearML Enterprise plan.
|
||||
:::
|
||||
|
||||
ClearML supports applications, which are extensions that allow additional capabilities, such as cloud auto-scaling,
|
||||
Hyperparameter Optimizations, etc. For more information, see [ClearML Applications](../../webapp/applications/apps_overview.md).
|
||||
|
||||
|
||||
@@ -2,6 +2,10 @@
|
||||
title: Application Installation on On-Prem and VPC Servers
|
||||
---
|
||||
|
||||
:::important Enterprise Feature
|
||||
UI application deployment is available under the ClearML Enterprise plan.
|
||||
:::
|
||||
|
||||
ClearML Applications are like plugins that allow you to manage ML workloads and automatically run recurring workflows
|
||||
without any coding. Applications are installed on top of the ClearML Server.
|
||||
|
||||
@@ -9,7 +13,7 @@ without any coding. Applications are installed on top of the ClearML Server.
|
||||
To run application you will need the following:
|
||||
* RAM: Make sure you have at least 400 MB of RAM per application instance.
|
||||
* Applications Service: Make sure that the applications agent service is up and running on your server:
|
||||
* If you are using a docker-compose solution, make sure that the clearml-apps-agent service is running.
|
||||
* If you are using a `docker-compose` solution, make sure that the clearml-apps-agent service is running.
|
||||
* If you are using a Kubernetes cluster, check for the clearml-clearml-enterprise-apps component.
|
||||
* Installation Files: Each application has its installation zip file. Make sure you have the relevant files for the
|
||||
applications you wish to install.
|
||||
|
||||
@@ -3,7 +3,7 @@ title: AI Application Gateway
|
||||
---
|
||||
|
||||
:::important Enterprise Feature
|
||||
This feature is available under the ClearML Enterprise plan.
|
||||
The AI Application Gateway is available under the ClearML Enterprise plan.
|
||||
:::
|
||||
|
||||
Services running through a cluster orchestrator such as Kubernetes or cloud hyperscaler require meticulous configuration
|
||||
|
||||
@@ -1,4 +1,10 @@
|
||||
# Docker-Compose Deployment
|
||||
---
|
||||
title: Docker-Compose Deployment
|
||||
---
|
||||
|
||||
:::important Enterprise Feature
|
||||
The Application Gateway is available under the ClearML Enterprise plan.
|
||||
:::
|
||||
|
||||
## Requirements
|
||||
|
||||
@@ -7,11 +13,11 @@
|
||||
* Credentials for the ClearML/allegroai docker repository
|
||||
* A valid ClearML Server installation
|
||||
|
||||
## Host configurations
|
||||
## Host Configurations
|
||||
|
||||
### Docker installation
|
||||
### Docker Installation
|
||||
|
||||
Installing docker and docker-compose might vary depending on the specific operating system you’re using. Here is an example for AmazonLinux:
|
||||
Installing `docker` and `docker-compose` might vary depending on the specific operating system you’re using. Here is an example for AmazonLinux:
|
||||
|
||||
```
|
||||
sudo dnf -y install docker
|
||||
@@ -27,9 +33,9 @@ sudo docker login
|
||||
|
||||
Use the ClearML/allegroai dockerhub credentials when prompted by docker login.
|
||||
|
||||
### Docker-compose file
|
||||
### Docker-compose File
|
||||
|
||||
This is an example of the docker-compose file you will need:
|
||||
This is an example of the `docker-compose` file you will need:
|
||||
|
||||
```
|
||||
version: '3.5'
|
||||
@@ -97,17 +103,17 @@ Edit it according to the following guidelines:
|
||||
* `CLEARML_API_ACCESS_KEY`: ClearML server api key
|
||||
* `CLEARML_API_SECRET_KEY`: ClearML server secret key
|
||||
* `ROUTER_URL`: URL for this router that was previously configured in the load balancer starting with `https://`
|
||||
* `ROUTER_NAME`: unique name for this router
|
||||
* `AUTH_ENABLED`: enable or disable http calls authentication when the router is communicating with the ClearML server
|
||||
* `SSL_VERIFY`: enable or disable SSL certificate validation when the router is communicating with the ClearML server
|
||||
* `AUTH_COOKIE_NAME`: the cookie name used by the ClearML server to store the ClearML authentication cookie. This can usually be found in the `value_prefix` key starting with `allegro_token` in `envoy.yaml` file in the ClearML server installation (`/opt/allegro/config/envoy/envoy.yaml`) (see below)
|
||||
* `AUTH_SECURE_ENABLED`: enable the Set-Cookie `secure` parameter
|
||||
* `AUTH_BASE64_JWKS_KEY`: value form `k` key in the `jwks.json` file in the ClearML server installation
|
||||
* `LISTEN_QUEUE_NAME`: (optional) name of queue to check for tasks (if none, every task is checked)
|
||||
* `EXTRA_BASH_COMMAND`: command to be launched before starting the router
|
||||
* `TCP_ROUTER_ADDRESS`: router external address, can be an IP or the host machine or a load balancer hostname, depends on network configuration
|
||||
* `TCP_PORT_START`: start port for the TCP Session feature
|
||||
* `TCP_PORT_END`: end port port for the TCP Session feature
|
||||
* `ROUTER_NAME`: Unique name for this router
|
||||
* `AUTH_ENABLED`: Enable or disable http calls authentication when the router is communicating with the ClearML server
|
||||
* `SSL_VERIFY`: Enable or disable SSL certificate validation when the router is communicating with the ClearML server
|
||||
* `AUTH_COOKIE_NAME`: Cookie name used by the ClearML server to store the ClearML authentication cookie. This can usually be found in the `value_prefix` key starting with `allegro_token` in `envoy.yaml` file in the ClearML server installation (`/opt/allegro/config/envoy/envoy.yaml`) (see below)
|
||||
* `AUTH_SECURE_ENABLED`: Enable the Set-Cookie `secure` parameter
|
||||
* `AUTH_BASE64_JWKS_KEY`: Value form `k` key in the `jwks.json` file in the ClearML server installation
|
||||
* `LISTEN_QUEUE_NAME`: (*optional*) Name of queue to check for tasks (if none, every task is checked)
|
||||
* `EXTRA_BASH_COMMAND`: Command to be launched before starting the router
|
||||
* `TCP_ROUTER_ADDRESS`: Router external address, can be an IP or the host machine or a load balancer hostname, depends on network configuration
|
||||
* `TCP_PORT_START`: Start port for the TCP Session feature
|
||||
* `TCP_PORT_END`: End port for the TCP Session feature
|
||||
|
||||
Run the following command to start the router:
|
||||
|
||||
@@ -115,11 +121,11 @@ Run the following command to start the router:
|
||||
sudo docker compose --env-file runtime.env up -d
|
||||
```
|
||||
|
||||
:::Note How to find my jwkskey
|
||||
:::note How to find my jwkskey
|
||||
|
||||
The *JSON Web Key Set* (*JWKS*) is a set of keys containing the public keys used to verify any JSON Web Token (JWT).
|
||||
|
||||
In a docker-compose server installation, this can be found in the `CLEARML__secure__auth__token_secret` env var in the apiserver server component.
|
||||
In a `docker-compose` server installation, this can be found in the `CLEARML__secure__auth__token_secret` env var in the apiserver server component.
|
||||
|
||||
:::
|
||||
|
||||
|
||||
@@ -1,4 +1,10 @@
|
||||
# Kubernetes Deployment
|
||||
---
|
||||
title: Kubernetes Deployment
|
||||
---
|
||||
|
||||
:::important Enterprise Feature
|
||||
The Application Gateway is available under the ClearML Enterprise plan.
|
||||
:::
|
||||
|
||||
This guide details the installation of the ClearML AI Application Gateway, specifically the ClearML Task Router Component.
|
||||
|
||||
@@ -6,8 +12,8 @@ This guide details the installation of the ClearML AI Application Gateway, speci
|
||||
|
||||
* Kubernetes cluster: `>= 1.21.0-0 < 1.32.0-0`
|
||||
* Helm installed and configured
|
||||
* Helm token to access allegroai helm-chart repo
|
||||
* Credentials for allegroai docker repo
|
||||
* Helm token to access `allegroai` helm-chart repo
|
||||
* Credentials for `allegroai` docker repo
|
||||
* A valid ClearML Server installation
|
||||
|
||||
## Optional for HTTPS
|
||||
@@ -21,14 +27,14 @@ This guide details the installation of the ClearML AI Application Gateway, speci
|
||||
|
||||
```
|
||||
helm repo add allegroai-enterprise \
|
||||
https://raw.githubusercontent.com/allegroai/clearml-enterprise-helm-charts/gh-pages \
|
||||
https://raw.githubusercontent.com/clearml/clearml-enterprise-helm-charts/gh-pages \
|
||||
--username <GITHUB_TOKEN> \
|
||||
--password <GITHUB_TOKEN>
|
||||
```
|
||||
|
||||
### Prepare values
|
||||
### Prepare Values
|
||||
|
||||
Before installing the TTR create an helm-override files named `task-traffic-router.values-override.yaml`:
|
||||
Before installing the TTR, create a `helm-override` files named `task-traffic-router.values-override.yaml`:
|
||||
|
||||
```
|
||||
imageCredentials:
|
||||
@@ -49,20 +55,20 @@ tcpSession:
|
||||
end:
|
||||
```
|
||||
|
||||
Edit it accordingly to this guidelines:
|
||||
Edit it accordingly to these guidelines:
|
||||
|
||||
* `clearml.apiServerUrlReference`: url usually starting with `https://api.`
|
||||
* `clearml.apiServerUrlReference`: URL usually starting with `https://api.`
|
||||
* `clearml.apiServerKey`: ClearML server api key
|
||||
* `clearml.apiServerSecret`: ClearML server secret key
|
||||
* `ingress.hostName`: url of router we configured previously for loadbalancer starting with `https://`
|
||||
* `clearml.sslVerify`: enable or disable SSL certificate validation on apiserver calls check
|
||||
* `clearml.authCookieName`: value from `value_prefix` key starting with `allegro_token` in `envoy.yaml` file in ClearML server installation.
|
||||
* `clearml.jwksKey`: value form `k` key in `jwks.json` file in ClearML server installation (see below)
|
||||
* `tcpSession.routerAddress`: router external address can be an IP or the host machine or a loadbalancer hostname, depends on the network configuration
|
||||
* `tcpSession.portRange.start`: start port for the TCP Session feature
|
||||
* `tcpSession.portRange.end`: end port port for the TCP Session feature
|
||||
* `ingress.hostName`: URL of router we configured previously for load balancer starting with `https://`
|
||||
* `clearml.sslVerify`: Enable or disable SSL certificate validation on apiserver calls check
|
||||
* `clearml.authCookieName`: Value from `value_prefix` key starting with `allegro_token` in `envoy.yaml` file in ClearML server installation.
|
||||
* `clearml.jwksKey`: Value form `k` key in `jwks.json` file in ClearML server installation (see below)
|
||||
* `tcpSession.routerAddress`: Router external address can be an IP or the host machine or a load balancer hostname, depends on the network configuration
|
||||
* `tcpSession.portRange.start`: Start port for the TCP Session feature
|
||||
* `tcpSession.portRange.end`: End port for the TCP Session feature
|
||||
|
||||
::: How to find my jwkskey
|
||||
:::note How to find my jwkskey
|
||||
|
||||
The *JSON Web Key Set* (*JWKS*) is a set of keys containing the public keys used to verify any JSON Web Token (JWT).
|
||||
|
||||
|
||||
@@ -2,6 +2,10 @@
|
||||
title: Custom Billing Events
|
||||
---
|
||||
|
||||
:::important Enterprise Feature
|
||||
Sending custom billing events is available under the ClearML Enterprise plan.
|
||||
:::
|
||||
|
||||
ClearML supports sending custom events to selected Kafka topics. Event sending is triggered by API calls and
|
||||
is available only for the companies with the `custom_events` settings set.
|
||||
|
||||
|
||||
@@ -1,5 +1,5 @@
|
||||
---
|
||||
title: Exporting and Importing ClearML Projects
|
||||
title: Project Migration
|
||||
---
|
||||
|
||||
When migrating from a ClearML Open Server to a ClearML Enterprise Server, you may need to transfer projects. This is done
|
||||
@@ -36,7 +36,7 @@ them before exporting.
|
||||
Execute the data tool within the `apiserver` container.
|
||||
|
||||
Open a bash session inside the `apiserver` container of the server:
|
||||
* In docker-compose:
|
||||
* In `docker-compose`:
|
||||
|
||||
```commandline
|
||||
sudo docker exec -it clearml-apiserver /bin/bash
|
||||
|
||||
@@ -100,9 +100,10 @@ Install the ClearML chart with the required configuration:
|
||||
1. Prepare the `overrides.yaml` file and input the following content. Make sure to replace `<BASE_DOMAIN>` and `<SSO_*>`
|
||||
with a valid domain that will have records pointing to the ingress controller accordingly.
|
||||
The credentials specified in `<SUPERVISOR_USER_KEY>` and `<SUPERVISOR_USER_SECRET>` can be used to log in as the
|
||||
supervisor user in the web UI.
|
||||
supervisor user in the web UI.
|
||||
|
||||
Note that the `<SUPERVISOR_USER_EMAIL>` value must be explicitly quoted. To do so, put `\\"` around the quoted value.
|
||||
For example `"\\"email@example.com\\””`
|
||||
For example `"\\"email@example.com\\””`.
|
||||
|
||||
```
|
||||
imageCredentials:
|
||||
@@ -192,7 +193,7 @@ Install the ClearML chart with the required configuration:
|
||||
enabled: true
|
||||
```
|
||||
|
||||
2. Install ClearML
|
||||
2. Install ClearML:
|
||||
|
||||
```
|
||||
helm install -n clearml \\
|
||||
@@ -305,9 +306,9 @@ spec:
|
||||
kubernetes.io/metadata.name: clearml
|
||||
```
|
||||
|
||||
## Applications Installation
|
||||
## Application Installation
|
||||
|
||||
To install ClearML GUI applications, follow these steps:
|
||||
To install ClearML GUI applications:
|
||||
|
||||
1. Get the apps to install and the installation script by downloading and extracting the archive provided by ClearML
|
||||
|
||||
@@ -332,8 +333,8 @@ must be substituted with valid domain names or values from responses.
|
||||
|
||||
```
|
||||
APISERVER_URL="https://api.<BASE_DOMAIN>"
|
||||
APISERVER_KEY="GGS9F4M6XB2DXJ5AFT9F"
|
||||
APISERVER_SECRET="2oGujVFhPfaozhpuz2GzQfA5OyxmMsR3WVJpsCR5hrgHFs20PO"
|
||||
APISERVER_KEY="<APISERVER_KEY>"
|
||||
APISERVER_SECRET="<APISERVER_SECRET>"
|
||||
```
|
||||
|
||||
2. Create a *Tenant* (company):
|
||||
@@ -491,7 +492,7 @@ To install the ClearML Agent Chart, follow these steps:
|
||||
-d '{"name":"default"}'
|
||||
```
|
||||
|
||||
### Tenant Namespace isolation with NetworkPolicies
|
||||
### Tenant Namespace Isolation with NetworkPolicies
|
||||
|
||||
To ensure network isolation for each tenant, you need to create a `NetworkPolicy` in the tenant namespace. This way
|
||||
the entire namespace/tenant will not accept any connection from other namespaces.
|
||||
@@ -525,7 +526,7 @@ Install the [Task Traffic Router](appgw.md) in your Kubernetes cluster, allowing
|
||||
apiServerUrlReference: "<http://clearml-enterprise-apiserver.clearml:8008>"
|
||||
apiserverKey: "<TENANT_KEY>"
|
||||
apiserverSecret: "<TENANT_SECRET>"
|
||||
jwksKey: "ymLh1ok5k5xNUQfS944Xdx9xjf0wueokqKM2dMZfHuH9ayItG2"
|
||||
jwksKey: "<JWKS_KEY>"
|
||||
ingress:
|
||||
enabled: true
|
||||
hostName: "<unique url in same domain as apiserver/webserver>"
|
||||
|
||||
@@ -43,7 +43,7 @@ should be reviewed and modified prior to the server installation
|
||||
## Installing ClearML Server
|
||||
### Preliminary Steps
|
||||
|
||||
1. Install Docker CE
|
||||
1. Install Docker CE:
|
||||
|
||||
```
|
||||
https://docs.docker.com/install/linux/docker-ce/ubuntu/
|
||||
@@ -113,10 +113,10 @@ should be reviewed and modified prior to the server installation
|
||||
sudo systemctl enable disable-thp
|
||||
```
|
||||
|
||||
1. Restart the machine
|
||||
1. Restart the machine.
|
||||
|
||||
### Installing the Server
|
||||
1. Remove any previous installation of ClearML Server
|
||||
1. Remove any previous installation of ClearML Server:
|
||||
|
||||
```
|
||||
sudo rm -R /opt/clearml/
|
||||
@@ -141,7 +141,7 @@ should be reviewed and modified prior to the server installation
|
||||
sudo mkdir -pv /opt/allegro/config/onprem_poc
|
||||
```
|
||||
|
||||
1. Copy the following ClearML configuration files to `/opt/allegro`
|
||||
1. Copy the following ClearML configuration files to `/opt/allegro`:
|
||||
* `constants.env`
|
||||
* `docker-compose.override.yml`
|
||||
* `docker-compose.yml`
|
||||
@@ -165,10 +165,13 @@ should be reviewed and modified prior to the server installation
|
||||
sudo docker login -u=$DOCKERHUB_USER -p=$DOCKERHUB_PASSWORD
|
||||
```
|
||||
|
||||
1. Start the `docker-compose` by changing directories to the directory containing the docker-compose files and running the following command:
|
||||
sudo docker-compose --env-file constants.env up -d
|
||||
|
||||
1. Verify web access by browsing to your URL (IP address) and port 8080.
|
||||
1. Start the `docker-compose` by changing directories to the directory containing the `docker-compose` files and running the following command:
|
||||
|
||||
```
|
||||
sudo docker-compose --env-file constants.env up -d
|
||||
```
|
||||
|
||||
1. Verify web access by browsing to your URL (IP address) and port 8080:
|
||||
|
||||
```
|
||||
http://<server_ip_here>:8080
|
||||
@@ -191,7 +194,10 @@ the following subdomains should be forwarded to the corresponding ports on the s
|
||||
* `https://app.<domain>` should be forwarded to port 8080
|
||||
* `https://files.<domain>` should be forwarded to port 8081
|
||||
|
||||
|
||||
:::warning
|
||||
**Critical: Ensure no other ports are open to maintain the highest level of security.**
|
||||
:::
|
||||
|
||||
Additionally, ensure that the following URLs are correctly configured in the server's environment file:
|
||||
|
||||
|
||||
@@ -8,13 +8,14 @@ It covers the following:
|
||||
* Set up security groups and IAM role
|
||||
* Create EC2 instance with required disks
|
||||
* Install dependencies and mount disks
|
||||
* Deploy ClearML version using docker-compose
|
||||
* Deploy ClearML version using `docker-compose`
|
||||
* Set up load balancer and DNS
|
||||
* Set up server backup
|
||||
|
||||
## Prerequisites
|
||||
|
||||
An AWS account with at least 2 availability zones is required. It is recommended to install on a region with at least
|
||||
* It is recommended to start with 4 CPUs and 32 GB of RAM. An `r6a.xlarge` EC2 instance would accommodate these requirements.
|
||||
* An AWS account with at least 2 availability zones is required. It is recommended to install on a region with at least
|
||||
3 availability zones. Having fewer than 3 availability zones would prevent the use of high-availability setups, if
|
||||
needed in the future.
|
||||
|
||||
@@ -116,10 +117,10 @@ Instance requirements:
|
||||
## Load Balancer
|
||||
|
||||
1. Create a TLS certificate:
|
||||
1. Choose a domain name to be used with the server. The main URL that will be used by the system’s users will be app.\<domain\>
|
||||
1. Choose a domain name to be used with the server. The main URL that will be used by the system’s users will be `app.<domain>`
|
||||
2. Create a certificate, with the following DNS names:
|
||||
1. \<domain name\>
|
||||
2. \*.\<domain name\>
|
||||
1. `<domain name>`
|
||||
2. `*.<domain name>`
|
||||
|
||||
2. Create the `envoy` target group for the server:
|
||||
1. Port: 10000
|
||||
@@ -283,7 +284,7 @@ log would usually indicate the reason for the failure.
|
||||
|
||||
## Maintenance
|
||||
|
||||
### Removing app containers
|
||||
### Removing App Containers
|
||||
|
||||
To remove old application containers, add the following to the cron:
|
||||
|
||||
|
||||
@@ -2,14 +2,21 @@
|
||||
title: AWS EC2 AMIs
|
||||
---
|
||||
|
||||
:::note
|
||||
For upgrade purposes, the terms **Trains Server** and **ClearML Server** are interchangeable.
|
||||
:::
|
||||
<Collapsible title="Important: Upgrading to v2.x from v1.16.0 or older" type="info">
|
||||
|
||||
MongoDB major version was upgraded from `v5.x` to `6.x`. Please note that if your current ClearML Server version is older than
|
||||
`v1.17` (where MongoDB `v5.x` was first used), you'll need to first upgrade to ClearML Server v1.17.
|
||||
|
||||
First upgrade to ClearML Server v1.17 following the procedure below and using [this `docker-compose` file](https://github.com/clearml/clearml-server/blob/2976ce69cc91550a3614996e8a8d8cd799af2efd/upgrade/1_17_to_2_0/docker-compose.yml). Once successfully upgraded,
|
||||
you can proceed to upgrade to v2.x.
|
||||
|
||||
</Collapsible>
|
||||
|
||||
|
||||
The sections below contain the steps to upgrade ClearML Server on the [same AWS instance](#upgrading-on-the-same-aws-instance), and
|
||||
to upgrade and migrate to a [new AWS instance](#upgrading-and-migrating-to-a-new-aws-instance).
|
||||
|
||||
### Upgrading on the Same AWS Instance
|
||||
## Upgrading on the Same AWS Instance
|
||||
|
||||
This section contains the steps to upgrade ClearML Server on the same AWS instance.
|
||||
|
||||
@@ -42,7 +49,7 @@ If upgrading from Trains Server version 0.15 or older, a data migration is requi
|
||||
1. Download the latest `docker-compose.yml` file. Execute the following command:
|
||||
|
||||
```
|
||||
sudo curl https://raw.githubusercontent.com/allegroai/clearml-server/master/docker/docker-compose.yml -o /opt/clearml/docker-compose.yml
|
||||
sudo curl https://raw.githubusercontent.com/clearml/clearml-server/master/docker/docker-compose.yml -o /opt/clearml/docker-compose.yml
|
||||
```
|
||||
|
||||
1. Startup ClearML Server. This automatically pulls the latest ClearML Server build.
|
||||
@@ -52,7 +59,7 @@ If upgrading from Trains Server version 0.15 or older, a data migration is requi
|
||||
docker-compose -f docker-compose.yml up -d
|
||||
```
|
||||
|
||||
### Upgrading and Migrating to a New AWS Instance
|
||||
## Upgrading and Migrating to a New AWS Instance
|
||||
|
||||
This section contains the steps to upgrade ClearML Server on the new AWS instance.
|
||||
|
||||
@@ -67,8 +74,9 @@ This section contains the steps to upgrade ClearML Server on the new AWS instanc
|
||||
1. On the old AWS instance, [backup your data](clearml_server_aws_ec2_ami.md#backing-up-and-restoring-data-and-configuration)
|
||||
and, if your configuration folder is not empty, backup your configuration.
|
||||
|
||||
1. If upgrading from ClearML Server version older than 1.2, you need to migrate your data before upgrading your server. See instructions [here](clearml_server_mongo44_migration.md).
|
||||
If upgrading from Trains Server version 0.15 or older, a data migration is required before continuing this upgrade. See instructions [here](clearml_server_es7_migration.md).
|
||||
1. If upgrading from Trains Server version 0.15 or older, you need to migrate your data before upgrading your server. See instructions [here](clearml_server_es7_migration.md).
|
||||
|
||||
1. If upgrading from ClearML Server version 1.1 or older, you need to migrate your data before upgrading your server. See instructions [here](clearml_server_mongo44_migration.md).
|
||||
|
||||
1. On the new AWS instance, [restore your data](clearml_server_aws_ec2_ami.md#backing-up-and-restoring-data-and-configuration) and, if the configuration folder is not empty, restore the
|
||||
configuration.
|
||||
|
||||
@@ -19,11 +19,13 @@ you can proceed to upgrade to v2.x.
|
||||
```
|
||||
docker-compose -f docker-compose.yml down
|
||||
```
|
||||
|
||||
1. [Backing up data](clearml_server_gcp.md#backing-up-and-restoring-data-and-configuration) is recommended, and if the configuration folder is
|
||||
not empty, backing up the configuration.
|
||||
|
||||
1. If upgrading from **Trains Server** version 0.15 or older to **ClearML Server**, do the following:
|
||||
|
||||
1. Follow these [data migration instructions](clearml_server_es7_migration.md),
|
||||
and then continue this upgrade.
|
||||
1. Follow these [data migration instructions](clearml_server_es7_migration.md).
|
||||
|
||||
1. Rename `/opt/trains` and its subdirectories to `/opt/clearml`:
|
||||
|
||||
@@ -31,14 +33,12 @@ you can proceed to upgrade to v2.x.
|
||||
sudo mv /opt/trains /opt/clearml
|
||||
```
|
||||
|
||||
1. If upgrading from ClearML Server version older than 1.2, you need to migrate your data before upgrading your server. See instructions [here](clearml_server_mongo44_migration.md).
|
||||
1. [Backing up data](clearml_server_gcp.md#backing-up-and-restoring-data-and-configuration) is recommended, and if the configuration folder is
|
||||
not empty, backing up the configuration.
|
||||
1. If upgrading from ClearML Server version 1.1 or older, you need to migrate your data before upgrading your server. See instructions [here](clearml_server_mongo44_migration.md).
|
||||
|
||||
1. Download the latest `docker-compose.yml` file:
|
||||
|
||||
```
|
||||
curl https://raw.githubusercontent.com/allegroai/clearml-server/master/docker/docker-compose.yml -o /opt/clearml/docker-compose.yml
|
||||
curl https://raw.githubusercontent.com/clearml/clearml-server/master/docker/docker-compose.yml -o /opt/clearml/docker-compose.yml
|
||||
```
|
||||
|
||||
1. Startup ClearML Server. This automatically pulls the latest ClearML Server build.
|
||||
|
||||
@@ -7,13 +7,13 @@ title: Kubernetes
|
||||
|
||||
```bash
|
||||
helm repo update
|
||||
helm upgrade clearml allegroai/clearml
|
||||
helm upgrade clearml clearml/clearml
|
||||
```
|
||||
|
||||
**To change the values in an existing installation,** execute the following:
|
||||
|
||||
```bash
|
||||
helm upgrade clearml allegroai/clearml --version <CURRENT CHART VERSION> -f custom_values.yaml
|
||||
helm upgrade clearml clearml/clearml --version <CURRENT CHART VERSION> -f custom_values.yaml
|
||||
```
|
||||
|
||||
See the [clearml-helm-charts repository](https://github.com/clearml/clearml-helm-charts/tree/main/charts/clearml#local-environment)
|
||||
|
||||
@@ -40,24 +40,26 @@ For backwards compatibility, the environment variables ``TRAINS_HOST_IP``, ``TRA
|
||||
```
|
||||
docker-compose -f docker-compose.yml down
|
||||
```
|
||||
|
||||
1. If upgrading from **Trains Server** version 0.15 or older, a data migration is required before continuing this upgrade. See instructions [here](clearml_server_es7_migration.md).
|
||||
|
||||
1. If upgrading from ClearML Server version older than 1.2, you need to migrate your data before upgrading your server. See instructions [here](clearml_server_mongo44_migration.md).
|
||||
|
||||
1. [Backing up data](clearml_server_linux_mac.md#backing-up-and-restoring-data-and-configuration) is recommended and, if the configuration folder is
|
||||
not empty, backing up the configuration.
|
||||
|
||||
1. If upgrading from **Trains Server** version 0.15 or older to **ClearML Server**, do the following:
|
||||
|
||||
1. If upgrading from **Trains Server** to **ClearML Server**, rename `/opt/trains` and its subdirectories to `/opt/clearml`:
|
||||
1. Follow these [data migration instructions](clearml_server_es7_migration.md).
|
||||
|
||||
1. Rename `/opt/trains` and its subdirectories to `/opt/clearml`:
|
||||
|
||||
```
|
||||
sudo mv /opt/trains /opt/clearml
|
||||
```
|
||||
|
||||
```
|
||||
sudo mv /opt/trains /opt/clearml
|
||||
```
|
||||
1. If upgrading from ClearML Server version 1.1 or older, you need to migrate your data before upgrading your server. See instructions [here](clearml_server_mongo44_migration.md).
|
||||
|
||||
1. Download the latest `docker-compose.yml` file:
|
||||
|
||||
```
|
||||
curl https://raw.githubusercontent.com/allegroai/clearml-server/master/docker/docker-compose.yml -o /opt/clearml/docker-compose.yml
|
||||
curl https://raw.githubusercontent.com/clearml/clearml-server/master/docker/docker-compose.yml -o /opt/clearml/docker-compose.yml
|
||||
```
|
||||
|
||||
1. Startup ClearML Server. This automatically pulls the latest ClearML Server build:
|
||||
|
||||
@@ -29,10 +29,7 @@ you can proceed to upgrade to v2.x.
|
||||
```
|
||||
docker-compose -f c:\opt\trains\docker-compose-win10.yml down
|
||||
```
|
||||
|
||||
1. If upgrading from **Trains Server** version 0.15 or older, a data migration is required before continuing this upgrade. See instructions [here](clearml_server_es7_migration.md).
|
||||
|
||||
1. If upgrading from ClearML Server version older than 1.2, you need to migrate your data before upgrading your server. See instructions [here](clearml_server_mongo44_migration.md).
|
||||
|
||||
1. Backing up data is recommended, and if the configuration folder is not empty, backing up the configuration.
|
||||
|
||||
@@ -40,13 +37,19 @@ you can proceed to upgrade to v2.x.
|
||||
For example, if the configuration is in ``c:\opt\clearml``, then backup ``c:\opt\clearml\config`` and ``c:\opt\clearml\data``.
|
||||
Before restoring, remove the old artifacts in ``c:\opt\clearml\config`` and ``c:\opt\clearml\data``, and then restore.
|
||||
:::
|
||||
|
||||
1. If upgrading from **Trains Server** to **ClearML Server**, rename `/opt/trains` and its subdirectories to `/opt/clearml`.
|
||||
|
||||
1. If upgrading from **Trains Server** version 0.15 or older to **ClearML Server**, do the following:
|
||||
|
||||
1. Follow these [data migration instructions](clearml_server_es7_migration.md).
|
||||
|
||||
1. Rename `/opt/trains` and its subdirectories to `/opt/clearml`.
|
||||
|
||||
1. If upgrading from ClearML Server version 1.1 or older, you need to migrate your data before upgrading your server. See instructions [here](clearml_server_mongo44_migration.md).
|
||||
|
||||
1. Download the latest `docker-compose.yml` file:
|
||||
|
||||
```
|
||||
curl https://raw.githubusercontent.com/allegroai/clearml-server/master/docker/docker-compose-win10.yml -o c:\opt\clearml\docker-compose-win10.yml
|
||||
curl https://raw.githubusercontent.com/clearml/clearml-server/master/docker/docker-compose-win10.yml -o c:\opt\clearml\docker-compose-win10.yml
|
||||
```
|
||||
|
||||
1. Startup ClearML Server. This automatically pulls the latest ClearML Server build.
|
||||
|
||||
18
docs/faq.md
@@ -137,7 +137,8 @@ the following numbers are displayed:
|
||||
* API server version
|
||||
* API version
|
||||
|
||||

|
||||

|
||||

|
||||
|
||||
ClearML Python package information can be obtained by using `pip freeze`.
|
||||
|
||||
@@ -593,7 +594,8 @@ Due to speed/optimization issues, the console displays only the last several hun
|
||||
You can always download the full log as a file using the ClearML Web UI. In the **ClearML Web UI >** task's **CONSOLE**
|
||||
tab, click `Download full log`.
|
||||
|
||||

|
||||

|
||||

|
||||
|
||||
<br/>
|
||||
|
||||
@@ -604,17 +606,19 @@ and accuracy values of several tasks. In the task comparison page, under the **H
|
||||
you can visualize tasks' hyperparameter values in relation to performance metrics in a scatter plot or parallel
|
||||
coordinates plot:
|
||||
* [Scatter plot](webapp/webapp_exp_comparing.md#scatter-plot): View the correlation between a selected hyperparameter and
|
||||
metric. For example, the image below shows a scatter plot that displays the values of a performance metric (`epoch_accuracy`)
|
||||
metric. For example, the image below shows a scatter plot that displays the values of a performance metric (`accuracy`)
|
||||
and a hyperparameter (`epochs`) of a few tasks:
|
||||
|
||||

|
||||

|
||||

|
||||
|
||||
* [Parallel coordinates plot](webapp/webapp_exp_comparing.md#parallel-coordinates-mode): View the impact of hyperparameters
|
||||
on selected metric(s). For example, the image below shows
|
||||
a parallel coordinates plot which displays the values of selected hyperparameters (`base_lr`, `batch_size`, and
|
||||
`number_of_epochs`) and a performance metric (`accuracy`) of three tasks:
|
||||
a parallel coordinates plot which displays the values of selected hyperparameters (`epochs`, `lr`, and `batch_size`)
|
||||
and a performance metric (`accuracy`) of a few tasks:
|
||||
|
||||

|
||||

|
||||

|
||||
|
||||
<br/>
|
||||
|
||||
|
||||
@@ -1,5 +1,5 @@
|
||||
---
|
||||
title: Building Executable Task Containers
|
||||
title: Building Executable Task Containers
|
||||
---
|
||||
|
||||
## Exporting a Task into a Standalone Docker Container
|
||||
|
||||
@@ -3,7 +3,7 @@ title: Managing Agent Work Schedules
|
||||
---
|
||||
|
||||
:::important Enterprise Feature
|
||||
This feature is available under the ClearML Enterprise plan.
|
||||
Agent work schedule management is available under the ClearML Enterprise plan.
|
||||
:::
|
||||
|
||||
The Agent scheduler enables scheduling working hours for each Agent. During working hours, a worker will actively poll
|
||||
|
||||
@@ -32,19 +32,19 @@ training, and deploying models at every scale on any AI infrastructure.
|
||||
<tbody>
|
||||
<tr>
|
||||
<td><a href="https://github.com/clearml/clearml/blob/master/docs/tutorials/Getting_Started_1_Experiment_Management.ipynb"><b>Step 1</b></a> - Experiment Management</td>
|
||||
<td className="align-center"><a className="no-ext-icon" target="_blank" href="https://colab.research.google.com/github/allegroai/clearml/blob/master/docs/tutorials/Getting_Started_1_Experiment_Management.ipynb">
|
||||
<td className="align-center"><a className="no-ext-icon" target="_blank" href="https://colab.research.google.com/github/clearml/clearml/blob/master/docs/tutorials/Getting_Started_1_Experiment_Management.ipynb">
|
||||
<img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/>
|
||||
</a></td>
|
||||
</tr>
|
||||
<tr>
|
||||
<td><a href="https://github.com/clearml/clearml/blob/master/docs/tutorials/Getting_Started_2_Setting_Up_Agent.ipynb"><b>Step 2</b></a> - Remote Execution Agent Setup</td>
|
||||
<td className="align-center"><a className="no-ext-icon" target="_blank" href="https://colab.research.google.com/github/allegroai/clearml/blob/master/docs/tutorials/Getting_Started_2_Setting_Up_Agent.ipynb">
|
||||
<td className="align-center"><a className="no-ext-icon" target="_blank" href="https://colab.research.google.com/github/clearml/clearml/blob/master/docs/tutorials/Getting_Started_2_Setting_Up_Agent.ipynb">
|
||||
<img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/>
|
||||
</a></td>
|
||||
</tr>
|
||||
<tr>
|
||||
<td><a href="https://github.com/clearml/clearml/blob/master/docs/tutorials/Getting_Started_3_Remote_Execution.ipynb"><b>Step 3</b></a> - Remotely Execute Tasks</td>
|
||||
<td className="align-center"><a className="no-ext-icon" target="_blank" href="https://colab.research.google.com/github/allegroai/clearml/blob/master/docs/tutorials/Getting_Started_3_Remote_Execution.ipynb">
|
||||
<td className="align-center"><a className="no-ext-icon" target="_blank" href="https://colab.research.google.com/github/clearml/clearml/blob/master/docs/tutorials/Getting_Started_3_Remote_Execution.ipynb">
|
||||
<img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/>
|
||||
</a></td>
|
||||
</tr>
|
||||
|
||||
@@ -14,7 +14,7 @@ powerful remote machine. This is useful for:
|
||||
* Managing execution through ClearML's queue system.
|
||||
|
||||
This guide focuses on transitioning a locally executed process to a remote machine for scalable execution. To learn how
|
||||
to reproduce a previously executed process on a remote machine, see [Reproducing Tasks](reproduce_tasks.md).
|
||||
to reproduce a previously executed process on a remote machine, see [Reproducing Task Runs](reproduce_tasks.md).
|
||||
|
||||
## Running a Task Remotely
|
||||
|
||||
|
||||
@@ -1,5 +1,5 @@
|
||||
---
|
||||
title: Reproducing Tasks
|
||||
title: Reproducing Task Runs
|
||||
---
|
||||
|
||||
:::note
|
||||
|
||||
@@ -31,7 +31,7 @@ The pip package also includes `clearml-data`. It can help you keep track of your
|
||||
|
||||
Both the 2 magic lines and the data tool will send all of their information to a ClearML server. This server then keeps an overview of your experiment runs and data sets over time, so you can always go back to a previous experiment, see how it was created and even recreate it exactly. Keep track of your best models by creating leaderboards based on your own metrics, and you can even directly compare multiple experiment runs, helping you to figure out the best way forward for your models.
|
||||
|
||||
To get started with a server right away, you can make use of the free tier. And when your needs grow, we've got you covered too! Just check out our website to find a tier that fits your organisation best. But, because we're open source, you can also host your own completely for free. We have AWS images, Google Cloud images, you can run it on docker-compose locally or even, if you really hate yourself, run it on a self-hosted kubernetes cluster using our helm charts.
|
||||
To get started with a server right away, you can make use of the free tier. And when your needs grow, we've got you covered too! Just check out our website to find a tier that fits your organisation best. But, because we're open source, you can also host your own completely for free. We have AWS images, Google Cloud images, you can run it on `docker-compose` locally or even, if you really hate yourself, run it on a self-hosted kubernetes cluster using our helm charts.
|
||||
|
||||
So, to recap: to get started, all you need is a pip package and a server to store everything. Easy right? But MLOps is much more than experiment and data management. It's also about automation and orchestration, which is exactly where the `clearml-agent` comes into play.
|
||||
|
||||
|
||||
@@ -57,24 +57,28 @@ Logger.current_logger().report_scalar(
|
||||
|
||||
These scalars can be visualized in plots, which appear in the ClearML web UI, in the task's **SCALARS** tab.
|
||||
|
||||

|
||||

|
||||

|
||||
|
||||
## Hyperparameters
|
||||
|
||||
ClearML automatically logs command line options defined with `argparse`. They appear in **CONFIGURATION** **>** **HYPERPARAMETERS** **>** **Args**.
|
||||
|
||||

|
||||

|
||||

|
||||
|
||||
## Console
|
||||
|
||||
Text printed to the console for training progress, as well as all other console output, appear in **CONSOLE**.
|
||||
|
||||

|
||||

|
||||

|
||||
|
||||
## Artifacts
|
||||
|
||||
Models created by the task appear in the task's **ARTIFACTS** tab. ClearML automatically logs and tracks models
|
||||
and any snapshots created using PyTorch.
|
||||
|
||||

|
||||

|
||||

|
||||
|
||||
|
||||
@@ -31,4 +31,5 @@ the `examples` project. This starts the parameter search, and creates the tasks:
|
||||
|
||||
When these tasks are completed, their [results can be compared](../../webapp/webapp_exp_comparing.md).
|
||||
|
||||

|
||||

|
||||

|
||||
@@ -49,7 +49,7 @@ Execution log at: https://app.clear.ml/projects/552d5399112d47029c146d5248570295
|
||||
### Executing a Local Script
|
||||
|
||||
For this example, use a local version of [this script](https://github.com/clearml/events/blob/master/webinar-0620/keras_mnist.py).
|
||||
1. Clone the [allegroai/events](https://github.com/clearml/events) repository
|
||||
1. Clone the [clearml/events](https://github.com/clearml/events) repository
|
||||
1. Go to the root folder of the cloned repository
|
||||
1. Run the following command:
|
||||
|
||||
|
||||
@@ -34,7 +34,8 @@ Task.current_task().upload_artifact(
|
||||
|
||||
All of these artifacts appear in the main Task under **ARTIFACTS** **>** **OTHER**.
|
||||
|
||||

|
||||

|
||||

|
||||
|
||||
## Scalars
|
||||
|
||||
@@ -54,7 +55,8 @@ Task.current_task().get_logger().report_scalar(
|
||||
|
||||
The single scalar plot for loss appears in **SCALARS**.
|
||||
|
||||

|
||||

|
||||

|
||||
|
||||
## Hyperparameters
|
||||
|
||||
@@ -69,12 +71,15 @@ Task.current_task().connect(param)
|
||||
|
||||
All the hyperparameters appear in **CONFIGURATION** **>** **HYPERPARAMETERS**.
|
||||
|
||||

|
||||

|
||||

|
||||
|
||||

|
||||

|
||||

|
||||
|
||||
## Console
|
||||
|
||||
Output to the console, including the text messages printed from the main Task object and each subprocess appear in **CONSOLE**.
|
||||
|
||||

|
||||

|
||||

|
||||
@@ -26,14 +26,17 @@ Task.current_task().connect(additional_parameters)
|
||||
|
||||
Command line options appear in **CONFIGURATION** **>** **HYPERPARAMETERS** **>** **Args**.
|
||||
|
||||

|
||||

|
||||

|
||||
|
||||
Parameter dictionaries appear in **General**.
|
||||
|
||||

|
||||

|
||||

|
||||
|
||||
## Console
|
||||
|
||||
Output to the console, including the text messages from the Task in each subprocess, appear in **CONSOLE**.
|
||||
|
||||

|
||||

|
||||

|
||||
@@ -15,25 +15,30 @@ The example script does the following:
|
||||
ClearML automatically captures scalars logged by CatBoost. These scalars can be visualized in plots, which appear in the
|
||||
[ClearML web UI](../../../webapp/webapp_overview.md), in the task's **SCALARS** tab.
|
||||
|
||||

|
||||

|
||||

|
||||
|
||||
## Hyperparameters
|
||||
ClearML automatically logs command line options defined with argparse. They appear in **CONFIGURATIONS > HYPERPARAMETERS > Args**.
|
||||
|
||||

|
||||

|
||||

|
||||
|
||||
## Console
|
||||
Text printed to the console for training progress, as well as all other console output, appear in **CONSOLE**.
|
||||
|
||||

|
||||

|
||||

|
||||
|
||||
## Artifacts
|
||||
Models created by the task appear in the task's **ARTIFACTS** tab. ClearML automatically logs and tracks
|
||||
models created using CatBoost.
|
||||
|
||||

|
||||

|
||||

|
||||
|
||||
Clicking on the model name takes you to the [model's page](../../../webapp/webapp_model_viewing.md), where you can view
|
||||
the model's details and access the model.
|
||||
|
||||

|
||||

|
||||

|
||||
|
||||
@@ -18,16 +18,19 @@ The example code does the following:
|
||||
|
||||
ClearML automatically logs the histogram output to TensorBoard. They appear in **PLOTS**.
|
||||
|
||||

|
||||

|
||||

|
||||
|
||||
## Plots
|
||||
|
||||
Histograms output to TensorBoard. They appear in **PLOTS**.
|
||||
|
||||

|
||||

|
||||

|
||||
|
||||
## Logs
|
||||
|
||||
Text printed to the console for training progress, as well as all other console output, appear in **CONSOLE**.
|
||||
|
||||

|
||||

|
||||

|
||||
|
||||
@@ -36,7 +36,8 @@ ClearML captures all of the `TrainingArguments` passed to the Trainer.
|
||||
|
||||
View these parameters in the task's **CONFIGURATION** tab **> Hyperparameters** section.
|
||||
|
||||

|
||||

|
||||

|
||||
|
||||
|
||||
### Models
|
||||
@@ -47,10 +48,12 @@ variable is set to `True`.
|
||||
ClearML automatically captures the model snapshots created by the Trainer, and saves them as artifacts. View the snapshots in the
|
||||
task's **ARTIFACTS** tab.
|
||||
|
||||

|
||||

|
||||

|
||||
|
||||
### Scalars
|
||||
|
||||
ClearML automatically captures the Trainer's scalars, which can be viewed in the task's **Scalars** tab.
|
||||
|
||||

|
||||

|
||||

|
||||
|
||||
@@ -18,22 +18,26 @@ The example does the following:
|
||||
|
||||
The loss and accuracy metric scalar plots appear in **SCALARS**, along with the resource utilization plots, which are titled **:monitor: machine**.
|
||||
|
||||

|
||||

|
||||

|
||||
|
||||
## Plots
|
||||
|
||||
The example calls Matplotlib methods to create several sample plots, and TensorBoard methods to plot histograms for layer density.
|
||||
They appear in **PLOTS**.
|
||||
|
||||

|
||||

|
||||

|
||||
|
||||

|
||||

|
||||

|
||||
|
||||
## Debug Samples
|
||||
|
||||
The example calls Matplotlib methods to log debug sample images. They appear in **DEBUG SAMPLES**.
|
||||
|
||||

|
||||

|
||||

|
||||
|
||||
## Hyperparameters
|
||||
|
||||
@@ -55,17 +59,20 @@ task_params['hidden_dim'] = 512
|
||||
|
||||
Parameter dictionaries appear in **CONFIGURATION** **>** **HYPERPARAMETERS** **>** **General**.
|
||||
|
||||

|
||||

|
||||

|
||||
|
||||
The TensorFlow Definitions appear in the **TF_DEFINE** subsection.
|
||||
|
||||

|
||||

|
||||

|
||||
|
||||
## Console
|
||||
|
||||
Text printed to the console for training appears in **CONSOLE**.
|
||||
|
||||

|
||||

|
||||

|
||||
|
||||
## Artifacts
|
||||
|
||||
@@ -74,9 +81,11 @@ created using Keras.
|
||||
|
||||
The task info panel shows model tracking, including the model name and design in **ARTIFACTS** **>** **Output Model**.
|
||||
|
||||

|
||||

|
||||

|
||||
|
||||
Clicking on the model name takes you to the [model's page](../../../webapp/webapp_model_viewing.md), where you can view
|
||||
the model's details and access the model.
|
||||
|
||||

|
||||

|
||||

|
||||
@@ -25,31 +25,36 @@ The example script does the following:
|
||||
The loss and accuracy metric scalar plots appear in **SCALARS**, along with the resource utilization plots,
|
||||
which are titled **:monitor: machine**.
|
||||
|
||||

|
||||

|
||||

|
||||
|
||||
## Histograms
|
||||
|
||||
Histograms for layer density appear in **PLOTS**.
|
||||
|
||||

|
||||

|
||||

|
||||
|
||||
## Hyperparameters
|
||||
|
||||
ClearML automatically logs command line options generated with `argparse`, and TensorFlow Definitions.
|
||||
ClearML automatically logs command line options generated with `argparse` and TensorFlow Definitions.
|
||||
|
||||
Command line options appear in **CONFIGURATION** **>** **HYPERPARAMETERS** **>** **Args**.
|
||||
|
||||

|
||||

|
||||

|
||||
|
||||
TensorFlow Definitions appear in **TF_DEFINE**.
|
||||
|
||||

|
||||

|
||||

|
||||
|
||||
## Console
|
||||
|
||||
Text printed to the console for training progress, as well as all other console output, appear in **CONSOLE**.
|
||||
|
||||

|
||||

|
||||

|
||||
|
||||
## Configuration Objects
|
||||
|
||||
@@ -64,4 +69,5 @@ task.connect_configuration(
|
||||
|
||||
It appears in **CONFIGURATION** **>** **CONFIGURATION OBJECTS** **>** **MyConfig**.
|
||||
|
||||

|
||||

|
||||

|
||||
@@ -83,18 +83,22 @@ if CONDITION:
|
||||
## WebApp
|
||||
The model appears in the task's **ARTIFACTS** tab.
|
||||
|
||||

|
||||

|
||||

|
||||
|
||||
Clicking on the model name takes you to the [model's page](../../../webapp/webapp_model_viewing.md), where you can view the
|
||||
model's details and access the model.
|
||||
|
||||

|
||||

|
||||

|
||||
|
||||
The model's **NETWORK** tab displays its configuration.
|
||||
|
||||

|
||||

|
||||

|
||||
|
||||
The model's **LABELS** tab displays its label enumeration.
|
||||
|
||||

|
||||

|
||||

|
||||
|
||||
|
||||
@@ -35,7 +35,8 @@ Task.current_task().upload_artifact(
|
||||
|
||||
All of these artifacts appear in the main Task, **ARTIFACTS** **>** **OTHER**.
|
||||
|
||||

|
||||

|
||||

|
||||
|
||||
## Scalars
|
||||
|
||||
@@ -54,7 +55,8 @@ Task.current_task().get_logger().report_scalar(
|
||||
|
||||
The single scalar plot for loss appears in **SCALARS**.
|
||||
|
||||

|
||||

|
||||

|
||||
|
||||
## Hyperparameters
|
||||
|
||||
@@ -69,7 +71,8 @@ Task.current_task().connect(param)
|
||||
|
||||
Command line options appear in **CONFIGURATION** **>** **HYPERPARAMETERS** **>** **Args**.
|
||||
|
||||

|
||||

|
||||

|
||||
|
||||
Parameter dictionaries appear in the **General** section of **HYPERPARAMETERS**.
|
||||
|
||||
@@ -78,10 +81,12 @@ param = {'worker_{}_stuff'.format(dist.get_rank()): 'some stuff ' + str(randint(
|
||||
Task.current_task().connect(param)
|
||||
```
|
||||
|
||||

|
||||

|
||||

|
||||
|
||||
## Log
|
||||
|
||||
Output to the console, including the text messages printed from the main Task object and each subprocess, appears in **CONSOLE**.
|
||||
|
||||

|
||||

|
||||

|
||||
@@ -14,15 +14,18 @@ The example does the following:
|
||||
|
||||
The images shown in the example script's `imshow` function appear according to metric in **DEBUG SAMPLES**.
|
||||
|
||||

|
||||

|
||||

|
||||
|
||||
Select a debug sample by metric.
|
||||
|
||||

|
||||

|
||||

|
||||
|
||||
Open the debug sample in the image viewer.
|
||||
Click a debug sample to view it in the image viewer.
|
||||
|
||||

|
||||

|
||||

|
||||
|
||||
|
||||
|
||||
|
||||
@@ -36,28 +36,33 @@ Logger.current_logger().report_scalar(
|
||||
These scalars can be visualized in plots, which appear in the ClearML [web UI](../../../webapp/webapp_overview.md),
|
||||
in the task's **SCALARS** tab.
|
||||
|
||||

|
||||

|
||||

|
||||
|
||||
## Hyperparameters
|
||||
|
||||
ClearML automatically logs command line options defined with `argparse`. They appear in **CONFIGURATION** **>** **HYPERPARAMETERS** **>** **Args**.
|
||||
|
||||

|
||||

|
||||

|
||||
|
||||
## Console
|
||||
|
||||
Text printed to the console for training progress, as well as all other console output, appear in **CONSOLE**.
|
||||
|
||||

|
||||

|
||||

|
||||
|
||||
## Artifacts
|
||||
|
||||
Models created by the task appear in the task's **ARTIFACTS** tab. ClearML automatically logs and tracks models
|
||||
and any snapshots created using PyTorch.
|
||||
|
||||

|
||||

|
||||

|
||||
|
||||
Clicking on the model name takes you to the [model's page](../../../webapp/webapp_model_viewing.md), where you can view
|
||||
the model's details and access the model.
|
||||
|
||||

|
||||

|
||||

|
||||
@@ -18,34 +18,40 @@ In the example script, the `train` and `test` functions call the TensorBoard `Su
|
||||
These scalars, along with the resource utilization plots, which are titled **:monitor: machine**, appear in the task's
|
||||
page in the [ClearML web UI](../../../webapp/webapp_overview.md) under **SCALARS**.
|
||||
|
||||

|
||||

|
||||

|
||||
|
||||
## Debug Samples
|
||||
|
||||
ClearML automatically tracks images and text output to TensorFlow. They appear in **DEBUG SAMPLES**.
|
||||
|
||||

|
||||

|
||||

|
||||
|
||||
## Hyperparameters
|
||||
|
||||
ClearML automatically logs TensorFlow Definitions. They appear in **CONFIGURATION** **>** **HYPERPARAMETERS** **>** **TF_DEFINE**.
|
||||
|
||||

|
||||

|
||||

|
||||
|
||||
## Console
|
||||
|
||||
Text printed to the console for training progress, as well as all other console output, appear in **CONSOLE**.
|
||||
|
||||

|
||||

|
||||

|
||||
|
||||
## Artifacts
|
||||
|
||||
Models created by the task appear in the task's **ARTIFACTS** tab. ClearML automatically logs and tracks
|
||||
models and any snapshots created using PyTorch.
|
||||
|
||||

|
||||

|
||||

|
||||
|
||||
Clicking on a model's name takes you to the [model's page](../../../webapp/webapp_model_viewing.md), where you can view
|
||||
the model's details and access the model.
|
||||
|
||||

|
||||

|
||||

|
||||
@@ -18,29 +18,34 @@ The loss and accuracy metric scalar plots, along with the resource utilization p
|
||||
appear in the task's page in the [web UI](../../../webapp/webapp_overview.md), under **SCALARS**.
|
||||
|
||||
|
||||

|
||||

|
||||

|
||||
|
||||
## Hyperparameters
|
||||
|
||||
ClearML automatically logs command line options defined with `argparse`. They appear in **CONFIGURATION** **>**
|
||||
**HYPERPARAMETERS** **>** **Args**.
|
||||
|
||||

|
||||

|
||||

|
||||
|
||||
## Log
|
||||
|
||||
Text printed to the console for training progress, as well as all other console output, appear in **CONSOLE**.
|
||||
|
||||

|
||||

|
||||

|
||||
|
||||
## Artifacts
|
||||
|
||||
Models created by the task appear in the task's **ARTIFACTS** tab. ClearML automatically logs and tracks
|
||||
models and any snapshots created using PyTorch.
|
||||
|
||||

|
||||

|
||||

|
||||
|
||||
Clicking on the model name takes you to the [model's page](../../../webapp/webapp_model_viewing.md), where you can view
|
||||
the model's details and access the model.
|
||||
|
||||

|
||||

|
||||

|
||||
|
||||
@@ -11,10 +11,12 @@ associated with the `examples` project.
|
||||
|
||||
The debug sample images appear according to metric, in the task's **DEBUG SAMPLES** tab.
|
||||
|
||||

|
||||

|
||||

|
||||
|
||||
## Hyperparameters
|
||||
|
||||
ClearML automatically logs TensorFlow Definitions. They appear in **CONFIGURATION** **>** **HYPERPARAMETERS** **>** **TF_DEFINE**.
|
||||
|
||||

|
||||

|
||||

|
||||
@@ -12,16 +12,19 @@ and `matplotlib` to create a scatter diagram. When the script runs, it creates a
|
||||
ClearML automatically logs the scatter plot, which appears in the [task's page](../../../webapp/webapp_exp_track_visual.md)
|
||||
in the ClearML web UI, under **PLOTS**.
|
||||
|
||||

|
||||

|
||||

|
||||
|
||||
## Artifacts
|
||||
|
||||
Models created by the task appear in the task's **ARTIFACTS** tab.
|
||||
|
||||

|
||||

|
||||

|
||||
|
||||
Clicking on the model name takes you to the [model's page](../../../webapp/webapp_model_viewing.md), where you can
|
||||
view the model's details and access the model.
|
||||
|
||||
|
||||

|
||||

|
||||

|
||||
@@ -16,30 +16,35 @@ The script does the following:
|
||||
The loss and accuracy metric scalar plots appear in the task's page in the **ClearML web UI**, under
|
||||
**SCALARS**. The also includes resource utilization plots, which are titled **:monitor: machine**.
|
||||
|
||||

|
||||

|
||||

|
||||
|
||||
## Hyperparameters
|
||||
|
||||
ClearML automatically logs command line options defined with `argparse`. They appear in **CONFIGURATION** **>**
|
||||
**HYPERPARAMETERS** **>** **Args**.
|
||||
|
||||

|
||||

|
||||

|
||||
|
||||
## Console
|
||||
|
||||
Text printed to the console for training progress, as well as all other console output, appear in **CONSOLE**.
|
||||
|
||||

|
||||

|
||||

|
||||
|
||||
## Artifacts
|
||||
|
||||
Models created by the task appear in the task's **ARTIFACTS** tab. ClearML automatically logs and tracks
|
||||
models and any snapshots created using PyTorch.
|
||||
|
||||

|
||||

|
||||

|
||||
|
||||
Clicking on the model's name takes you to the [model's page](../../../webapp/webapp_model_viewing.md), where you can
|
||||
view the model's details and access the model.
|
||||
|
||||

|
||||

|
||||

|
||||
|
||||
|
||||
@@ -14,5 +14,6 @@ the `examples` project.
|
||||
ClearML automatically captures the video data that is added to the `SummaryWriter` object, using the `add_video` method.
|
||||
The video appears in the task's **DEBUG SAMPLES** tab.
|
||||
|
||||

|
||||

|
||||

|
||||
|
||||
|
||||
@@ -44,28 +44,33 @@ When the script runs, it logs:
|
||||
ClearML logs the scalars from training each network. They appear in the task's page in the **ClearML web UI**, under
|
||||
**SCALARS**.
|
||||
|
||||

|
||||

|
||||

|
||||
|
||||
## Summary of Hyperparameter Optimization
|
||||
|
||||
ClearML automatically logs the parameters of each task run in the hyperparameter search. They appear in tabular
|
||||
form in **PLOTS**.
|
||||
|
||||

|
||||

|
||||

|
||||
|
||||
## Artifacts
|
||||
|
||||
ClearML automatically stores the output model. It appears in **ARTIFACTS** **>** **Output Model**.
|
||||
|
||||

|
||||

|
||||

|
||||
|
||||
Model details, such as snap locations, appear in the **MODELS** tab.
|
||||
|
||||

|
||||

|
||||

|
||||
|
||||
The model configuration is stored with the model.
|
||||
|
||||

|
||||

|
||||

|
||||
|
||||
## Configuration Objects
|
||||
|
||||
@@ -73,12 +78,14 @@ The model configuration is stored with the model.
|
||||
|
||||
ClearML automatically logs the TensorFlow Definitions, which appear in **CONFIGURATION** **>** **HYPERPARAMETERS**.
|
||||
|
||||

|
||||

|
||||

|
||||
|
||||
### Configuration
|
||||
|
||||
The Task configuration appears in **CONFIGURATION** **>** **General**.
|
||||
|
||||

|
||||

|
||||

|
||||
|
||||
|
||||
|
||||
@@ -20,24 +20,29 @@ In the **ClearML Web UI**, the PR Curve summaries appear in the task's page unde
|
||||
|
||||
* Blue PR curves
|
||||
|
||||

|
||||

|
||||

|
||||
|
||||
* Green PR curves
|
||||
|
||||

|
||||

|
||||

|
||||
|
||||
* Red PR curves
|
||||
|
||||

|
||||

|
||||

|
||||
|
||||
## Hyperparameters
|
||||
|
||||
ClearML automatically logs TensorFlow Definitions. They appear in **CONFIGURATION** **>** **HYPERPARAMETERS** **>** **TF_DEFINE**.
|
||||
|
||||

|
||||

|
||||

|
||||
|
||||
## Console
|
||||
|
||||
All other console output appears in **CONSOLE**.
|
||||
All console output appears in **CONSOLE** tab.
|
||||
|
||||

|
||||

|
||||

|
||||
|
||||
@@ -14,25 +14,29 @@ project.
|
||||
The `tf.summary.scalar` output appears in the ClearML web UI, in the task's
|
||||
**SCALARS**. Resource utilization plots, which are titled **:monitor: machine**, also appear in the **SCALARS** tab.
|
||||
|
||||

|
||||

|
||||

|
||||
|
||||
## Plots
|
||||
|
||||
The `tf.summary.histogram` output appears in **PLOTS**.
|
||||
|
||||

|
||||

|
||||

|
||||
|
||||
## Debug Samples
|
||||
|
||||
ClearML automatically tracks images and text output to TensorFlow. They appear in **DEBUG SAMPLES**.
|
||||
|
||||

|
||||

|
||||

|
||||
|
||||
## Hyperparameters
|
||||
|
||||
ClearML automatically logs TensorFlow Definitions. They appear in **CONFIGURATION** **>** **HYPERPARAMETERS** **>**
|
||||
**TF_DEFINE**.
|
||||
|
||||

|
||||

|
||||

|
||||
|
||||
|
||||
|
||||
@@ -13,30 +13,35 @@ When the script runs, it creates a task named `Tensorflow v2 mnist with summarie
|
||||
The loss and accuracy metric scalar plots appear in the task's page in the **ClearML web UI** under
|
||||
**SCALARS**. Resource utilization plots, which are titled **:monitor: machine**, also appear in the **SCALARS** tab.
|
||||
|
||||

|
||||

|
||||

|
||||
|
||||
## Hyperparameters
|
||||
|
||||
ClearML automatically logs TensorFlow Definitions. They appear in **CONFIGURATION** **>** **HYPERPARAMETERS**
|
||||
**>** **TF_DEFINE**.
|
||||
|
||||

|
||||

|
||||

|
||||
|
||||
## Console
|
||||
|
||||
All console output appears in **CONSOLE**.
|
||||
|
||||

|
||||

|
||||

|
||||
|
||||
## Artifacts
|
||||
|
||||
Models created by the task appear in the task's **ARTIFACTS** tab. ClearML automatically logs and tracks
|
||||
models and any snapshots created using TensorFlow.
|
||||
|
||||

|
||||

|
||||

|
||||
|
||||
Clicking on a model's name takes you to the [model's page](../../../webapp/webapp_model_viewing.md), where you can
|
||||
view the model's details and access the model.
|
||||
|
||||
|
||||

|
||||

|
||||

|
||||
@@ -13,7 +13,8 @@ the `examples` project.
|
||||
ClearML automatically captures scalars logged with XGBoost, which can be visualized in plots in the
|
||||
ClearML WebApp, in the task's **SCALARS** tab.
|
||||
|
||||

|
||||

|
||||

|
||||
|
||||
## Models
|
||||
|
||||
@@ -21,14 +22,17 @@ ClearML automatically captures the model logged using the `xgboost.save` method,
|
||||
|
||||
View saved snapshots in the task's **ARTIFACTS** tab.
|
||||
|
||||

|
||||

|
||||

|
||||
|
||||
To view the model details, click the model name in the **ARTIFACTS** page, which will open the model's info tab. Alternatively, download the model.
|
||||
|
||||

|
||||

|
||||

|
||||
|
||||
## Console
|
||||
|
||||
All console output during the script's execution appears in the task's **CONSOLE** page.
|
||||
|
||||

|
||||

|
||||

|
||||
@@ -18,25 +18,30 @@ classification dataset using XGBoost
|
||||
The feature importance plot and tree plot appear in the task's page in the **ClearML web UI**, under
|
||||
**PLOTS**.
|
||||
|
||||

|
||||

|
||||

|
||||
|
||||

|
||||

|
||||

|
||||
|
||||
|
||||
## Console
|
||||
|
||||
All other console output appear in **CONSOLE**.
|
||||
|
||||

|
||||

|
||||

|
||||
|
||||
## Artifacts
|
||||
|
||||
Models created by the task appear in the task's **ARTIFACTS** tab. ClearML automatically logs and tracks
|
||||
models and any snapshots created using XGBoost.
|
||||
|
||||

|
||||

|
||||

|
||||
|
||||
Clicking on the model's name takes you to the [model's page](../../../webapp/webapp_model_viewing.md), where you can
|
||||
view the model's details and access the model.
|
||||
|
||||

|
||||

|
||||

|
||||
@@ -16,7 +16,7 @@ and running, users can send Tasks to be executed on Google Colab's hardware.
|
||||
|
||||
|
||||
## Steps
|
||||
1. Open up [this Google Colab notebook](https://colab.research.google.com/github/allegroai/clearml/blob/master/examples/clearml_agent/clearml_colab_agent.ipynb).
|
||||
1. Open up [this Google Colab notebook](https://colab.research.google.com/github/clearml/clearml/blob/master/examples/clearml_agent/clearml_colab_agent.ipynb).
|
||||
|
||||
1. Run the first cell, which installs all the necessary packages:
|
||||
```
|
||||
@@ -40,8 +40,8 @@ and running, users can send Tasks to be executed on Google Colab's hardware.
|
||||
api_host="https://api.clear.ml",
|
||||
web_host="https://app.clear.ml",
|
||||
files_host="https://files.clear.ml",
|
||||
key='6ZHX9UQMYL874A1NE8',
|
||||
secret='=2h6#%@Y&m*tC!VLEXq&JI7QhZPKuJfbaYD4!uUk(t7=9ENv'
|
||||
key='<generated_key>',
|
||||
secret='<generated_secret>'
|
||||
)
|
||||
```
|
||||
|
||||
|
||||
@@ -3,7 +3,7 @@ title: Pipeline from Decorators
|
||||
---
|
||||
|
||||
The [pipeline_from_decorator.py](https://github.com/clearml/clearml/blob/master/examples/pipeline/pipeline_from_decorator.py)
|
||||
example demonstrates the creation of a pipeline in ClearML using the [`PipelineDecorator`](../../references/sdk/automation_controller_pipelinecontroller.md#class-automationcontrollerpipelinedecorator)
|
||||
example demonstrates the creation of a pipeline in ClearML using the [`PipelineDecorator`](../../references/sdk/automation_controller_pipelinedecorator.md#class-automationcontrollerpipelinedecorator)
|
||||
class.
|
||||
|
||||
This example creates a pipeline incorporating four tasks, each of which is created from a Python function using a custom decorator:
|
||||
@@ -14,11 +14,11 @@ This example creates a pipeline incorporating four tasks, each of which is creat
|
||||
* `step_four` - Uses data from `step_two` and the model from `step_three` to make a prediction.
|
||||
|
||||
The pipeline steps, defined in the `step_one`, `step_two`, `step_three`, and `step_four` functions, are each wrapped with the
|
||||
[`@PipelineDecorator.component`](../../references/sdk/automation_controller_pipelinecontroller.md#pipelinedecoratorcomponent)
|
||||
[`@PipelineDecorator.component`](../../references/sdk/automation_controller_pipelinedecorator.md#pipelinedecoratorcomponent)
|
||||
decorator, which creates a ClearML pipeline step for each one when the pipeline is executed.
|
||||
|
||||
The logic that executes these steps and controls the interaction between them is implemented in the `executing_pipeline`
|
||||
function. This function is wrapped with the [`@PipelineDecorator.pipeline`](../../references/sdk/automation_controller_pipelinecontroller.md#pipelinedecoratorpipeline)
|
||||
function. This function is wrapped with the [`@PipelineDecorator.pipeline`](../../references/sdk/automation_controller_pipelinedecorator.md#pipelinedecoratorpipeline)
|
||||
decorator which creates the ClearML pipeline task when it is executed.
|
||||
|
||||
The sections below describe in more detail what happens in the pipeline controller and steps.
|
||||
@@ -28,7 +28,7 @@ The sections below describe in more detail what happens in the pipeline controll
|
||||
In this example, the pipeline controller is implemented by the `executing_pipeline` function.
|
||||
|
||||
Using the `@PipelineDecorator.pipeline` decorator creates a ClearML Controller Task from the function when it is executed.
|
||||
For detailed information, see [`@PipelineDecorator.pipeline`](../../references/sdk/automation_controller_pipelinecontroller.md#pipelinedecoratorpipeline).
|
||||
For detailed information, see [`@PipelineDecorator.pipeline`](../../references/sdk/automation_controller_pipelinedecorator.md#pipelinedecoratorpipeline).
|
||||
|
||||
In the example script, the controller defines the interactions between the pipeline steps in the following way:
|
||||
1. The controller function passes its argument, `pickle_url`, to the pipeline's first step (`step_one`)
|
||||
@@ -39,13 +39,13 @@ In the example script, the controller defines the interactions between the pipel
|
||||
|
||||
:::info Local Execution
|
||||
In this example, the pipeline is set to run in local mode by using
|
||||
[`PipelineDecorator.run_locally()`](../../references/sdk/automation_controller_pipelinecontroller.md#pipelinedecoratorrun_locally)
|
||||
[`PipelineDecorator.run_locally()`](../../references/sdk/automation_controller_pipelinedecorator.md#pipelinedecoratorrun_locally)
|
||||
before calling the pipeline function. See pipeline execution options [here](../../pipelines/pipelines_sdk_function_decorators.md#running-the-pipeline).
|
||||
:::
|
||||
|
||||
## Pipeline Steps
|
||||
Using the `@PipelineDecorator.component` decorator will make the function a pipeline component that can be called from the
|
||||
pipeline controller, which implements the pipeline's execution logic. For detailed information, see [`@PipelineDecorator.component`](../../references/sdk/automation_controller_pipelinecontroller.md#pipelinedecoratorcomponent).
|
||||
pipeline controller, which implements the pipeline's execution logic. For detailed information, see [`@PipelineDecorator.component`](../../references/sdk/automation_controller_pipelinedecorator.md#pipelinedecoratorcomponent).
|
||||
|
||||
When the pipeline controller calls a pipeline step, a corresponding ClearML task will be created. Notice that all package
|
||||
imports inside the function will be automatically logged as required packages for the pipeline execution step.
|
||||
@@ -63,7 +63,7 @@ executing_pipeline(
|
||||
```
|
||||
|
||||
By default, the pipeline controller and the pipeline steps are launched through ClearML [queues](../../fundamentals/agents_and_queues.md#what-is-a-queue).
|
||||
Use the [`PipelineDecorator.set_default_execution_queue`](../../references/sdk/automation_controller_pipelinecontroller.md#pipelinedecoratorset_default_execution_queue)
|
||||
Use the [`PipelineDecorator.set_default_execution_queue`](../../references/sdk/automation_controller_pipelinedecorator.md#pipelinedecoratorset_default_execution_queue)
|
||||
method to specify the execution queue of all pipeline steps. The `execution_queue` parameter of the `@PipelineDecorator.component`
|
||||
decorator overrides the default queue value for the specific step for which it was specified.
|
||||
|
||||
|
||||
@@ -23,7 +23,7 @@ The Slack API token and channel you create are required to configure the Slack a
|
||||
1. In **Development Slack Workspace**, select a workspace.
|
||||
1. Click **Create App**.
|
||||
1. In **Basic Information**, under **Display Information**, complete the following:
|
||||
- In **Short description**, enter "Allegro Train Bot".
|
||||
- In **Short description**, enter "ClearML Train Bot".
|
||||
- In **Background color**, enter "#202432".
|
||||
1. Click **Save Changes**.
|
||||
1. In **OAuth & Permissions**, under **Scopes**, click **Add an OAuth Scope**, and then select the following permissions
|
||||
|
||||
|
Before Width: | Height: | Size: 239 KiB |
|
Before Width: | Height: | Size: 148 KiB After Width: | Height: | Size: 186 KiB |
BIN
docs/img/compare_parallel_coordinates_dark.png
Normal file
|
After Width: | Height: | Size: 197 KiB |
|
Before Width: | Height: | Size: 40 KiB After Width: | Height: | Size: 40 KiB |
BIN
docs/img/examples_catboost_artifacts_dark.png
Normal file
|
After Width: | Height: | Size: 40 KiB |
|
Before Width: | Height: | Size: 30 KiB After Width: | Height: | Size: 31 KiB |
BIN
docs/img/examples_catboost_configurations_dark.png
Normal file
|
After Width: | Height: | Size: 31 KiB |
|
Before Width: | Height: | Size: 222 KiB After Width: | Height: | Size: 136 KiB |
BIN
docs/img/examples_catboost_console_dark.png
Normal file
|
After Width: | Height: | Size: 138 KiB |
|
Before Width: | Height: | Size: 50 KiB After Width: | Height: | Size: 45 KiB |
BIN
docs/img/examples_catboost_model_dark.png
Normal file
|
After Width: | Height: | Size: 60 KiB |
|
Before Width: | Height: | Size: 62 KiB After Width: | Height: | Size: 63 KiB |
BIN
docs/img/examples_catboost_scalars_dark.png
Normal file
|
After Width: | Height: | Size: 66 KiB |
|
Before Width: | Height: | Size: 129 KiB After Width: | Height: | Size: 122 KiB |
BIN
docs/img/examples_data_management_cifar_dataset_dark.png
Normal file
|
After Width: | Height: | Size: 125 KiB |
|
Before Width: | Height: | Size: 92 KiB After Width: | Height: | Size: 106 KiB |
BIN
docs/img/examples_hpo_parallel_coordinates_dark.png
Normal file
|
After Width: | Height: | Size: 119 KiB |
|
Before Width: | Height: | Size: 40 KiB After Width: | Height: | Size: 39 KiB |
BIN
docs/img/examples_keras_00_dark.png
Normal file
|
After Width: | Height: | Size: 40 KiB |
|
Before Width: | Height: | Size: 86 KiB After Width: | Height: | Size: 76 KiB |
BIN
docs/img/examples_keras_00a_dark.png
Normal file
|
After Width: | Height: | Size: 78 KiB |
|
Before Width: | Height: | Size: 129 KiB After Width: | Height: | Size: 120 KiB |
BIN
docs/img/examples_keras_01_dark.png
Normal file
|
After Width: | Height: | Size: 126 KiB |
|
Before Width: | Height: | Size: 83 KiB After Width: | Height: | Size: 115 KiB |
BIN
docs/img/examples_keras_02_dark.png
Normal file
|
After Width: | Height: | Size: 119 KiB |
|
Before Width: | Height: | Size: 83 KiB After Width: | Height: | Size: 104 KiB |
BIN
docs/img/examples_keras_jupyter_03_dark.png
Normal file
|
After Width: | Height: | Size: 107 KiB |
|
Before Width: | Height: | Size: 90 KiB After Width: | Height: | Size: 99 KiB |
BIN
docs/img/examples_keras_jupyter_03a_dark.png
Normal file
|
After Width: | Height: | Size: 100 KiB |
|
Before Width: | Height: | Size: 39 KiB After Width: | Height: | Size: 43 KiB |
BIN
docs/img/examples_keras_jupyter_04_dark.png
Normal file
|
After Width: | Height: | Size: 43 KiB |
|
Before Width: | Height: | Size: 269 KiB After Width: | Height: | Size: 128 KiB |
BIN
docs/img/examples_keras_jupyter_07_dark.png
Normal file
|
After Width: | Height: | Size: 130 KiB |
|
Before Width: | Height: | Size: 120 KiB After Width: | Height: | Size: 118 KiB |
BIN
docs/img/examples_keras_jupyter_08_dark.png
Normal file
|
After Width: | Height: | Size: 123 KiB |
|
Before Width: | Height: | Size: 48 KiB After Width: | Height: | Size: 47 KiB |
BIN
docs/img/examples_keras_jupyter_20_dark.png
Normal file
|
After Width: | Height: | Size: 48 KiB |