This commit is contained in:
revital 2025-03-11 09:56:58 +02:00
commit d823a77ba5
19 changed files with 116 additions and 69 deletions

View File

@ -232,7 +232,7 @@ ranging from 2 GB to 12 GB (see [clearml-fractional-gpu repository](https://gith
This example runs the ClearML Ubuntu 22 with CUDA 12.3 container on GPU 0, which is limited to use up to 8GB of its memory.
:::note
--pid=host is required to allow the driver to differentiate between the container's processes and other host processes when limiting memory usage
`--pid=host` is required to allow the driver to differentiate between the container's processes and other host processes when limiting memory usage
:::
1. Run the following command inside the container to verify that the fractional gpu memory limit is working correctly:
```bash

View File

@ -212,7 +212,7 @@ Example:
ClearML serving instances send serving statistics (count/latency) automatically to Prometheus and Grafana can be used
to visualize and create live dashboards.
The default docker-compose installation is preconfigured with Prometheus and Grafana. Notice that by default data/ate
The default `docker-compose` installation is preconfigured with Prometheus and Grafana. Notice that by default data/ate
of both containers is *not* persistent. To add persistence, adding a volume mount is recommended.
You can also add many custom metrics on the input/predictions of your models. Once a model endpoint is registered,

View File

@ -22,7 +22,7 @@ The values in the ClearML configuration file can be overridden by environment va
and command-line arguments.
:::
# Editing Your Configuration File
## Editing Your Configuration File
To add, change, or delete options, edit your configuration file.
@ -1548,7 +1548,7 @@ environment {
}
```
### files section
### files section
**`files`** (*dict*)

View File

@ -5,8 +5,8 @@ title: Linux and macOS
Deploy the ClearML Server in Linux or macOS using the pre-built Docker image.
For ClearML docker images, including previous versions, see [https://hub.docker.com/r/allegroai/clearml](https://hub.docker.com/r/allegroai/clearml).
However, pulling the ClearML Docker image directly is not required. ClearML provides a docker-compose YAML file that does this.
The docker-compose file is included in the instructions on this page.
However, pulling the ClearML Docker image directly is not required. ClearML provides a `docker-compose` YAML file that does this.
The `docker-compose` file is included in the instructions on this page.
For information about upgrading ClearML Server in Linux or macOS, see [here](upgrade_server_linux_mac.md).
@ -134,7 +134,7 @@ Deploying the server requires a minimum of 8 GB of memory, 16 GB is recommended.
sudo chown -R $(whoami):staff /opt/clearml
```
2. Download the ClearML Server docker-compose YAML file.
2. Download the ClearML Server `docker-compose` YAML file:
```
sudo curl https://raw.githubusercontent.com/clearml/clearml-server/master/docker/docker-compose.yml -o /opt/clearml/docker-compose.yml
```

View File

@ -54,7 +54,7 @@ Deploying the server requires a minimum of 8 GB of memory, 16 GB is recommended.
mkdir c:\opt\clearml\logs
```
1. Save the ClearML Server docker-compose YAML file.
1. Save the ClearML Server `docker-compose` YAML file.
```
curl https://raw.githubusercontent.com/clearml/clearml-server/master/docker/docker-compose-win10.yml -o c:\opt\clearml\docker-compose-win10.yml

View File

@ -13,7 +13,7 @@ without any coding. Applications are installed on top of the ClearML Server.
To run application you will need the following:
* RAM: Make sure you have at least 400 MB of RAM per application instance.
* Applications Service: Make sure that the applications agent service is up and running on your server:
* If you are using a docker-compose solution, make sure that the clearml-apps-agent service is running.
* If you are using a `docker-compose` solution, make sure that the clearml-apps-agent service is running.
* If you are using a Kubernetes cluster, check for the clearml-clearml-enterprise-apps component.
* Installation Files: Each application has its installation zip file. Make sure you have the relevant files for the
applications you wish to install.

View File

@ -13,11 +13,11 @@ The Application Gateway is available under the ClearML Enterprise plan.
* Credentials for the ClearML/allegroai docker repository
* A valid ClearML Server installation
## Host configurations
## Host Configurations
### Docker installation
### Docker Installation
Installing docker and docker-compose might vary depending on the specific operating system youre using. Here is an example for AmazonLinux:
Installing `docker` and `docker-compose` might vary depending on the specific operating system youre using. Here is an example for AmazonLinux:
```
sudo dnf -y install docker
@ -33,9 +33,9 @@ sudo docker login
Use the ClearML/allegroai dockerhub credentials when prompted by docker login.
### Docker-compose file
### Docker-compose File
This is an example of the docker-compose file you will need:
This is an example of the `docker-compose` file you will need:
```
version: '3.5'
@ -103,17 +103,17 @@ Edit it according to the following guidelines:
* `CLEARML_API_ACCESS_KEY`: ClearML server api key
* `CLEARML_API_SECRET_KEY`: ClearML server secret key
* `ROUTER_URL`: URL for this router that was previously configured in the load balancer starting with `https://`
* `ROUTER_NAME`: unique name for this router
* `AUTH_ENABLED`: enable or disable http calls authentication when the router is communicating with the ClearML server
* `SSL_VERIFY`: enable or disable SSL certificate validation when the router is communicating with the ClearML server
* `AUTH_COOKIE_NAME`: the cookie name used by the ClearML server to store the ClearML authentication cookie. This can usually be found in the `value_prefix` key starting with `allegro_token` in `envoy.yaml` file in the ClearML server installation (`/opt/allegro/config/envoy/envoy.yaml`) (see below)
* `AUTH_SECURE_ENABLED`: enable the Set-Cookie `secure` parameter
* `AUTH_BASE64_JWKS_KEY`: value form `k` key in the `jwks.json` file in the ClearML server installation
* `LISTEN_QUEUE_NAME`: (optional) name of queue to check for tasks (if none, every task is checked)
* `EXTRA_BASH_COMMAND`: command to be launched before starting the router
* `TCP_ROUTER_ADDRESS`: router external address, can be an IP or the host machine or a load balancer hostname, depends on network configuration
* `TCP_PORT_START`: start port for the TCP Session feature
* `TCP_PORT_END`: end port port for the TCP Session feature
* `ROUTER_NAME`: Unique name for this router
* `AUTH_ENABLED`: Enable or disable http calls authentication when the router is communicating with the ClearML server
* `SSL_VERIFY`: Enable or disable SSL certificate validation when the router is communicating with the ClearML server
* `AUTH_COOKIE_NAME`: Cookie name used by the ClearML server to store the ClearML authentication cookie. This can usually be found in the `value_prefix` key starting with `allegro_token` in `envoy.yaml` file in the ClearML server installation (`/opt/allegro/config/envoy/envoy.yaml`) (see below)
* `AUTH_SECURE_ENABLED`: Enable the Set-Cookie `secure` parameter
* `AUTH_BASE64_JWKS_KEY`: Value form `k` key in the `jwks.json` file in the ClearML server installation
* `LISTEN_QUEUE_NAME`: (*optional*) Name of queue to check for tasks (if none, every task is checked)
* `EXTRA_BASH_COMMAND`: Command to be launched before starting the router
* `TCP_ROUTER_ADDRESS`: Router external address, can be an IP or the host machine or a load balancer hostname, depends on network configuration
* `TCP_PORT_START`: Start port for the TCP Session feature
* `TCP_PORT_END`: End port for the TCP Session feature
Run the following command to start the router:
@ -121,11 +121,11 @@ Run the following command to start the router:
sudo docker compose --env-file runtime.env up -d
```
:::Note How to find my jwkskey
:::note How to find my jwkskey
The *JSON Web Key Set* (*JWKS*) is a set of keys containing the public keys used to verify any JSON Web Token (JWT).
In a docker-compose server installation, this can be found in the `CLEARML__secure__auth__token_secret` env var in the apiserver server component.
In a `docker-compose` server installation, this can be found in the `CLEARML__secure__auth__token_secret` env var in the apiserver server component.
:::

View File

@ -32,9 +32,9 @@ https://raw.githubusercontent.com/clearml/clearml-enterprise-helm-charts/gh-page
--password <GITHUB_TOKEN>
```
### Prepare values
### Prepare Values
Before installing the TTR create an helm-override files named `task-traffic-router.values-override.yaml`:
Before installing the TTR, create a `helm-override` files named `task-traffic-router.values-override.yaml`:
```
imageCredentials:
@ -55,20 +55,20 @@ tcpSession:
end:
```
Edit it accordingly to this guidelines:
Edit it accordingly to these guidelines:
* `clearml.apiServerUrlReference`: url usually starting with `https://api.`
* `clearml.apiServerUrlReference`: URL usually starting with `https://api.`
* `clearml.apiServerKey`: ClearML server api key
* `clearml.apiServerSecret`: ClearML server secret key
* `ingress.hostName`: url of router we configured previously for loadbalancer starting with `https://`
* `clearml.sslVerify`: enable or disable SSL certificate validation on apiserver calls check
* `clearml.authCookieName`: value from `value_prefix` key starting with `allegro_token` in `envoy.yaml` file in ClearML server installation.
* `clearml.jwksKey`: value form `k` key in `jwks.json` file in ClearML server installation (see below)
* `tcpSession.routerAddress`: router external address can be an IP or the host machine or a loadbalancer hostname, depends on the network configuration
* `tcpSession.portRange.start`: start port for the TCP Session feature
* `tcpSession.portRange.end`: end port port for the TCP Session feature
* `ingress.hostName`: URL of router we configured previously for load balancer starting with `https://`
* `clearml.sslVerify`: Enable or disable SSL certificate validation on apiserver calls check
* `clearml.authCookieName`: Value from `value_prefix` key starting with `allegro_token` in `envoy.yaml` file in ClearML server installation.
* `clearml.jwksKey`: Value form `k` key in `jwks.json` file in ClearML server installation (see below)
* `tcpSession.routerAddress`: Router external address can be an IP or the host machine or a load balancer hostname, depends on the network configuration
* `tcpSession.portRange.start`: Start port for the TCP Session feature
* `tcpSession.portRange.end`: End port for the TCP Session feature
::: How to find my jwkskey
:::note How to find my jwkskey
The *JSON Web Key Set* (*JWKS*) is a set of keys containing the public keys used to verify any JSON Web Token (JWT).

View File

@ -36,7 +36,7 @@ them before exporting.
Execute the data tool within the `apiserver` container.
Open a bash session inside the `apiserver` container of the server:
* In docker-compose:
* In `docker-compose`:
```commandline
sudo docker exec -it clearml-apiserver /bin/bash

View File

@ -100,9 +100,10 @@ Install the ClearML chart with the required configuration:
1. Prepare the `overrides.yaml` file and input the following content. Make sure to replace `<BASE_DOMAIN>` and `<SSO_*>`
with a valid domain that will have records pointing to the ingress controller accordingly.
The credentials specified in `<SUPERVISOR_USER_KEY>` and `<SUPERVISOR_USER_SECRET>` can be used to log in as the
supervisor user in the web UI.
supervisor user in the web UI.
Note that the `<SUPERVISOR_USER_EMAIL>` value must be explicitly quoted. To do so, put `\\"` around the quoted value.
For example `"\\"email@example.com\\””`
For example `"\\"email@example.com\\””`.
```
imageCredentials:
@ -192,7 +193,7 @@ Install the ClearML chart with the required configuration:
enabled: true
```
2. Install ClearML
2. Install ClearML:
```
helm install -n clearml \\
@ -305,9 +306,9 @@ spec:
kubernetes.io/metadata.name: clearml
```
## Applications Installation
## Application Installation
To install ClearML GUI applications, follow these steps:
To install ClearML GUI applications:
1. Get the apps to install and the installation script by downloading and extracting the archive provided by ClearML
@ -491,7 +492,7 @@ To install the ClearML Agent Chart, follow these steps:
-d '{"name":"default"}'
```
### Tenant Namespace isolation with NetworkPolicies
### Tenant Namespace Isolation with NetworkPolicies
To ensure network isolation for each tenant, you need to create a `NetworkPolicy` in the tenant namespace. This way
the entire namespace/tenant will not accept any connection from other namespaces.

View File

@ -43,7 +43,7 @@ should be reviewed and modified prior to the server installation
## Installing ClearML Server
### Preliminary Steps
1. Install Docker CE
1. Install Docker CE:
```
https://docs.docker.com/install/linux/docker-ce/ubuntu/
@ -113,10 +113,10 @@ should be reviewed and modified prior to the server installation
sudo systemctl enable disable-thp
```
1. Restart the machine
1. Restart the machine.
### Installing the Server
1. Remove any previous installation of ClearML Server
1. Remove any previous installation of ClearML Server:
```
sudo rm -R /opt/clearml/
@ -141,7 +141,7 @@ should be reviewed and modified prior to the server installation
sudo mkdir -pv /opt/allegro/config/onprem_poc
```
1. Copy the following ClearML configuration files to `/opt/allegro`
1. Copy the following ClearML configuration files to `/opt/allegro`:
* `constants.env`
* `docker-compose.override.yml`
* `docker-compose.yml`
@ -165,10 +165,13 @@ should be reviewed and modified prior to the server installation
sudo docker login -u=$DOCKERHUB_USER -p=$DOCKERHUB_PASSWORD
```
1. Start the `docker-compose` by changing directories to the directory containing the docker-compose files and running the following command:
sudo docker-compose --env-file constants.env up -d
1. Verify web access by browsing to your URL (IP address) and port 8080.
1. Start the `docker-compose` by changing directories to the directory containing the `docker-compose` files and running the following command:
```
sudo docker-compose --env-file constants.env up -d
```
1. Verify web access by browsing to your URL (IP address) and port 8080:
```
http://<server_ip_here>:8080
@ -191,7 +194,10 @@ the following subdomains should be forwarded to the corresponding ports on the s
* `https://app.<domain>` should be forwarded to port 8080
* `https://files.<domain>` should be forwarded to port 8081
:::warning
**Critical: Ensure no other ports are open to maintain the highest level of security.**
:::
Additionally, ensure that the following URLs are correctly configured in the server's environment file:

View File

@ -8,7 +8,7 @@ It covers the following:
* Set up security groups and IAM role
* Create EC2 instance with required disks
* Install dependencies and mount disks
* Deploy ClearML version using docker-compose
* Deploy ClearML version using `docker-compose`
* Set up load balancer and DNS
* Set up server backup
@ -117,10 +117,10 @@ Instance requirements:
## Load Balancer
1. Create a TLS certificate:
1. Choose a domain name to be used with the server. The main URL that will be used by the systems users will be app.\<domain\>
1. Choose a domain name to be used with the server. The main URL that will be used by the systems users will be `app.<domain>`
2. Create a certificate, with the following DNS names:
1. \<domain name\>
2. \*.\<domain name\>
1. `<domain name>`
2. `*.<domain name>`
2. Create the `envoy` target group for the server:
1. Port: 10000
@ -284,7 +284,7 @@ log would usually indicate the reason for the failure.
## Maintenance
### Removing app containers
### Removing App Containers
To remove old application containers, add the following to the cron:

View File

@ -31,7 +31,7 @@ The pip package also includes `clearml-data`. It can help you keep track of your
Both the 2 magic lines and the data tool will send all of their information to a ClearML server. This server then keeps an overview of your experiment runs and data sets over time, so you can always go back to a previous experiment, see how it was created and even recreate it exactly. Keep track of your best models by creating leaderboards based on your own metrics, and you can even directly compare multiple experiment runs, helping you to figure out the best way forward for your models.
To get started with a server right away, you can make use of the free tier. And when your needs grow, we've got you covered too! Just check out our website to find a tier that fits your organisation best. But, because we're open source, you can also host your own completely for free. We have AWS images, Google Cloud images, you can run it on docker-compose locally or even, if you really hate yourself, run it on a self-hosted kubernetes cluster using our helm charts.
To get started with a server right away, you can make use of the free tier. And when your needs grow, we've got you covered too! Just check out our website to find a tier that fits your organisation best. But, because we're open source, you can also host your own completely for free. We have AWS images, Google Cloud images, you can run it on `docker-compose` locally or even, if you really hate yourself, run it on a self-hosted kubernetes cluster using our helm charts.
So, to recap: to get started, all you need is a pip package and a server to store everything. Easy right? But MLOps is much more than experiment and data management. It's also about automation and orchestration, which is exactly where the `clearml-agent` comes into play.

View File

@ -8,6 +8,7 @@ ClearML seamlessly integrates with a wide range of popular machine learning fram
* [Keras](keras.md)
* [YOLO v5](yolov5.md)
* [YOLO v8](yolov8.md)
* [Hugging Face Transformers](transformers.md)
* [MMEngine](mmengine.md)
* [MMCV](mmcv.md)
* [MONAI](monai.md)

View File

@ -77,16 +77,29 @@ cloud of your choice (AWS, GCP, Azure) and automatically deploy ClearML agents:
and shuts down instances as needed, according to a resource budget that you set.
### Cloning, Editing, and Enqueuing
### Reproducing Tasks
![Cloning, editing, enqueuing gif](../img/gif/integrations_yolov5.gif#light-mode-only)
![Cloning, editing, enqueuing gif](../img/gif/integrations_yolov5_dark.gif#dark-mode-only)
Use ClearML's web interface to edit task details, like configuration parameters or input models, then execute the task
with the new configuration on a remote machine:
* Clone the task
* Edit the hyperparameters and/or other details
* Enqueue the task
Use ClearML's web interface to reproduce tasks and edit their details, like hyperparameters or input models, then execute the tasks
with the new configuration on a remote machine.
When ClearML is integrated into a script, it captures and stores configurations, such as hyperparameters
and model settings. When executing a task, the ClearML Agent will, by default, override runtime configuration values
(such as hyperparameters and environment variables) with the values specified in the task.
However, for tasks using Transformers, the default behavior is different. By default, Transformers tasks ignore UI
overrides and use execution-time parameters (such as environment variables). This is done to prevent potential issues
with environment-specific settings when running tasks on different machines.
**To rerun a task with modified configuration:**
1. Clone the task
1. Edit the hyperparameters and/or other details.
1. In the **CONFIGURATION > HYPERPARAMETERS > Transformers** section, set both `_ignore_hparams_ui_overrides_` and `_ignore_model_config_ui_overrides_`
to `False` . This allows the task to use the new hyperparameter and model
configuration values respectively during execution.
1. Enqueue the task
The ClearML Agent executing the task will use the new values to [override any hard coded values](../clearml_agent.md).

View File

@ -0,0 +1,17 @@
---
title: Version 1.18
---
### ClearML 1.18.0
**New Features**
* Add support for IP overriding with `CLEARML_AGENT_HOST_IP` environment variable
* Add port mapping support (requires `clearml-agent` v2.0 and up)
**Bug Fixes**
* Fix syntax warnings with Python 3.12 ([ClearML GitHub issue #1318](https://github.com/clearml/clearml/issues/1318))
* Fix adding dataset folder with modified files will upload all files instead of just the modified ones
* Fix detecting git branch in detached HEAD state
* Fix issue with A100 GPU monitoring
* Fix single series labels shown incorrectly in plotly histogram,

View File

@ -25,7 +25,7 @@ The WebApp's sidebar provides access to the following modules:
* Share tasks and their models with other ClearML hosted service users
* <img src="/docs/latest/icons/ico-side-bar-datasets.svg" alt="Datasets" className="icon size-md space-md" />[Datasets](datasets/webapp_dataset_page.md) - View and manage your datasets.
* <img src="/docs/latest/icons/ico-pipelines.svg" alt="Pipelines" className="icon size-md space-md" />[Pipelines](pipelines/webapp_pipeline_page.md) - View and manage your pipelines.
* <img src="/docs/latest/icons/ico-model-endpoints.svg" alt="Model endpoints" className="icon size-md space-md" />[Model Endpoints](webapp_model_endpoints.md) - Monitor your live model endpoints (available in the ClearML Enterprise plan).
* <img src="/docs/latest/icons/ico-model-endpoints.svg" alt="Model endpoints" className="icon size-md space-md" />[Model Endpoints](webapp_model_endpoints.md) - Monitor your live model endpoints.
* <img src="/docs/latest/icons/ico-reports.svg" alt="Reports" className="icon size-md space-md" />[Reports](webapp_reports.md) - View and manage your reports.
* <img src="/docs/latest/icons/ico-workers.svg" alt="Workers and Queues" className="icon size-md space-md" />[Orchestration](webapp_workers_queues.md) - Autoscale, monitor, and manage your resource usage and workers queues.
* <img src="/docs/latest/icons/ico-applications.svg" alt="ClearML Apps" className="icon size-md space-md" />[Applications](applications/apps_overview.md) - ClearML's GUI applications for no-code workflow execution (available in the ClearML Pro and Enterprise plans).

View File

@ -327,9 +327,10 @@ module.exports = {
{
'Open Source':
[
'release_notes/sdk/open_source/ver_1_17',
'release_notes/sdk/open_source/ver_1_18',
{
'Older Versions': [
'release_notes/sdk/open_source/ver_1_17',
'release_notes/sdk/open_source/ver_1_16', 'release_notes/sdk/open_source/ver_1_15',
'release_notes/sdk/open_source/ver_1_14', 'release_notes/sdk/open_source/ver_1_13',
'release_notes/sdk/open_source/ver_1_12', 'release_notes/sdk/open_source/ver_1_11',
@ -643,7 +644,7 @@ module.exports = {
'deploying_clearml/enterprise_deploy/vpc_aws',
'deploying_clearml/enterprise_deploy/on_prem_ubuntu',
],
'Maintenance': [
'Maintenance and Migration': [
'deploying_clearml/enterprise_deploy/import_projects',
'deploying_clearml/enterprise_deploy/change_artifact_links',
'deploying_clearml/enterprise_deploy/delete_tenant',

View File

@ -536,6 +536,10 @@ html[data-theme="dark"] .footer__copyright {
background-image: url('data:image/svg+xml;utf8,<svg xmlns="http://www.w3.org/2000/svg" viewBox="0 0 16 16" height="16px" width="16px"><path d="M6.02945,10.20327a4.17382,4.17382,0,1,1,4.17382-4.17382A4.15609,4.15609,0,0,1,6.02945,10.20327Zm9.69195,4.2199L10.8989,9.59979A5.88021,5.88021,0,0,0,12.058,6.02856,6.00467,6.00467,0,1,0,9.59979,10.8989l4.82338,4.82338a.89729.89729,0,0,0,1.29912,0,.89749.89749,0,0,0-.00087-1.29909Z" fill="rgba(255,255,255,0.5)" /></svg>')
}
.navbar .DocSearch-Button {
border: 1px solid var(--ifm-color-emphasis-200);
}
/* GLOBAL SEARCH PAGE */
html[data-theme="dark"] input[class^="searchQueryInput"] {
color: #fff;
@ -793,6 +797,10 @@ html[data-theme="dark"] h4 a.hash-link {
color: #c7cdd2;
}
/* <h5> */
.markdown h5 {
--ifm-h5-font-size: 1.1rem;
}
/* admonition */
.admonition {