mirror of
https://github.com/clearml/clearml-docs
synced 2025-04-19 13:55:04 +00:00
Merge branch 'main' of https://github.com/allegroai/clearml-docs into images_3
This commit is contained in:
commit
bc103ee6ac
@ -40,7 +40,7 @@ it can't do that when running from a virtual environment.
|
||||
If the setup wizard's response indicates that a configuration file already exists, follow the instructions [here](#adding-clearml-agent-to-a-configuration-file).
|
||||
The wizard does not edit or overwrite existing configuration files.
|
||||
|
||||
1. At the command prompt `Paste copied configuration here:`, copy and paste the ClearML credentials and press **Enter**.
|
||||
1. At the command prompt `Paste copied configuration here:`, paste the ClearML credentials and press **Enter**.
|
||||
The setup wizard confirms the credentials.
|
||||
|
||||
```
|
||||
|
@ -68,7 +68,7 @@ pip install clearml
|
||||
The **LOCAL PYTHON** tab shows the data required by the setup wizard (a copy to clipboard action is available on
|
||||
hover).
|
||||
|
||||
1. At the command prompt `Paste copied configuration here:`, copy and paste the ClearML credentials.
|
||||
1. At the command prompt `Paste copied configuration here:`, paste the ClearML credentials.
|
||||
The setup wizard verifies the credentials.
|
||||
```console
|
||||
Detected credentials key="********************" secret="*******"
|
||||
|
@ -65,6 +65,7 @@ After invoking `Task.init` in a script, ClearML starts its automagical logging,
|
||||
* [argparse](../guides/reporting/hyper_parameters.md#argparse-command-line-options)
|
||||
* [Python Fire](../integrations/python_fire.md)
|
||||
* [LightningCLI](../integrations/pytorch_lightning.md)
|
||||
* [jsonargparse](../integrations/jsonargparse.md)
|
||||
* TensorFlow Definitions (`absl-py`)
|
||||
* [Hydra](../integrations/hydra.md) - ClearML logs the OmegaConf which holds all the configuration files, as well as values overridden during runtime.
|
||||
* **Models** - ClearML automatically logs and updates the models and all snapshot paths saved with the following frameworks:
|
||||
|
@ -120,8 +120,8 @@ The `wizard` section defines the entries to display in the application instance
|
||||
* `model`
|
||||
* `queue`
|
||||
* `dataset_version`
|
||||
* `display_field` - The field of the source object to display in the list. Usually “name”
|
||||
* `value_field` - The field of the source object to use for configuring the app instance. Usually “id”
|
||||
* `display_field` - The field of the source object to display in the list. Usually "name"
|
||||
* `value_field` - The field of the source object to use for configuring the app instance. Usually "id"
|
||||
* `filter` - Allows to limit the choices list by setting a filter on one or more of the object’s fields. See Project Selection example below
|
||||
* `target` - Where in the application instance’s task the values will be set. Contains the following:
|
||||
* `field` - Either `configuration` or `hyperparams`
|
||||
|
@ -35,9 +35,10 @@ If your ClearML Deployment does not have the App Gateway Router properly install
|
||||
|
||||
#### Installation
|
||||
|
||||
The App Gateway Router supports two deployment options:
|
||||
The App Gateway Router supports the following deployment options:
|
||||
|
||||
* [Docker Compose](appgw_install_compose.md)
|
||||
* [Docker Compose for hosted servers](appgw_install_compose_hosted.md)
|
||||
* [Kubernetes](appgw_install_k8s.md)
|
||||
|
||||
The deployment configuration specifies the external and internal address and port mappings for routing requests.
|
||||
|
@ -0,0 +1,166 @@
|
||||
---
|
||||
title: Docker-Compose - Hosted Server
|
||||
---
|
||||
|
||||
:::important Enterprise Feature
|
||||
The AI Application Gateway is available under the ClearML Enterprise plan.
|
||||
:::
|
||||
|
||||
The AI Application Gateway enables external access to ClearML tasks, and applications running on workload nodes that
|
||||
require HTTP or TCP access. The gateway is configured with an endpoint or external address, making these services
|
||||
accessible from the user's machine, outside the workload nodes’ network.
|
||||
|
||||
This guide details the installation of the App Gateway Router for ClearML users who use ClearML's hosted control
|
||||
plane while hosting their own workload nodes.
|
||||
|
||||
## Requirements
|
||||
|
||||
* Linux OS (x86) machine with root access
|
||||
* The machine needs to be reachable from your user network
|
||||
* The machine needs to have network reachability to workload nodes
|
||||
* Credentials for the ClearML docker repository
|
||||
* A valid ClearML Server installation
|
||||
|
||||
Additionally, for a secure connection, it is recommended to have a DNS entry and a valid SSL Certificate assigned to the machine IP.
|
||||
|
||||
## Host Configuration
|
||||
|
||||
### Docker Installation
|
||||
|
||||
Installing `docker` and `docker-compose` might vary depending on the specific operating system you're using. Here is an
|
||||
example for AmazonLinux:
|
||||
|
||||
```
|
||||
sudo dnf -y install docker
|
||||
DOCKER_CONFIG="/usr/local/lib/docker"
|
||||
sudo mkdir -p $DOCKER_CONFIG/cli-plugins
|
||||
sudo curl -SL https://github.com/docker/compose/releases/download/v2.17.3/docker-compose-linux-x86_64 -o $DOCKER_CONFIG/cli-plugins/docker-compose
|
||||
sudo chmod +x $DOCKER_CONFIG/cli-plugins/docker-compose
|
||||
sudo systemctl enable docker
|
||||
sudo systemctl start docker
|
||||
|
||||
sudo docker login
|
||||
```
|
||||
|
||||
Use the ClearML docker hub credentials when prompted by `docker` login.
|
||||
|
||||
### Docker-compose File
|
||||
|
||||
This is an example of the `docker-compose` file you will need to create:
|
||||
|
||||
```
|
||||
version: '3.5'
|
||||
services:
|
||||
task_traffic_webserver:
|
||||
image: clearml/ai-gateway-proxy:${PROXY_TAG:?err}
|
||||
network_mode: "host"
|
||||
restart: unless-stopped
|
||||
container_name: task_traffic_webserver
|
||||
volumes:
|
||||
- ./task_traffic_router/config/nginx:/etc/nginx/conf.d:ro
|
||||
- ./task_traffic_router/config/lua:/usr/local/openresty/nginx/lua:ro
|
||||
task_traffic_router:
|
||||
image: clearml/ai-gateway-router:${ROUTER_TAG:?err}
|
||||
restart: unless-stopped
|
||||
container_name: task_traffic_router
|
||||
volumes:
|
||||
- /var/run/docker.sock:/var/run/docker.sock
|
||||
- ./task_traffic_router/config/nginx:/etc/nginx/conf.d:rw
|
||||
- ./task_traffic_router/config/lua:/usr/local/openresty/nginx/lua:rw
|
||||
environment:
|
||||
- LOGGER_LEVEL=INFO
|
||||
- ROUTER__WEBSERVER__SERVER_PORT="8010"
|
||||
- ROUTER_NAME=${ROUTER_NAME:?err}
|
||||
- ROUTER_URL=${ROUTER_URL:?err}
|
||||
- CLEARML_API_HOST=${CLEARML_API_HOST:?err}
|
||||
- CLEARML_API_ACCESS_KEY=${CLEARML_API_ACCESS_KEY:?err}
|
||||
- CLEARML_API_SECRET_KEY=${CLEARML_API_SECRET_KEY:?err}
|
||||
- AUTH_COOKIE_NAME=${AUTH_COOKIE_NAME:?err}
|
||||
- AUTH_SECURE_ENABLED=${AUTH_SECURE_ENABLED}
|
||||
- TCP_ROUTER_ADDRESS=${TCP_ROUTER_ADDRESS}
|
||||
- TCP_PORT_START=${TCP_PORT_START}
|
||||
- TCP_PORT_END=${TCP_PORT_END}
|
||||
```
|
||||
|
||||
### Configuration File
|
||||
|
||||
You will be provided with a prefilled `runtime.env` file containing the following entries:
|
||||
|
||||
```
|
||||
# PREFILLED SECTION, PROVIDED BY CLEARML
|
||||
PROXY_TAG=
|
||||
ROUTER_TAG=
|
||||
CLEARML_API_HOST=https://api.
|
||||
AUTH_COOKIE_NAME=
|
||||
|
||||
# TO BE FILLED BY USER
|
||||
ROUTER_NAME=main-router
|
||||
ROUTER_URL=http://<ROUTER-HOST-PUBLIC-IP>:8010
|
||||
CLEARML_API_ACCESS_KEY=
|
||||
CLEARML_API_SECRET_KEY=
|
||||
AUTH_SECURE_ENABLED=true
|
||||
TCP_ROUTER_ADDRESS=<ROUTER-HOST-PUBLIC-IP>
|
||||
TCP_PORT_START=
|
||||
TCP_PORT_END=
|
||||
```
|
||||
|
||||
**Configuration Options:**
|
||||
|
||||
* `ROUTER_NAME`: In the case of [multiple routers on the same tenant](#multiple-router-in-the-same-tenant), each router
|
||||
needs to have a unique name.
|
||||
* `CLEARML_API_ACCESS_KEY, CLEARML_API_SECRET_KEY:` API credentials for Admin user or Service Account with admin privileges
|
||||
created in the ClearML web UI. Make sure to label these credentials clearly, so that they will not be revoked by mistake.
|
||||
* `ROUTER_URL`: External address to access the router. This can be the IP address or DNS of the node where the router
|
||||
is running, or the address of a load balancer if the router operates behind a proxy/load balancer. This URL is used
|
||||
to access AI workload applications (e.g. remote IDE, model deployment, etc.), so it must be reachable and resolvable for them.
|
||||
* `TCP_ROUTER_ADDRESS`: Router external address, can be an IP or the host machine or a load balancer hostname, depends on network configuration.
|
||||
* `TCP_PORT_START`: Start port for the TCP Tasks, chosen by the customer. Ensure that ports are open and can be allocated on the host.
|
||||
* `TCP_PORT_END`: End port for the TCP Tasks, chosen by the customer. Ensure that ports are open and can be allocated on the host.
|
||||
|
||||
### Installation
|
||||
|
||||
Run the following command to start the router:
|
||||
|
||||
```
|
||||
sudo docker compose --env-file runtime.env up -d
|
||||
```
|
||||
|
||||
### Advanced Configuration
|
||||
|
||||
#### Using Open HTTP
|
||||
|
||||
To deploy the App Gateway Router on open HTTP (without a certificate), set the `AUTH_SECURE_ENABLED` entry
|
||||
to `false` in the `runtime.env` file.
|
||||
|
||||
#### Multiple Router in the Same Tenant
|
||||
|
||||
If you have workloads running in separate networks that cannot communicate with each other, you need to deploy multiple
|
||||
routers, one for each isolated environment. Each router will only process tasks from designated queues, ensuring that
|
||||
tasks are correctly routed to agents within the same network.
|
||||
|
||||
For example:
|
||||
* If Agent A and Agent B are in separate networks, each must have its own router to receive tasks.
|
||||
* Router A will handle tasks from Agent A’s queues. Router B will handle tasks from Agent B’s queues.
|
||||
|
||||
To achieve this, each router must be configured with:
|
||||
* A unique `ROUTER_NAME`
|
||||
* A distinct set of queues defined in `LISTEN_QUEUE_NAME`.
|
||||
|
||||
##### Example Configuration
|
||||
Each router's `runtime.env` file should include:
|
||||
|
||||
* Router A:
|
||||
|
||||
```
|
||||
ROUTER_NAME=router-a
|
||||
LISTEN_QUEUE_NAME=queue1,queue2
|
||||
```
|
||||
|
||||
* Router B:
|
||||
|
||||
```
|
||||
ROUTER_NAME=router-b
|
||||
LISTEN_QUEUE_NAME=queue3,queue4
|
||||
```
|
||||
|
||||
Make sure `LISTEN_QUEUE_NAME` is set in the [`docker-compose` environment variables](#docker-compose-file) for each router instance.
|
@ -68,7 +68,7 @@ tcpSession:
|
||||
end:
|
||||
```
|
||||
|
||||
Configuration options:
|
||||
**Configuration options:**
|
||||
|
||||
* `imageCredentials.password`: ClearML DockerHub Access Token.
|
||||
* `clearml.apiServerKey`: ClearML server API key.
|
||||
|
@ -29,7 +29,8 @@ script changes the values in the databases, and can't be undone.
|
||||
## Fixing MongoDB links
|
||||
|
||||
1. Access the `apiserver` Docker container:
|
||||
* In `docker-compose:`
|
||||
|
||||
* In `docker-compose`:
|
||||
|
||||
```commandline
|
||||
sudo docker exec -it allegro-apiserver /bin/bash
|
||||
|
448
docs/deploying_clearml/enterprise_deploy/k8s.md
Normal file
448
docs/deploying_clearml/enterprise_deploy/k8s.md
Normal file
@ -0,0 +1,448 @@
|
||||
---
|
||||
title: Kubernetes
|
||||
---
|
||||
|
||||
|
||||
This guide provides step-by-step instructions for installing the ClearML Enterprise setup in a Kubernetes cluster.
|
||||
|
||||
|
||||
## Prerequisites
|
||||
|
||||
|
||||
* A Kubernetes cluster
|
||||
* An ingress controller (e.g. `nginx-ingress`) and the ability to create LoadBalancer services (e.g. MetalLB) if needed
|
||||
to expose ClearML
|
||||
* Credentials for ClearML Enterprise GitHub Helm chart repository
|
||||
* Credentials for ClearML Enterprise DockerHub repository
|
||||
* URL for downloading the ClearML Enterprise applications configuration
|
||||
|
||||
|
||||
## Control Plane Installation
|
||||
|
||||
|
||||
The following steps cover installing the control plane (server and required charts) and will
|
||||
require some or all of the tokens/deliverables mentioned above.
|
||||
|
||||
|
||||
### Requirements
|
||||
|
||||
|
||||
* Add the ClearML Enterprise repository:
|
||||
|
||||
|
||||
```
|
||||
helm repo add clearml-enterprise https://raw.githubusercontent.com/clearml/clearml-enterprise-helm-charts/gh-pages --username <clearmlenterprise_GitHub_TOKEN> --password <clearmlenterprise_GitHub_TOKEN>
|
||||
```
|
||||
|
||||
|
||||
* Update the repository locally:
|
||||
|
||||
|
||||
```
|
||||
helm repo update
|
||||
```
|
||||
|
||||
|
||||
### Install ClearML Enterprise Chart
|
||||
|
||||
|
||||
#### Configuration
|
||||
|
||||
|
||||
The Helm Chart must be installed with an `overrides.yaml` overriding values as follows:
|
||||
|
||||
|
||||
:::note
|
||||
In the following configuration, replace `<BASE_DOMAIN>` with a valid domain
|
||||
that will have records pointing to the cluster’s ingress controller (see ingress details in the values below).
|
||||
:::
|
||||
|
||||
|
||||
```
|
||||
imageCredentials:
|
||||
password: "<clearml_enterprise_DockerHub_TOKEN>"
|
||||
|
||||
|
||||
clearml:
|
||||
cookieDomain: "<BASE_DOMAIN>"
|
||||
# Set values for improved security
|
||||
apiserverKey: "<GENERATED_API_SERVER_KEY>"
|
||||
apiserverSecret: "<GENERATED_API_SERVER_SECRET>"
|
||||
fileserverKey: "<GENERATED_FILE_SERVER_KEY>"
|
||||
fileserverSecret: "<GENERATED_FILE_SERVER_SECRET>"
|
||||
secureAuthTokenSecret: "<GENERATED_AUTH_TOKEN_SECRET>"
|
||||
testUserKey: "<GENERATED_TEST_USER_KEY>"
|
||||
testUserSecret: "<GENERATED_TEST_USER_SECRET>"
|
||||
|
||||
|
||||
apiserver:
|
||||
ingress:
|
||||
enabled: true
|
||||
hostName: "api.<BASE_DOMAIN>"
|
||||
service:
|
||||
type: ClusterIP
|
||||
extraEnvs:
|
||||
- name: CLEARML__services__organization__features__user_management_advanced
|
||||
value: "true"
|
||||
- name: CLEARML__services__auth__ui_features_per_role__user__show_datasets
|
||||
value: "false"
|
||||
- name: CLEARML__services__auth__ui_features_per_role__user__show_orchestration
|
||||
value: "false"
|
||||
- name: CLEARML__services__workers__resource_usages__supervisor_company
|
||||
value: "<SUPERVISOR_TENANT_ID>"
|
||||
- name: CLEARML__secure__credentials__supervisor__role
|
||||
value: "system"
|
||||
- name: CLEARML__secure__credentials__supervisor__allow_login
|
||||
value: "true"
|
||||
- name: CLEARML__secure__credentials__supervisor__user_key
|
||||
value: "<SUPERVISOR_USER_KEY>"
|
||||
- name: CLEARML__secure__credentials__supervisor__user_secret
|
||||
value: "<SUPERVISOR_USER_SECRET>"
|
||||
- name: CLEARML__secure__credentials__supervisor__sec_groups
|
||||
value: "[\"users\", \"admins\", \"queue_admins\"]"
|
||||
- name: CLEARML__secure__credentials__supervisor__email
|
||||
value: "\"<SUPERVISOR_USER_EMAIL>\""
|
||||
- name: CLEARML__apiserver__company__unique_names
|
||||
value: "true"
|
||||
|
||||
|
||||
fileserver:
|
||||
ingress:
|
||||
enabled: true
|
||||
hostName: "file.<BASE_DOMAIN>"
|
||||
service:
|
||||
type: ClusterIP
|
||||
|
||||
|
||||
webserver:
|
||||
ingress:
|
||||
enabled: true
|
||||
hostName: "app.<BASE_DOMAIN>"
|
||||
service:
|
||||
type: ClusterIP
|
||||
|
||||
|
||||
clearmlApplications:
|
||||
enabled: true
|
||||
```
|
||||
|
||||
|
||||
The credentials specified in `<SUPERVISOR_USER_KEY>` and `<SUPERVISOR_USER_SECRET>` can be used to login as the
|
||||
supervisor user from the ClearML Web UI accessible using the URL `app.<BASE_DOMAIN>`.
|
||||
|
||||
|
||||
Note that the `<SUPERVISOR_USER_EMAIL>` value must be explicitly quoted. To do so, put `\"` around the quoted value.
|
||||
For example `"\"email@example.com\""`.
|
||||
|
||||
|
||||
#### Additional Configuration Options
|
||||
##### Fixed Users (Simple Login)
|
||||
|
||||
|
||||
Enable static login with username and password in `overrides.yaml`.
|
||||
|
||||
|
||||
This is an optional step in case SSO (Identity provider) configuration will not be performed.
|
||||
|
||||
|
||||
```
|
||||
apiserver:
|
||||
additionalConfigs:
|
||||
apiserver.conf: |
|
||||
auth {
|
||||
fixed_users {
|
||||
enabled: true
|
||||
pass_hashed: false
|
||||
users: [
|
||||
{
|
||||
username: "my_user"
|
||||
password: "my_password"
|
||||
name: "My User"
|
||||
admin: true
|
||||
},
|
||||
]
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
|
||||
##### SSO (Identity Provider)
|
||||
|
||||
|
||||
The following examples (Auth0 and Keycloak) show how to configure an identity provider on the ClearML server.
|
||||
|
||||
|
||||
Add the following values configuring `extraEnvs` for `apiserver` in the `clearml-enterprise` values `override.yaml` file.
|
||||
|
||||
|
||||
Substitute all `<PLACEHOLDER>`s with the correct value for your configuration.
|
||||
|
||||
|
||||
##### Auth0 Identity Provider
|
||||
|
||||
|
||||
```
|
||||
apiserver:
|
||||
extraEnvs:
|
||||
- name: CLEARML__secure__login__sso__oauth_client__auth0__client_id
|
||||
value: "<SSO_CLIENT_ID>"
|
||||
- name: CLEARML__secure__login__sso__oauth_client__auth0__client_secret
|
||||
value: "<SSO_CLIENT_SECRET>"
|
||||
- name: CLEARML__services__login__sso__oauth_client__auth0__base_url
|
||||
value: "<SSO_CLIENT_URL>"
|
||||
- name: CLEARML__services__login__sso__oauth_client__auth0__authorize_url
|
||||
value: "<SSO_CLIENT_AUTHORIZE_URL>"
|
||||
- name: CLEARML__services__login__sso__oauth_client__auth0__access_token_url
|
||||
value: "<SSO_CLIENT_ACCESS_TOKEN_URL>"
|
||||
- name: CLEARML__services__login__sso__oauth_client__auth0__audience
|
||||
value: "<SSO_CLIENT_AUDIENCE>"
|
||||
```
|
||||
|
||||
|
||||
##### Keycloak Identity Provider
|
||||
|
||||
|
||||
```
|
||||
apiserver:
|
||||
extraEnvs:
|
||||
- name: CLEARML__secure__login__sso__oauth_client__keycloak__client_id
|
||||
value: "<KC_CLIENT_ID>"
|
||||
- name: CLEARML__secure__login__sso__oauth_client__keycloak__client_secret
|
||||
value: "<KC_SECRET_ID>"
|
||||
- name: CLEARML__services__login__sso__oauth_client__keycloak__base_url
|
||||
value: "<KC_URL>/realms/<REALM_NAME>/"
|
||||
- name: CLEARML__services__login__sso__oauth_client__keycloak__authorize_url
|
||||
value: "<KC_URL>/realms/<REALM_NAME>/protocol/openid-connect/auth"
|
||||
- name: CLEARML__services__login__sso__oauth_client__keycloak__access_token_url
|
||||
value: "<KC_URL>/realms/<REALM_NAME>/protocol/openid-connect/token"
|
||||
- name: CLEARML__services__login__sso__oauth_client__keycloak__idp_logout
|
||||
value: "true"
|
||||
|
||||
|
||||
```
|
||||
|
||||
|
||||
#### Installing the Chart
|
||||
|
||||
|
||||
```
|
||||
helm install -n clearml \
|
||||
clearml \
|
||||
clearml-enterprise/clearml-enterprise \
|
||||
--create-namespace \
|
||||
-f overrides.yaml
|
||||
```
|
||||
|
||||
|
||||
### Install ClearML Agent Chart
|
||||
|
||||
|
||||
#### Configuration
|
||||
|
||||
|
||||
To configure the agent you will need to choose a Redis password and use that when setting up Redis as well
|
||||
(see [Shared Redis installation](multi_tenant_k8s.md#shared-redis-installation)).
|
||||
|
||||
|
||||
The Helm Chart must be installed with `overrides.yaml`:
|
||||
|
||||
|
||||
```
|
||||
imageCredentials:
|
||||
password: "<CLEARML_DOCKERHUB_TOKEN>"
|
||||
clearml:
|
||||
agentk8sglueKey: "<ACCESS_KEY>"
|
||||
agentk8sglueSecret: "<SECRET_KEY>"
|
||||
agentk8sglue:
|
||||
apiServerUrlReference: "https://api.<BASE_DOMAIN>"
|
||||
fileServerUrlReference: "https://files.<BASE_DOMAIN>"
|
||||
webServerUrlReference: "https://app.<BASE_DOMAIN>"
|
||||
defaultContainerImage: "python:3.9"
|
||||
```
|
||||
|
||||
|
||||
#### Installing the Chart
|
||||
|
||||
|
||||
```
|
||||
helm install -n <WORKLOAD_NAMESPACE> \
|
||||
clearml-agent \
|
||||
clearml-enterprise/clearml-enterprise-agent \
|
||||
--create-namespace \
|
||||
-f overrides.yaml
|
||||
```
|
||||
|
||||
|
||||
To create a queue by API:
|
||||
|
||||
|
||||
```
|
||||
curl $APISERVER_URL/queues.create \
|
||||
-H "Content-Type: application/json" \
|
||||
-H "X-Clearml-Impersonate-As:<USER_ID>" \
|
||||
-u $APISERVER_KEY:$APISERVER_SECRET \
|
||||
-d '{"name":"default"}'
|
||||
```
|
||||
|
||||
|
||||
## ClearML AI Application Gateway Installation
|
||||
|
||||
|
||||
### Configuring Chart
|
||||
|
||||
|
||||
The Helm Chart must be installed with `overrides.yaml`:
|
||||
|
||||
|
||||
```
|
||||
imageCredentials:
|
||||
password: "<DOCKERHUB_TOKEN>"
|
||||
clearml:
|
||||
apiServerKey: ""
|
||||
apiServerSecret: ""
|
||||
apiServerUrlReference: "https://api."
|
||||
authCookieName: ""
|
||||
ingress:
|
||||
enabled: true
|
||||
hostName: "task-router.dev"
|
||||
tcpSession:
|
||||
routerAddress: "<NODE_IP OR EXTERNAL_NAME>"
|
||||
portRange:
|
||||
start: <START_PORT>
|
||||
end: <END_PORT>
|
||||
```
|
||||
|
||||
|
||||
**Configuration options:**
|
||||
|
||||
|
||||
* **`clearml.apiServerUrlReference`:** URL usually starting with `https://api.`
|
||||
* **`clearml.apiServerKey`:** ClearML server API key
|
||||
* **`clearml.apiServerSecret`:** ClearML server secret key
|
||||
* **`ingress.hostName`:** URL of the router we configured previously for load balancer starting with `https://`
|
||||
* **`clearml.sslVerify`:** Enable or disable SSL certificate validation on apiserver calls check
|
||||
* **`clearml.authCookieName`:** Value from `value_prefix` key starting with `allegro_token` in `envoy.yaml` file in ClearML server installation.
|
||||
* **`tcpSession.routerAddress`**: Router external address can be an IP or the host machine or a load balancer hostname, depends on the network configuration
|
||||
* **`tcpSession.portRange.start`**: Start port for the TCP Session feature
|
||||
* **`tcpSession.portRange.end`**: End port for the TCP Session feature
|
||||
|
||||
|
||||
### Installing the Chart
|
||||
|
||||
|
||||
```
|
||||
helm install -n <WORKLOAD_NAMESPACE> \
|
||||
clearml-ttr \
|
||||
clearml-enterprise/clearml-enterprise-task-traffic-router \
|
||||
--create-namespace \
|
||||
-f overrides.yaml
|
||||
```
|
||||
|
||||
|
||||
|
||||
|
||||
## Applications Installation
|
||||
|
||||
|
||||
To install the ClearML Applications on the newly installed ClearML Enterprise control-plane, download the applications
|
||||
package using the URL provided by the ClearML staff.
|
||||
|
||||
|
||||
|
||||
|
||||
### Download and Extract
|
||||
|
||||
|
||||
```
|
||||
wget -O apps.zip "<ClearML enterprise applications configuration download url>"
|
||||
unzip apps.zip
|
||||
```
|
||||
|
||||
|
||||
### Adjust Application Docker Images Location (Air-Gapped Systems)
|
||||
|
||||
|
||||
ClearML applications use pre-built docker images provided by ClearML on the ClearML DockerHub
|
||||
repository. If you are using an air-gapped system, these images must be available as part of your internal docker
|
||||
registry, and the correct docker images location must be specified before installing the applications.
|
||||
|
||||
|
||||
Use the following script to adjust the applications packages accordingly before installing the applications:
|
||||
|
||||
|
||||
```
|
||||
python convert_image_registry.py \
|
||||
--apps-dir /path/to/apps/ \
|
||||
--repo local_registry/clearml-apps
|
||||
```
|
||||
|
||||
|
||||
The script will change the application zip files to point to the new registry, and will output the list of containers
|
||||
that need to be copied to the local registry. For example:
|
||||
|
||||
|
||||
```
|
||||
make sure allegroai/clearml-apps:hpo-1.10.0-1062 was added to local_registry/clearml-apps
|
||||
```
|
||||
|
||||
|
||||
### Install Applications
|
||||
|
||||
|
||||
Use the `upload_apps.py` script to upload the application packages to the ClearML server:
|
||||
|
||||
|
||||
```
|
||||
python upload_apps.py \
|
||||
--host $APISERVER_ADDRESS \
|
||||
--user $APISERVER_USER --password $APISERVER_PASSWORD \
|
||||
--dir apps -ml
|
||||
```
|
||||
|
||||
|
||||
## Configuring Shared Memory for Large Model Deployment
|
||||
|
||||
|
||||
Deploying large models may fail due to shared memory size limitations. This issue commonly arises when the allocated
|
||||
`/dev/shm` space is insufficient.:
|
||||
|
||||
|
||||
```
|
||||
> 3d3e22c3066f:168:168 [0] misc/shmutils.cc:72 NCCL WARN Error: failed to extend /dev/shm/nccl-UbzKZ9 to 9637892 bytes
|
||||
> 3d3e22c3066f:168:168 [0] misc/shmutils.cc:113 NCCL WARN Error while creating shared memory segment /dev/shm/nccl-UbzKZ9 (size 9637888)
|
||||
> 3d3e22c3066f:168:168 [0] NCCL INFO transport/shm.cc:114 -> 2
|
||||
> 3d3e22c3066f:168:168 [0] NCCL INFO transport.cc:33 -> 2
|
||||
> 3d3e22c3066f:168:168 [0] NCCL INFO transport.cc:113 -> 2
|
||||
> 3d3e22c3066f:168:168 [0] NCCL INFO init.cc:1263 -> 2
|
||||
> 3d3e22c3066f:168:168 [0] NCCL INFO init.cc:1548 -> 2
|
||||
> 3d3e22c3066f:168:168 [0] NCCL INFO init.cc:1799 -> 2
|
||||
```
|
||||
|
||||
|
||||
To configure a proper SHM size you can use the following configuration in the agent `overrides.yaml`.
|
||||
|
||||
|
||||
Replace `<SIZE>` with the desired memory allocation in GiB, based on your model requirements.
|
||||
|
||||
|
||||
This example configures a specific queue, but you can include this setting in the `basePodTemplate` if you need to
|
||||
apply it to all tasks.
|
||||
|
||||
|
||||
```
|
||||
agentk8sglue:
|
||||
queues:
|
||||
GPUshm:
|
||||
templateOverrides:
|
||||
env:
|
||||
- name: VLLM_SKIP_P2P_CHECK
|
||||
value: "1"
|
||||
volumeMounts:
|
||||
- name: dshm
|
||||
mountPath: /dev/shm
|
||||
volumes:
|
||||
- name: dshm
|
||||
emptyDir:
|
||||
medium: Memory
|
||||
sizeLimit: <SIZE>Gi
|
||||
```
|
@ -337,7 +337,7 @@ must be substituted with valid domain names or values from responses.
|
||||
APISERVER_SECRET="<APISERVER_SECRET>"
|
||||
```
|
||||
|
||||
2. Create a *Tenant* (company):
|
||||
2. Create a **Tenant** (company):
|
||||
|
||||
```
|
||||
curl $APISERVER_URL/system.create_company \\
|
||||
@ -352,7 +352,7 @@ must be substituted with valid domain names or values from responses.
|
||||
curl -u $APISERVER_KEY:$APISERVER_SECRET $APISERVER_URL/system.get_companies
|
||||
```
|
||||
|
||||
3. Create an *Admin User*:
|
||||
3. Create an **Admin User**:
|
||||
|
||||
```
|
||||
curl $APISERVER_URL/auth.create_user \\
|
||||
@ -363,7 +363,7 @@ must be substituted with valid domain names or values from responses.
|
||||
|
||||
This returns the new User ID (`<USER_ID>`).
|
||||
|
||||
4. Generate *Credentials* for the new Admin User:
|
||||
4. Generate **Credentials** for the new Admin User:
|
||||
|
||||
```
|
||||
curl $APISERVER_URL/auth.create_credentials \\
|
||||
@ -374,7 +374,7 @@ must be substituted with valid domain names or values from responses.
|
||||
|
||||
This returns a set of key and secret credentials associated with the new Admin User.
|
||||
|
||||
5. Create an SSO Domain *Whitelist*. The `<USERS_EMAIL_DOMAIN>` is the email domain setup for users to access through SSO.
|
||||
5. Create an SSO Domain **Whitelist**. The `<USERS_EMAIL_DOMAIN>` is the email domain setup for users to access through SSO.
|
||||
|
||||
```
|
||||
curl $APISERVER_URL/login.set_domains \\
|
||||
@ -541,3 +541,270 @@ Install the App Gateway Router in your Kubernetes cluster, allowing it to manage
|
||||
-f overrides.yaml
|
||||
```
|
||||
|
||||
## Configuring Options per Tenant
|
||||
|
||||
### Override Options When Creating a New Tenant
|
||||
|
||||
When creating a new tenant company, you can specify several tenant options. These include:
|
||||
|
||||
* `features` - Add features to a company
|
||||
* `exclude_features` - Exclude features from a company.
|
||||
* `allowed_users` - Set the maximum number of users for a company.
|
||||
|
||||
#### Example: Create a New Tenant with a Specific Feature Set
|
||||
|
||||
```
|
||||
curl $APISERVER_URL/system.create_company \
|
||||
-H "Content-Type: application/json" \
|
||||
-u $APISERVER_KEY:$APISERVER_SECRET \
|
||||
-d '{"name":"<TENANT_NAME>", "defaults": { "allowed_users": "10", "features": ["experiments"], "exclude_features": ["app_management", "applications", "user_management"] }}'
|
||||
```
|
||||
|
||||
**Note**: make sure to replace the `<TENANT_NAME>` placeholder.
|
||||
|
||||
### Limit Features for all Users
|
||||
|
||||
This Helm Chart value in the `overrides.yaml` will have priority over all tenants, and will limit the features
|
||||
available to any user in the system. This means that even if the feature is enabled for the tenant, if it's not in this
|
||||
list, the user will not see it.
|
||||
|
||||
Example: all users will only have the `applications` feature enabled.
|
||||
|
||||
```
|
||||
apiserver:
|
||||
extraEnvs:
|
||||
- name: CLEARML__services__auth__default_groups__users__features
|
||||
value: "[\"applications\"]"
|
||||
```
|
||||
|
||||
**Available Features**:
|
||||
|
||||
* `applications` - Viewing and running applications
|
||||
* `data_management` - Working with hyper-datasets and dataviews
|
||||
* `experiments` - Viewing experiment table and launching experiments
|
||||
* `queues` - Viewing the queues screen
|
||||
* `queue_management` - Creating and deleting queues
|
||||
* `pipelines` - Viewing/managing pipelines in the system
|
||||
* `reports` - Viewing and managing reports in the system
|
||||
* `show_dashboard` - Show the dashboard screen
|
||||
* `show_projects` - Show the projects menu option
|
||||
* `resource_dashboard` - Display the resource dashboard in the orchestration page
|
||||
|
||||
|
||||
## Configuring Groups
|
||||
|
||||
Groups in ClearML are used to manage user permissions and control access to specific features within the platform.
|
||||
The following section explains the different types of groups and how to configure them, with a focus on configuration-based,
|
||||
cross-tenant groups.
|
||||
|
||||
### Types of Groups
|
||||
|
||||
ClearML utilizes several types of groups:
|
||||
* **Built-in Groups** - These groups exist by default in every ClearML installation:
|
||||
* **`users`**: All registered users automatically belong to this group. It typically defines the baseline set of
|
||||
permissions and features available to everyone.
|
||||
* **`admins`**: Users in this group have administrative privileges.
|
||||
* **`queue_admins`**: Users in this group have specific permissions to manage execution queues.
|
||||
* **Tenant-Specific Groups (UI)** - Additional groups can be created specific to a tenant (organization workspace)
|
||||
directly through the ClearML Web UI (under **Settings > Users & Groups**). Users can be assigned to these groups via
|
||||
the UI. These groups are managed *within* a specific tenant. For more information, see [Users & Groups](../../webapp/settings/webapp_settings_users.md).
|
||||
* **Cross-Tenant Groups (Configuration)** - These groups are defined centrally in the ClearML configuration files
|
||||
(e.g., Helm chart values, docker-compose environment variables). They offer several advantages:
|
||||
* **Cross-Tenant Definition:** Defined once in the configuration, applicable across the deployment.
|
||||
* **Fine-Grained Feature Control:** Allows precise assignment of specific ClearML features to groups.
|
||||
* **Automation:** Suitable for infrastructure-as-code and automated deployment setups.
|
||||
|
||||
|
||||
|
||||
### Configuring Cross-Tenant Groups
|
||||
|
||||
To define a cross-tenant group, you need to set specific configuration variables. These are typically set as environment
|
||||
variables for the relevant ClearML services (like `apiserver`). The naming convention follows this
|
||||
pattern: `CLEARML__services__auth__default_groups__<GroupName>__<Property>`.
|
||||
|
||||
Replace `<GroupName>` with the desired name for your group (e.g., `my_group_name`, `Data_Scientists`, `MLOps_Engineers`).
|
||||
|
||||
#### Configuration Variables
|
||||
|
||||
For each group you define in the configuration, you need to specify the following properties:
|
||||
|
||||
* **`id`**: A unique identifier for the group. This **must** be a standard UUID (Universally Unique Identifier). You can
|
||||
generate one using various online tools or libraries.
|
||||
|
||||
* Variable Name: `CLEARML__services__auth__default_groups__<GroupName>__id`
|
||||
* Example Value: `"abcd-1234-abcd-1234"`
|
||||
|
||||
* **`name`**: The display name of the group. This should match the `<GroupName>` used in the variable path.
|
||||
|
||||
* Variable Name: `CLEARML__services__auth__default_groups__<GroupName>__name`
|
||||
* Example Value: `"My Group Name"`
|
||||
|
||||
* **`features`**: A JSON-formatted list of strings, where each string is a feature name to be enabled for this group. See
|
||||
[Available Features](#available-features) for a list of valid feature names. Note that the features must be defined
|
||||
for the tenant or for the entire server in order to affect the group. By default, all the features of the tenant are
|
||||
available to all users.
|
||||
|
||||
* Variable Name: `CLEARML__services__auth__default_groups__<GroupName>__features`
|
||||
* Example Value: `'["applications", "experiments", "pipelines", "reports", "show_dashboard", "show_projects"]'` (Note
|
||||
the single quotes wrapping the JSON string if setting via YAML/environment variables).
|
||||
|
||||
* **`assignable`**: A boolean (`"true"` or `"false"`) indicating whether administrators can add users to this group via
|
||||
the ClearML Web UI. If `false`, group membership is managed externally or implicitly. Configuration-defined groups
|
||||
often have this set to `false`.
|
||||
|
||||
* Variable Name: `CLEARML__services__auth__default_groups__<GroupName>__assignable`
|
||||
* Example Value: `"false"`
|
||||
|
||||
* **`system`**: A boolean flag. This should **always be set to `"false"`** for custom-defined groups.
|
||||
|
||||
* Variable Name: `CLEARML__services__auth__default_groups__<GroupName>__system`
|
||||
* Example Value: `"false"`
|
||||
|
||||
#### Example Configuration
|
||||
|
||||
The following example demonstrates how you would define a group named `my_group_name` with a specific set of features
|
||||
that cannot be assigned via the UI:
|
||||
|
||||
```
|
||||
# Example configuration snippet (e.g., in Helm values.yaml or docker-compose.yml environment section)
|
||||
|
||||
# Unique group id for my_group_name
|
||||
- name: CLEARML__services__auth__default_groups__my_group_name__id
|
||||
value: "abcd-1234-abcd-1234" # Replace with a newly generated UUID
|
||||
|
||||
# Group name for my_group_name
|
||||
- name: CLEARML__services__auth__default_groups__my_group_name__name
|
||||
value: "My Group Name"
|
||||
|
||||
# List of features for my_group_name
|
||||
- name: CLEARML__services__auth__default_groups__my_group_name__features
|
||||
value: '["applications", "experiments", "queues", "pipelines", "reports", "show_dashboard","show_projects"]'
|
||||
|
||||
# Prevent assignment via UI for my_group_name
|
||||
- name: CLEARML__services__auth__default_groups__my_group_name__assignable
|
||||
value: "false"
|
||||
|
||||
# Always false for custom groups
|
||||
- name: CLEARML__services__auth__default_groups__my_group_name__system
|
||||
value: "false"
|
||||
```
|
||||
|
||||
### Available Features
|
||||
|
||||
The following features can be assigned to groups via the `features` configuration variable:
|
||||
|
||||
| Feature Name | Description | Notes |
|
||||
| :---- | :---- | :---- |
|
||||
| `user_management` | Allows viewing company users and groups, and editing group memberships. | Only effective if the group is `assignable`. |
|
||||
| `user_management_advanced` | Allows direct creation of users (bypassing invites) by admins and system users. | Often requires enabling at the organization level too. |
|
||||
| `permissions` | Enables editing of Role-Based Access Control (RBAC) rules. | <img src="/docs/latest/icons/ico-optional-no.svg" alt="No" className="icon size-md center-md" /> |
|
||||
| `applications` | Allows users to work with ClearML Applications (viewing, running). | Excludes management operations (upload/delete). |
|
||||
| `app_management` | Allows application management operations: upload, delete, enable, disable. | <img src="/docs/latest/icons/ico-optional-no.svg" alt="No" className="icon size-md center-md" /> |
|
||||
| `experiments` | Allows working with experiments. | *Deprecated/Not Used.* All users have access regardless of this flag. |
|
||||
| `queues` | Allows working with queues. | *Deprecated/Not Used.* All users have access regardless of this flag. |
|
||||
| `queue_management` | Allows create, update, and delete operations on queues. | <img src="/docs/latest/icons/ico-optional-no.svg" alt="No" className="icon size-md center-md" /> |
|
||||
| `data_management` | Controls access to Hyper-Datasets. | Actual access might also depend on `apiserver.services.excluded`. |
|
||||
| `config_vault` | Enables the configuration vaults feature. | <img src="/docs/latest/icons/ico-optional-no.svg" alt="No" className="icon size-md center-md" /> |
|
||||
| `pipelines` | Enables access to Pipelines (building and running). | <img src="/docs/latest/icons/ico-optional-no.svg" alt="No" className="icon size-md center-md" /> |
|
||||
| `reports` | Enables access to Reports. | <img src="/docs/latest/icons/ico-optional-no.svg" alt="No" className="icon size-md center-md" /> |
|
||||
| `resource_dashboard` | Enables access to the compute resource dashboard feature. | <img src="/docs/latest/icons/ico-optional-no.svg" alt="No" className="icon size-md center-md" /> |
|
||||
| `sso_management` | Enables the SSO (Single Sign-On) configuration wizard. | <img src="/docs/latest/icons/ico-optional-no.svg" alt="No" className="icon size-md center-md" /> |
|
||||
| `service_users` | Enables support for creating and managing service users (API keys). | <img src="/docs/latest/icons/ico-optional-no.svg" alt="No" className="icon size-md center-md" /> |
|
||||
| `resource_policy` | Enables the resource policy feature. | May default to a trial feature if not explicitly enabled. |
|
||||
| `model_serving` | Enables access to the model serving endpoints feature. | <img src="/docs/latest/icons/ico-optional-no.svg" alt="No" className="icon size-md center-md" /> |
|
||||
| `show_dashboard` | Makes the "Dashboard" menu item visible in the UI sidebar. | <img src="/docs/latest/icons/ico-optional-no.svg" alt="No" className="icon size-md center-md" /> |
|
||||
| `show_model_view` | Makes the "Models" menu item visible in the UI sidebar. | <img src="/docs/latest/icons/ico-optional-no.svg" alt="No" className="icon size-md center-md" /> |
|
||||
| `show_projects` | Makes the "Projects" menu item visible in the UI sidebar. | <img src="/docs/latest/icons/ico-optional-no.svg" alt="No" className="icon size-md center-md" /> |
|
||||
| `show_orchestration` | Makes the "Orchestration" menu item visible in the UI sidebar. | Available from apiserver version 3.25 |
|
||||
| `show_datasets` | Makes the "Datasets" menu item visible in the UI sidebar. | Available from apiserver version 3.25 |
|
||||
|
||||
### Feature Assignment Strategy
|
||||
|
||||
#### Combining Features
|
||||
|
||||
If a user belongs to multiple groups (e.g., the default `users` group and a custom `my_group_name` group), their
|
||||
effective feature set is the **union** (combination) of all features from all groups they belong to.
|
||||
|
||||
#### Configuring the Default 'users' Group
|
||||
|
||||
Because all users belong to the `users` group, and features are combined, it's crucial to configure the `users` group
|
||||
appropriately. You generally have two options:
|
||||
|
||||
1. **Minimum Shared Features:** Assign only the absolute minimum set of features that *every single user* should have to
|
||||
the `users` group.
|
||||
2. **Empty Feature Set:** Assign an empty list (`[]`) to the `users` group's features. This means users only get features
|
||||
explicitly granted by other groups they are members of. This is often the cleanest approach when using multiple custom groups.
|
||||
|
||||
**Example: Disabling all features by default for the `users` group:**
|
||||
|
||||
```
|
||||
- name: CLEARML__services__auth__default_groups__users__features
|
||||
value: '[]'
|
||||
|
||||
```
|
||||
|
||||
:::note
|
||||
You typically don't need to define the id, name, assignable, or system properties for built-in groups like users unless
|
||||
you need to override default behavior, but you do configure their features.
|
||||
:::
|
||||
|
||||
|
||||
### Setting Server-Level or Tenant-level Features
|
||||
|
||||
Features must be enabled for the entire server or for the tenant in order to allow setting it for specific groups.
|
||||
Setting server wide feature is done using a different configuration pattern: `CLEARML__services__organization__features__<FeatureName>`.
|
||||
|
||||
Setting one of these variables to `"true"` enables the feature globally.
|
||||
|
||||
**Example: Enabling `user_management_advanced` for the entire organization:**
|
||||
|
||||
```
|
||||
- name: CLEARML__services__organization__features__user_management_advanced
|
||||
value: "true"
|
||||
```
|
||||
|
||||
To enable a feature for a specific tenant, use the following API call:
|
||||
|
||||
```
|
||||
curl $APISERVER_URL/system.update_company_settings \
|
||||
-H "Content-Type: application/json" \
|
||||
-u $APISERVER_KEY:$APISERVER_SECRET \
|
||||
-d '{
|
||||
"company": "<company_id>",
|
||||
"features": ["sso_management", "user_management_advanced", ...]
|
||||
}'
|
||||
```
|
||||
|
||||
By default, all users have access to all features, but this can be changed by setting specific features set per group as described above.
|
||||
|
||||
### Example: Defining Full Features for Admins
|
||||
|
||||
While the `admins` group has inherent administrative privileges, you might want to explicitly ensure they have access to
|
||||
*all* configurable features defined via the `features` list, especially if you've restricted the default `users` group
|
||||
significantly. You might also need to enable certain features organization-wide.
|
||||
|
||||
```
|
||||
# Enable advanced user management for the whole organization
|
||||
- name: CLEARML__services__organization__features__user_management_advanced
|
||||
value: "true"
|
||||
|
||||
# (Optional but good practice) Explicitly assign all features to the built-in admins group
|
||||
- name: CLEARML__services__auth__default_groups__admins__features
|
||||
value: '["user_management", "user_management_advanced", "permissions", "applications", "app_management", "queues", "queue_management", "data_management", "config_vault", "pipelines", "reports", "resource_dashboard", "sso_management", "service_users", "resource_policy", "model_serving", "show_dashboard", "show_model_view", "show_projects"]' # List all relevant features
|
||||
|
||||
# You might still want to define other custom groups with fewer features...
|
||||
# - name: CLEARML__services__auth__default_groups__my_group_name__id
|
||||
# value: "xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx" # Replace with a newly generated UUID
|
||||
# - name: CLEARML__services__auth__default_groups__my_group_name__name
|
||||
# value: "my_group_name"
|
||||
# - name: CLEARML__services__auth__default_groups__my_group_name__features
|
||||
# value: '["some_feature", "another_feature"]'
|
||||
# - name: CLEARML__services__auth__default_groups__my_group_name__assignable
|
||||
# value: "false"
|
||||
# - name: CLEARML__services__auth__default_groups__my_group_name__system
|
||||
# value: "false"
|
||||
```
|
||||
|
||||
By combining configuration-defined groups, careful management of the default users group features, and organization-level
|
||||
settings, you can create a flexible and secure permission model tailored to your ClearML deployment. Remember to
|
||||
restart the relevant ClearML services after applying configuration changes.
|
||||
|
@ -30,12 +30,12 @@ To configure groups that should automatically become admins in ClearML set the f
|
||||
CLEARML__services__login__sso__saml_client__microsoft_ad__groups__admins=[<admin_group_name1>, <admin_group_name2>, ...]
|
||||
```
|
||||
|
||||
To change the the default Group Claim set the following environment variable:
|
||||
To change the default Group Claim, set the following environment variable:
|
||||
```
|
||||
CLEARML__services__login__sso__saml_client__microsoft_ad__groups__claim=...
|
||||
```
|
||||
|
||||
To make group matching case insensitive set the following environment variable:
|
||||
To make group matching case-insensitive, set the following environment variable:
|
||||
```
|
||||
CLEARML__services__login__sso__saml_client__microsoft_ad__groups__case_sensitive=false
|
||||
```
|
||||
|
@ -10,7 +10,7 @@ browser).
|
||||
|
||||
In the following sections, you will be instructed to set up different environment variables for the ClearML Server. If
|
||||
using a `docker-compose` deployment, these should be defined in your `docker-compose.override.yaml` file, under the
|
||||
`apiserver` service’ environment variables, as follows:
|
||||
`apiserver` service’s environment variables, as follows:
|
||||
|
||||
```
|
||||
services:
|
||||
|
@ -15,7 +15,7 @@ ClearML tenant can be associated with a particular external tenant
|
||||
<clearml_webapp_address>/login
|
||||
<clearml_webapp_address>/login/<external tenant ID>
|
||||
```
|
||||
3. Make sure the external tenant ID and groups are returned as claims for a each user
|
||||
3. Make sure the external tenant ID and groups are returned as claims for each user
|
||||
|
||||
## Configure ClearML to use Multi-Tenant Mode
|
||||
|
||||
|
@ -49,7 +49,7 @@ Your goal is to create an immutable copy of the data to be used by further steps
|
||||
The second step is to preprocess the data. First access the data, then modify it,
|
||||
and lastly create a new version of the data.
|
||||
|
||||
1. Create a task for you data preprocessing (not required):
|
||||
1. Create a task for your data preprocessing (not required):
|
||||
|
||||
```python
|
||||
from clearml import Task, Dataset
|
||||
|
@ -202,7 +202,7 @@ you'll get is the best performance here because our checks already run, so you s
|
||||
open the PR, so basically the dummy task here was found to be the best performance, and it has been tagged but that
|
||||
means that every single time I open a PR or I update a PR, it will search ClearML, and get this dummy task. It will get
|
||||
this one, and then we say if we find the best task, if not we'll just add the best performance anyway because you're the
|
||||
first task in the list, you'll always be getting best performance, but if you're not then we'll get the best latest
|
||||
first task in the list, you'll always be getting the best performance, but if you're not then we'll get the best latest
|
||||
metric. For example `get_reported_scalars().get('Performance Metric').get('Series 1').get('y')`, so the `y` value there
|
||||
so this could basically be the best or the highest map from a task or the highest F1 score from a task, or any some
|
||||
such. Then you have the best metric. We do the same thing for the current task as well, and then it's fairly easy. We
|
||||
|
@ -28,7 +28,7 @@ moved to be executed by a stronger machine.
|
||||
|
||||
During the execution of the example script, the code does the following:
|
||||
* Uses ClearML's automatic and explicit logging.
|
||||
* Creates an task named `Remote_execution PyTorch MNIST train` in the `examples` project.
|
||||
* Creates a task named `Remote_execution PyTorch MNIST train` in the `examples` project.
|
||||
|
||||
|
||||
## Scalars
|
||||
|
@ -9,7 +9,7 @@ The example script does the following:
|
||||
* Trains a simple deep neural network on the PyTorch built-in [MNIST](https://pytorch.org/vision/stable/datasets.html#mnist)
|
||||
dataset
|
||||
* Creates a task named `pytorch mnist train with abseil` in the `examples` project
|
||||
* ClearML automatically logs the absl.flags, and the models (and their snapshots) created by PyTorch
|
||||
* ClearML automatically logs the `absl.flags`, and the models (and their snapshots) created by PyTorch
|
||||
* Additional metrics are logged by calling [`Logger.report_scalar()`](../../../references/sdk/logger.md#report_scalar)
|
||||
|
||||
## Scalars
|
||||
|
@ -4,7 +4,7 @@ title: TensorFlow MNIST
|
||||
|
||||
The [tensorflow_mnist.py](https://github.com/clearml/clearml/blob/master/examples/frameworks/tensorflow/tensorflow_mnist.py)
|
||||
example demonstrates the integration of ClearML into code that uses TensorFlow and Keras to train a neural network on
|
||||
the Keras built-in [MNIST](https://www.tensorflow.org/api_docs/python/tf/keras/datasets/mnist) handwritten digits dataset.
|
||||
the Keras built-in [MNIST](https://www.tensorflow.org/api_docs/python/tf/keras/datasets/mnist) handwritten digit dataset.
|
||||
|
||||
When the script runs, it creates a task named `Tensorflow v2 mnist with summaries` in the `examples` project.
|
||||
|
||||
|
@ -145,7 +145,7 @@ filters.
|
||||
* Source rule - Query frame source information. Enter a Lucene query of frame metadata fields in the format
|
||||
`sources.<key>:<value>` (can use AND, OR, and NOT operators).
|
||||
|
||||
A frame filter can contain a number of rules. For each frame filter, the rules are applied with a logical AND operator. For example, the dataset version in the image below has one filter. “Frame Filter 1” has two rules:
|
||||
A frame filter can contain a number of rules. For each frame filter, the rules are applied with a logical AND operator. For example, the dataset version in the image below has one filter. "Frame Filter 1" has two rules:
|
||||
1. ROI rule - the frame must include an ROI with the `cat` label
|
||||
2. Source rule - the frames must be 640 pixels wide.
|
||||
|
||||
|
@ -9,7 +9,7 @@ Dataviews are available under the ClearML Enterprise plan.
|
||||
While a task is running, and any time after it finishes, results are tracked and can be visualized in the ClearML
|
||||
Enterprise WebApp (UI).
|
||||
|
||||
In addition to all of ClearML's offerings, ClearML Enterprise keeps track of the Dataviews associated with an
|
||||
In addition to all of ClearML's offerings, ClearML Enterprise keeps track of the Dataviews associated with a
|
||||
task, which can be viewed and [modified](webapp_exp_modifying.md) in the WebApp.
|
||||
|
||||
## Viewing a Task's Dataviews
|
||||
|
@ -167,8 +167,8 @@ Additionally, you can enable automatic logging of a step's metrics / artifacts /
|
||||
following arguments:
|
||||
* `monitor_metrics` (optional) - Automatically log the step's reported metrics also on the pipeline Task. The expected
|
||||
format is one of the following:
|
||||
* List of pairs metric (title, series) to log: [(step_metric_title, step_metric_series), ]. Example: `[('test', 'accuracy'), ]`
|
||||
* List of tuple pairs, to specify a different target metric to use on the pipeline Task: [((step_metric_title, step_metric_series), (target_metric_title, target_metric_series)), ].
|
||||
* List of pairs metric (title, series) to log: `[(step_metric_title, step_metric_series), ]`. Example: `[('test', 'accuracy'), ]`
|
||||
* List of tuple pairs, to specify a different target metric to use on the pipeline Task: `[((step_metric_title, step_metric_series), (target_metric_title, target_metric_series)), ]`.
|
||||
Example: `[[('test', 'accuracy'), ('model', 'accuracy')], ]`
|
||||
* `monitor_artifacts` (optional) - Automatically log the step's artifacts on the pipeline Task.
|
||||
* Provided a list of
|
||||
|
@ -221,8 +221,8 @@ You can enable automatic logging of a step's metrics /artifacts / models to the
|
||||
|
||||
* `monitor_metrics` (optional) - Automatically log the step's reported metrics also on the pipeline Task. The expected
|
||||
format is one of the following:
|
||||
* List of pairs metric (title, series) to log: [(step_metric_title, step_metric_series), ]. Example: `[('test', 'accuracy'), ]`
|
||||
* List of tuple pairs, to specify a different target metric to use on the pipeline Task: [((step_metric_title, step_metric_series), (target_metric_title, target_metric_series)), ].
|
||||
* List of pairs metric (title, series) to log: `[(step_metric_title, step_metric_series), ]`. Example: `[('test', 'accuracy'), ]`
|
||||
* List of tuple pairs, to specify a different target metric to use on the pipeline Task: `[((step_metric_title, step_metric_series), (target_metric_title, target_metric_series)), ]`.
|
||||
Example: `[[('test', 'accuracy'), ('model', 'accuracy')], ]`
|
||||
* `monitor_artifacts` (optional) - Automatically log the step's artifacts on the pipeline Task.
|
||||
* Provided a list of artifact names created by the step function, these artifacts will be logged automatically also
|
||||
|
5
docs/references/api/pipelines.md
Normal file
5
docs/references/api/pipelines.md
Normal file
@ -0,0 +1,5 @@
|
||||
---
|
||||
title: pipelines
|
||||
---
|
||||
|
||||
**AutoGenerated PlaceHolder**
|
5
docs/references/api/reports.md
Normal file
5
docs/references/api/reports.md
Normal file
@ -0,0 +1,5 @@
|
||||
---
|
||||
title: reports
|
||||
---
|
||||
|
||||
**AutoGenerated PlaceHolder**
|
5
docs/references/api/serving.md
Normal file
5
docs/references/api/serving.md
Normal file
@ -0,0 +1,5 @@
|
||||
---
|
||||
title: serving
|
||||
---
|
||||
|
||||
**AutoGenerated PlaceHolder**
|
@ -3,6 +3,12 @@ title: Version 3.24
|
||||
---
|
||||
|
||||
|
||||
### Enterprise Server 3.24.3
|
||||
|
||||
**New Features**
|
||||
* Add option to limit UI application instance endpoint access to the application instance creator only
|
||||
* Add custom user properties to multi-tenant usage reports
|
||||
|
||||
### Enterprise Server 3.24.2
|
||||
|
||||
**New Features**
|
||||
@ -24,7 +30,7 @@ title: Version 3.24
|
||||
* Add grouped same-event view to UI "Latest Task Events"
|
||||
|
||||
**Bug Fixes**
|
||||
* Fix downloaded CSV file of UI “Latest Task Events” missing some events
|
||||
* Fix downloaded CSV file of UI "Latest Task Events" missing some events
|
||||
* Fix access permissions to UI Reports
|
||||
* Fix configuration modal of UI application instance displays incorrect values
|
||||
* Fix UI Hyper-Dataset frame viewer navigation controls not displaying
|
||||
@ -54,7 +60,8 @@ title: Version 3.24
|
||||
|
||||
**Bug Fixes**
|
||||
* Fix ctrl-f does not open a search bar in UI editor modals ([ClearML Web GitHub issue #99](https://github.com/clearml/clearml-web/issues/99))
|
||||
* Fix UI Incorrect project statistics in project page
|
||||
* Fix webserver configuration environment variables don't load with single-quoted strings ([ClearML Server GitHub issue #271](https://github.com/clearml/clearml-server/issues/271))
|
||||
* Fix UI incorrect project statistics in project page
|
||||
* Fix UI Hyper-Dataset version's "Publish" function is sometimes unnecessarily disabled
|
||||
* Fix UI Task manual refresh function does not work in full screen mode
|
||||
* Fix links to tasks are broken in the Orchestration's Queues’ task lists
|
||||
@ -73,5 +80,5 @@ title: Version 3.24
|
||||
* Fix UI global search results display aborted tasks as completed
|
||||
* Fix UI breadcrumbs sometimes don't display project name of newly cloned task
|
||||
* Fix scroll sometimes doesn't work in UI global search results
|
||||
* Fix Hyper-Dataset FrameGroup Details and FrameGroup Metadata sections are not expanding
|
||||
* Fix Hyper-Dataset FrameGroup Details and Metadata sections are not expanding
|
||||
* Fix unsaved content is not discarded in UI Hyper-Dataset frame viewer when moving to another frame source
|
||||
|
@ -2,6 +2,26 @@
|
||||
title: Version 2.0
|
||||
---
|
||||
|
||||
### ClearML Server 2.0.1
|
||||
|
||||
**New Features**
|
||||
* New UI task creation options
|
||||
* Support bash as well as python scripts
|
||||
* Support file upload
|
||||
|
||||
**Bug Fixes**
|
||||
* Fix ctrl-f does not open a search bar in UI editor modals ([ClearML Web GitHub issue #99](https://github.com/clearml/clearml-web/issues/99))
|
||||
* Fix UI smoothed plots are dimmer than original plots in dark mode ([ClearML Server GitHub issue #270](https://github.com/clearml/clearml-server/issues/270))
|
||||
* Fix webserver configuration environment variables don't load with single-quoted strings ([ClearML Server GitHub issue #271](https://github.com/clearml/clearml-server/issues/271))
|
||||
* Fix image plots sometimes not rendered in UI
|
||||
* Fix "All" tag filter not working in UI model selection modal in comparison pages
|
||||
* Fix manual refresh function sometimes does not work in UI task
|
||||
* Fix UI embedded plot colors do not change upon UI theme change
|
||||
* Fix deleting a parameter in the UI task creation modal incorrectly removes another parameter
|
||||
* Fix UI global search displays aborted tasks as completed
|
||||
* Fix can't show/hide specific UI plot variants
|
||||
* Fix UI breadcrumbs sometimes does not display project name
|
||||
|
||||
### ClearML Server 2.0.0
|
||||
|
||||
**Breaking Changes**
|
||||
@ -18,7 +38,7 @@ Upgrading to ClearML Server v1.17 from a previous version:
|
||||
* New UI task creation options
|
||||
* Support bash as well as Python scripts
|
||||
* Support file upload
|
||||
* New UI setting for configuring cloud storage credentials with which ClearML can clean up cloud storage artifacts on task deletion.
|
||||
* New UI setting for configuring cloud storage credentials with which ClearML can clean up cloud storage artifacts on task deletion ([ClearML Server GitHub issue #144](https://github.com/clearml/clearml-server/issues/144)).
|
||||
* Add UI scalar plots presentation of plots in sections grouped by metrics.
|
||||
* Add UI batch export plot embed codes for all metric plots in a single click.
|
||||
* Add UI pipeline presentation of steps grouped into stages
|
||||
|
10
docs/release_notes/sdk/enterprise/ver_3_13.md
Normal file
10
docs/release_notes/sdk/enterprise/ver_3_13.md
Normal file
@ -0,0 +1,10 @@
|
||||
---
|
||||
title: Version 3.13
|
||||
---
|
||||
|
||||
### AllegroAI 3.13.0
|
||||
|
||||
**New Features**
|
||||
* Add support for changing Hyper-Dataset version name using `DatasetVersion.set_version_name()`
|
||||
* Bump `clearml` dependency version (support `v<1.19`)
|
||||
* Update docstrings
|
@ -11,4 +11,6 @@ configuration and access control administration by allowing administrators to as
|
||||
rules at the group level rather than for each user and/or [service account](../webapp/settings/webapp_settings_users.md#service-accounts)
|
||||
individually. Administrators have the flexibility to create user groups, and add or remove members as needed.
|
||||
|
||||
For more information see [User Groups](../webapp/settings/webapp_settings_users.md#user-groups).
|
||||
For more information about defining groups via the UI, see [User Groups](../webapp/settings/webapp_settings_users.md#user-groups).
|
||||
|
||||
For more information about defining cross-tenant groups on a multi-tenant server deployment, see [Configuring Cross-Tenant Groups](../deploying_clearml/enterprise_deploy/multi_tenant_k8s.md#configuring-groups).
|
@ -18,16 +18,22 @@ The **ClearML Web UI** is the graphical user interface for the ClearML platform,
|
||||
The WebApp's sidebar provides access to the following modules:
|
||||
|
||||
* <img src="/docs/latest/icons/ico-applications.svg" alt="ClearML Apps" className="icon size-md space-md" />[Applications](applications/apps_overview.md) - ClearML's GUI applications for no-code workflow execution (available in the ClearML Pro and Enterprise plans).
|
||||
|
||||
* <img src="/docs/latest/icons/ico-workers.svg" alt="Workers and Queues" className="icon size-md space-md" />[Orchestration](webapp_workers_queues.md) - Autoscaling, resource usage monitoring and allocation management.
|
||||
|
||||
* <img src="/docs/latest/icons/ico-model-endpoints.svg" alt="Model endpoints" className="icon size-md space-md" />[Model Endpoints](webapp_model_endpoints.md) - Monitor your live model endpoints.
|
||||
|
||||
* <img src="/docs/latest/icons/ico-side-bar-datasets.svg" alt="Datasets" className="icon size-md space-md" />[Datasets](datasets/webapp_dataset_page.md) - View and manage your datasets.
|
||||
|
||||
* <img src="/docs/latest/icons/ico-projects.svg" alt="Projects" className="icon size-md space-md" />[Projects](webapp_projects_page.md) - The main experimentation page. Access your tasks and models as they are organized into projects. The tasks and models are displayed in tables which let you:
|
||||
|
||||
* Track ongoing tasks and visualize their results
|
||||
* Reproduce previous task runs
|
||||
* Tune task parameter values with no code change
|
||||
* Compare tasks and models
|
||||
* Share tasks and models with other ClearML hosted service users
|
||||
* Create and share rich content [Reports](webapp_reports.md)
|
||||
|
||||
* <img src="/docs/latest/icons/ico-pipelines.svg" alt="Pipelines" className="icon size-md space-md" />[Pipelines](pipelines/webapp_pipeline_page.md) - View and manage your pipelines.
|
||||
|
||||
## UI Top Bar
|
||||
|
4
package-lock.json
generated
4
package-lock.json
generated
@ -10203,7 +10203,9 @@
|
||||
}
|
||||
},
|
||||
"node_modules/image-size": {
|
||||
"version": "1.2.0",
|
||||
"version": "1.2.1",
|
||||
"resolved": "https://registry.npmjs.org/image-size/-/image-size-1.2.1.tgz",
|
||||
"integrity": "sha512-rH+46sQJ2dlwfjfhCyNx5thzrv+dtmBIhPHk0zgRUukHzZ/kRueTJXoYYsclBaKcSMBWuGbOFXtioLpzTb5euw==",
|
||||
"license": "MIT",
|
||||
"dependencies": {
|
||||
"queue": "6.0.2"
|
||||
|
16
sidebars.js
16
sidebars.js
@ -351,9 +351,10 @@ module.exports = {
|
||||
{
|
||||
'Enterprise':
|
||||
[
|
||||
'release_notes/sdk/enterprise/ver_3_12',
|
||||
'release_notes/sdk/enterprise/ver_3_13',
|
||||
{
|
||||
'Older Versions': [
|
||||
'release_notes/sdk/enterprise/ver_3_12',
|
||||
'release_notes/sdk/enterprise/ver_3_11',
|
||||
'release_notes/sdk/enterprise/ver_3_10',
|
||||
]
|
||||
@ -454,14 +455,17 @@ module.exports = {
|
||||
{'Server API': [
|
||||
'references/api/index',
|
||||
'references/api/definitions',
|
||||
'references/api/login',
|
||||
'references/api/debug',
|
||||
'references/api/events',
|
||||
'references/api/login',
|
||||
'references/api/models',
|
||||
'references/api/pipelines',
|
||||
'references/api/projects',
|
||||
'references/api/queues',
|
||||
'references/api/workers',
|
||||
'references/api/events',
|
||||
'references/api/models',
|
||||
'references/api/reports',
|
||||
'references/api/serving',
|
||||
'references/api/tasks',
|
||||
'references/api/workers',
|
||||
]},
|
||||
{
|
||||
type: 'category',
|
||||
@ -643,6 +647,7 @@ module.exports = {
|
||||
label: 'Enterprise Server',
|
||||
items: [
|
||||
{'Deployment Options': [
|
||||
'deploying_clearml/enterprise_deploy/k8s',
|
||||
'deploying_clearml/enterprise_deploy/multi_tenant_k8s',
|
||||
'deploying_clearml/enterprise_deploy/vpc_aws',
|
||||
'deploying_clearml/enterprise_deploy/on_prem_ubuntu',
|
||||
@ -656,6 +661,7 @@ module.exports = {
|
||||
},
|
||||
{'ClearML Application Gateway': [
|
||||
'deploying_clearml/enterprise_deploy/appgw_install_compose',
|
||||
'deploying_clearml/enterprise_deploy/appgw_install_compose_hosted',
|
||||
'deploying_clearml/enterprise_deploy/appgw_install_k8s',
|
||||
]
|
||||
},
|
||||
|
Loading…
Reference in New Issue
Block a user