This commit is contained in:
revital 2025-02-26 13:49:29 +02:00
commit 1a008c1cb8
28 changed files with 453 additions and 18 deletions

View File

@ -9,7 +9,8 @@ See [Hyper-Datasets](../hyperdatasets/overview.md) for ClearML's advanced querya
The following are some recommendations for using ClearML Data.
![Dataset UI gif](../img/gif/dataset.gif)
![Dataset UI gif](../img/gif/dataset.gif#light-mode-only)
![Dataset UI gif](../img/gif/dataset_dark.gif#dark-mode-only)
## Versioning Datasets

View File

@ -0,0 +1,78 @@
---
title: Changing CleaML Artifacts Links
---
This guide describes how to update artifact references in the ClearML Enterprise server.
By default, artifacts are stored on the file server; however, an external storage such as AWS S3, Minio, Google Cloud
Storage, etc. may be used to store artifacts. References to these artifacts may exist in ClearML databases: MongoDB and ElasticSearch.
This procedure should be used if external storage is being migrated to a different location or URL.
:::important
This procedure does not deal with the actual migration of the data--only with changing the references in ClearML that
point to the data.
:::
## Preparation
### Version Confirmation
To change the links, use the `fix_fileserver_urls.py` script, located inside the `allegro-apiserver`
Docker container. This script will be executed from within the `apiserver` container. Make sure the `apiserver` version
is 3.20 or higher.
### Backup
It is highly recommended to back up the ClearML MongoDB and ElasticSearch databases before running the script, as the
script changes the values in the databases, and can't be undone.
## Fixing MongoDB links
1. Access the `apiserver` Docker container:
* In `docker-compose:`
```commandline
sudo docker exec -it allegro-apiserver /bin/bash
```
* In Kubernetes:
```commandline
kubectl exec -it -n clearml <clearml-apiserver-pod-name> -- bash
```
1. Navigate to the script location in the `upgrade` folder:
```commandline
cd /opt/seematics/apiserver/server/upgrade
```
1. Run the following command:
:::important
Before running the script, verify that this is indeed the correct version (`apiserver` v3.20 or higher,
or that the script provided by ClearML was copied into the container).
::::
```commandline
python3 fix_fileserver_urls.py \
--mongo-host mongodb://mongo:27017 \
--elastic-host elasticsearch:9200 \
--host-source "<old fileserver host and/or port, as in artifact links>" \
--host-target "<new fileserver host and/or port>" --datasets
```
:::note Notes
* If MongoDB or ElasticSearch services are accessed from the `apiserver` container using custom addresses, then
`--mongo-host` and `--elastic-host` arguments should be updated accordingly.
* If ElasticSearch is set up to require authentication then the following arguments should be used to pass the user
and password: `--elastic-user <es_user> --elastic-password <es_pass>`
:::
The script fixes the links in MongoDB, and outputs `cURL` commands for updating the links in ElasticSearch.
## Fixing the ElasticSearch Links
Copy the `cURL` commands printed by the script run in the previous stage, and run them one after the other. Make sure to
inspect that a "success" result was returned from each command. Depending on the amount of the data in the ElasticSearch,
running these commands may take some time.

View File

@ -0,0 +1,240 @@
---
title: Exporting and Importing ClearML Projects
---
When migrating from a ClearML Open Server to a ClearML Enterprise Server, you may need to transfer projects. This is done
using the `data_tool.py` script. This utility is available in the `apiserver` Docker image, and can be used for
exporting and importing ClearML project data for both open source and Enterprise versions.
This guide covers the following:
* Exporting data from Open Source and Enterprise servers
* Importing data into an Enterprise server
* Handling the artifacts stored on the file server.
:::note
Export instructions differ for ClearML open and Enterprise servers. Make sure you follow the guidelines that match your
server type.
:::
## Exporting Data
The export process is done by running the ***data_tool*** script that generates a zip file containing project and task
data. This file should then be copied to the server on which the import will run.
Note that artifacts stored in the ClearML ***file server*** should be copied manually if required (see [Handling Artifacts](#handling-artifacts)).
### Exporting Data from ClearML Open Servers
#### Preparation
* Make sure the `apiserver` is at least Open Source server version 1.12.0.
* Note that any `pending` or `running` tasks will not be exported. If you wish to export them, make sure to stop/dequeue
them before exporting.
#### Running the Data Tool
Execute the data tool within the `apiserver` container.
Open a bash session inside the `apiserver` container of the server:
* In docker-compose:
```commandline
sudo docker exec -it clearml-apiserver /bin/bash
```
* In Kubernetes:
```commandline
kubectl exec -it -n <clearml-namespace> <clearml-apiserver-pod-name> -- bash
```
#### Export Commands
**To export specific projects:**
```commandline
python3 -m apiserver.data_tool export --projects <project_id1> <project_id2>
--statuses created stopped published failed completed --output <output-file-name>.zip
```
As a result, you should get a `<output-file-name>.zip` file that contains all the data from the specified projects and
their children.
**To export all the projects:**
```commandline
python3 -m apiserver.data_tool export \
--all \
--statuses created stopped published failed completed \
--output <output-file-name>.zip
```
#### Optional Parameters
* `--experiments <list of experiment IDs>` - If not specified then all experiments from the specified projects are exported
* `--statuses <list of task statuses>` - Export tasks of specific statuses. If the parameter
is omitted, only `published` tasks are exported
* `--no-events` - Do not export task events, i.e. logs and metrics (scalar, plots, debug samples).
Make sure to copy the generated zip file containing the exported data.
### Exporting Data from ClearML Enterprise Servers
#### Preparation
* Make sure the `apiserver` is at least Enterprise Server version 3.18.0.
* Note that any `pending` or `running` tasks will not be exported. If you wish to export them, make sure to stop/dequeue
before exporting.
#### Running the Data Tool
Execute the data tool from within the `apiserver` docker container.
Open a bash session inside the `apiserver` container of the server:
* In `docker-compose`:
```commandline
sudo docker exec -it allegro-apiserver /bin/bash
```
* In Kubernetes:
```commandline
kubectl exec -it -n <clearml-namespace> <clearml-apiserver-pod-name> -- bash
```
#### Export Commands
**To export specific projects:**
```commandline
PYTHONPATH=/opt/seematics/apiserver/trains-server-repo python3 data_tool.py \
export \
--projects <project_id1> <project_id2> \
--statuses created stopped published failed completed \
--output <output-file-name>.zip
```
As a result, you should get `<output-file-name>.zip` file that contains all the data from the specified projects and
their children.
**To export all the projects:**
```commandline
PYTHONPATH=/opt/seematics/apiserver/trains-server-repo python3 data_tool.py \
export \
--all \
--statuses created stopped published failed completed \
--output <output-file-name>.zip
```
#### Optional Parameters
* `--experiments <list of experiment IDs>` - If not specified then all experiments from the specified projects are exported
* `--statuses <list of task statuses>` - Can be used to allow exporting tasks of specific statuses. If the parameter is
omitted, only `published` tasks are exported.
* `--no-events` - Do not export task events, i.e. logs, and metrics (scalar, plots, debug samples).
Make sure to copy the generated zip file containing the exported data.
## Importing Data
This section explains how to import the exported data into a ClearML Enterprise server.
### Preparation
* It is highly recommended to back up the ClearML databases before importing data, as import injects data into the
databases, and can't be undone.
* Make sure you are working with `apiserver` version 3.22.3 or higher.
* Make the zip file accessible from within the `apiserver` container by copying the exported data to the
`apiserver` container or to a folder on the host, which the `apiserver` is mounted to.
### Usage
The data tool should be executed from within the `apiserver` docker container.
1. Open a bash session inside the `apiserver` container of the server:
* In `docker-compose`:
```commandline
sudo docker exec -it allegro-apiserver /bin/bash
```
* In Kubernetes:
```commandline
kubectl exec -it -n <clearml-namespace> <clearml-apiserver-pod-name> -- bash
```
1. Run the data tool script in *import* mode:
```commandline
PYTHONPATH=/opt/seematics/apiserver/trains-server-repo python3 data_tool.py \
import \
<path to zip file> \
--company <company_id> \
--user <user_id>
```
* `company_id`- The default company ID used in the target deployment. Inside the `apiserver` container you can
usually get it from the environment variable `CLEARML__APISERVER__DEFAULT_COMPANY`.
If you do not specify the `--company` parameter then all the data will be imported as `Examples` (read-only)
* `user_id` - The ID of the user in the target deployment who will become the owner of the imported data
## Handling Artifacts
***Artifacts*** refers to any content which the ClearML server holds references to. This can include:
* Dataset or Hyper-Dataset frame URLs
* ClearML artifact URLs
* Model snapshots
* Debug samples
Artifacts may be stored in any external storage (e.g., AWS S3, minio, Google Cloud Storage) or in the ClearML file server.
* If the artifacts are **not** stored in the ClearML file server, they do not need to be moved during the export/import process,
as the URLs registered in ClearML entities pointing to these artifacts will not change.
* If the artifacts are stored in the ClearML file server, then the file server content must also be moved, and the URLs
in the ClearML databases must point to the new location. See instructions [below](#exporting-file-server-data-for-clearml-open-server).
### Exporting File Server Data for ClearML Open Server
Data in the file server is organized by project. For each project, all data references by entities in that project is
stored in a folder bearing the name of the project. This folder can be located in:
```
/opt/clearml/data/fileserver/<project name>
```
The entire projects' folders content should be copied to the target server (see [Importing Fileserver Data](#importing-file-server-data)).
### Exporting File Server Data for ClearML Enterprise Server
Data in the file server is organized by tenant and project. For each project, all data references by entities in that
project is stored in a folder bearing the name of the project. This folder can be located in:
```
/opt/allegro/data/fileserver/<company_id>/<project name>
```
The entire projects' folders content should be copied to the target server (see [Importing Fileserver Data](#importing-file-server-data)).
## Importing File Server Data
### Copying the Data
Place the exported projects' folder(s) content into the target file server's storage in the following folder:
```
/opt/allegro/data/fileserver/<company_id>/<project name>
```
### Fixing Registered URLs
Since URLs pointing to the file server contain the file server's address, these need to be changed to the address of the
new file server.
Note that this is not required if the new file server is replacing the old file server and can be accessed using the same
exact address.
Once the projects' data has been copied to the target server, and the projects themselves were imported, see
[Changing CleaML Artifacts Links](change_artifact_links.md) for information on how to fix the URLs.

View File

@ -0,0 +1,98 @@
---
title: Multi-Tenant Login Mode
---
In a multi-tenant setup, each external tenant can be represented by an SSO client defined in the customer Identity provider
(Keycloak). Each ClearML tenant can be associated with a particular external tenant. Currently, only one
ClearML tenant can be associated with a particular external tenant
## Setup IdP/SSO Client in Identity Provider
1. Add the following URL to "Valid redirect URIs": `<clearml_webapp_address>/callback_<client_id>`
2. Add the following URLs to "Valid post logout redirect URIs":
```
<clearml_webapp_address>/login
<clearml_webapp_address>/login/<external tenant ID>
```
3. Make sure the external tenant ID and groups are returned as claims for a each user
## Configure ClearML to use Multi-Tenant Mode
Set the following environment variables in the ClearML enterprise helm chart under the `apiserver` section:
* To turn on the multi-tenant login mode:
```
- name: CLEARML__services__login__sso__tenant_login
value: "true"
```
* To hide any global IdP/SSO configuration that's not associated with a specific ClearML tenant:
```
- name: CLEARML__services__login__sso__allow_settings_providers
value: "false"
```
Enable `onlyPasswordLogin` by setting the following environment variable in the helm chart under the `webserver` section:
```
- name: WEBSERVER__onlyPasswordLogin`
value: “true”`
```
## Setup IdP for a ClearML Tenant
To set an IdP client for a ClearML tenant, youll need to set the ClearML tenant settings and define an identity provider:
1. Call the following API to set the ClearML tenant settings:
```
curl $APISERVER_URL/system.update_company_sso_config -H "Content-Type: application/json" -u $APISERVER_KEY:$APISERVER_SECRET -d'{
"company": "<company_id>",
"sso": {
"tenant": "<external tenant ID>",
"group_mapping": {
"IDP group name1": "Clearml group name1",
"IDP group name2": "Clearml group name2"
},
"admin_groups": ["IDP admin group name1", "IDP admin group name2"]
}}'
```
2. Call the following API to define the ClearML tenant identity provider:
```
curl $APISERVER_URL/sso.save_provider_configuration -H "Content-Type: application/json" -u $APISERVER_KEY:$APISERVER_SECRET -d'{
"provider": "keycloak",
"company": "<company_id>",
"configuration": {
"id": "<some unique id here, you can use company_id>",
"display_name": "<The text that you want to see on the login button>",
"client_id": "<client_id from IDP>",
"client_secret": "<client secret from IDP>",
"authorization_endpoint": "<authorization_endpoint from IDP OpenID configuration>",
"token_endpoint": "<token_endpoint from IDP OpenID configuration>",
"revocation_endpoint": "<revocation_endpoint from IDP OpenID configuration>",
"end_session_endpoint": "<end_session_endpoint from IDP OpenID configuration>",
"logout_from_provider": true,
"claim_tenant": "tenant_key",
"claim_name": "name",
"group_enabled": true,
"claim_groups": "ad_groups_trusted",
"group_prohibit_user_login_if_not_in_group": true
}}'
```
The above configuration assumes the following:
* On logout from ClearML, the user is also logged out from the Identity Provider
* External tenant ID for the user is returned under the `tenant_key` claim
* User display name is returned under the `name` claim
* User groups list is returned under the `ad_groups_trusted` claim
* Group integration is turned on and a user will be allowed to log in if any of the groups s/he belongs to in the
IdP exists under the corresponding ClearML tenant (this is after group name translation is done according to the ClearML tenant settings)
## Webapp Login
When running in multi-tenant login mode, a user belonging to some external tenant should use the following link to log in:
```
<clearml_webapp_address>/login/<external tenant ID>
```

Binary file not shown.

Before

(image error) Size: 388 KiB

After

(image error) Size: 372 KiB

Binary file not shown.

After

(image error) Size: 16 MiB

Binary file not shown.

Before

(image error) Size: 606 KiB

After

(image error) Size: 615 KiB

Binary file not shown.

Before

(image error) Size: 360 KiB

After

(image error) Size: 359 KiB

Binary file not shown.

Before

(image error) Size: 2.4 MiB

After

(image error) Size: 12 MiB

Binary file not shown.

After

(image error) Size: 14 MiB

View File

@ -95,7 +95,8 @@ and shuts down instances as needed, according to a resource budget that you set.
### Cloning, Editing, and Enqueuing
![Cloning, editing, enqueuing gif](../img/gif/integrations_yolov5.gif)
![Cloning, editing, enqueuing gif](../img/gif/integrations_yolov5.gif#light-mode-only)
![Cloning, editing, enqueuing gif](../img/gif/integrations_yolov5_dark.gif#dark-mode-only)
Use ClearML's web interface to edit task details, like configuration parameters or input models, then execute the task
with the new configuration on a remote machine:

View File

@ -93,7 +93,8 @@ and shuts down instances as needed, according to a resource budget that you set.
### Cloning, Editing, and Enqueuing
![Cloning, editing, enqueuing gif](../img/gif/integrations_yolov5.gif)
![Cloning, editing, enqueuing gif](../img/gif/integrations_yolov5.gif#light-mode-only)
![Cloning, editing, enqueuing gif](../img/gif/integrations_yolov5_dark.gif#dark-mode-only)
Use ClearML's web interface to edit task details, like configuration parameters or input models, then execute the task
with the new configuration on a remote machine:

View File

@ -92,7 +92,8 @@ and shuts down instances as needed, according to a resource budget that you set.
### Cloning, Editing, and Enqueuing
![Cloning, editing, enqueuing gif](../img/gif/integrations_yolov5.gif)
![Cloning, editing, enqueuing gif](../img/gif/integrations_yolov5.gif#light-mode-only)
![Cloning, editing, enqueuing gif](../img/gif/integrations_yolov5_dark.gif#dark-mode-only)
Use ClearML's web interface to edit task details, like configuration parameters or input models, then execute the task
with the new configuration on a remote machine:

View File

@ -105,7 +105,8 @@ and shuts down instances as needed, according to a resource budget that you set.
### Cloning, Editing, and Enqueuing
![Cloning, editing, enqueuing gif](../img/gif/integrations_yolov5.gif)
![Cloning, editing, enqueuing gif](../img/gif/integrations_yolov5.gif#light-mode-only)
![Cloning, editing, enqueuing gif](../img/gif/integrations_yolov5_dark.gif#dark-mode-only)
Use ClearML's web interface to edit task details, like configuration parameters or input models, then execute the task
with the new configuration on a remote machine:

View File

@ -94,7 +94,8 @@ and shuts down instances as needed, according to a resource budget that you set.
### Cloning, Editing, and Enqueuing
![Cloning, editing, enqueuing gif](../img/gif/integrations_yolov5.gif)
![Cloning, editing, enqueuing gif](../img/gif/integrations_yolov5.gif#light-mode-only)
![Cloning, editing, enqueuing gif](../img/gif/integrations_yolov5_dark.gif#dark-mode-only)
Use ClearML's web interface to edit task details, like configuration parameters or input models, then execute the task
with the new configuration on a remote machine:

View File

@ -90,7 +90,8 @@ and shuts down instances as needed, according to a resource budget that you set.
### Cloning, Editing, and Enqueuing
![Cloning, editing, enqueuing gif](../img/gif/integrations_yolov5.gif)
![Cloning, editing, enqueuing gif](../img/gif/integrations_yolov5.gif#light-mode-only)
![Cloning, editing, enqueuing gif](../img/gif/integrations_yolov5_dark.gif#dark-mode-only)
Use ClearML's web interface to edit task details, like configuration parameters or input models, then execute the task
with the new configuration on a remote machine:

View File

@ -114,7 +114,8 @@ and shuts down instances as needed, according to a resource budget that you set.
### Cloning, Editing, and Enqueuing
![Cloning, editing, enqueuing gif](../img/gif/integrations_yolov5.gif)
![Cloning, editing, enqueuing gif](../img/gif/integrations_yolov5.gif#light-mode-only)
![Cloning, editing, enqueuing gif](../img/gif/integrations_yolov5_dark.gif#dark-mode-only)
Use ClearML's web interface to edit task details, like configuration parameters or input models, then execute the task
with the new configuration on a remote machine:

View File

@ -120,7 +120,8 @@ and shuts down instances as needed, according to a resource budget that you set.
### Cloning, Editing, and Enqueuing
![Cloning, editing, enqueuing gif](../img/gif/integrations_yolov5.gif)
![Cloning, editing, enqueuing gif](../img/gif/integrations_yolov5.gif#light-mode-only)
![Cloning, editing, enqueuing gif](../img/gif/integrations_yolov5_dark.gif#dark-mode-only)
Use ClearML's web interface to edit task details, like configuration parameters or input models, then execute the task
with the new configuration on a remote machine:

View File

@ -96,7 +96,8 @@ and shuts down instances as needed, according to a resource budget that you set.
### Cloning, Editing, and Enqueuing
![Cloning, editing, enqueuing gif](../img/gif/integrations_yolov5.gif)
![Cloning, editing, enqueuing gif](../img/gif/integrations_yolov5.gif#light-mode-only)
![Cloning, editing, enqueuing gif](../img/gif/integrations_yolov5_dark.gif#dark-mode-only)
Use ClearML's web interface to edit task details, like configuration parameters or input models, then execute the task
with the new configuration on a remote machine:

View File

@ -113,7 +113,8 @@ and shuts down instances as needed, according to a resource budget that you set.
### Cloning, Editing, and Enqueuing
![Cloning, editing, enqueuing gif](../img/gif/integrations_yolov5.gif)
![Cloning, editing, enqueuing gif](../img/gif/integrations_yolov5.gif#light-mode-only)
![Cloning, editing, enqueuing gif](../img/gif/integrations_yolov5_dark.gif#dark-mode-only)
Use ClearML's web interface to edit task details, like configuration parameters or input models, then execute the task
with the new configuration on a remote machine:

View File

@ -107,7 +107,8 @@ and shuts down instances as needed, according to a resource budget that you set.
### Cloning, Editing, and Enqueuing
![Cloning, editing, enqueuing gif](../img/gif/integrations_yolov5.gif)
![Cloning, editing, enqueuing gif](../img/gif/integrations_yolov5.gif#light-mode-only)
![Cloning, editing, enqueuing gif](../img/gif/integrations_yolov5_dark.gif#dark-mode-only)
Use ClearML's web interface to edit task details, like configuration parameters or input models, then execute the task
with the new configuration on a remote machine:

View File

@ -78,7 +78,8 @@ and shuts down instances as needed, according to a resource budget that you set.
### Cloning, Editing, and Enqueuing
![Cloning, editing, enqueuing gif](../img/gif/integrations_yolov5.gif)
![Cloning, editing, enqueuing gif](../img/gif/integrations_yolov5.gif#light-mode-only)
![Cloning, editing, enqueuing gif](../img/gif/integrations_yolov5_dark.gif#dark-mode-only)
Use ClearML's web interface to edit task details, like configuration parameters or input models, then execute the task
with the new configuration on a remote machine:

View File

@ -120,7 +120,8 @@ and shuts down instances as needed, according to a resource budget that you set.
### Cloning, Editing, and Enqueuing
![Cloning, editing, enqueuing gif](../img/gif/integrations_yolov5.gif)
![Cloning, editing, enqueuing gif](../img/gif/integrations_yolov5.gif#light-mode-only)
![Cloning, editing, enqueuing gif](../img/gif/integrations_yolov5_dark.gif#dark-mode-only)
Use ClearML's web interface to edit task details, like configuration parameters or input models, then execute the task
with the new configuration on a remote machine:

View File

@ -169,7 +169,8 @@ and shuts down instances as needed, according to a resource budget that you set.
### Cloning, Editing, and Enqueuing
![Cloning, editing, enqueuing gif](../img/gif/integrations_yolov5.gif)
![Cloning, editing, enqueuing gif](../img/gif/integrations_yolov5.gif#light-mode-only)
![Cloning, editing, enqueuing gif](../img/gif/integrations_yolov5_dark.gif#dark-mode-only)
Use ClearML's web interface to edit task details, like configuration parameters or input models, then execute the task
with the new configuration on a remote machine:

View File

@ -166,4 +166,5 @@ with the new configuration on a remote machine:
The ClearML Agent executing the task will use the new values to [override any hard coded values](../clearml_agent.md).
![Cloning, editing, enqueuing gif](../img/gif/integrations_yolov5.gif)
![Cloning, editing, enqueuing gif](../img/gif/integrations_yolov5.gif#light-mode-only)
![Cloning, editing, enqueuing gif](../img/gif/integrations_yolov5_dark.gif#dark-mode-only)

2
package-lock.json generated
View File

@ -15,7 +15,7 @@
"@docusaurus/plugin-google-analytics": "^3.6.1",
"@docusaurus/plugin-google-gtag": "^3.6.1",
"@docusaurus/preset-classic": "^3.6.1",
"@easyops-cn/docusaurus-search-local": "^0.48.0",
"@easyops-cn/docusaurus-search-local": "^0.48.5",
"@mdx-js/react": "^3.0.0",
"clsx": "^1.1.1",
"joi": "^17.4.0",

View File

@ -23,7 +23,7 @@
"@docusaurus/plugin-google-analytics": "^3.6.1",
"@docusaurus/plugin-google-gtag": "^3.6.1",
"@docusaurus/preset-classic": "^3.6.1",
"@easyops-cn/docusaurus-search-local": "^0.48.0",
"@easyops-cn/docusaurus-search-local": "^0.48.5",
"@mdx-js/react": "^3.0.0",
"clsx": "^1.1.1",
"medium-zoom": "^1.0.6",

View File

@ -652,6 +652,8 @@ module.exports = {
]
},
'deploying_clearml/enterprise_deploy/delete_tenant',
'deploying_clearml/enterprise_deploy/import_projects',
'deploying_clearml/enterprise_deploy/change_artifact_links',
{
'Enterprise Applications': [
'deploying_clearml/enterprise_deploy/app_install_ubuntu_on_prem',
@ -671,6 +673,7 @@ module.exports = {
label: 'Identity Provider Integration',
link: {type: 'doc', id: 'user_management/identity_providers'},
items: [
'deploying_clearml/enterprise_deploy/sso_multi_tenant_login',
'deploying_clearml/enterprise_deploy/sso_saml_k8s',
'deploying_clearml/enterprise_deploy/sso_keycloak',
'deploying_clearml/enterprise_deploy/sso_active_directory'