mirror of
https://github.com/clearml/clearml-docs
synced 2025-04-05 05:40:54 +00:00
Add Enterprise server guides (#1057)
This commit is contained in:
parent
fd04894361
commit
0090010fd2
@ -0,0 +1,78 @@
|
||||
---
|
||||
title: Changing CleaML Artifacts Links
|
||||
---
|
||||
|
||||
This guide describes how to update artifact references in the ClearML Enterprise server.
|
||||
|
||||
By default, artifacts are stored on the file server; however, an external storage such as AWS S3, Minio, Google Cloud
|
||||
Storage, etc. may be used to store artifacts. References to these artifacts may exist in ClearML databases: MongoDB and ElasticSearch.
|
||||
This procedure should be used if external storage is being migrated to a different location or URL.
|
||||
|
||||
:::important
|
||||
This procedure does not deal with the actual migration of the data--only with changing the references in ClearML that
|
||||
point to the data.
|
||||
:::
|
||||
|
||||
## Preparation
|
||||
|
||||
### Version Confirmation
|
||||
|
||||
To change the links, use the `fix_fileserver_urls.py` script, located inside the `allegro-apiserver`
|
||||
Docker container. This script will be executed from within the `apiserver` container. Make sure the `apiserver` version
|
||||
is 3.20 or higher.
|
||||
|
||||
### Backup
|
||||
|
||||
It is highly recommended to back up the ClearML MongoDB and ElasticSearch databases before running the script, as the
|
||||
script changes the values in the databases, and can't be undone.
|
||||
|
||||
## Fixing MongoDB links
|
||||
|
||||
1. Access the `apiserver` Docker container:
|
||||
* In `docker-compose:`
|
||||
|
||||
```commandline
|
||||
sudo docker exec -it allegro-apiserver /bin/bash
|
||||
```
|
||||
|
||||
* In Kubernetes:
|
||||
|
||||
```commandline
|
||||
kubectl exec -it -n clearml <clearml-apiserver-pod-name> -- bash
|
||||
```
|
||||
|
||||
1. Navigate to the script location in the `upgrade` folder:
|
||||
|
||||
```commandline
|
||||
cd /opt/seematics/apiserver/server/upgrade
|
||||
```
|
||||
|
||||
1. Run the following command:
|
||||
|
||||
:::important
|
||||
Before running the script, verify that this is indeed the correct version (`apiserver` v3.20 or higher,
|
||||
or that the script provided by ClearML was copied into the container).
|
||||
::::
|
||||
|
||||
```commandline
|
||||
python3 fix_fileserver_urls.py \
|
||||
--mongo-host mongodb://mongo:27017 \
|
||||
--elastic-host elasticsearch:9200 \
|
||||
--host-source "<old fileserver host and/or port, as in artifact links>" \
|
||||
--host-target "<new fileserver host and/or port>" --datasets
|
||||
```
|
||||
|
||||
:::note Notes
|
||||
* If MongoDB or ElasticSearch services are accessed from the `apiserver` container using custom addresses, then
|
||||
`--mongo-host` and `--elastic-host` arguments should be updated accordingly.
|
||||
* If ElasticSearch is set up to require authentication then the following arguments should be used to pass the user
|
||||
and password: `--elastic-user <es_user> --elastic-password <es_pass>`
|
||||
:::
|
||||
|
||||
The script fixes the links in MongoDB, and outputs `cURL` commands for updating the links in ElasticSearch.
|
||||
|
||||
## Fixing the ElasticSearch Links
|
||||
|
||||
Copy the `cURL` commands printed by the script run in the previous stage, and run them one after the other. Make sure to
|
||||
inspect that a "success" result was returned from each command. Depending on the amount of the data in the ElasticSearch,
|
||||
running these commands may take some time.
|
240
docs/deploying_clearml/enterprise_deploy/import_projects.md
Normal file
240
docs/deploying_clearml/enterprise_deploy/import_projects.md
Normal file
@ -0,0 +1,240 @@
|
||||
---
|
||||
title: Exporting and Importing ClearML Projects
|
||||
---
|
||||
|
||||
When migrating from a ClearML Open Server to a ClearML Enterprise Server, you may need to transfer projects. This is done
|
||||
using the `data_tool.py` script. This utility is available in the `apiserver` Docker image, and can be used for
|
||||
exporting and importing ClearML project data for both open source and Enterprise versions.
|
||||
|
||||
This guide covers the following:
|
||||
* Exporting data from Open Source and Enterprise servers
|
||||
* Importing data into an Enterprise server
|
||||
* Handling the artifacts stored on the file server.
|
||||
|
||||
:::note
|
||||
Export instructions differ for ClearML open and Enterprise servers. Make sure you follow the guidelines that match your
|
||||
server type.
|
||||
:::
|
||||
|
||||
## Exporting Data
|
||||
|
||||
The export process is done by running the ***data_tool*** script that generates a zip file containing project and task
|
||||
data. This file should then be copied to the server on which the import will run.
|
||||
|
||||
Note that artifacts stored in the ClearML ***file server*** should be copied manually if required (see [Handling Artifacts](#handling-artifacts)).
|
||||
|
||||
### Exporting Data from ClearML Open Servers
|
||||
|
||||
#### Preparation
|
||||
|
||||
* Make sure the `apiserver` is at least Open Source server version 1.12.0.
|
||||
* Note that any `pending` or `running` tasks will not be exported. If you wish to export them, make sure to stop/dequeue
|
||||
them before exporting.
|
||||
|
||||
#### Running the Data Tool
|
||||
|
||||
Execute the data tool within the `apiserver` container.
|
||||
|
||||
Open a bash session inside the `apiserver` container of the server:
|
||||
* In docker-compose:
|
||||
|
||||
```commandline
|
||||
sudo docker exec -it clearml-apiserver /bin/bash
|
||||
```
|
||||
|
||||
* In Kubernetes:
|
||||
|
||||
```commandline
|
||||
kubectl exec -it -n <clearml-namespace> <clearml-apiserver-pod-name> -- bash
|
||||
```
|
||||
|
||||
#### Export Commands
|
||||
**To export specific projects:**
|
||||
|
||||
```commandline
|
||||
python3 -m apiserver.data_tool export --projects <project_id1> <project_id2>
|
||||
--statuses created stopped published failed completed --output <output-file-name>.zip
|
||||
```
|
||||
|
||||
As a result, you should get a `<output-file-name>.zip` file that contains all the data from the specified projects and
|
||||
their children.
|
||||
|
||||
**To export all the projects:**
|
||||
|
||||
```commandline
|
||||
python3 -m apiserver.data_tool export \
|
||||
--all \
|
||||
--statuses created stopped published failed completed \
|
||||
--output <output-file-name>.zip
|
||||
```
|
||||
|
||||
#### Optional Parameters
|
||||
|
||||
* `--experiments <list of experiment IDs>` - If not specified then all experiments from the specified projects are exported
|
||||
* `--statuses <list of task statuses>` - Export tasks of specific statuses. If the parameter
|
||||
is omitted, only `published` tasks are exported
|
||||
* `--no-events` - Do not export task events, i.e. logs and metrics (scalar, plots, debug samples).
|
||||
|
||||
Make sure to copy the generated zip file containing the exported data.
|
||||
|
||||
### Exporting Data from ClearML Enterprise Servers
|
||||
|
||||
#### Preparation
|
||||
|
||||
* Make sure the `apiserver` is at least Enterprise Server version 3.18.0.
|
||||
* Note that any `pending` or `running` tasks will not be exported. If you wish to export them, make sure to stop/dequeue
|
||||
before exporting.
|
||||
|
||||
#### Running the Data Tool
|
||||
|
||||
Execute the data tool from within the `apiserver` docker container.
|
||||
|
||||
Open a bash session inside the `apiserver` container of the server:
|
||||
* In `docker-compose`:
|
||||
|
||||
```commandline
|
||||
sudo docker exec -it allegro-apiserver /bin/bash
|
||||
```
|
||||
|
||||
* In Kubernetes:
|
||||
|
||||
```commandline
|
||||
kubectl exec -it -n <clearml-namespace> <clearml-apiserver-pod-name> -- bash
|
||||
```
|
||||
|
||||
#### Export Commands
|
||||
|
||||
**To export specific projects:**
|
||||
|
||||
```commandline
|
||||
PYTHONPATH=/opt/seematics/apiserver/trains-server-repo python3 data_tool.py \
|
||||
export \
|
||||
--projects <project_id1> <project_id2> \
|
||||
--statuses created stopped published failed completed \
|
||||
--output <output-file-name>.zip
|
||||
```
|
||||
|
||||
As a result, you should get `<output-file-name>.zip` file that contains all the data from the specified projects and
|
||||
their children.
|
||||
|
||||
**To export all the projects:**
|
||||
|
||||
```commandline
|
||||
PYTHONPATH=/opt/seematics/apiserver/trains-server-repo python3 data_tool.py \
|
||||
export \
|
||||
--all \
|
||||
--statuses created stopped published failed completed \
|
||||
--output <output-file-name>.zip
|
||||
```
|
||||
|
||||
#### Optional Parameters
|
||||
|
||||
* `--experiments <list of experiment IDs>` - If not specified then all experiments from the specified projects are exported
|
||||
* `--statuses <list of task statuses>` - Can be used to allow exporting tasks of specific statuses. If the parameter is
|
||||
omitted, only `published` tasks are exported.
|
||||
* `--no-events` - Do not export task events, i.e. logs, and metrics (scalar, plots, debug samples).
|
||||
|
||||
Make sure to copy the generated zip file containing the exported data.
|
||||
|
||||
## Importing Data
|
||||
|
||||
This section explains how to import the exported data into a ClearML Enterprise server.
|
||||
|
||||
### Preparation
|
||||
|
||||
* It is highly recommended to back up the ClearML databases before importing data, as import injects data into the
|
||||
databases, and can't be undone.
|
||||
* Make sure you are working with `apiserver` version 3.22.3 or higher.
|
||||
* Make the zip file accessible from within the `apiserver` container by copying the exported data to the
|
||||
`apiserver` container or to a folder on the host, which the `apiserver` is mounted to.
|
||||
|
||||
### Usage
|
||||
|
||||
The data tool should be executed from within the `apiserver` docker container.
|
||||
|
||||
1. Open a bash session inside the `apiserver` container of the server:
|
||||
* In `docker-compose`:
|
||||
|
||||
```commandline
|
||||
sudo docker exec -it allegro-apiserver /bin/bash
|
||||
```
|
||||
|
||||
* In Kubernetes:
|
||||
|
||||
```commandline
|
||||
kubectl exec -it -n <clearml-namespace> <clearml-apiserver-pod-name> -- bash
|
||||
```
|
||||
|
||||
1. Run the data tool script in *import* mode:
|
||||
|
||||
```commandline
|
||||
PYTHONPATH=/opt/seematics/apiserver/trains-server-repo python3 data_tool.py \
|
||||
import \
|
||||
<path to zip file> \
|
||||
--company <company_id> \
|
||||
--user <user_id>
|
||||
```
|
||||
|
||||
* `company_id`- The default company ID used in the target deployment. Inside the `apiserver` container you can
|
||||
usually get it from the environment variable `CLEARML__APISERVER__DEFAULT_COMPANY`.
|
||||
If you do not specify the `--company` parameter then all the data will be imported as `Examples` (read-only)
|
||||
* `user_id` - The ID of the user in the target deployment who will become the owner of the imported data
|
||||
|
||||
## Handling Artifacts
|
||||
|
||||
***Artifacts*** refers to any content which the ClearML server holds references to. This can include:
|
||||
* Dataset or Hyper-Dataset frame URLs
|
||||
* ClearML artifact URLs
|
||||
* Model snapshots
|
||||
* Debug samples
|
||||
|
||||
Artifacts may be stored in any external storage (e.g., AWS S3, minio, Google Cloud Storage) or in the ClearML file server.
|
||||
* If the artifacts are **not** stored in the ClearML file server, they do not need to be moved during the export/import process,
|
||||
as the URLs registered in ClearML entities pointing to these artifacts will not change.
|
||||
* If the artifacts are stored in the ClearML file server, then the file server content must also be moved, and the URLs
|
||||
in the ClearML databases must point to the new location. See instructions [below](#exporting-file-server-data-for-clearml-open-server).
|
||||
|
||||
### Exporting File Server Data for ClearML Open Server
|
||||
|
||||
Data in the file server is organized by project. For each project, all data references by entities in that project is
|
||||
stored in a folder bearing the name of the project. This folder can be located in:
|
||||
|
||||
```
|
||||
/opt/clearml/data/fileserver/<project name>
|
||||
```
|
||||
|
||||
The entire projects' folders content should be copied to the target server (see [Importing Fileserver Data](#importing-file-server-data)).
|
||||
|
||||
### Exporting File Server Data for ClearML Enterprise Server
|
||||
|
||||
Data in the file server is organized by tenant and project. For each project, all data references by entities in that
|
||||
project is stored in a folder bearing the name of the project. This folder can be located in:
|
||||
|
||||
```
|
||||
/opt/allegro/data/fileserver/<company_id>/<project name>
|
||||
```
|
||||
|
||||
The entire projects' folders content should be copied to the target server (see [Importing Fileserver Data](#importing-file-server-data)).
|
||||
|
||||
## Importing File Server Data
|
||||
|
||||
### Copying the Data
|
||||
|
||||
Place the exported projects' folder(s) content into the target file server's storage in the following folder:
|
||||
|
||||
```
|
||||
/opt/allegro/data/fileserver/<company_id>/<project name>
|
||||
```
|
||||
|
||||
### Fixing Registered URLs
|
||||
|
||||
Since URLs pointing to the file server contain the file server's address, these need to be changed to the address of the
|
||||
new file server.
|
||||
|
||||
Note that this is not required if the new file server is replacing the old file server and can be accessed using the same
|
||||
exact address.
|
||||
|
||||
Once the projects' data has been copied to the target server, and the projects themselves were imported, see
|
||||
[Changing CleaML Artifacts Links](change_artifact_links.md) for information on how to fix the URLs.
|
||||
|
||||
|
@ -652,6 +652,8 @@ module.exports = {
|
||||
]
|
||||
},
|
||||
'deploying_clearml/enterprise_deploy/delete_tenant',
|
||||
'deploying_clearml/enterprise_deploy/import_projects',
|
||||
'deploying_clearml/enterprise_deploy/change_artifact_links',
|
||||
{
|
||||
'Enterprise Applications': [
|
||||
'deploying_clearml/enterprise_deploy/app_install_ubuntu_on_prem',
|
||||
|
Loading…
Reference in New Issue
Block a user