mirror of
https://github.com/clearml/clearml-docs
synced 2025-03-09 13:42:26 +00:00
Small edits (#861)
This commit is contained in:
parent
af1de9f598
commit
7137669f24
@ -734,15 +734,20 @@ CLEARML_API_SECRET_KEY
|
||||
|
||||
Build a Docker container that when launched executes a specific experiment, or a clone (copy) of that experiment.
|
||||
|
||||
- Build a Docker container that at launch will execute a specific Task.
|
||||
- Build a Docker container that at launch will execute a specific Task:
|
||||
|
||||
```bash
|
||||
clearml-agent build --id <task-id> --docker --target <new-docker-name> --entry-point reuse_task
|
||||
```
|
||||
- Build a Docker container that at launch will clone a Task specified by Task ID, and will execute the newly cloned Task.
|
||||
|
||||
- Build a Docker container that at launch will clone a Task specified by Task ID, and will execute the newly cloned Task:
|
||||
|
||||
```bash
|
||||
clearml-agent build --id <task-id> --docker --target <new-docker-name> --entry-point clone_task
|
||||
```
|
||||
|
||||
- Run built Docker by executing:
|
||||
|
||||
```bash
|
||||
docker run <new-docker-name>
|
||||
```
|
||||
|
@ -52,11 +52,14 @@ and downloaded in realtime when updated
|
||||
|
||||
Spin the Inference Container
|
||||
- Customize container [Dockerfile](https://github.com/allegroai/clearml-serving/blob/main/clearml_serving/serving/Dockerfile) if needed
|
||||
- Build container `
|
||||
- Build container:
|
||||
|
||||
```bash
|
||||
docker build --tag clearml-serving-inference:latest -f clearml_serving/serving/Dockerfile .
|
||||
```
|
||||
|
||||
- Spin the inference container:
|
||||
|
||||
```bash
|
||||
docker run -v ~/clearml.conf:/root/clearml.conf -p 8080:8080 -e CLEARML_SERVING_TASK_ID=<service_id> -e CLEARML_SERVING_POLL_FREQ=5 clearml-serving-inference:latest
|
||||
```
|
||||
@ -97,7 +100,8 @@ or with the `clearml-serving` CLI.
|
||||
In the [ClearML web UI](../webapp/webapp_overview.md), the new model is listed under the **Models** tab of its project.
|
||||
You can also download the model file itself directly from the web UI.
|
||||
|
||||
1. Register a new endpoint with the new model
|
||||
1. Register a new endpoint with the new model:
|
||||
|
||||
```bash
|
||||
clearml-serving --id <service_id> model add --engine sklearn --endpoint "test_model_sklearn" --preprocess "examples/sklearn/preprocess.py" --model-id <newly_created_model_id_here>
|
||||
```
|
||||
@ -131,11 +135,13 @@ deployment process, as a single API automatically deploys (or removes) a model f
|
||||
- Use the RestAPI (see [details](https://clear.ml/docs/latest/docs/references/api/models#post-modelspublish_many))
|
||||
- Use Python interface:
|
||||
|
||||
```python
|
||||
from clearml import Model
|
||||
Model(model_id="unique_model_id_here").publish()
|
||||
```
|
||||
```python
|
||||
from clearml import Model
|
||||
Model(model_id="unique_model_id_here").publish()
|
||||
```
|
||||
|
||||
1. The new model is available on a new endpoint version (1), test with:
|
||||
|
||||
```bash
|
||||
curl -X POST "http://127.0.0.1:8080/serve/test_model_sklearn_auto/1" -H "accept: application/json" -H "Content-Type: application/json" -d '{"x0": 1, "x1": 2}'
|
||||
```
|
||||
|
@ -93,12 +93,14 @@ sudo tar czvf ~/clearml_backup_config.tgz -C /opt/clearml/config .
|
||||
|
||||
1. Verify you have the backup files.
|
||||
1. Replace any existing data with the backup data:
|
||||
|
||||
```bash
|
||||
sudo rm -fR /opt/clearml/data/* /opt/clearml/config/*
|
||||
sudo tar -xzf ~/clearml_backup_data.tgz -C /opt/clearml/data
|
||||
sudo tar -xzf ~/clearml_backup_config.tgz -C /opt/clearml/config
|
||||
```
|
||||
1. Grant access to the data:
|
||||
|
||||
```bash
|
||||
sudo chown -R 1000:1000 /opt/clearml
|
||||
```
|
||||
|
@ -224,30 +224,39 @@ To open external access to the Elasticsearch, MongoDB, and Redis ports:
|
||||
|
||||
1. Shutdown ClearML Server. Execute the following command (which assumes the configuration file is in the environment path).
|
||||
|
||||
docker-compose down
|
||||
```
|
||||
docker-compose down
|
||||
```
|
||||
|
||||
1. Edit the `docker-compose.yml` file as follows:
|
||||
|
||||
* In the `elasticsearch` section, add the two lines:
|
||||
|
||||
ports:
|
||||
- "9200:9200"
|
||||
|
||||
```
|
||||
ports:
|
||||
- "9200:9200"
|
||||
```
|
||||
|
||||
* In the `mongo` section, add the two lines:
|
||||
|
||||
ports:
|
||||
- "27017:27017"
|
||||
```
|
||||
ports:
|
||||
- "27017:27017"
|
||||
```
|
||||
|
||||
* In the `redis` section, add the two lines:
|
||||
|
||||
ports:
|
||||
- "6379:6379"
|
||||
|
||||
```
|
||||
ports:
|
||||
- "6379:6379"
|
||||
```
|
||||
|
||||
1. Startup ClearML Server.
|
||||
|
||||
docker-compose -f docker-compose.yml pull
|
||||
docker-compose -f docker-compose.yml up -d
|
||||
|
||||
```
|
||||
docker-compose -f docker-compose.yml pull
|
||||
docker-compose -f docker-compose.yml up -d
|
||||
```
|
||||
|
||||
|
||||
### Web Login Authentication
|
||||
|
@ -71,13 +71,17 @@ and ClearML Server needs to be installed.
|
||||
|
||||
1. Download the migration package archive.
|
||||
|
||||
curl -L -O https://github.com/allegroai/clearml-server/releases/download/0.16.0/trains-server-0.16.0-migration.zip
|
||||
|
||||
If the file needs to be downloaded manually, use this direct link: [trains-server-0.16.0-migration.zip](https://github.com/allegroai/clearml-server/releases/download/0.16.0/trains-server-0.16.0-migration.zip).
|
||||
```
|
||||
curl -L -O https://github.com/allegroai/clearml-server/releases/download/0.16.0/trains-server-0.16.0-migration.zip
|
||||
```
|
||||
|
||||
If the file needs to be downloaded manually, use this direct link: [trains-server-0.16.0-migration.zip](https://github.com/allegroai/clearml-server/releases/download/0.16.0/trains-server-0.16.0-migration.zip).
|
||||
|
||||
1. Extract the archive.
|
||||
|
||||
unzip trains-server-0.16.0-migration.zip -d /opt/trains
|
||||
```
|
||||
unzip trains-server-0.16.0-migration.zip -d /opt/trains
|
||||
```
|
||||
|
||||
1. Migrate the data.
|
||||
|
||||
@ -104,37 +108,51 @@ and ClearML Server needs to be installed.
|
||||
|
||||
1. Clone the `trains-server-k8s` repository and change to the new `trains-server-k8s/upgrade-elastic` directory:
|
||||
|
||||
git clone https://github.com/allegroai/clearml-server-k8s.git && cd clearml-server-k8s/upgrade-elastic
|
||||
```
|
||||
git clone https://github.com/allegroai/clearml-server-k8s.git && cd clearml-server-k8s/upgrade-elastic
|
||||
```
|
||||
|
||||
1. Create the `upgrade-elastic` namespace and deployments:
|
||||
|
||||
kubectl apply -k overlays/current_version
|
||||
|
||||
Wait for the job to be completed. To check if it's completed, run:
|
||||
```
|
||||
kubectl apply -k overlays/current_version
|
||||
```
|
||||
|
||||
Wait for the job to be completed. To check if it's completed, run:
|
||||
|
||||
kubectl get jobs -n upgrade-elastic
|
||||
```
|
||||
kubectl get jobs -n upgrade-elastic
|
||||
```
|
||||
|
||||
* **Kubernetes using Helm**
|
||||
|
||||
1. Add the `clearml-server` repository to Helm client.
|
||||
|
||||
helm repo add allegroai https://allegroai.github.io/clearml-server-helm/
|
||||
```
|
||||
helm repo add allegroai https://allegroai.github.io/clearml-server-helm/
|
||||
```
|
||||
|
||||
Confirm the `clearml-server` repository is now in the Helm client.
|
||||
|
||||
Confirm the `clearml-server` repository is now in the Helm client.
|
||||
|
||||
helm search clearml
|
||||
|
||||
The `helm search` results must include `allegroai/upgrade-elastic-helm`.
|
||||
```
|
||||
helm search clearml
|
||||
```
|
||||
|
||||
The `helm search` results must include `allegroai/upgrade-elastic-helm`.
|
||||
|
||||
1. Install `upgrade-elastic-helm` on the cluster:
|
||||
1. Install `upgrade-elastic-helm` on the cluster:
|
||||
|
||||
helm install allegroai/upgrade-elastic-helm --namespace=upgrade-elastic --name upgrade
|
||||
|
||||
An upgrade-elastic `namespace` is created in the cluster, and the upgrade is deployed in it.
|
||||
```
|
||||
helm install allegroai/upgrade-elastic-helm --namespace=upgrade-elastic --name upgrade
|
||||
```
|
||||
|
||||
An upgrade-elastic `namespace` is created in the cluster, and the upgrade is deployed in it.
|
||||
|
||||
Wait for the job to complete. To check if it completed, execute the following command:
|
||||
Wait for the job to complete. To check if it completed, execute the following command:
|
||||
|
||||
kubectl get jobs -n upgrade-elastic
|
||||
```
|
||||
kubectl get jobs -n upgrade-elastic
|
||||
```
|
||||
|
||||
### Verifying the Data Migration
|
||||
|
||||
|
@ -70,7 +70,7 @@ By default, ClearML Server launches with unrestricted access. To restrict ClearM
|
||||
instructions in the [Security](clearml_server_security.md) page.
|
||||
:::
|
||||
|
||||
To launch ClearML Server using a GCP Custom Image, see the [Manually importing virtual disks](https://cloud.google.com/compute/docs/import/import-existing-image#overview) in the "Google Cloud Storage" documentation, [Compute Engine documentation](https://cloud.google.com/compute/docs). For more information about Custom Images, see [Custom Images](https://cloud.google.com/compute/docs/images#custom_images) in the "Compute Engine documentation".
|
||||
To launch ClearML Server using a GCP Custom Image, see the [Google Cloud Storage documentation](https://cloud.google.com/compute/docs/import/import-existing-image#overview). For more information about Custom Images, see [Custom Images](https://cloud.google.com/compute/docs/images#custom_images) in the Compute Engine documentation.
|
||||
|
||||
The minimum requirements for ClearML Server are:
|
||||
|
||||
@ -83,9 +83,10 @@ The minimum requirements for ClearML Server are:
|
||||
|
||||
* Stop and then restart the Docker containers by executing the following commands:
|
||||
|
||||
docker-compose -f /opt/clearml/docker-compose.yml down
|
||||
docker-compose -f /opt/clearml/docker-compose.yml up -d
|
||||
|
||||
```
|
||||
docker-compose -f /opt/clearml/docker-compose.yml down
|
||||
docker-compose -f /opt/clearml/docker-compose.yml up -d
|
||||
```
|
||||
|
||||
## Backing Up and Restoring Data and Configuration
|
||||
|
||||
@ -98,22 +99,28 @@ The commands in this section are an example of how to back up and restore data a
|
||||
If data and configuration folders are in `/opt/clearml`, then archive all data into `~/clearml_backup_data.tgz`, and
|
||||
configuration into `~/clearml_backup_config.tgz`:
|
||||
|
||||
sudo tar czvf ~/clearml_backup_data.tgz -C /opt/clearml/data .
|
||||
sudo tar czvf ~/clearml_backup_config.tgz -C /opt/clearml/config .
|
||||
```
|
||||
sudo tar czvf ~/clearml_backup_data.tgz -C /opt/clearml/data .
|
||||
sudo tar czvf ~/clearml_backup_config.tgz -C /opt/clearml/config .
|
||||
```
|
||||
|
||||
If the data and the configuration need to be restored:
|
||||
|
||||
1. Verify you have the backup files.
|
||||
1. Replace any existing data with the backup data:
|
||||
|
||||
sudo rm -fR /opt/clearml/data/* /opt/clearml/config/*
|
||||
sudo tar -xzf ~/clearml_backup_data.tgz -C /opt/clearml/data
|
||||
sudo tar -xzf ~/clearml_backup_config.tgz -C /opt/clearml/config
|
||||
|
||||
```
|
||||
sudo rm -fR /opt/clearml/data/* /opt/clearml/config/*
|
||||
sudo tar -xzf ~/clearml_backup_data.tgz -C /opt/clearml/data
|
||||
sudo tar -xzf ~/clearml_backup_config.tgz -C /opt/clearml/config
|
||||
```
|
||||
|
||||
1. Grant access to the data:
|
||||
|
||||
sudo chown -R 1000:1000 /opt/clearml
|
||||
|
||||
```
|
||||
sudo chown -R 1000:1000 /opt/clearml
|
||||
```
|
||||
|
||||
## ClearML Server GCP Custom Image
|
||||
|
||||
The following section contains a list of Custom Image URLs (exported in different formats) for each released ClearML Server version.
|
||||
|
@ -48,18 +48,21 @@ Deploying the server requires a minimum of 4 GB of memory, 8 GB is recommended.
|
||||
|
||||
1. Verify the Docker CE installation. Execute the command:
|
||||
|
||||
docker run hello-world
|
||||
```
|
||||
docker run hello-world
|
||||
```
|
||||
|
||||
The expected is output is:
|
||||
```
|
||||
Hello from Docker!
|
||||
This message shows that your installation appears to be working correctly.
|
||||
To generate this message, Docker took the following steps:
|
||||
|
||||
Hello from Docker!
|
||||
This message shows that your installation appears to be working correctly.
|
||||
To generate this message, Docker took the following steps:
|
||||
|
||||
1. The Docker client contacted the Docker daemon.
|
||||
2. The Docker daemon pulled the "hello-world" image from the Docker Hub. (amd64)
|
||||
3. The Docker daemon created a new container from that image which runs the executable that produces the output you are currently reading.
|
||||
4. The Docker daemon streamed that output to the Docker client, which sent it to your terminal.
|
||||
1. The Docker client contacted the Docker daemon.
|
||||
2. The Docker daemon pulled the "hello-world" image from the Docker Hub. (amd64)
|
||||
3. The Docker daemon created a new container from that image which runs the executable that produces the output you are currently reading.
|
||||
4. The Docker daemon streamed that output to the Docker client, which sent it to your terminal.
|
||||
```
|
||||
|
||||
1. For macOS only, increase the memory allocation in Docker Desktop to `8GB`.
|
||||
|
||||
@ -68,39 +71,46 @@ Deploying the server requires a minimum of 4 GB of memory, 8 GB is recommended.
|
||||
1. Click **Apply**.
|
||||
|
||||
1. For Linux only, install `docker-compose`. Execute the following commands (for more information, see [Install Docker Compose](https://docs.docker.com/compose/install/) in the Docker documentation):
|
||||
|
||||
sudo curl -L "https://github.com/docker/compose/releases/download/1.24.1/docker-compose-$(uname -s)-$(uname -m)" -o /usr/local/bin/docker-compose
|
||||
sudo chmod +x /usr/local/bin/docker-compose
|
||||
|
||||
|
||||
```
|
||||
sudo curl -L "https://github.com/docker/compose/releases/download/1.24.1/docker-compose-$(uname -s)-$(uname -m)" -o /usr/local/bin/docker-compose
|
||||
sudo chmod +x /usr/local/bin/docker-compose
|
||||
```
|
||||
|
||||
1. Increase `vm.max_map_count` for Elasticsearch in Docker. Execute the following commands, depending upon the operating system:
|
||||
|
||||
* Linux:
|
||||
|
||||
echo "vm.max_map_count=262144" > /tmp/99-clearml.conf
|
||||
sudo mv /tmp/99-clearml.conf /etc/sysctl.d/99-clearml.conf
|
||||
sudo sysctl -w vm.max_map_count=262144
|
||||
sudo service docker restart
|
||||
|
||||
```
|
||||
echo "vm.max_map_count=262144" > /tmp/99-clearml.conf
|
||||
sudo mv /tmp/99-clearml.conf /etc/sysctl.d/99-clearml.conf
|
||||
sudo sysctl -w vm.max_map_count=262144
|
||||
sudo service docker restart
|
||||
```
|
||||
|
||||
* macOS:
|
||||
|
||||
docker run --net=host --ipc=host --uts=host --pid=host --privileged --security-opt=seccomp=unconfined -it --rm -v /:/host alpine chroot /host
|
||||
sysctl -w vm.max_map_count=262144
|
||||
|
||||
```
|
||||
docker run --net=host --ipc=host --uts=host --pid=host --privileged --security-opt=seccomp=unconfined -it --rm -v /:/host alpine chroot /host
|
||||
sysctl -w vm.max_map_count=262144
|
||||
```
|
||||
1. Remove any previous installation of ClearML Server.
|
||||
|
||||
**This clears all existing ClearML SDK databases.**
|
||||
|
||||
sudo rm -R /opt/clearml/
|
||||
```
|
||||
sudo rm -R /opt/clearml/
|
||||
```
|
||||
|
||||
1. Create local directories for the databases and storage.
|
||||
|
||||
sudo mkdir -p /opt/clearml/data/elastic_7
|
||||
sudo mkdir -p /opt/clearml/data/mongo_4/db
|
||||
sudo mkdir -p /opt/clearml/data/mongo_4/configdb
|
||||
sudo mkdir -p /opt/clearml/data/redis
|
||||
sudo mkdir -p /opt/clearml/logs
|
||||
sudo mkdir -p /opt/clearml/config
|
||||
sudo mkdir -p /opt/clearml/data/fileserver
|
||||
```
|
||||
sudo mkdir -p /opt/clearml/data/elastic_7
|
||||
sudo mkdir -p /opt/clearml/data/mongo_4/db
|
||||
sudo mkdir -p /opt/clearml/data/mongo_4/configdb
|
||||
sudo mkdir -p /opt/clearml/data/redis
|
||||
sudo mkdir -p /opt/clearml/logs
|
||||
sudo mkdir -p /opt/clearml/config
|
||||
sudo mkdir -p /opt/clearml/data/fileserver
|
||||
```
|
||||
|
||||
1. For macOS only do the following:
|
||||
|
||||
@ -114,26 +124,32 @@ Deploying the server requires a minimum of 4 GB of memory, 8 GB is recommended.
|
||||
|
||||
* Linux:
|
||||
|
||||
sudo chown -R 1000:1000 /opt/clearml
|
||||
|
||||
```
|
||||
sudo chown -R 1000:1000 /opt/clearml
|
||||
```
|
||||
|
||||
* macOS:
|
||||
|
||||
sudo chown -R $(whoami):staff /opt/clearml
|
||||
|
||||
1. Download the ClearML Server docker-compose YAML file.
|
||||
|
||||
sudo curl https://raw.githubusercontent.com/allegroai/clearml-server/master/docker/docker-compose.yml -o /opt/clearml/docker-compose.yml
|
||||
```
|
||||
sudo chown -R $(whoami):staff /opt/clearml
|
||||
```
|
||||
|
||||
2. Download the ClearML Server docker-compose YAML file.
|
||||
```
|
||||
sudo curl https://raw.githubusercontent.com/allegroai/clearml-server/master/docker/docker-compose.yml -o /opt/clearml/docker-compose.yml
|
||||
```
|
||||
1. For Linux only, configure the **ClearML Agent Services**. If `CLEARML_HOST_IP` is not provided, then ClearML Agent Services uses the external public address of the ClearML Server. If `CLEARML_AGENT_GIT_USER` / `CLEARML_AGENT_GIT_PASS` are not provided, then ClearML Agent Services can't access any private repositories for running service tasks.
|
||||
|
||||
export CLEARML_HOST_IP=server_host_ip_here
|
||||
export CLEARML_AGENT_GIT_USER=git_username_here
|
||||
export CLEARML_AGENT_GIT_PASS=git_password_here
|
||||
```
|
||||
export CLEARML_HOST_IP=server_host_ip_here
|
||||
export CLEARML_AGENT_GIT_USER=git_username_here
|
||||
export CLEARML_AGENT_GIT_PASS=git_password_here
|
||||
```
|
||||
|
||||
1. Run `docker-compose` with the downloaded configuration file.
|
||||
|
||||
docker-compose -f /opt/clearml/docker-compose.yml up -d
|
||||
|
||||
```
|
||||
docker-compose -f /opt/clearml/docker-compose.yml up -d
|
||||
```
|
||||
The server is now running on [http://localhost:8080](http://localhost:8080).
|
||||
|
||||
## Port Mapping
|
||||
@ -150,9 +166,10 @@ After deploying ClearML Server, the services expose the following ports:
|
||||
|
||||
* Stop and then restart the Docker containers by executing the following commands:
|
||||
|
||||
docker-compose -f /opt/clearml/docker-compose.yml down
|
||||
docker-compose -f /opt/clearml/docker-compose.yml up -d
|
||||
|
||||
```
|
||||
docker-compose -f /opt/clearml/docker-compose.yml down
|
||||
docker-compose -f /opt/clearml/docker-compose.yml up -d
|
||||
```
|
||||
|
||||
|
||||
## Backing Up and Restoring Data and Configuration
|
||||
@ -166,27 +183,36 @@ The commands in this section are an example of how to back up and to restore dat
|
||||
If the data and configuration folders are in `/opt/clearml`, then archive all data into `~/clearml_backup_data.tgz`, and
|
||||
configuration into `~/clearml_backup_config.tgz`:
|
||||
|
||||
sudo tar czvf ~/clearml_backup_data.tgz -C /opt/clearml/data .
|
||||
sudo tar czvf ~/clearml_backup_config.tgz -C /opt/clearml/config .
|
||||
```
|
||||
sudo tar czvf ~/clearml_backup_data.tgz -C /opt/clearml/data .
|
||||
sudo tar czvf ~/clearml_backup_config.tgz -C /opt/clearml/config .
|
||||
```
|
||||
|
||||
If needed, restore data and configuration by doing the following:
|
||||
|
||||
1. Verify the existence of backup files.
|
||||
1. Replace any existing data with the backup data:
|
||||
|
||||
sudo rm -fR /opt/clearml/data/* /opt/clearml/config/*
|
||||
sudo tar -xzf ~/clearml_backup_data.tgz -C /opt/clearml/data
|
||||
sudo tar -xzf ~/clearml_backup_config.tgz -C /opt/clearml/config
|
||||
|
||||
```
|
||||
sudo rm -fR /opt/clearml/data/* /opt/clearml/config/*
|
||||
sudo tar -xzf ~/clearml_backup_data.tgz -C /opt/clearml/data
|
||||
sudo tar -xzf ~/clearml_backup_config.tgz -C /opt/clearml/config
|
||||
```
|
||||
|
||||
1. Grant access to the data, depending upon the operating system:
|
||||
|
||||
* Linux:
|
||||
|
||||
sudo chown -R 1000:1000 /opt/clearml
|
||||
|
||||
```
|
||||
sudo chown -R 1000:1000 /opt/clearml
|
||||
```
|
||||
|
||||
* macOS:
|
||||
|
||||
sudo chown -R $(whoami):staff /opt/clearml
|
||||
```
|
||||
sudo chown -R $(whoami):staff /opt/clearml
|
||||
```
|
||||
|
||||
## Next Step
|
||||
|
||||
To keep track of your experiments and/or data, the `clearml` package needs to communicate with your server.
|
||||
|
@ -42,23 +42,30 @@ Deploying the server requires a minimum of 4 GB of memory, 8 GB is recommended.
|
||||
|
||||
**This clears all existing ClearML SDK databases.**
|
||||
|
||||
rmdir c:\opt\clearml /s
|
||||
|
||||
```
|
||||
rmdir c:\opt\clearml /s
|
||||
```
|
||||
|
||||
1. Create local directories for data and logs. Open PowerShell and execute the following commands:
|
||||
|
||||
cd c:
|
||||
mkdir c:\opt\clearml\data
|
||||
mkdir c:\opt\clearml\logs
|
||||
```
|
||||
cd c:
|
||||
mkdir c:\opt\clearml\data
|
||||
mkdir c:\opt\clearml\logs
|
||||
```
|
||||
|
||||
1. Save the ClearML Server docker-compose YAML file.
|
||||
|
||||
curl https://raw.githubusercontent.com/allegroai/clearml-server/master/docker/docker-compose-win10.yml -o c:\opt\clearml\docker-compose-win10.yml
|
||||
|
||||
```
|
||||
curl https://raw.githubusercontent.com/allegroai/clearml-server/master/docker/docker-compose-win10.yml -o c:\opt\clearml\docker-compose-win10.yml
|
||||
```
|
||||
|
||||
1. Run `docker-compose`. In PowerShell, execute the following commands:
|
||||
|
||||
docker-compose -f c:\opt\clearml\docker-compose-win10.yml up
|
||||
|
||||
The server is now running on [http://localhost:8080](http://localhost:8080).
|
||||
```
|
||||
docker-compose -f c:\opt\clearml\docker-compose-win10.yml up
|
||||
```
|
||||
The server is now running on [http://localhost:8080](http://localhost:8080).
|
||||
|
||||
## Port Mapping
|
||||
|
||||
@ -74,9 +81,10 @@ After deploying ClearML Server, the services expose the following node ports:
|
||||
|
||||
* Stop and then restart the Docker containers by executing the following commands:
|
||||
|
||||
docker-compose -f c:\opt\clearml\docker-compose-win10.yml down
|
||||
docker-compose -f c:\opt\clearml\docker-compose-win10.yml up -d
|
||||
|
||||
```
|
||||
docker-compose -f c:\opt\clearml\docker-compose-win10.yml down
|
||||
docker-compose -f c:\opt\clearml\docker-compose-win10.yml up -d
|
||||
```
|
||||
|
||||
## Next Step
|
||||
|
||||
|
@ -20,13 +20,17 @@ Some legacy **Trains Server** AMIs provided an auto-upgrade on restart capabilit
|
||||
**To upgrade your ClearML Server AWS AMI:**
|
||||
|
||||
1. Shutdown the ClearML Server executing the following command (which assumes the configuration file is in the environment path).
|
||||
|
||||
docker-compose -f /opt/clearml/docker-compose.yml down
|
||||
|
||||
|
||||
```
|
||||
docker-compose -f /opt/clearml/docker-compose.yml down
|
||||
```
|
||||
|
||||
If you are upgrading from **Trains Server**, use this command:
|
||||
|
||||
docker-compose -f /opt/trains/docker-compose.yml down
|
||||
|
||||
```
|
||||
docker-compose -f /opt/trains/docker-compose.yml down
|
||||
```
|
||||
|
||||
1. [Backing up your data](clearml_server_aws_ec2_ami.md#backing-up-and-restoring-data-and-configuration) is recommended,
|
||||
and if your configuration folder is not empty, backing up your configuration.
|
||||
|
||||
@ -37,12 +41,16 @@ If upgrading from Trains Server version 0.15 or older, a data migration is requi
|
||||
|
||||
1. Download the latest `docker-compose.yml` file. Execute the following command:
|
||||
|
||||
sudo curl https://raw.githubusercontent.com/allegroai/clearml-server/master/docker/docker-compose.yml -o /opt/clearml/docker-compose.yml
|
||||
|
||||
```
|
||||
sudo curl https://raw.githubusercontent.com/allegroai/clearml-server/master/docker/docker-compose.yml -o /opt/clearml/docker-compose.yml
|
||||
```
|
||||
|
||||
1. Startup ClearML Server. This automatically pulls the latest ClearML Server build.
|
||||
|
||||
docker-compose -f /opt/clearml/docker-compose.yml pull
|
||||
docker-compose -f docker-compose.yml up -d
|
||||
```
|
||||
docker-compose -f /opt/clearml/docker-compose.yml pull
|
||||
docker-compose -f docker-compose.yml up -d
|
||||
```
|
||||
|
||||
### Upgrading and Migrating to a New AWS Instance
|
||||
|
||||
@ -52,8 +60,10 @@ This section contains the steps to upgrade ClearML Server on the new AWS instanc
|
||||
|
||||
1. Shutdown ClearML Server. Executing the following command (which assumes the configuration file is in the environment path).
|
||||
|
||||
docker-compose down
|
||||
|
||||
```
|
||||
docker-compose down
|
||||
```
|
||||
|
||||
1. On the old AWS instance, [backup your data](clearml_server_aws_ec2_ami.md#backing-up-and-restoring-data-and-configuration)
|
||||
and, if your configuration folder is not empty, backup your configuration.
|
||||
|
||||
@ -65,5 +75,7 @@ This section contains the steps to upgrade ClearML Server on the new AWS instanc
|
||||
|
||||
1. Startup ClearML Server. This automatically pulls the latest ClearML Server build.
|
||||
|
||||
docker-compose -f docker-compose.yml pull
|
||||
docker-compose -f docker-compose.yml up -d
|
||||
```
|
||||
docker-compose -f docker-compose.yml pull
|
||||
docker-compose -f docker-compose.yml up -d
|
||||
```
|
@ -6,7 +6,9 @@ title: Google Cloud Platform
|
||||
|
||||
1. Shut down the docker containers with the following command:
|
||||
|
||||
docker-compose -f docker-compose.yml down
|
||||
```
|
||||
docker-compose -f docker-compose.yml down
|
||||
```
|
||||
|
||||
1. If upgrading from **Trains Server** version 0.15 or older to **ClearML Server**, do the following:
|
||||
|
||||
@ -15,19 +17,25 @@ title: Google Cloud Platform
|
||||
|
||||
1. Rename `/opt/trains` and its subdirectories to `/opt/clearml`.
|
||||
|
||||
sudo mv /opt/trains /opt/clearml
|
||||
|
||||
```
|
||||
sudo mv /opt/trains /opt/clearml
|
||||
```
|
||||
|
||||
1. If upgrading from ClearML Server version older than 1.2, you need to migrate your data before upgrading your server. See instructions [here](clearml_server_mongo44_migration.md).
|
||||
1. [Backing up data](clearml_server_gcp.md#backing-up-and-restoring-data-and-configuration) is recommended, and if the configuration folder is
|
||||
not empty, backing up the configuration.
|
||||
|
||||
1. Download the latest `docker-compose.yml` file.
|
||||
|
||||
curl https://raw.githubusercontent.com/allegroai/clearml-server/master/docker/docker-compose.yml -o /opt/clearml/docker-compose.yml
|
||||
|
||||
```
|
||||
curl https://raw.githubusercontent.com/allegroai/clearml-server/master/docker/docker-compose.yml -o /opt/clearml/docker-compose.yml
|
||||
```
|
||||
|
||||
1. Startup ClearML Server. This automatically pulls the latest ClearML Server build.
|
||||
|
||||
docker-compose -f /opt/clearml/docker-compose.yml pull
|
||||
docker-compose -f /opt/clearml/docker-compose.yml up -d
|
||||
```
|
||||
docker-compose -f /opt/clearml/docker-compose.yml pull
|
||||
docker-compose -f /opt/clearml/docker-compose.yml up -d
|
||||
```
|
||||
|
||||
If issues arise during your upgrade, see the FAQ page, [How do I fix Docker upgrade errors?](../faq.md#common-docker-upgrade-errors).
|
||||
|
@ -9,11 +9,12 @@ For Linux only, if upgrading from <strong>Trains Server</strong> v0.14 or older,
|
||||
* If ``CLEARML_HOST_IP`` is not provided, then **ClearML Agent Services** uses the external public address of the ClearML Server.
|
||||
* If ``CLEARML_AGENT_GIT_USER`` / ``CLEARML_AGENT_GIT_PASS`` are not provided, then **ClearML Agent Services** can't access any private repositories for running service tasks.
|
||||
|
||||
|
||||
export CLEARML_HOST_IP=server_host_ip_here
|
||||
export CLEARML_AGENT_GIT_USER=git_username_here
|
||||
export CLEARML_AGENT_GIT_PASS=git_password_here
|
||||
|
||||
```
|
||||
export CLEARML_HOST_IP=server_host_ip_here
|
||||
export CLEARML_AGENT_GIT_USER=git_username_here
|
||||
export CLEARML_AGENT_GIT_PASS=git_password_here
|
||||
```
|
||||
|
||||
:::note
|
||||
For backwards compatibility, the environment variables ``TRAINS_HOST_IP``, ``TRAINS_AGENT_GIT_USER``, and ``TRAINS_AGENT_GIT_PASS`` are supported.
|
||||
:::
|
||||
@ -25,8 +26,10 @@ For backwards compatibility, the environment variables ``TRAINS_HOST_IP``, ``TRA
|
||||
**To upgrade ClearML Server Docker deployment:**
|
||||
|
||||
1. Shutdown ClearML Server. Execute the following command (which assumes the configuration file is in the environment path).
|
||||
|
||||
docker-compose -f docker-compose.yml down
|
||||
|
||||
```
|
||||
docker-compose -f docker-compose.yml down
|
||||
```
|
||||
|
||||
1. If upgrading from **Trains Server** version 0.15 or older, a data migration is required before continuing this upgrade. See instructions [here](clearml_server_es7_migration.md).
|
||||
|
||||
@ -37,15 +40,21 @@ For backwards compatibility, the environment variables ``TRAINS_HOST_IP``, ``TRA
|
||||
|
||||
1. If upgrading from **Trains Server** to **ClearML Server**, rename `/opt/trains` and its subdirectories to `/opt/clearml`.
|
||||
|
||||
sudo mv /opt/trains /opt/clearml
|
||||
|
||||
```
|
||||
sudo mv /opt/trains /opt/clearml
|
||||
```
|
||||
|
||||
1. Download the latest `docker-compose.yml` file.
|
||||
|
||||
curl https://raw.githubusercontent.com/allegroai/clearml-server/master/docker/docker-compose.yml -o /opt/clearml/docker-compose.yml
|
||||
|
||||
```
|
||||
curl https://raw.githubusercontent.com/allegroai/clearml-server/master/docker/docker-compose.yml -o /opt/clearml/docker-compose.yml
|
||||
```
|
||||
|
||||
1. Startup ClearML Server. This automatically pulls the latest ClearML Server build.
|
||||
|
||||
docker-compose -f /opt/clearml/docker-compose.yml pull
|
||||
docker-compose -f /opt/clearml/docker-compose.yml up -d
|
||||
```
|
||||
docker-compose -f /opt/clearml/docker-compose.yml pull
|
||||
docker-compose -f /opt/clearml/docker-compose.yml up -d
|
||||
```
|
||||
|
||||
If issues arise during your upgrade, see the FAQ page, [How do I fix Docker upgrade errors?](../faq.md#common-docker-upgrade-errors).
|
||||
|
@ -10,12 +10,16 @@ title: Windows
|
||||
|
||||
* Upgrading ClearML Server version:
|
||||
|
||||
docker-compose -f c:\opt\clearml\docker-compose-win10.yml down
|
||||
|
||||
```
|
||||
docker-compose -f c:\opt\clearml\docker-compose-win10.yml down
|
||||
```
|
||||
|
||||
* Upgrading from **Trains Server** to **ClearML Server**:
|
||||
|
||||
docker-compose -f c:\opt\trains\docker-compose-win10.yml down
|
||||
|
||||
```
|
||||
docker-compose -f c:\opt\trains\docker-compose-win10.yml down
|
||||
```
|
||||
|
||||
1. If upgrading from **Trains Server** version 0.15 or older, a data migration is required before continuing this upgrade. See instructions [here](clearml_server_es7_migration.md).
|
||||
|
||||
1. If upgrading from ClearML Server version older than 1.2, you need to migrate your data before upgrading your server. See instructions [here](clearml_server_mongo44_migration.md).
|
||||
@ -31,11 +35,15 @@ title: Windows
|
||||
|
||||
1. Download the latest `docker-compose.yml` file.
|
||||
|
||||
curl https://raw.githubusercontent.com/allegroai/clearml-server/master/docker/docker-compose-win10.yml -o c:\opt\clearml\docker-compose-win10.yml
|
||||
```
|
||||
curl https://raw.githubusercontent.com/allegroai/clearml-server/master/docker/docker-compose-win10.yml -o c:\opt\clearml\docker-compose-win10.yml
|
||||
```
|
||||
|
||||
1. Startup ClearML Server. This automatically pulls the latest ClearML Server build.
|
||||
|
||||
docker-compose -f c:\opt\clearml\docker-compose-win10.yml pull
|
||||
docker-compose -f c:\opt\clearml\docker-compose-win10.yml up -d
|
||||
|
||||
```
|
||||
docker-compose -f c:\opt\clearml\docker-compose-win10.yml pull
|
||||
docker-compose -f c:\opt\clearml\docker-compose-win10.yml up -d
|
||||
```
|
||||
|
||||
If issues arise during your upgrade, see the FAQ page, [How do I fix Docker upgrade errors?](../faq.md#common-docker-upgrade-errors).
|
||||
|
@ -34,12 +34,16 @@ pip install clearml
|
||||
|
||||
Use the `--file` option for `clearml-init`.
|
||||
|
||||
clearml-init --file MyOtherClearML.conf
|
||||
|
||||
```
|
||||
clearml-init --file MyOtherClearML.conf
|
||||
```
|
||||
|
||||
and then specify it using the ``CLEARML_CONFIG_FILE`` environment variable inside the container:
|
||||
|
||||
CLEARML_CONFIG_FILE = MyOtherClearML.conf
|
||||
|
||||
```
|
||||
CLEARML_CONFIG_FILE = MyOtherClearML.conf
|
||||
```
|
||||
|
||||
For more information about running experiments inside Docker containers, see [ClearML Agent Deployment](../../clearml_agent.md#deployment)
|
||||
and [ClearML Agent Reference](../../clearml_agent/clearml_agent_ref.md).
|
||||
|
||||
|
@ -30,7 +30,7 @@ For more information about how autoscalers work, see [Autoscalers Overview](../.
|
||||
* GCP Subnet Full Path - Available if `Use full subnet path` was selected. The GCP subnetwork where the instances
|
||||
will be spun up. This allows setting a custom subnet resource path, and allows setting subnets shared from other
|
||||
projects as well. See [GCP Documentation](https://cloud.google.com/dataflow/docs/guides/specifying-networks).
|
||||
* GCP Subnet Name - Available if `Use full subnet path` was not selected. The GCP subnetwork where the instances
|
||||
* GCP Subnet Name - Available if `Use full subnet path` was not selected. The GCP subnetwork where the instances
|
||||
will be spun up. GCP setting will be `projects/{project-id}/regions/{region}/subnetworks/{subnetwork}`
|
||||
* GCP Credentials - Credentials with which the autoscaler can access your GCP account for spinning VM instances
|
||||
up/down. See [Generating GCP Credentials](#generating-gcp-credentials).
|
||||
|
@ -634,7 +634,7 @@ of resources allocated to jobs in this profile
|
||||
* <img src="/docs/latest/icons/ico-running-jobs.svg" alt="Running jobs" className="icon size-md space-sm" /> - Number of currently running jobs
|
||||
* Number of resource policies. Click to open resource policy list and to order queuing priority.
|
||||
|
||||
### Example Workflow
|
||||
### Example Workflow
|
||||
|
||||
You have GPUs spread across a local H100 and additional bare metal servers, as well as on AWS (managed
|
||||
by an autoscaler). Assume that currently most of your resources are already assigned to jobs, and only 16 resources are available: 8 in the
|
||||
@ -648,7 +648,7 @@ Teams' jobs have varying resource requirements of 0.5, 2, 4, and 8 GPUs. Resourc
|
||||
|
||||
The different jobs will be routed to different resource pools by connecting the profiles to the resource pools. Jobs
|
||||
enqueued through the profiles will be run in the pools where there are available resources in order of their priority.
|
||||
For example, the H100 pool will run jobs with the following precedence: 2 GPU jobs first, then 4GPU ones, then 8 GPU,
|
||||
For example, the H100 pool will run jobs with the following precedence: 2 GPU jobs first, then 4 GPU ones, then 8 GPU,
|
||||
and lastly 0.5 GPU.
|
||||
|
||||

|
||||
|
Loading…
Reference in New Issue
Block a user