mirror of
https://github.com/clearml/clearml-docs
synced 2025-06-26 18:17:44 +00:00
Add ClearML Application Gateway overview and installation instructions
This commit is contained in:
40
docs/deploying_clearml/enterprise_deploy/appgw.md
Normal file
40
docs/deploying_clearml/enterprise_deploy/appgw.md
Normal file
@@ -0,0 +1,40 @@
|
||||
---
|
||||
title: AI Application Gateway
|
||||
---
|
||||
|
||||
Services running through a cluster orchestrator such as Kubernetes or cloud hyperscaler require meticulous configuration
|
||||
to make them available as these environments do not expose their networks to external users.
|
||||
|
||||
The ClearML AI Application Gateway facilitates setting up secure, authenticated access to jobs running on your compute
|
||||
nodes from external networks.
|
||||
|
||||
Using the AI Application Gateway, services are allocated externally accessible, SSL secure network routes which provide
|
||||
access in adherence to ClearML RBAC privileges. The AI Application Gateway supports HTTP/S as well as raw TCP routing.
|
||||
|
||||
The following ClearML UI applications make use of the AI Application Gateway to provide authenticated HTTPS access to
|
||||
their instances:
|
||||
|
||||
* GPUaaS
|
||||
* [JupyterLab](../../webapp/applications/apps_jupyter_lab.md)
|
||||
* [VScode](../../webapp/applications/apps_vscode.md)
|
||||
* [SSH Session](../../webapp/applications/apps_ssh_session.md)
|
||||
* UI Dev
|
||||
* [Gradio launcher](../../webapp/applications/apps_gradio.md)
|
||||
* [Streamlit launcher](../../webapp/applications/apps_streamlit.md)
|
||||
* Deploy
|
||||
* [vLLM Deployment](../../webapp/applications/apps_model_deployment.md)
|
||||
* [Embedding Model Deployment](../../webapp/applications/apps_embed_model_deployment.md)
|
||||
* [Llama.cpp Model Deployment](../../webapp/applications/apps_llama_deployment.md)
|
||||
|
||||
The AI Application Gateway is provided through an additional component to the ClearML Server deployment: The ClearML Task Traffic Router.
|
||||
If your ClearML Deployment does not have the Task Traffic Router properly installed, these application instances may not be accessible.
|
||||
|
||||
## Installation
|
||||
|
||||
The Task Traffic Router supports two deployment options:
|
||||
|
||||
* [Docker Compose](appgw_install_compose.md)
|
||||
* [Kubernetes](appgw_install_k8s.md)
|
||||
|
||||
The deployment configuration specifies the external and internal address and port mappings for routing requests.
|
||||
|
||||
@@ -0,0 +1,133 @@
|
||||
---
|
||||
title: Docker Compose Installation
|
||||
---
|
||||
|
||||
Use docker-compose to deploy the Task Traffic Router.
|
||||
|
||||
## Requirements
|
||||
|
||||
* Linux OS (x86) machine
|
||||
* Root access
|
||||
* Credentials for the ClearML Docker repository
|
||||
* Valid ClearML Server installation
|
||||
|
||||
## Host Configuration
|
||||
|
||||
1. Install Docker (procedure may vary depending on your operating system). The code below is an example for Amazon Linux:
|
||||
|
||||
```
|
||||
sudo dnf -y install docker
|
||||
DOCKER_CONFIG="/usr/local/lib/docker"
|
||||
sudo mkdir -p $DOCKER_CONFIG/cli-plugins
|
||||
sudo curl -SL https://github.com/docker/compose/releases/download/v2.17.3/docker-compose-linux-x86_64 -o $DOCKER_CONFIG/cli-plugins/docker-compose
|
||||
sudo chmod +x $DOCKER_CONFIG/cli-plugins/docker-compose
|
||||
sudo systemctl enable docker
|
||||
sudo systemctl start docker
|
||||
```
|
||||
|
||||
1. Log in with credentials for the ClearML Docker Hub repository:
|
||||
|
||||
```
|
||||
sudo docker login
|
||||
```
|
||||
|
||||
## Docker Compose Configuration
|
||||
|
||||
|
||||
1. Create a `docker-compose.yml` file. For example:
|
||||
|
||||
```
|
||||
version: '3.5'
|
||||
services:`
|
||||
task_traffic_webserver:
|
||||
image: allegroai/task-traffic-router-webserver:${TASK-TRAFFIC-ROUTER-WEBSERVER-TAG}
|
||||
ports:
|
||||
- "80:8080"
|
||||
restart: unless-stopped
|
||||
container_name: task_traffic_webserver
|
||||
volumes:
|
||||
- ./task_traffic_router/config/nginx:/etc/nginx/conf.d:ro
|
||||
- ./task_traffic_router/config/lua:/usr/local/openresty/nginx/lua:ro
|
||||
task_traffic_router:
|
||||
image: allegroai/task-traffic-router:${TASK-TRAFFIC-ROUTER-TAG}
|
||||
restart: unless-stopped
|
||||
container_name: task_traffic_router
|
||||
volumes:
|
||||
- /var/run/docker.sock:/var/run/docker.sock
|
||||
- ./task_traffic_router/config/nginx:/etc/nginx/conf.d:rw
|
||||
- ./task_traffic_router/config/lua:/usr/local/openresty/nginx/lua:rw
|
||||
environment:
|
||||
- LOGGER_LEVEL=INFO
|
||||
- CLEARML_API_HOST=${CLEARML_API_HOST:?err}
|
||||
- CLEARML_API_ACCESS_KEY=${CLEARML_API_ACCESS_KEY:?err}
|
||||
- CLEARML_API_SECRET_KEY=${CLEARML_API_SECRET_KEY:?err}
|
||||
- ROUTER_URL=${ROUTER_URL:?err}
|
||||
- ROUTER_NAME=${ROUTER_NAME:?err}
|
||||
- AUTH_ENABLED=${AUTH_ENABLED:?err}
|
||||
- SSL_VERIFY=${SSL_VERIFY:?err}
|
||||
- AUTH_COOKIE_NAME=${AUTH_COOKIE_NAME:?err}
|
||||
- AUTH_BASE64_JWKS_KEY=${AUTH_BASE64_JWKS_KEY:?err}
|
||||
- LISTEN_QUEUE_NAME=${LISTEN_QUEUE_NAME}
|
||||
- EXTRA_BASH_COMMAND=${EXTRA_BASH_COMMAND}
|
||||
- TCP_ROUTER_ADDRESS=${TCP_ROUTER_ADDRESS}
|
||||
- TCP_PORT_START=${TCP_PORT_START}
|
||||
- TCP_PORT_END=${TCP_PORT_END}
|
||||
```
|
||||
|
||||
1. Create a `runtime.env` file with the following entries:
|
||||
|
||||
```
|
||||
TASK-TRAFFIC-ROUTER-WEBSERVER-TAG=
|
||||
TASK-TRAFFIC-ROUTER-TAG=
|
||||
CLEARML_API_HOST=https://api.
|
||||
CLEARML_API_ACCESS_KEY=
|
||||
CLEARML_API_SECRET_KEY=
|
||||
ROUTER_URL=
|
||||
ROUTER_NAME=main-router
|
||||
AUTH_ENABLED=true
|
||||
SSL_VERIFY=true
|
||||
AUTH_COOKIE_NAME=
|
||||
AUTH_BASE64_JWKS_KEY=
|
||||
LISTEN_QUEUE_NAME=
|
||||
EXTRA_BASH_COMMAND=
|
||||
TCP_ROUTER_ADDRESS=
|
||||
TCP_PORT_START=
|
||||
TCP_PORT_END=
|
||||
```
|
||||
|
||||
Edit the runtime.env file:
|
||||
* `CLEARML_API_HOST`: The URL of your ClearML API Server (i.e. starting with `https://api`).
|
||||
* `CLEARML_API_ACCESS_KEY`: ClearML server API key
|
||||
* `CLEARML_API_SECRET_KEY`: ClearML server API secret
|
||||
* `ROUTER_URL`: The URL users will use to access the router, starting with `https://`
|
||||
* `ROUTER_NAME`: The name for the router. Must be unique across the ClearML control plane scope
|
||||
* `AUTH_ENABLED`: Whether to enable http calls authentication when the router is communicating with the ClearML Server
|
||||
* `SSL_VERIFY`: Whether to enable SSL certificate validation when the router is communicating with the ClearML Server
|
||||
* `AUTH_COOKIE_NAME`: The cookie used by the ClearML server to store the ClearML authentication token. This can
|
||||
usually be found in the `value_prefix` key starting with `allegro_token` in the `envoy.yaml` file in the ClearML
|
||||
Server installation (`/opt/allegro/config/envoy/envoy.yaml`)
|
||||
* `AUTH_SECURE_ENABLED`: Enable the Set-Cookie `secure` parameter
|
||||
* `AUTH_BASE64_JWKS_KEY`: Value form `k` key in the `jwks.json` file in the ClearML server installation (see [JWKS key](#jwks-key))
|
||||
* `LISTEN_QUEUE_NAME`: The ClearML Server queue whose tasks the router will service (useful for setting up more than
|
||||
one router in the same deployment, facilitating directing different routers to different tasks). Use `none` to have
|
||||
the router service all tasks.
|
||||
* `EXTRA_BASH_COMMAND`: Command to be launched before starting router
|
||||
* `TCP_ROUTER_ADDRESS`: The network address users will use for TCP connections to the router: IP address or hostname
|
||||
(for the machine or a load balancer configured in front of it).
|
||||
* `TCP_PORT_START` and `TCP_PORT_END`: The range of ports available for TCP connections to the router. Ensure that
|
||||
the chosen range is open and accessible in your network configuration to allow proper routing.
|
||||
|
||||
1. Start the router:
|
||||
|
||||
```
|
||||
sudo docker compose --env-file runtime.env up -d
|
||||
```
|
||||
|
||||
## JWKS Key
|
||||
|
||||
|
||||
The **JSON Web Key Set** (JWKS) is a set of keys containing the public keys used to verify any JSON Web Token (JWT).
|
||||
|
||||
For the `docker-compose` installation, the JWKS key is the value of the `CLEARML__secure__auth__token_secret` environment
|
||||
variable in the API server component
|
||||
|
||||
@@ -0,0 +1,91 @@
|
||||
---
|
||||
title: Kubernetes Installation
|
||||
---
|
||||
|
||||
Use Kubernetes to deploy the Task Traffic Router.
|
||||
|
||||
### Requirements
|
||||
|
||||
* Kubernetes cluster: `>= 1.21.0-0` `< 1.32.0-0`
|
||||
* Helm installed and configured
|
||||
* Helm token to access ClearML helm-chart repo
|
||||
* Credentials for ClearML Docker repo
|
||||
* Valid ClearML Server installation
|
||||
|
||||
**Optional for HTTPS:**
|
||||
|
||||
* Valid DNS entry for the new Task Router instance
|
||||
* Valid SSL certificate
|
||||
|
||||
## Helm Configuration
|
||||
|
||||
1. Add the `allegroai-enterprise` Helm repository:
|
||||
|
||||
```
|
||||
helm repo add allegroai-enterprise \
|
||||
https://raw.githubusercontent.com/allegroai/clearml-enterprise-helm-charts/gh-pages \
|
||||
--username <GITHUB_TOKEN> \
|
||||
--password <GITHUB_TOKEN>
|
||||
```
|
||||
|
||||
1. Create a `task-traffic-router.values-override.yaml` file:
|
||||
|
||||
```
|
||||
imageCredentials:
|
||||
password: "${dockerhub_token}"
|
||||
clearml:
|
||||
apiServerKey: ""
|
||||
apiServerSecret: ""
|
||||
apiServerUrlReference: "https://api."
|
||||
jwksKey: ""
|
||||
authCookieName: ""
|
||||
ingress:
|
||||
enabled: true
|
||||
hostName: "task-router.dev"
|
||||
tcpSession:
|
||||
routerAddress: ""
|
||||
portRange:
|
||||
start:
|
||||
end:
|
||||
```
|
||||
Edit the file according to these guidelines:
|
||||
* `clearml.apiServerUrlReference`: URL starting with `https://api`.
|
||||
* `clearml.apiServerKey`: ClearML Server API key
|
||||
* `clearml.apiServerSecret`: ClearML Server API secret
|
||||
* `ingress.hostName`: A Unique URL users will use to access the router, starting with `https://`
|
||||
* `clearml.sslVerify`: Whether to enable SSL certificate validation when the router is communicating with the ClearML
|
||||
Server
|
||||
* `clearml.authCookieName`: The cookie used by the ClearML server to store the ClearML authentication token. This
|
||||
can usually be found in the `value_prefix` key starting with `allegro_token` in the `envoy.yaml` file in the ClearML
|
||||
server installation (`/opt/allegro/config/envoy/envoy.yaml`) (see [JWKS Key](#JWKS_KEY))
|
||||
* `clearml.jwksKey`: Value from `k` key in `jwks.json` file in ClearML Server installation (see [JWKS Key](#JWKS_KEY)).
|
||||
* `tcpSession.routerAddress`: The network address users will use for TCP connections to the router: This can be an IP address or hostname (for the machine or a load balancer configured in front of it).
|
||||
* `tcpSession.portRange.start` and `tcpSession.portRange.end`: These ports define the range of ports available for TCP connections to the router.
|
||||
|
||||
For a complete list of supported configurations:
|
||||
```
|
||||
helm show readme allegroai-enterprise/clearml-task-traffic-router
|
||||
```
|
||||
|
||||
3. Install the task traffic router component via Helm:
|
||||
|
||||
```
|
||||
helm upgrade --install \
|
||||
<RELEASE_NAME> \
|
||||
-n <NAME_SPACE> \
|
||||
allegroai-enterprise/clearml-task-traffic-router \
|
||||
--version <CURRENT CHART VERSION> \
|
||||
-f task-traffic-router.values-override.yaml
|
||||
```
|
||||
|
||||
## JWKS Key
|
||||
|
||||
The **JSON Web Key Set** (JWKS) is a set of keys containing the public keys used to verify any JSON Web Token (JWT).
|
||||
|
||||
For the Kubernetes installation, use the following command to retrieve the **JWKS key**:
|
||||
|
||||
```
|
||||
kubectl \-n clearml get secret clearml-conf \
|
||||
\-o jsonpath='{.data.secure\_auth\_token\_secret}' \
|
||||
| base64 \-d && echo
|
||||
```
|
||||
@@ -635,9 +635,10 @@ module.exports = {
|
||||
collapsible: true,
|
||||
collapsed: true,
|
||||
label: 'ClearML Application Gateway',
|
||||
link: {type: 'doc', id: 'deploying_clearml/enterprise_deploy/appgw'},
|
||||
items: [
|
||||
'deploying_clearml/appgw_install_compose',
|
||||
'deploying_clearml/appgw_install_k8s',
|
||||
'deploying_clearml/enterprise_deploy/appgw_install_compose',
|
||||
'deploying_clearml/enterprise_deploy/appgw_install_k8s',
|
||||
]
|
||||
},
|
||||
]
|
||||
|
||||
Reference in New Issue
Block a user