Compare commits

...

13 Commits

Author SHA1 Message Date
Valeriano Manassero
1f23bcf7ca 160 fileserver doesnt have an option to be with ephemeral storage (#164)
* Added: fileserver emptyDir support

* Changed: bump up version
2023-02-14 16:31:27 +01:00
Valeriano Manassero
3075f5e280 157 improve documentation (#159)
* Changed: updated installation guide

* Fixed: typo in copy and paste

* Changed: updated install guide

* Fixed: use relative path
2023-02-14 08:44:04 +01:00
Valeriano Manassero
97550c720f Fix cookiename availability (#158)
* Fixed: cookieName availability

* Changed: bump up version
2023-02-14 08:42:26 +01:00
Valeriano Manassero
a29a144119 Changed: redis cluster configuration for production (#156) 2023-02-13 12:22:01 +01:00
Valeriano Manassero
a4f77c624d Create inactive-issues.yaml 2023-02-13 08:58:08 +01:00
Valeriano Manassero
dd1c201eeb Avoid collisions in internal helper variable naming (#154)
* Fixed: helper variable rename to avoid collisions

* Changed: bump version
2023-02-13 08:17:53 +01:00
Valeriano Manassero
7995fc8441 Add external multihost elasticsearch support (#150)
* Changed: elasticsearch connstring creation

* Changed: elasticsearch connstring creation

* Changed: bump up version
2023-02-09 10:29:00 +01:00
Valeriano Manassero
99903085cd Fix existing secret reference (#149)
* Fixed: existingSecret reference

* Changed: bump version

* Changed: bump up version
2023-02-09 10:11:03 +01:00
Valeriano Manassero
9fc2b7ddda Fix existing secret apiserver (#148)
* Fixed: missing brackets

* Changed: bump vesion

* Fixed: trailing space in changelog
2023-02-08 14:20:25 +01:00
Valeriano Manassero
c7b3a28989 146 agentadd affinity config (#147)
* Added: affinity parameter

* Changed: bump version
2023-02-02 12:20:06 +01:00
Valeriano Manassero
12baef0d75 fixed: typos (#145) 2023-02-02 11:50:11 +01:00
Valeriano Manassero
72916e171a Added: specific platform configurations (#144) 2023-01-31 09:25:53 +01:00
Valeriano Manassero
126f313cdf Add agent pod securitycontext (#143)
* Added: securityContext for agent

* Changed: bump up version

* Added: support for k8s 1.26
2023-01-31 09:16:25 +01:00
29 changed files with 319 additions and 131 deletions

22
.github/workflows/inactive-issues.yaml vendored Normal file
View File

@@ -0,0 +1,22 @@
name: Close inactive issues
on:
schedule:
- cron: "30 1 * * *"
jobs:
close-issues:
runs-on: ubuntu-latest
permissions:
issues: write
pull-requests: write
steps:
- uses: actions/stale@v7
with:
days-before-issue-stale: 14
days-before-issue-close: 7
stale-issue-label: "stale"
stale-issue-message: "This issue is stale because it has been open for 14 days with no activity."
close-issue-message: "This issue was closed because it has been inactive for 7 days since being marked as stale."
days-before-pr-stale: -1
days-before-pr-close: -1
repo-token: ${{ secrets.GITHUB_TOKEN }}

57
INSTALL.md Normal file
View File

@@ -0,0 +1,57 @@
# ClearML Helm Charts Installation guide
## Requirements
### Setup a Kubernetes Cluster
For setting up Kubernetes on various platforms refer to the Kubernetes [getting started guide](http://kubernetes.io/docs/getting-started-guides/).
#### Setup a single node LOCAL Kubernetes on laptop/desktop (development)
For setting up Kubernetes on your laptop/desktop we suggest [kind](https://kind.sigs.k8s.io).
#### [Kubernetes Tanzu users only] Additional setup requirements
For setting up Clear.ML on a Tanzu cluster, check [prerequisites](https://github.com/allegroai/clearml-helm-charts/tree/main/platform-specific-configs/tanzu).
#### [Kubernetes Openshift users only] Additional setup requirements
For setting up Clear.ML on a Openshift cluster, check [prerequisites](https://github.com/allegroai/clearml-helm-charts/tree/main/platform-specific-configs/openshift).
### Install Helm
Helm is a tool for managing Kubernetes charts. Charts are packages of pre-configured Kubernetes resources.
To install Helm, refer to the [Helm install guide](https://github.com/helm/helm#install) and ensure that the `helm` binary is in the `PATH` of your shell.
## Helm charts installation
### Helm Repo
```bash
$ helm repo add allegroai https://allegroai.github.io/clearml-helm-charts
$ helm repo update
```
### ClearML server ecosystem
```bash
$ helm install clearml allegroai/clearml
```
### ClearML agent
Agent is always related a ClearML server ecosystem (by default using `app.clear.ml` public service but can be on same or another Kubernetes cluster or a single server installation).
On ClearML UI, Settings -> Workspace and Create new Credentials.
In following Helm chart install command:
* set ACCESSKEY to resuted credentials access_key
* set SECRETKEY to resuted credentials secret_key
* set APIERVERURL to resuted credentials api_server
* set FILESSERVERURL to resuted credentials files_server
* set WEBSERVERURL to resuted credentials web_server
```bash
$ helm install clearml-agent allegroai/clearml-agent --set clearml.agentk8sglueKey=ACCESSKEY --set clearml.agentk8sglueSecret=SECRETKEY --set agentk8sglue.apiServerUrlReference=APISERVERURL --set agentk8sglue.fileServerUrlReference=FILESERVERURL --set agentk8sglue.webServerUrlReference=WEBSERVERURL
```

View File

@@ -1,4 +1,4 @@
# ClearML Helm Charts Library for Kubernetes
# ClearML Helm Charts for Kubernetes
## Auto-Magical Experiment Manager & Version Control for AI
@@ -23,7 +23,11 @@ Use this repository to deploy **clearml-server** on Kubernetes clusters.
## Provided in this repository
### [All around Helm Chart](https://github.com/allegroai/clearml-helm-charts/tree/main/charts/clearml)
### [ClearML server chart](https://github.com/allegroai/clearml-helm-charts/tree/main/charts/clearml)
### [ClearML agent chart](https://github.com/allegroai/clearml-helm-charts/tree/main/charts/clearml-agent)
### [ClearML serving chart](https://github.com/allegroai/clearml-helm-charts/tree/main/charts/clearml-serving)
## Who We Are
@@ -40,30 +44,9 @@ will always upgrade with you.
Apache License, Version 2.0, (see the [LICENSE](https://www.apache.org/licenses/LICENSE-2.0) for more information)
## Requirements
## Installation guide
### Setup a Kubernetes Cluster
For setting up Kubernetes on various platforms refer to the Kubernetes [getting started guide](http://kubernetes.io/docs/getting-started-guides/).
### Setup a single node LOCAL Kubernetes on laptop/desktop
For setting up Kubernetes on your laptop/desktop we suggest [kind](https://kind.sigs.k8s.io).
### Install Helm
Helm is a tool for managing Kubernetes charts. Charts are packages of pre-configured Kubernetes resources.
To install Helm, refer to the [Helm install guide](https://github.com/helm/helm#install) and ensure that the `helm` binary is in the `PATH` of your shell.
## Usage
```bash
$ helm repo add allegroai https://allegroai.github.io/clearml-helm-charts
$ helm repo update
$ helm search repo allegroai
$ helm install <release-name> allegroai/<chart>
```
For installation instruction, follow related [Installation Guide](INSTALL.md).
## Documentation, Community & Support

View File

@@ -1,10 +1,10 @@
apiVersion: v2
name: clearml-agent
description: MLOps platform
description: MLOps platform Task running agent
type: application
version: "3.1.4"
version: "3.3.2"
appVersion: "1.24"
kubeVersion: ">= 1.19.0-0 < 1.26.0-0"
kubeVersion: ">= 1.21.0-0 < 1.27.0-0"
home: https://clear.ml
icon: https://raw.githubusercontent.com/allegroai/clearml/master/docs/clearml-logo.svg
sources:
@@ -17,3 +17,8 @@ keywords:
- clearml
- "machine learning"
- mlops
- "task agent"
annotations:
artifacthub.io/changes: |
- kind: fixed
description: clearml agent internal helper variable name

View File

@@ -1,8 +1,8 @@
# ClearML Kubernetes Agent
![Version: 3.1.4](https://img.shields.io/badge/Version-3.1.4-informational?style=flat-square) ![Type: application](https://img.shields.io/badge/Type-application-informational?style=flat-square) ![AppVersion: 1.24](https://img.shields.io/badge/AppVersion-1.24-informational?style=flat-square)
![Version: 3.3.2](https://img.shields.io/badge/Version-3.3.2-informational?style=flat-square) ![Type: application](https://img.shields.io/badge/Type-application-informational?style=flat-square) ![AppVersion: 1.24](https://img.shields.io/badge/AppVersion-1.24-informational?style=flat-square)
MLOps platform
MLOps platform Task running agent
**Homepage:** <https://clear.ml>
@@ -24,16 +24,18 @@ It allows you to schedule distributed experiments on a Kubernetes cluster.
## Requirements
Kubernetes: `>= 1.19.0-0 < 1.26.0-0`
Kubernetes: `>= 1.21.0-0 < 1.27.0-0`
## Values
| Key | Type | Default | Description |
|-----|------|---------|-------------|
| agentk8sglue | object | `{"annotations":{},"apiServerUrlReference":"https://api.clear.ml","basePodTemplate":{"annotations":{},"env":[],"fileMounts":[],"hostAliases":{},"initContainers":[],"labels":{},"nodeSelector":{},"resources":{},"schedulerName":"","securityContext":{},"tolerations":[],"volumeMounts":[],"volumes":[]},"clearmlcheckCertificate":true,"containerCustomBashScript":"","customBashScript":"","debugMode":false,"defaultContainerImage":"ubuntu:18.04","extraEnvs":[],"fileMounts":[],"fileServerUrlReference":"https://files.clear.ml","image":{"repository":"allegroai/clearml-agent-k8s-base","tag":"1.24-21"},"labels":{},"nodeSelector":{},"queue":"default","replicaCount":1,"serviceExistingAccountName":"","volumeMounts":[],"volumes":[],"webServerUrlReference":"https://app.clear.ml"}` | This agent will spawn queued experiments in new pods, a good use case is to combine this with GPU autoscaling nodes. https://github.com/allegroai/clearml-agent/tree/master/docker/k8s-glue |
| agentk8sglue | object | `{"affinity":{},"annotations":{},"apiServerUrlReference":"https://api.clear.ml","basePodTemplate":{"affinity":{},"annotations":{},"env":[],"fileMounts":[],"hostAliases":{},"initContainers":[],"labels":{},"nodeSelector":{},"resources":{},"schedulerName":"","securityContext":{},"tolerations":[],"volumeMounts":[],"volumes":[]},"clearmlcheckCertificate":true,"containerCustomBashScript":"","customBashScript":"","debugMode":false,"defaultContainerImage":"ubuntu:18.04","extraEnvs":[],"fileMounts":[],"fileServerUrlReference":"https://files.clear.ml","image":{"repository":"allegroai/clearml-agent-k8s-base","tag":"1.24-21"},"labels":{},"nodeSelector":{},"queue":"default","replicaCount":1,"securityContext":{},"serviceExistingAccountName":"","tolerations":[],"volumeMounts":[],"volumes":[],"webServerUrlReference":"https://app.clear.ml"}` | This agent will spawn queued experiments in new pods, a good use case is to combine this with GPU autoscaling nodes. https://github.com/allegroai/clearml-agent/tree/master/docker/k8s-glue |
| agentk8sglue.affinity | object | `{}` | affinity setup for Agent pod (example in values.yaml comments) |
| agentk8sglue.annotations | object | `{}` | annotations setup for Agent pod (example in values.yaml comments) |
| agentk8sglue.apiServerUrlReference | string | `"https://api.clear.ml"` | Reference to Api server url |
| agentk8sglue.basePodTemplate | object | `{"annotations":{},"env":[],"fileMounts":[],"hostAliases":{},"initContainers":[],"labels":{},"nodeSelector":{},"resources":{},"schedulerName":"","securityContext":{},"tolerations":[],"volumeMounts":[],"volumes":[]}` | base template for pods spawned to consume ClearML Task |
| agentk8sglue.basePodTemplate | object | `{"affinity":{},"annotations":{},"env":[],"fileMounts":[],"hostAliases":{},"initContainers":[],"labels":{},"nodeSelector":{},"resources":{},"schedulerName":"","securityContext":{},"tolerations":[],"volumeMounts":[],"volumes":[]}` | base template for pods spawned to consume ClearML Task |
| agentk8sglue.basePodTemplate.affinity | object | `{}` | affinity setup for pods spawned to consume ClearML Task |
| agentk8sglue.basePodTemplate.annotations | object | `{}` | annotations setup for pods spawned to consume ClearML Task (example in values.yaml comments) |
| agentk8sglue.basePodTemplate.env | list | `[]` | environment variables for pods spawned to consume ClearML Task (example in values.yaml comments) |
| agentk8sglue.basePodTemplate.fileMounts | list | `[]` | file definition for pods spawned to consume ClearML Task (example in values.yaml comments) |
@@ -59,7 +61,9 @@ Kubernetes: `>= 1.19.0-0 < 1.26.0-0`
| agentk8sglue.nodeSelector | object | `{}` | nodeSelector setup for Agent pod (example in values.yaml comments) |
| agentk8sglue.queue | string | `"default"` | ClearML queue this agent will consume |
| agentk8sglue.replicaCount | int | `1` | Glue Agent number of pods |
| agentk8sglue.securityContext | object | `{}` | Web Server pod security context |
| agentk8sglue.serviceExistingAccountName | string | `""` | if set, don't create a serviceAccountName but use defined existing one |
| agentk8sglue.tolerations | list | `[]` | tolerations setup for Agent pod (example in values.yaml comments) |
| agentk8sglue.volumeMounts | list | `[]` | volume mounts definition for Glue Agent (example in values.yaml comments) |
| agentk8sglue.volumes | list | `[]` | volumes definition for Glue Agent (example in values.yaml comments) |
| agentk8sglue.webServerUrlReference | string | `"https://app.clear.ml"` | Reference to Web server url |

View File

@@ -1,23 +1,23 @@
{{/*
Expand the name of the chart.
*/}}
{{- define "clearml.name" -}}
{{- define "clearmlAgent.name" -}}
{{- .Release.Name | trunc 59 | trimSuffix "-" }}
{{- end }}
{{/*
Create chart name and version as used by the chart label.
*/}}
{{- define "clearml.chart" -}}
{{- define "clearmlAgent.chart" -}}
{{- printf "%s-%s" .Chart.Name .Chart.Version | replace "+" "_" | trunc 59 | trimSuffix "-" }}
{{- end }}
{{/*
Common labels
*/}}
{{- define "clearml.labels" -}}
helm.sh/chart: {{ include "clearml.chart" . }}
{{ include "clearml.selectorLabels" . }}
{{- define "clearmlAgent.labels" -}}
helm.sh/chart: {{ include "clearmlAgent.chart" . }}
{{ include "clearmlAgent.selectorLabels" . }}
{{- if .Chart.AppVersion }}
app.kubernetes.io/version: {{ .Chart.AppVersion | quote }}
{{- end }}
@@ -30,7 +30,7 @@ app.kubernetes.io/managed-by: {{ .Release.Service }}
{{/*
Common annotations
*/}}
{{- define "clearml.annotations" -}}
{{- define "clearmlAgent.annotations" -}}
{{- if $.Values.agentk8sglue.annotations }}
{{ toYaml $.Values.agentk8sglue.annotations }}
{{- end }}
@@ -39,8 +39,8 @@ Common annotations
{{/*
Selector labels
*/}}
{{- define "clearml.selectorLabels" -}}
app.kubernetes.io/name: {{ include "clearml.name" . }}
{{- define "clearmlAgent.selectorLabels" -}}
app.kubernetes.io/name: {{ include "clearmlAgent.name" . }}
app.kubernetes.io/instance: {{ .Release.Name }}
{{- end }}
@@ -48,18 +48,18 @@ app.kubernetes.io/instance: {{ .Release.Name }}
Selector labels (agentk8sglue)
*/}}
{{- define "agentk8sglue.selectorLabels" -}}
app.kubernetes.io/name: {{ include "clearml.name" . }}
app.kubernetes.io/instance: {{ include "clearml.name" . }}
app.kubernetes.io/name: {{ include "clearmlAgent.name" . }}
app.kubernetes.io/instance: {{ include "clearmlAgent.name" . }}
{{- end }}
{{/*
Create the name of the service account to use
*/}}
{{- define "clearml.serviceAccountName" -}}
{{- define "clearmlAgent.serviceAccountName" -}}
{{- if .Values.agentk8sglue.serviceExistingAccountName }}
{{- .Values.agentk8sglue.serviceExistingAccountName }}
{{- else }}
{{- include "clearml.name" . }}-sa
{{- include "clearmlAgent.name" . }}-sa
{{- end }}
{{- end }}

View File

@@ -1,7 +1,7 @@
apiVersion: v1
kind: ConfigMap
metadata:
name: {{ include "clearml.name" . }}-pt
name: {{ include "clearmlAgent.name" . }}-pt
data:
{{- if .Values.enterpriseFeatures.enabled }}
template.yaml: |
@@ -28,9 +28,9 @@ data:
{{- if $.Values.imageCredentials.enabled }}
imagePullSecrets:
{{- if $.Values.imageCredentials.existingSecret }}
- name: $.Values.imageCredentials.existingSecret
- name: {{ $.Values.imageCredentials.existingSecret }}
{{- else }}
- name: {{ include "clearml.name" $ }}-ark
- name: {{ include "clearmlAgent.name" $ }}-ark
{{- end }}
{{- end }}
{{- if $value.templateOverrides.schedulerName }}
@@ -66,14 +66,14 @@ data:
{{- if $value.templateOverrides.fileMounts }}
- name: filemounts
secret:
secretName: {{ include "clearml.name" $ }}-{{ $key }}-fm
secretName: {{ include "clearmlAgent.name" $ }}-{{ $key }}-fm
{{- else if $.Values.agentk8sglue.basePodTemplate.fileMounts }}
- name: filemounts
secret:
secretName: {{ include "clearml.name" $ }}-fm
secretName: {{ include "clearmlAgent.name" $ }}-fm
{{- end }}
{{- if not $.Values.enterpriseFeatures.serviceAccountClusterAccess }}
serviceAccountName: {{ include "clearml.serviceAccountName" $ }}
serviceAccountName: {{ include "clearmlAgent.serviceAccountName" $ }}
{{- end }}
{{- if $value.templateOverrides.initContainers }}
initContainers:
@@ -126,7 +126,7 @@ data:
{{- if $.Values.clearml.existingAgentk8sglueSecret }}
name: {{ $.Values.clearml.existingAgentk8sglueSecret }}
{{- else }}
name: {{ include "clearml.name" $ }}-ac
name: {{ include "clearmlAgent.name" $ }}-ac
{{- end }}
key: agentk8sglue_key
- name: CLEARML_API_SECRET_KEY
@@ -135,7 +135,7 @@ data:
{{- if $.Values.clearml.existingAgentk8sglueSecret }}
name: {{ $.Values.clearml.existingAgentk8sglueSecret }}
{{- else }}
name: {{ include "clearml.name" $ }}-ac
name: {{ include "clearmlAgent.name" $ }}-ac
{{- end }}
key: agentk8sglue_secret
{{- end }}
@@ -172,14 +172,25 @@ data:
{{- toYaml . | nindent 10 }}
{{- end }}
{{- end }}
{{- if $value.templateOverrides.affinity }}
{{- with $value.templateOverrides.affinity }}
affinity:
{{- toYaml . | nindent 10 }}
{{- end }}
{{- else if $.Values.agentk8sglue.basePodTemplate.affinity }}
{{- with $.Values.agentk8sglue.basePodTemplate.affinity }}
affinity:
{{- toYaml . | nindent 10 }}
{{- end }}
{{- end }}
{{- end }}
secrets.yaml: |
{{- range $key, $value := $.Values.enterpriseFeatures.queues }}
{{ $key }}:
{{- if $value.templateOverrides.fileMounts }}
- {{ include "clearml.name" $ }}-{{ $key }}-fm
- {{ include "clearmlAgent.name" $ }}-{{ $key }}-fm
{{- else if $.Values.agentk8sglue.basePodTemplate.fileMounts }}
- {{ include "clearml.name" $ }}-fm
- {{ include "clearmlAgent.name" $ }}-fm
{{- end }}
{{- end }}
{{- else }}
@@ -195,16 +206,16 @@ data:
{{- if .Values.imageCredentials.enabled }}
imagePullSecrets:
{{- if .Values.imageCredentials.existingSecret }}
- name: {{.Values.imageCredentials.existingSecret}}
- name: {{ .Values.imageCredentials.existingSecret }}
{{- else }}
- name: name: {{ include "clearml.name" $ }}-ark
- name: name: {{ include "clearmlAgent.name" $ }}-ark
{{- end }}
{{- end }}
{{- with .Values.agentk8sglue.basePodTemplate.volumes }}
volumes:
{{- toYaml . | nindent 8 }}
{{- end }}
serviceAccountName: {{ include "clearml.serviceAccountName" $ }}
serviceAccountName: {{ include "clearmlAgent.serviceAccountName" $ }}
containers:
- resources:
{{- toYaml .Values.agentk8sglue.basePodTemplate.resources | nindent 10 }}
@@ -227,7 +238,7 @@ data:
{{- if .Values.clearml.existingAgentk8sglueSecret }}
name: {{ .Values.clearml.existingAgentk8sglueSecret }}
{{- else }}
name: {{ include "clearml.name" . }}-ac
name: {{ include "clearmlAgent.name" . }}-ac
{{- end }}
key: agentk8sglue_key
- name: CLEARML_API_SECRET_KEY
@@ -236,7 +247,7 @@ data:
{{- if .Values.clearml.existingAgentk8sglueSecret }}
name: {{ .Values.clearml.existingAgentk8sglueSecret }}
{{- else }}
name: {{ include "clearml.name" . }}-ac
name: {{ include "clearmlAgent.name" . }}-ac
{{- end }}
key: agentk8sglue_secret
{{- if .Values.agentk8sglue.basePodTemplate.env }}
@@ -250,6 +261,10 @@ data:
tolerations:
{{- toYaml . | nindent 8 }}
{{- end }}
{{- with .Values.agentk8sglue.basePodTemplate.affinity }}
affinity:
{{- toYaml . | nindent 8 }}
{{- end }}
{{- end }}
{{- if .Values.sessions.portModeEnabled }}
{{- range untilStep 1 ( ( add .Values.sessions.maxServices 1 ) | int ) 1 }}
@@ -259,7 +274,7 @@ data:
metadata:
name: clearml-session-{{ . }}
labels:
{{- include "clearml.labels" $ | nindent 8 }}
{{- include "clearmlAgent.labels" $ | nindent 8 }}
{{- with $.Values.sessions.svcAnnotations }}
annotations:
{{- toYaml . | nindent 8 }}

View File

@@ -1,11 +1,11 @@
apiVersion: apps/v1
kind: Deployment
metadata:
name: {{ include "clearml.name" . }}
name: {{ include "clearmlAgent.name" . }}
labels:
{{- include "clearml.labels" . | nindent 4 }}
{{- include "clearmlAgent.labels" . | nindent 4 }}
annotations:
{{- include "clearml.annotations" . | nindent 4 }}
{{- include "clearmlAgent.annotations" . | nindent 4 }}
spec:
replicas: {{ .Values.agentk8sglue.replicaCount }}
selector:
@@ -15,19 +15,20 @@ spec:
metadata:
annotations:
checksum/config: {{ printf "%s%s" .Values.clearml .Values.agentk8sglue | sha256sum }}
{{- include "clearml.annotations" . | nindent 8 }}
{{- include "clearmlAgent.annotations" . | nindent 8 }}
labels:
{{- include "clearml.labels" . | nindent 8 }}
{{- include "clearmlAgent.labels" . | nindent 8 }}
spec:
{{- if .Values.imageCredentials.enabled }}
imagePullSecrets:
{{- if .Values.imageCredentials.existingSecret }}
- name: .Values.imageCredentials.existingSecret
- name: {{ .Values.imageCredentials.existingSecret }}
{{- else }}
- name: {{ include "clearml.name" . }}-ark
- name: {{ include "clearmlAgent.name" . }}-ark
{{- end }}
{{- end }}
serviceAccountName: {{ include "clearml.serviceAccountName" . }}
serviceAccountName: {{ include "clearmlAgent.serviceAccountName" . }}
securityContext: {{ toYaml .Values.agentk8sglue.securityContext | nindent 8 }}
initContainers:
- name: init-k8s-glue
{{- if .Values.enterpriseFeatures.enabled }}
@@ -67,7 +68,7 @@ spec:
export PATH=$PATH:$HOME/bin;
source /root/.bashrc && /root/entrypoint.sh
volumeMounts:
- name: {{ include "clearml.name" . }}-pt
- name: {{ include "clearmlAgent.name" . }}-pt
mountPath: /root/template
{{ if .Values.clearml.clearmlConfig }}
- name: k8sagent-clearml-conf-volume
@@ -121,15 +122,15 @@ spec:
- name: CLEARML_API_ACCESS_KEY
valueFrom:
secretKeyRef:
name: {{ include "clearml.name" . }}-ac
name: {{ include "clearmlAgent.name" . }}-ac
key: agentk8sglue_key
- name: CLEARML_API_SECRET_KEY
valueFrom:
secretKeyRef:
name: {{ include "clearml.name" . }}-ac
name: {{ include "clearmlAgent.name" . }}-ac
key: agentk8sglue_secret
- name: CLEARML_WORKER_ID
value: {{ include "clearml.name" . }}
value: {{ include "clearmlAgent.name" . }}
- name: CLEARML_AGENT_UPDATE_REPO
value: ""
- name: FORCE_CLEARML_AGENT_REPO
@@ -176,14 +177,22 @@ spec:
nodeSelector:
{{- toYaml . | nindent 8 }}
{{- end }}
{{- with .Values.agentk8sglue.affinity }}
affinity:
{{- toYaml . | nindent 8 }}
{{- end }}
{{- with .Values.agentk8sglue.tolerations }}
tolerations:
{{- toYaml . | nindent 8 }}
{{- end }}
volumes:
- name: {{ include "clearml.name" . }}-pt
- name: {{ include "clearmlAgent.name" . }}-pt
configMap:
name: {{ include "clearml.name" . }}-pt
name: {{ include "clearmlAgent.name" . }}-pt
{{ if .Values.clearml.clearmlConfig }}
- name: k8sagent-clearml-conf-volume
secret:
secretName: {{ include "clearml.name" . }}-ac
secretName: {{ include "clearmlAgent.name" . }}-ac
items:
- key: clearml.conf
path: clearml.conf
@@ -194,5 +203,5 @@ spec:
{{ if .Values.agentk8sglue.fileMounts }}
- name: filemounts
secret:
secretName: {{ include "clearml.name" . }}-afm
secretName: {{ include "clearmlAgent.name" . }}-afm
{{- end }}

View File

@@ -2,7 +2,7 @@
apiVersion: v1
kind: ServiceAccount
metadata:
name: {{ include "clearml.serviceAccountName" . }}
name: {{ include "clearmlAgent.serviceAccountName" . }}
namespace: {{ .Release.Namespace }}
{{- end }}
{{- if .Values.enterpriseFeatures.serviceAccountClusterAccess }}
@@ -10,7 +10,7 @@ metadata:
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
name: {{ include "clearml.name" . }}-kpa
name: {{ include "clearmlAgent.name" . }}-kpa
rules:
- apiGroups:
- ""
@@ -28,21 +28,21 @@ rules:
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
name: {{ include "clearml.name" . }}-kpa
name: {{ include "clearmlAgent.name" . }}-kpa
subjects:
- kind: ServiceAccount
name: {{ include "clearml.serviceAccountName" . }}
name: {{ include "clearmlAgent.serviceAccountName" . }}
namespace: {{ .Release.Namespace }}
roleRef:
apiGroup: rbac.authorization.k8s.io
kind: ClusterRole
name: {{ include "clearml.name" . }}-kpa
name: {{ include "clearmlAgent.name" . }}-kpa
{{- else }}
---
apiVersion: rbac.authorization.k8s.io/v1
kind: Role
metadata:
name: {{ include "clearml.name" . }}-kpa
name: {{ include "clearmlAgent.name" . }}-kpa
rules:
- apiGroups:
- ""
@@ -60,13 +60,13 @@ rules:
apiVersion: rbac.authorization.k8s.io/v1
kind: RoleBinding
metadata:
name: {{ include "clearml.name" . }}-kpa
name: {{ include "clearmlAgent.name" . }}-kpa
subjects:
- kind: ServiceAccount
name: {{ include "clearml.serviceAccountName" . }}
name: {{ include "clearmlAgent.serviceAccountName" . }}
namespace: {{ .Release.Namespace }}
roleRef:
apiGroup: rbac.authorization.k8s.io
kind: Role
name: {{ include "clearml.name" . }}-kpa
name: {{ include "clearmlAgent.name" . }}-kpa
{{- end }}

View File

@@ -1,7 +1,7 @@
apiVersion: v1
kind: Secret
metadata:
name: {{ include "clearml.name" . }}-ac
name: {{ include "clearmlAgent.name" . }}-ac
data:
agentk8sglue_key: {{ .Values.clearml.agentk8sglueKey | b64enc }}
agentk8sglue_secret: {{ .Values.clearml.agentk8sglueSecret | b64enc }}
@@ -12,7 +12,7 @@ data:
apiVersion: v1
kind: Secret
metadata:
name: {{ include "clearml.name" . }}-ark
name: {{ include "clearmlAgent.name" . }}-ark
type: kubernetes.io/dockerconfigjson
data:
.dockerconfigjson: {{ template "imagePullSecret" . }}

View File

@@ -2,7 +2,7 @@
apiVersion: v1
kind: Secret
metadata:
name: {{ include "clearml.name" . }}-afm
name: {{ include "clearmlAgent.name" . }}-afm
data:
{{- range .Values.agentk8sglue.fileMounts }}
{{ .name }}: {{ .fileContent | b64enc }}
@@ -14,7 +14,7 @@ data:
apiVersion: v1
kind: Secret
metadata:
name: {{ include "clearml.name" . }}-fm
name: {{ include "clearmlAgent.name" . }}-fm
data:
{{- range .Values.agentk8sglue.basePodTemplate.fileMounts }}
{{ .name }}: {{ .fileContent | b64enc }}
@@ -26,7 +26,7 @@ data:
apiVersion: v1
kind: Secret
metadata:
name: {{ include "clearml.name" $ }}-{{ $key }}-fm
name: {{ include "clearmlAgent.name" $ }}-{{ $key }}-fm
data:
{{- range .templateOverrides.fileMounts }}
{{ .name }}: {{ .fileContent | b64enc }}

View File

@@ -7,7 +7,7 @@ kind: Service
metadata:
name: clearml-session-{{ . }}
labels:
{{- include "clearml.labels" $ | nindent 4 }}
{{- include "clearmlAgent.labels" $ | nindent 4 }}
{{- with $.Values.sessions.svcAnnotations }}
annotations:
{{- toYaml . | nindent 4 }}

View File

@@ -73,12 +73,19 @@ agentk8sglue:
containerCustomBashScript: ""
# -- Extra Environment variables for Glue Agent
extraEnvs: []
# - name: PYTHONPATH
# value: "somepath"
# - name: PYTHONPATH
# value: "somepath"
# -- Web Server pod security context
securityContext: {}
# runAsUser: 1001
# fsGroup: 1001
# -- nodeSelector setup for Agent pod (example in values.yaml comments)
nodeSelector: {}
# fleet: agent-nodes
# -- tolerations setup for Agent pod (example in values.yaml comments)
tolerations: []
# -- affinity setup for Agent pod (example in values.yaml comments)
affinity: {}
# -- volumes definition for Glue Agent (example in values.yaml comments)
volumes: []
# - name: "yourvolume"
@@ -159,17 +166,20 @@ agentk8sglue:
resources: {}
# limits:
# nvidia.com/gpu: 1
# -- nodeSelector setup for pods spawned to consume ClearML Task (example in values.yaml comments)
nodeSelector: {}
# fleet: gpu-nodes
# -- tolerations setup for pods spawned to consume ClearML Task (example in values.yaml comments)
tolerations: []
# - key: "nvidia.com/gpu"
# operator: Exists
# effect: "NoSchedule"
# -- nodeSelector setup for pods spawned to consume ClearML Task (example in values.yaml comments)
nodeSelector: {}
# fleet: gpu-nodes
# -- affinity setup for pods spawned to consume ClearML Task
affinity: {}
# -- securityContext setup for pods spawned to consume ClearML Task (example in values.yaml comments)
securityContext: {}
# runAsUser: 1000
# runAsUser: 1001
# fsGroup: 1001
# -- hostAliases setup for pods spawned to consume ClearML Task (example in values.yaml comments)
hostAliases: {}
# - ip: "127.0.0.1"

View File

@@ -2,7 +2,7 @@ apiVersion: v2
name: clearml
description: MLOps platform
type: application
version: "5.5.1"
version: "5.7.0"
appVersion: "1.9.2"
kubeVersion: ">= 1.21.0-0 < 1.27.0-0"
home: https://clear.ml
@@ -32,5 +32,5 @@ dependencies:
condition: elasticsearch.enabled
annotations:
artifacthub.io/changes: |
- kind: changed
description: check redis and mongodb too before starting apiserver
- kind: added
description: fileserver support for emptyDir

View File

@@ -1,6 +1,6 @@
# ClearML Ecosystem for Kubernetes
![Version: 5.5.1](https://img.shields.io/badge/Version-5.5.1-informational?style=flat-square) ![Type: application](https://img.shields.io/badge/Type-application-informational?style=flat-square) ![AppVersion: 1.9.2](https://img.shields.io/badge/AppVersion-1.9.2-informational?style=flat-square)
![Version: 5.7.0](https://img.shields.io/badge/Version-5.7.0-informational?style=flat-square) ![Type: application](https://img.shields.io/badge/Type-application-informational?style=flat-square) ![AppVersion: 1.9.2](https://img.shields.io/badge/AppVersion-1.9.2-informational?style=flat-square)
MLOps platform
@@ -216,14 +216,13 @@ Kubernetes: `>= 1.21.0-0 < 1.27.0-0`
| enterpriseFeatures.overrideReferenceApiUrl | string | `""` | set this value AND overrideReferenceFileUrl if external endpoint exposure is in place (like a LoadBalancer) example: "https://api.clearml.local" |
| enterpriseFeatures.overrideReferenceFileUrl | string | `""` | set this value AND overrideReferenceAPIUrl if external endpoint exposure is in place (like a LoadBalancer) example: "https://files.clearml.local" |
| enterpriseFeatures.webserverImageTagOverride | string | `"3.15.3-801"` | Image tag override for webserver enterprise version |
| externalServices | object | `{"elasticsearchHost":"","elasticsearchPort":9200,"mongodbConnectionStringAuth":"","mongodbConnectionStringBackend":"","redisHost":"","redisPort":6379}` | Definition of external services to use if not enabled as dependency charts here |
| externalServices.elasticsearchHost | string | `""` | Existing ElasticSearch Hostname to use if elasticsearch.enabled is false |
| externalServices.elasticsearchPort | int | `9200` | Existing ElasticSearch Port to use if elasticsearch.enabled is false |
| externalServices | object | `{"elasticsearchConnectionString":"","mongodbConnectionStringAuth":"","mongodbConnectionStringBackend":"","redisHost":"","redisPort":6379}` | Definition of external services to use if not enabled as dependency charts here |
| externalServices.elasticsearchConnectionString | string | `""` | Existing ElasticSearch connectionstring if elasticsearch.enabled is false (example in values.yaml) |
| externalServices.mongodbConnectionStringAuth | string | `""` | Existing MongoDB connection string for BACKEND to use if mongodb.enabled is false |
| externalServices.mongodbConnectionStringBackend | string | `""` | Existing MongoDB connection string for AUTH to use if mongodb.enabled is false |
| externalServices.redisHost | string | `""` | Existing Redis Hostname to use if redis.enabled is false |
| externalServices.redisPort | int | `6379` | Existing Redis Port to use if redis.enabled is false |
| fileserver | object | `{"affinity":{},"enabled":true,"extraEnvs":[],"image":{"pullPolicy":"IfNotPresent","repository":"allegroai/clearml","tag":"1.9.2-317"},"ingress":{"annotations":{},"enabled":false,"hostName":"files.clearml.127-0-0-1.nip.io","ingressClassName":"","path":"/","tlsSecretName":""},"nodeSelector":{},"podAnnotations":{},"replicaCount":1,"resources":{"limits":{"cpu":"2000m","memory":"1Gi"},"requests":{"cpu":"100m","memory":"256Mi"}},"securityContext":{},"service":{"nodePort":30081,"port":8081,"type":"NodePort"},"storage":{"data":{"accessMode":"ReadWriteOnce","class":"","existingPVC":"","size":"50Gi"}},"tolerations":[]}` | File Server configurations |
| fileserver | object | `{"affinity":{},"enabled":true,"extraEnvs":[],"image":{"pullPolicy":"IfNotPresent","repository":"allegroai/clearml","tag":"1.9.2-317"},"ingress":{"annotations":{},"enabled":false,"hostName":"files.clearml.127-0-0-1.nip.io","ingressClassName":"","path":"/","tlsSecretName":""},"nodeSelector":{},"podAnnotations":{},"replicaCount":1,"resources":{"limits":{"cpu":"2000m","memory":"1Gi"},"requests":{"cpu":"100m","memory":"256Mi"}},"securityContext":{},"service":{"nodePort":30081,"port":8081,"type":"NodePort"},"storage":{"data":{"accessMode":"ReadWriteOnce","class":"","existingPVC":"","size":"50Gi"},"enabled":true},"tolerations":[]}` | File Server configurations |
| fileserver.affinity | object | `{}` | File Server affinity setup |
| fileserver.enabled | bool | `true` | Enable/Disable component deployment |
| fileserver.extraEnvs | list | `[]` | File Server extra envrinoment variables |
@@ -242,10 +241,11 @@ Kubernetes: `>= 1.21.0-0 < 1.27.0-0`
| fileserver.securityContext | object | `{}` | File Server pod security context |
| fileserver.service | object | `{"nodePort":30081,"port":8081,"type":"NodePort"}` | File Server internal service configuration |
| fileserver.service.nodePort | int | `30081` | If service.type set to NodePort, this will be set to service's nodePort field. If service.type is set to others, this field will be ignored |
| fileserver.storage | object | `{"data":{"accessMode":"ReadWriteOnce","class":"","existingPVC":"","size":"50Gi"}}` | File server persistence settings |
| fileserver.storage | object | `{"data":{"accessMode":"ReadWriteOnce","class":"","existingPVC":"","size":"50Gi"},"enabled":true}` | File server persistence settings |
| fileserver.storage.data.accessMode | string | `"ReadWriteOnce"` | Access mode (must be ReadWriteMany if fileserver replica > 1) |
| fileserver.storage.data.class | string | `""` | Storage class (use default if empty) |
| fileserver.storage.data.existingPVC | string | `""` | If set, it uses an already existing PVC instead of dynamic provisioning |
| fileserver.storage.enabled | bool | `true` | If set to false no PVC is created and emptyDir is used |
| fileserver.tolerations | list | `[]` | File Server tolerations setup |
| imageCredentials | object | `{"email":"someone@host.com","enabled":false,"existingSecret":"","password":"pwd","registry":"docker.io","username":"someone"}` | Container registry configuration |
| imageCredentials.email | string | `"someone@host.com"` | Email |

View File

@@ -141,21 +141,24 @@ Create readiness probe auth token
Elasticsearch Service name
*/}}
{{- define "elasticsearch.servicename" -}}
{{- if .Values.elasticsearch.enabled }}
{{- .Values.elasticsearch.clusterName }}-master
{{- else }}
{{- .Values.externalServices.elasticsearchHost }}
{{- end }}
{{- end }}
{{/*
Elasticsearch Service port
*/}}
{{- define "elasticsearch.serviceport" -}}
{{- if .Values.elasticsearch.enabled }}
{{- .Values.elasticsearch.httpPort }}
{{- end }}
{{/*
Elasticsearch Comnnection string
*/}}
{{- define "elasticsearch.connectionstring" -}}
{{- if .Values.elasticsearch.enabled }}
{{- printf "[{\"host\":\"%s\",\"port\":%s}]" (include "elasticsearch.servicename" .) (include "elasticsearch.serviceport" .) | quote }}
{{- else }}
{{- .Values.externalServices.elasticsearchPort }}
{{- .Values.externalServices.elasticsearchConnectionString | quote }}
{{- end }}
{{- end }}

View File

@@ -22,7 +22,7 @@ spec:
{{- if .Values.imageCredentials.enabled }}
imagePullSecrets:
{{- if .Values.imageCredentials.existingSecret }}
- name: .Values.imageCredentials.existingSecret
- name: {{ .Values.imageCredentials.existingSecret }}
{{- else }}
- name: clearml-registry-key
{{- end }}
@@ -85,10 +85,14 @@ spec:
containerPort: 8008
protocol: TCP
env:
- name: CLEARML_ELASTIC_SERVICE_HOST
value: {{ include "elasticsearch.servicename" . }}
- name: CLEARML_ELASTIC_SERVICE_PORT
value: "{{ include "elasticsearch.serviceport" . }}"
- name: CLEARML__HOSTS__ELASTIC__WORKERS__HOSTS
value: {{ include "elasticsearch.connectionstring" . }}
- name: CLEARML__HOSTS__ELASTIC__EVENTS__HOSTS
value: {{ include "elasticsearch.connectionstring" . }}
- name: CLEARML__HOSTS__ELASTIC__DATASETS__HOSTS
value: {{ include "elasticsearch.connectionstring" . }}
- name: CLEARML__HOSTS__ELASTIC__LOGS__HOSTS
value: {{ include "elasticsearch.connectionstring" . }}
{{- if .Values.mongodb.enabled }}
- name: CLEARML_MONGODB_SERVICE_CONNECTION_STRING
value: {{ include "mongodb.connectionstring" . | quote }}
@@ -106,9 +110,9 @@ spec:
value: /opt/clearml/config
- name: CLEARML__apiserver__default_company_name
value: "{{ .Values.clearml.defaultCompany }}"
{{- if not (eq .Values.clearml.cookieDomain "") }}
- name: CLEARML__APISERVER__AUTH__SESSION_AUTH_COOKIE_NAME
value: {{ .Values.clearml.cookieName }}
{{- if .Values.clearml.cookieDomain }}
- name: CLEARML__APISERVER__AUTH__COOKIES__DOMAIN
value: ".{{ .Values.clearml.cookieDomain }}"
{{- end }}
@@ -138,8 +142,6 @@ spec:
value: "{{ .Values.enterpriseFeatures.defaultCompanyGuid }}"
- name: APPLY_ES_MAPPINGS
value: "false"
- name: CLEARML__HOSTS__ELASTIC__LOGS__HOSTS
value: "[\"http://{{ include "elasticsearch.servicename" . }}:{{ include "elasticsearch.serviceport" . }}\"]"
- name: NUMBER_OF_GUNICORN_WORKERS
value: "{{ .Values.apiserver.processes.count }}"
- name: GUNICORN_TIMEOUT

View File

@@ -12,7 +12,7 @@ data:
{{- if $.Values.imageCredentials.enabled }}
imagePullSecrets:
{{- if $.Values.imageCredentials.existingSecret }}
- name: $.Values.imageCredentials.existingSecret
- name: {{ $.Values.imageCredentials.existingSecret }}
{{- else }}
- name: clearml-registry-key
{{- end }}

View File

@@ -23,7 +23,7 @@ spec:
{{- if .Values.imageCredentials.enabled }}
imagePullSecrets:
{{- if .Values.imageCredentials.existingSecret }}
- name: .Values.imageCredentials.existingSecret
- name: {{ .Values.imageCredentials.existingSecret }}
{{- else }}
- name: clearml-registry-key
{{- end }}

View File

@@ -22,12 +22,13 @@ spec:
{{- if .Values.imageCredentials.enabled }}
imagePullSecrets:
{{- if .Values.imageCredentials.existingSecret }}
- name: .Values.imageCredentials.existingSecret
- name: {{ .Values.imageCredentials.existingSecret }}
{{- else }}
- name: clearml-registry-key
{{- end }}
{{- end }}
volumes:
{{- if .Values.fileserver.storage.enabled }}
{{- if .Values.fileserver.storage.data.existingPVC }}
- name: fileserver-data
persistentVolumeClaim:
@@ -37,6 +38,10 @@ spec:
persistentVolumeClaim:
claimName: {{ include "fileserver.referenceName" . }}-data
{{- end }}
{{- else }}
- name: fileserver-data
emptyDir: {}
{{- end }}
securityContext: {{ toYaml .Values.fileserver.podSecurityContext | nindent 8 }}
initContainers:
- name: init-fileserver

View File

@@ -1,4 +1,5 @@
{{- if .Values.fileserver.enabled }}
{{- if .Values.fileserver.storage.enabled }}
{{- if not .Values.fileserver.storage.data.existingPVC }}
kind: PersistentVolumeClaim
apiVersion: v1
@@ -17,3 +18,4 @@ spec:
{{- end -}}
{{- end }}
{{- end }}
{{- end }}

View File

@@ -22,7 +22,7 @@ spec:
{{- if .Values.imageCredentials.enabled }}
imagePullSecrets:
{{- if .Values.imageCredentials.existingSecret }}
- name: .Values.imageCredentials.existingSecret
- name: {{ .Values.imageCredentials.existingSecret }}
{{- else }}
- name: clearml-registry-key
{{- end }}

View File

@@ -16,6 +16,28 @@ webserver:
ingress:
enabled: true
hostName: "app.clearml.127-0-0-1.nip.io"
redis:
master:
name: "{{ .Release.Name }}-redis"
persistence:
enabled: true
accessModes:
- ReadWriteOnce
size: 5Gi
## If undefined (the default) or set to null, no storageClassName spec is set, choosing the default provisioner
storageClass: null
slave:
persistence:
enabled: true
accessModes:
- ReadWriteOnce
size: 5Gi
## If undefined (the default) or set to null, no storageClassName spec is set, choosing the default provisioner
storageClass: null
cluster:
enabled: true
sentinel:
enabled: true
mongodb:
enabled: true
architecture: replicaset

View File

@@ -207,6 +207,8 @@ fileserver:
# fsGroup: 1001
# -- File server persistence settings
storage:
# -- If set to false no PVC is created and emptyDir is used
enabled: true
data:
# -- If set, it uses an already existing PVC instead of dynamic provisioning
existingPVC: ""
@@ -276,10 +278,9 @@ webserver:
# -- Definition of external services to use if not enabled as dependency charts here
externalServices:
# -- Existing ElasticSearch Hostname to use if elasticsearch.enabled is false
elasticsearchHost: ""
# -- Existing ElasticSearch Port to use if elasticsearch.enabled is false
elasticsearchPort: 9200
# -- Existing ElasticSearch connectionstring if elasticsearch.enabled is false (example in values.yaml)
elasticsearchConnectionString: ""
# [{"host":"hostname1","port":9200},{"host":"hostname2","port":9200},{"host":"hostname3","port":9200}]
# -- Existing MongoDB connection string for BACKEND to use if mongodb.enabled is false
mongodbConnectionStringAuth: ""
# -- Existing MongoDB connection string for AUTH to use if mongodb.enabled is false

View File

@@ -0,0 +1,3 @@
# Openshift specific configuration
Use override files when deploying ClearML. Proposed files in this folder require setup of `<USER>` and `<FSUSER>` values to uids accepted by specific openshift configuration.

View File

@@ -0,0 +1,6 @@
agentk8sglue:
securityContext:
runAsUser: 0
basePodTemplate:
securityContext:
runAsUser: 0

View File

@@ -0,0 +1,36 @@
apiserver:
podSecurityContext:
fsGroup: <FSUSER>
runAsUser: <USER>
runAsNonRoot: true
fileserver:
podSecurityContext:
fsGroup: <FSUSER>
runAsUser: <USER>
runAsNonRoot: true
webserver:
podSecurityContext:
fsGroup: <FSUSER>
runAsUser: <USER>
runAsNonRoot: true
elasticsearch:
securityContext:
runAsUser: <USER>
podSecurityContext:
fsGroup: <FSUSER>
runAsUser: <USER>
sysctlInitContainer:
enabled: false
volumeClaimTemplate:
redis:
securityContext:
fsGroup: <FSUSER>
runAsUser: <USER>
mongodb:
podSecurityContext:
enabled: true
fsGroup: <FSUSER>
containerSecurityContext:
enabled: true
runAsUser: <USER>
runAsNonRoot: true

View File

@@ -0,0 +1,3 @@
# Tanzu specific configuration
Before installing any ClearML chart, apply `rolebinding.yaml` file after setting needed `<NAMESPACE>` in it.

View File

@@ -2,7 +2,7 @@ kind: RoleBinding
apiVersion: rbac.authorization.k8s.io/v1
metadata:
name: clearml-tanzu-rolebinding
namespace: clearml
namespace: <NAMESPACE>
roleRef:
kind: ClusterRole
name: psp:vmware-system-privileged