Compare commits

..

6 Commits

Author SHA1 Message Date
Valeriano Manassero
5540188db1 Add job support for task pod (#162)
* Added: task as job support

* Added: template generator

* Fixed: typo

* Changed: bump version

* Added: changelog reference

* Fixed: include function name

* Fixed: checksum generator

* Added: nindent

* Added: changelog item

* Fixed: job env var switch

* Fixed: double Restart policy removed

* Fixed: job template apiVersion
2023-02-15 15:27:59 +01:00
Valeriano Manassero
1f23bcf7ca 160 fileserver doesnt have an option to be with ephemeral storage (#164)
* Added: fileserver emptyDir support

* Changed: bump up version
2023-02-14 16:31:27 +01:00
Valeriano Manassero
3075f5e280 157 improve documentation (#159)
* Changed: updated installation guide

* Fixed: typo in copy and paste

* Changed: updated install guide

* Fixed: use relative path
2023-02-14 08:44:04 +01:00
Valeriano Manassero
97550c720f Fix cookiename availability (#158)
* Fixed: cookieName availability

* Changed: bump up version
2023-02-14 08:42:26 +01:00
Valeriano Manassero
a29a144119 Changed: redis cluster configuration for production (#156) 2023-02-13 12:22:01 +01:00
Valeriano Manassero
a4f77c624d Create inactive-issues.yaml 2023-02-13 08:58:08 +01:00
17 changed files with 299 additions and 218 deletions

22
.github/workflows/inactive-issues.yaml vendored Normal file
View File

@@ -0,0 +1,22 @@
name: Close inactive issues
on:
schedule:
- cron: "30 1 * * *"
jobs:
close-issues:
runs-on: ubuntu-latest
permissions:
issues: write
pull-requests: write
steps:
- uses: actions/stale@v7
with:
days-before-issue-stale: 14
days-before-issue-close: 7
stale-issue-label: "stale"
stale-issue-message: "This issue is stale because it has been open for 14 days with no activity."
close-issue-message: "This issue was closed because it has been inactive for 7 days since being marked as stale."
days-before-pr-stale: -1
days-before-pr-close: -1
repo-token: ${{ secrets.GITHUB_TOKEN }}

57
INSTALL.md Normal file
View File

@@ -0,0 +1,57 @@
# ClearML Helm Charts Installation guide
## Requirements
### Setup a Kubernetes Cluster
For setting up Kubernetes on various platforms refer to the Kubernetes [getting started guide](http://kubernetes.io/docs/getting-started-guides/).
#### Setup a single node LOCAL Kubernetes on laptop/desktop (development)
For setting up Kubernetes on your laptop/desktop we suggest [kind](https://kind.sigs.k8s.io).
#### [Kubernetes Tanzu users only] Additional setup requirements
For setting up Clear.ML on a Tanzu cluster, check [prerequisites](https://github.com/allegroai/clearml-helm-charts/tree/main/platform-specific-configs/tanzu).
#### [Kubernetes Openshift users only] Additional setup requirements
For setting up Clear.ML on a Openshift cluster, check [prerequisites](https://github.com/allegroai/clearml-helm-charts/tree/main/platform-specific-configs/openshift).
### Install Helm
Helm is a tool for managing Kubernetes charts. Charts are packages of pre-configured Kubernetes resources.
To install Helm, refer to the [Helm install guide](https://github.com/helm/helm#install) and ensure that the `helm` binary is in the `PATH` of your shell.
## Helm charts installation
### Helm Repo
```bash
$ helm repo add allegroai https://allegroai.github.io/clearml-helm-charts
$ helm repo update
```
### ClearML server ecosystem
```bash
$ helm install clearml allegroai/clearml
```
### ClearML agent
Agent is always related a ClearML server ecosystem (by default using `app.clear.ml` public service but can be on same or another Kubernetes cluster or a single server installation).
On ClearML UI, Settings -> Workspace and Create new Credentials.
In following Helm chart install command:
* set ACCESSKEY to resuted credentials access_key
* set SECRETKEY to resuted credentials secret_key
* set APIERVERURL to resuted credentials api_server
* set FILESSERVERURL to resuted credentials files_server
* set WEBSERVERURL to resuted credentials web_server
```bash
$ helm install clearml-agent allegroai/clearml-agent --set clearml.agentk8sglueKey=ACCESSKEY --set clearml.agentk8sglueSecret=SECRETKEY --set agentk8sglue.apiServerUrlReference=APISERVERURL --set agentk8sglue.fileServerUrlReference=FILESERVERURL --set agentk8sglue.webServerUrlReference=WEBSERVERURL
```

View File

@@ -1,4 +1,4 @@
# ClearML Helm Charts Library for Kubernetes
# ClearML Helm Charts for Kubernetes
## Auto-Magical Experiment Manager & Version Control for AI
@@ -23,7 +23,11 @@ Use this repository to deploy **clearml-server** on Kubernetes clusters.
## Provided in this repository
### [All around Helm Chart](https://github.com/allegroai/clearml-helm-charts/tree/main/charts/clearml)
### [ClearML server chart](https://github.com/allegroai/clearml-helm-charts/tree/main/charts/clearml)
### [ClearML agent chart](https://github.com/allegroai/clearml-helm-charts/tree/main/charts/clearml-agent)
### [ClearML serving chart](https://github.com/allegroai/clearml-helm-charts/tree/main/charts/clearml-serving)
## Who We Are
@@ -40,30 +44,9 @@ will always upgrade with you.
Apache License, Version 2.0, (see the [LICENSE](https://www.apache.org/licenses/LICENSE-2.0) for more information)
## Requirements
## Installation guide
### Setup a Kubernetes Cluster
For setting up Kubernetes on various platforms refer to the Kubernetes [getting started guide](http://kubernetes.io/docs/getting-started-guides/).
### Setup a single node LOCAL Kubernetes on laptop/desktop
For setting up Kubernetes on your laptop/desktop we suggest [kind](https://kind.sigs.k8s.io).
### Install Helm
Helm is a tool for managing Kubernetes charts. Charts are packages of pre-configured Kubernetes resources.
To install Helm, refer to the [Helm install guide](https://github.com/helm/helm#install) and ensure that the `helm` binary is in the `PATH` of your shell.
## Usage
```bash
$ helm repo add allegroai https://allegroai.github.io/clearml-helm-charts
$ helm repo update
$ helm search repo allegroai
$ helm install <release-name> allegroai/<chart>
```
For installation instruction, follow related [Installation Guide](INSTALL.md).
## Documentation, Community & Support

View File

@@ -2,7 +2,7 @@ apiVersion: v2
name: clearml-agent
description: MLOps platform Task running agent
type: application
version: "3.3.2"
version: "3.4.0"
appVersion: "1.24"
kubeVersion: ">= 1.21.0-0 < 1.27.0-0"
home: https://clear.ml
@@ -20,5 +20,9 @@ keywords:
- "task agent"
annotations:
artifacthub.io/changes: |
- kind: added
description: support for parameter Job/Pod task
- kind: fixed
description: clearml agent internal helper variable name
description: empty hostAliases map/array mismatch
- kind: fixed
description: agent deployment checksum

View File

@@ -1,6 +1,6 @@
# ClearML Kubernetes Agent
![Version: 3.3.2](https://img.shields.io/badge/Version-3.3.2-informational?style=flat-square) ![Type: application](https://img.shields.io/badge/Type-application-informational?style=flat-square) ![AppVersion: 1.24](https://img.shields.io/badge/AppVersion-1.24-informational?style=flat-square)
![Version: 3.4.0](https://img.shields.io/badge/Version-3.4.0-informational?style=flat-square) ![Type: application](https://img.shields.io/badge/Type-application-informational?style=flat-square) ![AppVersion: 1.24](https://img.shields.io/badge/AppVersion-1.24-informational?style=flat-square)
MLOps platform Task running agent
@@ -30,16 +30,16 @@ Kubernetes: `>= 1.21.0-0 < 1.27.0-0`
| Key | Type | Default | Description |
|-----|------|---------|-------------|
| agentk8sglue | object | `{"affinity":{},"annotations":{},"apiServerUrlReference":"https://api.clear.ml","basePodTemplate":{"affinity":{},"annotations":{},"env":[],"fileMounts":[],"hostAliases":{},"initContainers":[],"labels":{},"nodeSelector":{},"resources":{},"schedulerName":"","securityContext":{},"tolerations":[],"volumeMounts":[],"volumes":[]},"clearmlcheckCertificate":true,"containerCustomBashScript":"","customBashScript":"","debugMode":false,"defaultContainerImage":"ubuntu:18.04","extraEnvs":[],"fileMounts":[],"fileServerUrlReference":"https://files.clear.ml","image":{"repository":"allegroai/clearml-agent-k8s-base","tag":"1.24-21"},"labels":{},"nodeSelector":{},"queue":"default","replicaCount":1,"securityContext":{},"serviceExistingAccountName":"","tolerations":[],"volumeMounts":[],"volumes":[],"webServerUrlReference":"https://app.clear.ml"}` | This agent will spawn queued experiments in new pods, a good use case is to combine this with GPU autoscaling nodes. https://github.com/allegroai/clearml-agent/tree/master/docker/k8s-glue |
| agentk8sglue | object | `{"affinity":{},"annotations":{},"apiServerUrlReference":"https://api.clear.ml","basePodTemplate":{"affinity":{},"annotations":{},"env":[],"fileMounts":[],"hostAliases":[],"initContainers":[],"labels":{},"nodeSelector":{},"resources":{},"schedulerName":"","securityContext":{},"tolerations":[],"volumeMounts":[],"volumes":[]},"clearmlcheckCertificate":true,"containerCustomBashScript":"","customBashScript":"","debugMode":false,"defaultContainerImage":"ubuntu:18.04","extraEnvs":[],"fileMounts":[],"fileServerUrlReference":"https://files.clear.ml","image":{"repository":"allegroai/clearml-agent-k8s-base","tag":"1.24-21"},"labels":{},"nodeSelector":{},"queue":"default","replicaCount":1,"securityContext":{},"serviceExistingAccountName":"","taskAsJob":false,"tolerations":[],"volumeMounts":[],"volumes":[],"webServerUrlReference":"https://app.clear.ml"}` | This agent will spawn queued experiments in new pods, a good use case is to combine this with GPU autoscaling nodes. https://github.com/allegroai/clearml-agent/tree/master/docker/k8s-glue |
| agentk8sglue.affinity | object | `{}` | affinity setup for Agent pod (example in values.yaml comments) |
| agentk8sglue.annotations | object | `{}` | annotations setup for Agent pod (example in values.yaml comments) |
| agentk8sglue.apiServerUrlReference | string | `"https://api.clear.ml"` | Reference to Api server url |
| agentk8sglue.basePodTemplate | object | `{"affinity":{},"annotations":{},"env":[],"fileMounts":[],"hostAliases":{},"initContainers":[],"labels":{},"nodeSelector":{},"resources":{},"schedulerName":"","securityContext":{},"tolerations":[],"volumeMounts":[],"volumes":[]}` | base template for pods spawned to consume ClearML Task |
| agentk8sglue.basePodTemplate | object | `{"affinity":{},"annotations":{},"env":[],"fileMounts":[],"hostAliases":[],"initContainers":[],"labels":{},"nodeSelector":{},"resources":{},"schedulerName":"","securityContext":{},"tolerations":[],"volumeMounts":[],"volumes":[]}` | base template for pods spawned to consume ClearML Task |
| agentk8sglue.basePodTemplate.affinity | object | `{}` | affinity setup for pods spawned to consume ClearML Task |
| agentk8sglue.basePodTemplate.annotations | object | `{}` | annotations setup for pods spawned to consume ClearML Task (example in values.yaml comments) |
| agentk8sglue.basePodTemplate.env | list | `[]` | environment variables for pods spawned to consume ClearML Task (example in values.yaml comments) |
| agentk8sglue.basePodTemplate.fileMounts | list | `[]` | file definition for pods spawned to consume ClearML Task (example in values.yaml comments) |
| agentk8sglue.basePodTemplate.hostAliases | object | `{}` | hostAliases setup for pods spawned to consume ClearML Task (example in values.yaml comments) |
| agentk8sglue.basePodTemplate.hostAliases | list | `[]` | hostAliases setup for pods spawned to consume ClearML Task (example in values.yaml comments) |
| agentk8sglue.basePodTemplate.initContainers | list | `[]` | initContainers definition for pods spawned to consume ClearML Task (example in values.yaml comments) |
| agentk8sglue.basePodTemplate.labels | object | `{}` | labels setup for pods spawned to consume ClearML Task (example in values.yaml comments) |
| agentk8sglue.basePodTemplate.nodeSelector | object | `{}` | nodeSelector setup for pods spawned to consume ClearML Task (example in values.yaml comments) |
@@ -63,6 +63,7 @@ Kubernetes: `>= 1.21.0-0 < 1.27.0-0`
| agentk8sglue.replicaCount | int | `1` | Glue Agent number of pods |
| agentk8sglue.securityContext | object | `{}` | Web Server pod security context |
| agentk8sglue.serviceExistingAccountName | string | `""` | if set, don't create a serviceAccountName but use defined existing one |
| agentk8sglue.taskAsJob | bool | `false` | ClearML spawn tasks as jobs instead of pods |
| agentk8sglue.tolerations | list | `[]` | tolerations setup for Agent pod (example in values.yaml comments) |
| agentk8sglue.volumeMounts | list | `[]` | volume mounts definition for Glue Agent (example in values.yaml comments) |
| agentk8sglue.volumes | list | `[]` | volumes definition for Glue Agent (example in values.yaml comments) |

View File

@@ -72,7 +72,6 @@ Create secret to access docker registry
{{- end }}
{{- end }}
{{/*
Create a string composed by queue names
*/}}
@@ -83,3 +82,135 @@ Create a string composed by queue names
{{- end }}
{{- join " " $list }}
{{- end }}
{{/*
Create a task container template
*/}}
{{- define "taskContainer.containerTemplate" -}}
{{- if .main.Values.imageCredentials.enabled }}
imagePullSecrets:
- name: {{ .main.Values.imageCredentials.existingSecret | default (printf "%s-ark" (include "clearmlAgent.name" .main )) }}
{{- end }}
schedulerName: {{ .value.templateOverrides.schedulerName | default (.main.Values.agentk8sglue.basePodTemplate.schedulerName) }}
restartPolicy: Never
securityContext:
{{- .value.templateOverrides.securityContext | default .main.Values.agentk8sglue.basePodTemplate.securityContext | toYaml | nindent 2 }}
hostAliases:
{{- .value.templateOverrides.hostAliases | default .main.Values.agentk8sglue.basePodTemplate.hostAliases | toYaml | nindent 2 }}
volumes:
{{ $computedvolumes := (.value.templateOverrides.volumes | default .main.Values.agentk8sglue.basePodTemplate.volumes) }}
{{- if $computedvolumes }}{{- $computedvolumes | toYaml | nindent 2 }}{{- end }}
{{- if .value.templateOverrides.fileMounts }}
- name: filemounts
secret:
secretName: {{ include "clearmlAgent.name" .main }}-{{ .key }}-fm
{{- else if .main.Values.agentk8sglue.basePodTemplate.fileMounts }}
- name: filemounts
secret:
secretName: {{ include "clearmlAgent.name" .main }}-fm
{{- end }}
{{- if not .main.Values.enterpriseFeatures.serviceAccountClusterAccess }}
serviceAccountName: {{ include "clearmlAgent.serviceAccountName" .main }}
{{- end }}
initContainers:
{{- .value.templateOverrides.initContainers | default .main.Values.agentk8sglue.basePodTemplate.initContainers | toYaml | nindent 2 }}
containers:
- resources:
{{- .value.templateOverrides.resources | default .main.Values.agentk8sglue.basePodTemplate.resources | toYaml | nindent 4 }}
ports:
- containerPort: 10022
volumeMounts:
{{ $computedvolumemounts := (.value.templateOverrides.volumeMounts | default .main.Values.agentk8sglue.basePodTemplate.volumeMounts) }}
{{- if $computedvolumemounts }}{{- $computedvolumemounts | toYaml | nindent 4 }}{{- end }}
{{- if .value.templateOverrides.fileMounts }}
{{- range .value.templateOverrides.fileMounts }}
- name: filemounts
mountPath: "{{ .folderPath }}/{{ .name }}"
subPath: "{{ .name }}"
readOnly: true
{{- end }}
{{- else if .main.Values.agentk8sglue.basePodTemplate.fileMounts }}
{{- range .main.Values.agentk8sglue.basePodTemplate.fileMounts }}
- name: filemounts
mountPath: "{{ .folderPath }}/{{ .name }}"
subPath: "{{ .name }}"
readOnly: true
{{- end }}
{{- end }}
env:
- name: CLEARML_API_HOST
value: {{ .main.Values.agentk8sglue.apiServerUrlReference }}
- name: CLEARML_WEB_HOST
value: {{ .main.Values.agentk8sglue.webServerUrlReference }}
- name: CLEARML_FILES_HOST
value: {{ .main.Values.agentk8sglue.fileServerUrlReference }}
{{- if not .main.Values.enterpriseFeatures.useOwnerToken }}
- name: CLEARML_API_ACCESS_KEY
valueFrom:
secretKeyRef:
name: {{ .main.Values.clearml.existingAgentk8sglueSecret | default (printf "%s-ac" (include "clearmlAgent.name" .main )) }}
key: agentk8sglue_key
- name: CLEARML_API_SECRET_KEY
valueFrom:
secretKeyRef:
name: {{ .main.Values.clearml.existingAgentk8sglueSecret | default (printf "%s-ac" (include "clearmlAgent.name" .main )) }}
key: agentk8sglue_secret
{{- end }}
- name: PYTHONUNBUFFERED
value: "x"
{{- if not .main.Values.agentk8sglue.clearmlcheckCertificate }}
- name: CLEARML_API_HOST_VERIFY_CERT
value: "false"
{{- end }}
{{ $computedenvs := (.value.templateOverrides.env| default .main.Values.agentk8sglue.basePodTemplate.env) }}
{{- if $computedenvs }}{{- $computedenvs | toYaml | nindent 4 }}{{- end }}
nodeSelector:
{{ .value.templateOverrides.nodeSelector | default .main.Values.agentk8sglue.basePodTemplate.nodeSelector | toYaml | nindent 2 }}
tolerations:
{{ .value.templateOverrides.tolerations | default .main.Values.agentk8sglue.basePodTemplate.tolerations | toYaml | nindent 2 }}
affinity:
{{ .value.templateOverrides.affinity | default .main.Values.agentk8sglue.basePodTemplate.affinity | toYaml | nindent 2 }}
{{- end }}
{{/*
Create a task container template
*/}}
{{- define "taskContainer.podTemplate" -}}
{{- range $key, $value := $.Values.enterpriseFeatures.queues }}
{{ $key }}:
apiVersion: v1
kind: Pod
metadata:
namespace: {{ $.Release.Namespace }}
labels:
{{ $value.templateOverrides.labels | default $.Values.agentk8sglue.basePodTemplate.labels | toYaml }}
annotations:
{{ $value.templateOverrides.annotations | default $.Values.agentk8sglue.basePodTemplate.annotations | toYaml }}
spec:
{{- $data := dict "main" $ "key" $key "value" $value -}}
{{- include "taskContainer.containerTemplate" $data | nindent 4}}
{{- end }}
{{- end }}
{{/*
Create a task container template
*/}}
{{- define "taskContainer.jobTemplate" -}}
{{- range $key, $value := $.Values.enterpriseFeatures.queues }}
{{ $key }}:
apiVersion: batch/v1
kind: Job
metadata:
namespace: {{ $.Release.Namespace }}
labels:
{{ $value.templateOverrides.labels | default $.Values.agentk8sglue.basePodTemplate.labels | toYaml }}
annotations:
{{ $value.templateOverrides.annotations | default $.Values.agentk8sglue.basePodTemplate.annotations | toYaml }}
spec:
template:
spec:
{{- $data := dict "main" $ "key" $key "value" $value -}}
{{- include "taskContainer.containerTemplate" $data | nindent 8 }}
backoffLimit: 0
{{- end }}
{{- end }}

View File

@@ -5,184 +5,10 @@ metadata:
data:
{{- if .Values.enterpriseFeatures.enabled }}
template.yaml: |
{{- range $key, $value := $.Values.enterpriseFeatures.queues }}
{{ $key }}:
apiVersion: v1
metadata:
namespace: {{ $.Release.Namespace }}
{{- if $value.templateOverrides.labels }}
labels:
{{- toYaml $value.templateOverrides.labels | nindent 10 }}
{{- else if $.Values.agentk8sglue.basePodTemplate.labels }}
labels:
{{- toYaml $.Values.agentk8sglue.basePodTemplate.labels | nindent 10 }}
{{- end}}
{{- if $value.templateOverrides.annotations }}
annotations:
{{- toYaml $value.templateOverrides.annotations | nindent 10 }}
{{- else if $.Values.agentk8sglue.basePodTemplate.annotations }}
annotations:
{{- toYaml $.Values.agentk8sglue.basePodTemplate.annotations | nindent 10 }}
{{- end}}
spec:
{{- if $.Values.imageCredentials.enabled }}
imagePullSecrets:
{{- if $.Values.imageCredentials.existingSecret }}
- name: {{ $.Values.imageCredentials.existingSecret }}
{{- else }}
- name: {{ include "clearmlAgent.name" $ }}-ark
{{- end }}
{{- end }}
{{- if $value.templateOverrides.schedulerName }}
schedulerName: {{ $value.templateOverrides.schedulerName }}
{{- else if $.Values.agentk8sglue.basePodTemplate.schedulerName }}
schedulerName: {{ $.Values.agentk8sglue.basePodTemplate.schedulerName }}
{{- end}}
restartPolicy: Never
{{- if $value.templateOverrides.securityContext }}
securityContext:
{{- toYaml $value.templateOverrides.securityContext | nindent 10 }}
{{- else if $.Values.agentk8sglue.basePodTemplate.securityContext }}
securityContext:
{{- toYaml $.Values.agentk8sglue.basePodTemplate.securityContext | nindent 10 }}
{{- end}}
{{- if $value.templateOverrides.hostAliases }}
{{- with $value.templateOverrides.hostAliases }}
hostAliases:
{{- toYaml . | nindent 10 }}
{{- end }}
{{- else if $.Values.agentk8sglue.basePodTemplate.hostAliases }}
{{- with $.Values.agentk8sglue.basePodTemplate.hostAliases }}
hostAliases:
{{- toYaml . | nindent 10 }}
{{- end }}
{{- end }}
volumes:
{{- if $value.templateOverrides.volumes }}
{{- toYaml $value.templateOverrides.volumes | nindent 10 }}
{{- else if $.Values.agentk8sglue.basePodTemplate.volumes }}
{{- toYaml $.Values.agentk8sglue.basePodTemplate.volumes | nindent 10 }}
{{- end }}
{{- if $value.templateOverrides.fileMounts }}
- name: filemounts
secret:
secretName: {{ include "clearmlAgent.name" $ }}-{{ $key }}-fm
{{- else if $.Values.agentk8sglue.basePodTemplate.fileMounts }}
- name: filemounts
secret:
secretName: {{ include "clearmlAgent.name" $ }}-fm
{{- end }}
{{- if not $.Values.enterpriseFeatures.serviceAccountClusterAccess }}
serviceAccountName: {{ include "clearmlAgent.serviceAccountName" $ }}
{{- end }}
{{- if $value.templateOverrides.initContainers }}
initContainers:
{{- toYaml $value.templateOverrides.initContainers | nindent 10 }}
{{- else if $.Values.agentk8sglue.basePodTemplate.initContainers }}
initContainers:
{{- toYaml $.Values.agentk8sglue.basePodTemplate.initContainers | nindent 10 }}
{{- end }}
containers:
- resources:
{{- if $value.templateOverrides.resources }}
{{- toYaml $value.templateOverrides.resources | nindent 12 }}
{{- else if $.Values.agentk8sglue.basePodTemplate.resources }}
{{- toYaml $.Values.agentk8sglue.basePodTemplate.resources | nindent 12 }}
{{- end}}
ports:
- containerPort: 10022
volumeMounts:
{{- if $value.templateOverrides.volumeMounts }}
{{- toYaml $value.templateOverrides.volumeMounts | nindent 12 }}
{{- else if $.Values.agentk8sglue.basePodTemplate.volumeMounts }}
{{- toYaml $.Values.agentk8sglue.basePodTemplate.volumeMounts | nindent 12 }}
{{- end }}
{{- if $value.templateOverrides.fileMounts }}
{{- range $value.templateOverrides.fileMounts }}
- name: filemounts
mountPath: "{{ .folderPath }}/{{ .name }}"
subPath: "{{ .name }}"
readOnly: true
{{- end }}
{{- else if $.Values.agentk8sglue.basePodTemplate.fileMounts }}
{{- range $.Values.agentk8sglue.basePodTemplate.fileMounts }}
- name: filemounts
mountPath: "{{ .folderPath }}/{{ .name }}"
subPath: "{{ .name }}"
readOnly: true
{{- end }}
{{- end }}
env:
- name: CLEARML_API_HOST
value: {{ $.Values.agentk8sglue.apiServerUrlReference }}
- name: CLEARML_WEB_HOST
value: {{ $.Values.agentk8sglue.webServerUrlReference }}
- name: CLEARML_FILES_HOST
value: {{ $.Values.agentk8sglue.fileServerUrlReference }}
{{- if not $.Values.enterpriseFeatures.useOwnerToken }}
- name: CLEARML_API_ACCESS_KEY
valueFrom:
secretKeyRef:
{{- if $.Values.clearml.existingAgentk8sglueSecret }}
name: {{ $.Values.clearml.existingAgentk8sglueSecret }}
{{- else }}
name: {{ include "clearmlAgent.name" $ }}-ac
{{- end }}
key: agentk8sglue_key
- name: CLEARML_API_SECRET_KEY
valueFrom:
secretKeyRef:
{{- if $.Values.clearml.existingAgentk8sglueSecret }}
name: {{ $.Values.clearml.existingAgentk8sglueSecret }}
{{- else }}
name: {{ include "clearmlAgent.name" $ }}-ac
{{- end }}
key: agentk8sglue_secret
{{- end }}
- name: PYTHONUNBUFFERED
value: "x"
{{- if not $.Values.agentk8sglue.clearmlcheckCertificate }}
- name: CLEARML_API_HOST_VERIFY_CERT
value: "false"
{{- end }}
{{- if $value.templateOverrides.env }}
{{- toYaml $value.templateOverrides.env | nindent 12 }}
{{- else if $.Values.agentk8sglue.basePodTemplate.env }}
{{- toYaml $.Values.agentk8sglue.basePodTemplate.env | nindent 12 }}
{{- end }}
{{- if $value.templateOverrides.nodeSelector }}
{{- with $value.templateOverrides.nodeSelector }}
nodeSelector:
{{- toYaml . | nindent 12 }}
{{- end }}
{{- else if $.Values.agentk8sglue.basePodTemplate.nodeSelector }}
{{- with $.Values.agentk8sglue.basePodTemplate.nodeSelector }}
nodeSelector:
{{- toYaml . | nindent 10 }}
{{- end }}
{{- end }}
{{- if $value.templateOverrides.tolerations }}
{{- with $value.templateOverrides.tolerations }}
tolerations:
{{- toYaml . | nindent 10 }}
{{- end }}
{{- else if $.Values.agentk8sglue.basePodTemplate.tolerations }}
{{- with $.Values.agentk8sglue.basePodTemplate.tolerations }}
tolerations:
{{- toYaml . | nindent 10 }}
{{- end }}
{{- end }}
{{- if $value.templateOverrides.affinity }}
{{- with $value.templateOverrides.affinity }}
affinity:
{{- toYaml . | nindent 10 }}
{{- end }}
{{- else if $.Values.agentk8sglue.basePodTemplate.affinity }}
{{- with $.Values.agentk8sglue.basePodTemplate.affinity }}
affinity:
{{- toYaml . | nindent 10 }}
{{- end }}
{{- end }}
{{- if .Values.agentk8sglue.taskAsJob }}
{{ include "taskContainer.jobTemplate" . | nindent 4}}
{{- else }}
{{ include "taskContainer.podTemplate" . | nindent 4}}
{{- end }}
secrets.yaml: |
{{- range $key, $value := $.Values.enterpriseFeatures.queues }}

View File

@@ -14,7 +14,7 @@ spec:
template:
metadata:
annotations:
checksum/config: {{ printf "%s%s" .Values.clearml .Values.agentk8sglue | sha256sum }}
checksum/config: {{ printf "%s" .Values | sha256sum }}
{{- include "clearmlAgent.annotations" . | nindent 8 }}
labels:
{{- include "clearmlAgent.labels" . | nindent 8 }}
@@ -158,6 +158,13 @@ spec:
value: "interactive"
{{- end }}
{{- end }}
{{- if .Values.agentk8sglue.taskAsJob }}
- name: "CLEARML_K8S_GLUE_KIND"
value: "job"
{{- else }}
- name: "CLEARML_K8S_GLUE_KIND"
value: "pod"
{{- end }}
{{- if .Values.enterpriseFeatures.enabled }}
- name: K8S_GLUE_QUEUE
value: {{ include "agentk8sglue.queues" . | quote }}

View File

@@ -24,6 +24,14 @@ rules:
resources:
- namespaces
verbs: ["list"]
{{- if .Values.agentk8sglue.taskAsJob }}
- apiGroups:
- batch
- extensions
resources:
- jobs
verbs: ["get", "list", "watch", "create", "patch", "delete"]
{{- end }}
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
@@ -56,6 +64,14 @@ rules:
resources:
- namespaces
verbs: ["list"]
{{- if .Values.agentk8sglue.taskAsJob }}
- apiGroups:
- batch
- extensions
resources:
- jobs
verbs: ["get", "list", "watch", "create", "patch", "delete"]
{{- end }}
---
apiVersion: rbac.authorization.k8s.io/v1
kind: RoleBinding

View File

@@ -61,6 +61,8 @@ agentk8sglue:
defaultContainerImage: ubuntu:18.04
# -- ClearML queue this agent will consume
queue: default
# -- ClearML spawn tasks as jobs instead of pods
taskAsJob: false
# -- Custom Bash script for the Glue Agent
# -- labels setup for Agent pod (example in values.yaml comments)
labels: {}
@@ -181,7 +183,7 @@ agentk8sglue:
# runAsUser: 1001
# fsGroup: 1001
# -- hostAliases setup for pods spawned to consume ClearML Task (example in values.yaml comments)
hostAliases: {}
hostAliases: []
# - ip: "127.0.0.1"
# hostnames:
# - "foo.local"

View File

@@ -2,7 +2,7 @@ apiVersion: v2
name: clearml
description: MLOps platform
type: application
version: "5.6.0"
version: "5.7.0"
appVersion: "1.9.2"
kubeVersion: ">= 1.21.0-0 < 1.27.0-0"
home: https://clear.ml
@@ -33,4 +33,4 @@ dependencies:
annotations:
artifacthub.io/changes: |
- kind: added
description: multi host external elasticsearch support
description: fileserver support for emptyDir

View File

@@ -1,6 +1,6 @@
# ClearML Ecosystem for Kubernetes
![Version: 5.6.0](https://img.shields.io/badge/Version-5.6.0-informational?style=flat-square) ![Type: application](https://img.shields.io/badge/Type-application-informational?style=flat-square) ![AppVersion: 1.9.2](https://img.shields.io/badge/AppVersion-1.9.2-informational?style=flat-square)
![Version: 5.7.0](https://img.shields.io/badge/Version-5.7.0-informational?style=flat-square) ![Type: application](https://img.shields.io/badge/Type-application-informational?style=flat-square) ![AppVersion: 1.9.2](https://img.shields.io/badge/AppVersion-1.9.2-informational?style=flat-square)
MLOps platform
@@ -222,7 +222,7 @@ Kubernetes: `>= 1.21.0-0 < 1.27.0-0`
| externalServices.mongodbConnectionStringBackend | string | `""` | Existing MongoDB connection string for AUTH to use if mongodb.enabled is false |
| externalServices.redisHost | string | `""` | Existing Redis Hostname to use if redis.enabled is false |
| externalServices.redisPort | int | `6379` | Existing Redis Port to use if redis.enabled is false |
| fileserver | object | `{"affinity":{},"enabled":true,"extraEnvs":[],"image":{"pullPolicy":"IfNotPresent","repository":"allegroai/clearml","tag":"1.9.2-317"},"ingress":{"annotations":{},"enabled":false,"hostName":"files.clearml.127-0-0-1.nip.io","ingressClassName":"","path":"/","tlsSecretName":""},"nodeSelector":{},"podAnnotations":{},"replicaCount":1,"resources":{"limits":{"cpu":"2000m","memory":"1Gi"},"requests":{"cpu":"100m","memory":"256Mi"}},"securityContext":{},"service":{"nodePort":30081,"port":8081,"type":"NodePort"},"storage":{"data":{"accessMode":"ReadWriteOnce","class":"","existingPVC":"","size":"50Gi"}},"tolerations":[]}` | File Server configurations |
| fileserver | object | `{"affinity":{},"enabled":true,"extraEnvs":[],"image":{"pullPolicy":"IfNotPresent","repository":"allegroai/clearml","tag":"1.9.2-317"},"ingress":{"annotations":{},"enabled":false,"hostName":"files.clearml.127-0-0-1.nip.io","ingressClassName":"","path":"/","tlsSecretName":""},"nodeSelector":{},"podAnnotations":{},"replicaCount":1,"resources":{"limits":{"cpu":"2000m","memory":"1Gi"},"requests":{"cpu":"100m","memory":"256Mi"}},"securityContext":{},"service":{"nodePort":30081,"port":8081,"type":"NodePort"},"storage":{"data":{"accessMode":"ReadWriteOnce","class":"","existingPVC":"","size":"50Gi"},"enabled":true},"tolerations":[]}` | File Server configurations |
| fileserver.affinity | object | `{}` | File Server affinity setup |
| fileserver.enabled | bool | `true` | Enable/Disable component deployment |
| fileserver.extraEnvs | list | `[]` | File Server extra envrinoment variables |
@@ -241,10 +241,11 @@ Kubernetes: `>= 1.21.0-0 < 1.27.0-0`
| fileserver.securityContext | object | `{}` | File Server pod security context |
| fileserver.service | object | `{"nodePort":30081,"port":8081,"type":"NodePort"}` | File Server internal service configuration |
| fileserver.service.nodePort | int | `30081` | If service.type set to NodePort, this will be set to service's nodePort field. If service.type is set to others, this field will be ignored |
| fileserver.storage | object | `{"data":{"accessMode":"ReadWriteOnce","class":"","existingPVC":"","size":"50Gi"}}` | File server persistence settings |
| fileserver.storage | object | `{"data":{"accessMode":"ReadWriteOnce","class":"","existingPVC":"","size":"50Gi"},"enabled":true}` | File server persistence settings |
| fileserver.storage.data.accessMode | string | `"ReadWriteOnce"` | Access mode (must be ReadWriteMany if fileserver replica > 1) |
| fileserver.storage.data.class | string | `""` | Storage class (use default if empty) |
| fileserver.storage.data.existingPVC | string | `""` | If set, it uses an already existing PVC instead of dynamic provisioning |
| fileserver.storage.enabled | bool | `true` | If set to false no PVC is created and emptyDir is used |
| fileserver.tolerations | list | `[]` | File Server tolerations setup |
| imageCredentials | object | `{"email":"someone@host.com","enabled":false,"existingSecret":"","password":"pwd","registry":"docker.io","username":"someone"}` | Container registry configuration |
| imageCredentials.email | string | `"someone@host.com"` | Email |

View File

@@ -110,9 +110,9 @@ spec:
value: /opt/clearml/config
- name: CLEARML__apiserver__default_company_name
value: "{{ .Values.clearml.defaultCompany }}"
{{- if not (eq .Values.clearml.cookieDomain "") }}
- name: CLEARML__APISERVER__AUTH__SESSION_AUTH_COOKIE_NAME
value: {{ .Values.clearml.cookieName }}
{{- if .Values.clearml.cookieDomain }}
- name: CLEARML__APISERVER__AUTH__COOKIES__DOMAIN
value: ".{{ .Values.clearml.cookieDomain }}"
{{- end }}

View File

@@ -28,6 +28,7 @@ spec:
{{- end }}
{{- end }}
volumes:
{{- if .Values.fileserver.storage.enabled }}
{{- if .Values.fileserver.storage.data.existingPVC }}
- name: fileserver-data
persistentVolumeClaim:
@@ -37,6 +38,10 @@ spec:
persistentVolumeClaim:
claimName: {{ include "fileserver.referenceName" . }}-data
{{- end }}
{{- else }}
- name: fileserver-data
emptyDir: {}
{{- end }}
securityContext: {{ toYaml .Values.fileserver.podSecurityContext | nindent 8 }}
initContainers:
- name: init-fileserver

View File

@@ -1,4 +1,5 @@
{{- if .Values.fileserver.enabled }}
{{- if .Values.fileserver.storage.enabled }}
{{- if not .Values.fileserver.storage.data.existingPVC }}
kind: PersistentVolumeClaim
apiVersion: v1
@@ -17,3 +18,4 @@ spec:
{{- end -}}
{{- end }}
{{- end }}
{{- end }}

View File

@@ -16,6 +16,28 @@ webserver:
ingress:
enabled: true
hostName: "app.clearml.127-0-0-1.nip.io"
redis:
master:
name: "{{ .Release.Name }}-redis"
persistence:
enabled: true
accessModes:
- ReadWriteOnce
size: 5Gi
## If undefined (the default) or set to null, no storageClassName spec is set, choosing the default provisioner
storageClass: null
slave:
persistence:
enabled: true
accessModes:
- ReadWriteOnce
size: 5Gi
## If undefined (the default) or set to null, no storageClassName spec is set, choosing the default provisioner
storageClass: null
cluster:
enabled: true
sentinel:
enabled: true
mongodb:
enabled: true
architecture: replicaset

View File

@@ -207,6 +207,8 @@ fileserver:
# fsGroup: 1001
# -- File server persistence settings
storage:
# -- If set to false no PVC is created and emptyDir is used
enabled: true
data:
# -- If set, it uses an already existing PVC instead of dynamic provisioning
existingPVC: ""