mirror of
https://github.com/clearml/clearml-helm-charts
synced 2025-04-17 01:31:13 +00:00
Compare commits
6 Commits
clearml-ag
...
clearml-ag
| Author | SHA1 | Date | |
|---|---|---|---|
|
|
5540188db1 | ||
|
|
1f23bcf7ca | ||
|
|
3075f5e280 | ||
|
|
97550c720f | ||
|
|
a29a144119 | ||
|
|
a4f77c624d |
22
.github/workflows/inactive-issues.yaml
vendored
Normal file
22
.github/workflows/inactive-issues.yaml
vendored
Normal file
@@ -0,0 +1,22 @@
|
||||
name: Close inactive issues
|
||||
on:
|
||||
schedule:
|
||||
- cron: "30 1 * * *"
|
||||
|
||||
jobs:
|
||||
close-issues:
|
||||
runs-on: ubuntu-latest
|
||||
permissions:
|
||||
issues: write
|
||||
pull-requests: write
|
||||
steps:
|
||||
- uses: actions/stale@v7
|
||||
with:
|
||||
days-before-issue-stale: 14
|
||||
days-before-issue-close: 7
|
||||
stale-issue-label: "stale"
|
||||
stale-issue-message: "This issue is stale because it has been open for 14 days with no activity."
|
||||
close-issue-message: "This issue was closed because it has been inactive for 7 days since being marked as stale."
|
||||
days-before-pr-stale: -1
|
||||
days-before-pr-close: -1
|
||||
repo-token: ${{ secrets.GITHUB_TOKEN }}
|
||||
57
INSTALL.md
Normal file
57
INSTALL.md
Normal file
@@ -0,0 +1,57 @@
|
||||
# ClearML Helm Charts Installation guide
|
||||
|
||||
## Requirements
|
||||
|
||||
### Setup a Kubernetes Cluster
|
||||
|
||||
For setting up Kubernetes on various platforms refer to the Kubernetes [getting started guide](http://kubernetes.io/docs/getting-started-guides/).
|
||||
|
||||
#### Setup a single node LOCAL Kubernetes on laptop/desktop (development)
|
||||
|
||||
For setting up Kubernetes on your laptop/desktop we suggest [kind](https://kind.sigs.k8s.io).
|
||||
|
||||
#### [Kubernetes Tanzu users only] Additional setup requirements
|
||||
|
||||
For setting up Clear.ML on a Tanzu cluster, check [prerequisites](https://github.com/allegroai/clearml-helm-charts/tree/main/platform-specific-configs/tanzu).
|
||||
|
||||
#### [Kubernetes Openshift users only] Additional setup requirements
|
||||
|
||||
For setting up Clear.ML on a Openshift cluster, check [prerequisites](https://github.com/allegroai/clearml-helm-charts/tree/main/platform-specific-configs/openshift).
|
||||
|
||||
### Install Helm
|
||||
|
||||
Helm is a tool for managing Kubernetes charts. Charts are packages of pre-configured Kubernetes resources.
|
||||
|
||||
To install Helm, refer to the [Helm install guide](https://github.com/helm/helm#install) and ensure that the `helm` binary is in the `PATH` of your shell.
|
||||
|
||||
## Helm charts installation
|
||||
|
||||
### Helm Repo
|
||||
|
||||
```bash
|
||||
$ helm repo add allegroai https://allegroai.github.io/clearml-helm-charts
|
||||
$ helm repo update
|
||||
```
|
||||
### ClearML server ecosystem
|
||||
|
||||
```bash
|
||||
$ helm install clearml allegroai/clearml
|
||||
```
|
||||
|
||||
### ClearML agent
|
||||
|
||||
Agent is always related a ClearML server ecosystem (by default using `app.clear.ml` public service but can be on same or another Kubernetes cluster or a single server installation).
|
||||
|
||||
On ClearML UI, Settings -> Workspace and Create new Credentials.
|
||||
|
||||
In following Helm chart install command:
|
||||
|
||||
* set ACCESSKEY to resuted credentials access_key
|
||||
* set SECRETKEY to resuted credentials secret_key
|
||||
* set APIERVERURL to resuted credentials api_server
|
||||
* set FILESSERVERURL to resuted credentials files_server
|
||||
* set WEBSERVERURL to resuted credentials web_server
|
||||
|
||||
```bash
|
||||
$ helm install clearml-agent allegroai/clearml-agent --set clearml.agentk8sglueKey=ACCESSKEY --set clearml.agentk8sglueSecret=SECRETKEY --set agentk8sglue.apiServerUrlReference=APISERVERURL --set agentk8sglue.fileServerUrlReference=FILESERVERURL --set agentk8sglue.webServerUrlReference=WEBSERVERURL
|
||||
```
|
||||
33
README.md
33
README.md
@@ -1,4 +1,4 @@
|
||||
# ClearML Helm Charts Library for Kubernetes
|
||||
# ClearML Helm Charts for Kubernetes
|
||||
|
||||
## Auto-Magical Experiment Manager & Version Control for AI
|
||||
|
||||
@@ -23,7 +23,11 @@ Use this repository to deploy **clearml-server** on Kubernetes clusters.
|
||||
|
||||
## Provided in this repository
|
||||
|
||||
### [All around Helm Chart](https://github.com/allegroai/clearml-helm-charts/tree/main/charts/clearml)
|
||||
### [ClearML server chart](https://github.com/allegroai/clearml-helm-charts/tree/main/charts/clearml)
|
||||
|
||||
### [ClearML agent chart](https://github.com/allegroai/clearml-helm-charts/tree/main/charts/clearml-agent)
|
||||
|
||||
### [ClearML serving chart](https://github.com/allegroai/clearml-helm-charts/tree/main/charts/clearml-serving)
|
||||
|
||||
## Who We Are
|
||||
|
||||
@@ -40,30 +44,9 @@ will always upgrade with you.
|
||||
|
||||
Apache License, Version 2.0, (see the [LICENSE](https://www.apache.org/licenses/LICENSE-2.0) for more information)
|
||||
|
||||
## Requirements
|
||||
## Installation guide
|
||||
|
||||
### Setup a Kubernetes Cluster
|
||||
|
||||
For setting up Kubernetes on various platforms refer to the Kubernetes [getting started guide](http://kubernetes.io/docs/getting-started-guides/).
|
||||
|
||||
### Setup a single node LOCAL Kubernetes on laptop/desktop
|
||||
|
||||
For setting up Kubernetes on your laptop/desktop we suggest [kind](https://kind.sigs.k8s.io).
|
||||
|
||||
### Install Helm
|
||||
|
||||
Helm is a tool for managing Kubernetes charts. Charts are packages of pre-configured Kubernetes resources.
|
||||
|
||||
To install Helm, refer to the [Helm install guide](https://github.com/helm/helm#install) and ensure that the `helm` binary is in the `PATH` of your shell.
|
||||
|
||||
## Usage
|
||||
|
||||
```bash
|
||||
$ helm repo add allegroai https://allegroai.github.io/clearml-helm-charts
|
||||
$ helm repo update
|
||||
$ helm search repo allegroai
|
||||
$ helm install <release-name> allegroai/<chart>
|
||||
```
|
||||
For installation instruction, follow related [Installation Guide](INSTALL.md).
|
||||
|
||||
## Documentation, Community & Support
|
||||
|
||||
|
||||
@@ -2,7 +2,7 @@ apiVersion: v2
|
||||
name: clearml-agent
|
||||
description: MLOps platform Task running agent
|
||||
type: application
|
||||
version: "3.3.2"
|
||||
version: "3.4.0"
|
||||
appVersion: "1.24"
|
||||
kubeVersion: ">= 1.21.0-0 < 1.27.0-0"
|
||||
home: https://clear.ml
|
||||
@@ -20,5 +20,9 @@ keywords:
|
||||
- "task agent"
|
||||
annotations:
|
||||
artifacthub.io/changes: |
|
||||
- kind: added
|
||||
description: support for parameter Job/Pod task
|
||||
- kind: fixed
|
||||
description: clearml agent internal helper variable name
|
||||
description: empty hostAliases map/array mismatch
|
||||
- kind: fixed
|
||||
description: agent deployment checksum
|
||||
|
||||
@@ -1,6 +1,6 @@
|
||||
# ClearML Kubernetes Agent
|
||||
|
||||
  
|
||||
  
|
||||
|
||||
MLOps platform Task running agent
|
||||
|
||||
@@ -30,16 +30,16 @@ Kubernetes: `>= 1.21.0-0 < 1.27.0-0`
|
||||
|
||||
| Key | Type | Default | Description |
|
||||
|-----|------|---------|-------------|
|
||||
| agentk8sglue | object | `{"affinity":{},"annotations":{},"apiServerUrlReference":"https://api.clear.ml","basePodTemplate":{"affinity":{},"annotations":{},"env":[],"fileMounts":[],"hostAliases":{},"initContainers":[],"labels":{},"nodeSelector":{},"resources":{},"schedulerName":"","securityContext":{},"tolerations":[],"volumeMounts":[],"volumes":[]},"clearmlcheckCertificate":true,"containerCustomBashScript":"","customBashScript":"","debugMode":false,"defaultContainerImage":"ubuntu:18.04","extraEnvs":[],"fileMounts":[],"fileServerUrlReference":"https://files.clear.ml","image":{"repository":"allegroai/clearml-agent-k8s-base","tag":"1.24-21"},"labels":{},"nodeSelector":{},"queue":"default","replicaCount":1,"securityContext":{},"serviceExistingAccountName":"","tolerations":[],"volumeMounts":[],"volumes":[],"webServerUrlReference":"https://app.clear.ml"}` | This agent will spawn queued experiments in new pods, a good use case is to combine this with GPU autoscaling nodes. https://github.com/allegroai/clearml-agent/tree/master/docker/k8s-glue |
|
||||
| agentk8sglue | object | `{"affinity":{},"annotations":{},"apiServerUrlReference":"https://api.clear.ml","basePodTemplate":{"affinity":{},"annotations":{},"env":[],"fileMounts":[],"hostAliases":[],"initContainers":[],"labels":{},"nodeSelector":{},"resources":{},"schedulerName":"","securityContext":{},"tolerations":[],"volumeMounts":[],"volumes":[]},"clearmlcheckCertificate":true,"containerCustomBashScript":"","customBashScript":"","debugMode":false,"defaultContainerImage":"ubuntu:18.04","extraEnvs":[],"fileMounts":[],"fileServerUrlReference":"https://files.clear.ml","image":{"repository":"allegroai/clearml-agent-k8s-base","tag":"1.24-21"},"labels":{},"nodeSelector":{},"queue":"default","replicaCount":1,"securityContext":{},"serviceExistingAccountName":"","taskAsJob":false,"tolerations":[],"volumeMounts":[],"volumes":[],"webServerUrlReference":"https://app.clear.ml"}` | This agent will spawn queued experiments in new pods, a good use case is to combine this with GPU autoscaling nodes. https://github.com/allegroai/clearml-agent/tree/master/docker/k8s-glue |
|
||||
| agentk8sglue.affinity | object | `{}` | affinity setup for Agent pod (example in values.yaml comments) |
|
||||
| agentk8sglue.annotations | object | `{}` | annotations setup for Agent pod (example in values.yaml comments) |
|
||||
| agentk8sglue.apiServerUrlReference | string | `"https://api.clear.ml"` | Reference to Api server url |
|
||||
| agentk8sglue.basePodTemplate | object | `{"affinity":{},"annotations":{},"env":[],"fileMounts":[],"hostAliases":{},"initContainers":[],"labels":{},"nodeSelector":{},"resources":{},"schedulerName":"","securityContext":{},"tolerations":[],"volumeMounts":[],"volumes":[]}` | base template for pods spawned to consume ClearML Task |
|
||||
| agentk8sglue.basePodTemplate | object | `{"affinity":{},"annotations":{},"env":[],"fileMounts":[],"hostAliases":[],"initContainers":[],"labels":{},"nodeSelector":{},"resources":{},"schedulerName":"","securityContext":{},"tolerations":[],"volumeMounts":[],"volumes":[]}` | base template for pods spawned to consume ClearML Task |
|
||||
| agentk8sglue.basePodTemplate.affinity | object | `{}` | affinity setup for pods spawned to consume ClearML Task |
|
||||
| agentk8sglue.basePodTemplate.annotations | object | `{}` | annotations setup for pods spawned to consume ClearML Task (example in values.yaml comments) |
|
||||
| agentk8sglue.basePodTemplate.env | list | `[]` | environment variables for pods spawned to consume ClearML Task (example in values.yaml comments) |
|
||||
| agentk8sglue.basePodTemplate.fileMounts | list | `[]` | file definition for pods spawned to consume ClearML Task (example in values.yaml comments) |
|
||||
| agentk8sglue.basePodTemplate.hostAliases | object | `{}` | hostAliases setup for pods spawned to consume ClearML Task (example in values.yaml comments) |
|
||||
| agentk8sglue.basePodTemplate.hostAliases | list | `[]` | hostAliases setup for pods spawned to consume ClearML Task (example in values.yaml comments) |
|
||||
| agentk8sglue.basePodTemplate.initContainers | list | `[]` | initContainers definition for pods spawned to consume ClearML Task (example in values.yaml comments) |
|
||||
| agentk8sglue.basePodTemplate.labels | object | `{}` | labels setup for pods spawned to consume ClearML Task (example in values.yaml comments) |
|
||||
| agentk8sglue.basePodTemplate.nodeSelector | object | `{}` | nodeSelector setup for pods spawned to consume ClearML Task (example in values.yaml comments) |
|
||||
@@ -63,6 +63,7 @@ Kubernetes: `>= 1.21.0-0 < 1.27.0-0`
|
||||
| agentk8sglue.replicaCount | int | `1` | Glue Agent number of pods |
|
||||
| agentk8sglue.securityContext | object | `{}` | Web Server pod security context |
|
||||
| agentk8sglue.serviceExistingAccountName | string | `""` | if set, don't create a serviceAccountName but use defined existing one |
|
||||
| agentk8sglue.taskAsJob | bool | `false` | ClearML spawn tasks as jobs instead of pods |
|
||||
| agentk8sglue.tolerations | list | `[]` | tolerations setup for Agent pod (example in values.yaml comments) |
|
||||
| agentk8sglue.volumeMounts | list | `[]` | volume mounts definition for Glue Agent (example in values.yaml comments) |
|
||||
| agentk8sglue.volumes | list | `[]` | volumes definition for Glue Agent (example in values.yaml comments) |
|
||||
|
||||
@@ -72,7 +72,6 @@ Create secret to access docker registry
|
||||
{{- end }}
|
||||
{{- end }}
|
||||
|
||||
|
||||
{{/*
|
||||
Create a string composed by queue names
|
||||
*/}}
|
||||
@@ -83,3 +82,135 @@ Create a string composed by queue names
|
||||
{{- end }}
|
||||
{{- join " " $list }}
|
||||
{{- end }}
|
||||
|
||||
{{/*
|
||||
Create a task container template
|
||||
*/}}
|
||||
{{- define "taskContainer.containerTemplate" -}}
|
||||
{{- if .main.Values.imageCredentials.enabled }}
|
||||
imagePullSecrets:
|
||||
- name: {{ .main.Values.imageCredentials.existingSecret | default (printf "%s-ark" (include "clearmlAgent.name" .main )) }}
|
||||
{{- end }}
|
||||
schedulerName: {{ .value.templateOverrides.schedulerName | default (.main.Values.agentk8sglue.basePodTemplate.schedulerName) }}
|
||||
restartPolicy: Never
|
||||
securityContext:
|
||||
{{- .value.templateOverrides.securityContext | default .main.Values.agentk8sglue.basePodTemplate.securityContext | toYaml | nindent 2 }}
|
||||
hostAliases:
|
||||
{{- .value.templateOverrides.hostAliases | default .main.Values.agentk8sglue.basePodTemplate.hostAliases | toYaml | nindent 2 }}
|
||||
volumes:
|
||||
{{ $computedvolumes := (.value.templateOverrides.volumes | default .main.Values.agentk8sglue.basePodTemplate.volumes) }}
|
||||
{{- if $computedvolumes }}{{- $computedvolumes | toYaml | nindent 2 }}{{- end }}
|
||||
{{- if .value.templateOverrides.fileMounts }}
|
||||
- name: filemounts
|
||||
secret:
|
||||
secretName: {{ include "clearmlAgent.name" .main }}-{{ .key }}-fm
|
||||
{{- else if .main.Values.agentk8sglue.basePodTemplate.fileMounts }}
|
||||
- name: filemounts
|
||||
secret:
|
||||
secretName: {{ include "clearmlAgent.name" .main }}-fm
|
||||
{{- end }}
|
||||
{{- if not .main.Values.enterpriseFeatures.serviceAccountClusterAccess }}
|
||||
serviceAccountName: {{ include "clearmlAgent.serviceAccountName" .main }}
|
||||
{{- end }}
|
||||
initContainers:
|
||||
{{- .value.templateOverrides.initContainers | default .main.Values.agentk8sglue.basePodTemplate.initContainers | toYaml | nindent 2 }}
|
||||
containers:
|
||||
- resources:
|
||||
{{- .value.templateOverrides.resources | default .main.Values.agentk8sglue.basePodTemplate.resources | toYaml | nindent 4 }}
|
||||
ports:
|
||||
- containerPort: 10022
|
||||
volumeMounts:
|
||||
{{ $computedvolumemounts := (.value.templateOverrides.volumeMounts | default .main.Values.agentk8sglue.basePodTemplate.volumeMounts) }}
|
||||
{{- if $computedvolumemounts }}{{- $computedvolumemounts | toYaml | nindent 4 }}{{- end }}
|
||||
{{- if .value.templateOverrides.fileMounts }}
|
||||
{{- range .value.templateOverrides.fileMounts }}
|
||||
- name: filemounts
|
||||
mountPath: "{{ .folderPath }}/{{ .name }}"
|
||||
subPath: "{{ .name }}"
|
||||
readOnly: true
|
||||
{{- end }}
|
||||
{{- else if .main.Values.agentk8sglue.basePodTemplate.fileMounts }}
|
||||
{{- range .main.Values.agentk8sglue.basePodTemplate.fileMounts }}
|
||||
- name: filemounts
|
||||
mountPath: "{{ .folderPath }}/{{ .name }}"
|
||||
subPath: "{{ .name }}"
|
||||
readOnly: true
|
||||
{{- end }}
|
||||
{{- end }}
|
||||
env:
|
||||
- name: CLEARML_API_HOST
|
||||
value: {{ .main.Values.agentk8sglue.apiServerUrlReference }}
|
||||
- name: CLEARML_WEB_HOST
|
||||
value: {{ .main.Values.agentk8sglue.webServerUrlReference }}
|
||||
- name: CLEARML_FILES_HOST
|
||||
value: {{ .main.Values.agentk8sglue.fileServerUrlReference }}
|
||||
{{- if not .main.Values.enterpriseFeatures.useOwnerToken }}
|
||||
- name: CLEARML_API_ACCESS_KEY
|
||||
valueFrom:
|
||||
secretKeyRef:
|
||||
name: {{ .main.Values.clearml.existingAgentk8sglueSecret | default (printf "%s-ac" (include "clearmlAgent.name" .main )) }}
|
||||
key: agentk8sglue_key
|
||||
- name: CLEARML_API_SECRET_KEY
|
||||
valueFrom:
|
||||
secretKeyRef:
|
||||
name: {{ .main.Values.clearml.existingAgentk8sglueSecret | default (printf "%s-ac" (include "clearmlAgent.name" .main )) }}
|
||||
key: agentk8sglue_secret
|
||||
{{- end }}
|
||||
- name: PYTHONUNBUFFERED
|
||||
value: "x"
|
||||
{{- if not .main.Values.agentk8sglue.clearmlcheckCertificate }}
|
||||
- name: CLEARML_API_HOST_VERIFY_CERT
|
||||
value: "false"
|
||||
{{- end }}
|
||||
{{ $computedenvs := (.value.templateOverrides.env| default .main.Values.agentk8sglue.basePodTemplate.env) }}
|
||||
{{- if $computedenvs }}{{- $computedenvs | toYaml | nindent 4 }}{{- end }}
|
||||
nodeSelector:
|
||||
{{ .value.templateOverrides.nodeSelector | default .main.Values.agentk8sglue.basePodTemplate.nodeSelector | toYaml | nindent 2 }}
|
||||
tolerations:
|
||||
{{ .value.templateOverrides.tolerations | default .main.Values.agentk8sglue.basePodTemplate.tolerations | toYaml | nindent 2 }}
|
||||
affinity:
|
||||
{{ .value.templateOverrides.affinity | default .main.Values.agentk8sglue.basePodTemplate.affinity | toYaml | nindent 2 }}
|
||||
{{- end }}
|
||||
|
||||
{{/*
|
||||
Create a task container template
|
||||
*/}}
|
||||
{{- define "taskContainer.podTemplate" -}}
|
||||
{{- range $key, $value := $.Values.enterpriseFeatures.queues }}
|
||||
{{ $key }}:
|
||||
apiVersion: v1
|
||||
kind: Pod
|
||||
metadata:
|
||||
namespace: {{ $.Release.Namespace }}
|
||||
labels:
|
||||
{{ $value.templateOverrides.labels | default $.Values.agentk8sglue.basePodTemplate.labels | toYaml }}
|
||||
annotations:
|
||||
{{ $value.templateOverrides.annotations | default $.Values.agentk8sglue.basePodTemplate.annotations | toYaml }}
|
||||
spec:
|
||||
{{- $data := dict "main" $ "key" $key "value" $value -}}
|
||||
{{- include "taskContainer.containerTemplate" $data | nindent 4}}
|
||||
{{- end }}
|
||||
{{- end }}
|
||||
|
||||
{{/*
|
||||
Create a task container template
|
||||
*/}}
|
||||
{{- define "taskContainer.jobTemplate" -}}
|
||||
{{- range $key, $value := $.Values.enterpriseFeatures.queues }}
|
||||
{{ $key }}:
|
||||
apiVersion: batch/v1
|
||||
kind: Job
|
||||
metadata:
|
||||
namespace: {{ $.Release.Namespace }}
|
||||
labels:
|
||||
{{ $value.templateOverrides.labels | default $.Values.agentk8sglue.basePodTemplate.labels | toYaml }}
|
||||
annotations:
|
||||
{{ $value.templateOverrides.annotations | default $.Values.agentk8sglue.basePodTemplate.annotations | toYaml }}
|
||||
spec:
|
||||
template:
|
||||
spec:
|
||||
{{- $data := dict "main" $ "key" $key "value" $value -}}
|
||||
{{- include "taskContainer.containerTemplate" $data | nindent 8 }}
|
||||
backoffLimit: 0
|
||||
{{- end }}
|
||||
{{- end }}
|
||||
|
||||
@@ -5,184 +5,10 @@ metadata:
|
||||
data:
|
||||
{{- if .Values.enterpriseFeatures.enabled }}
|
||||
template.yaml: |
|
||||
{{- range $key, $value := $.Values.enterpriseFeatures.queues }}
|
||||
{{ $key }}:
|
||||
apiVersion: v1
|
||||
metadata:
|
||||
namespace: {{ $.Release.Namespace }}
|
||||
{{- if $value.templateOverrides.labels }}
|
||||
labels:
|
||||
{{- toYaml $value.templateOverrides.labels | nindent 10 }}
|
||||
{{- else if $.Values.agentk8sglue.basePodTemplate.labels }}
|
||||
labels:
|
||||
{{- toYaml $.Values.agentk8sglue.basePodTemplate.labels | nindent 10 }}
|
||||
{{- end}}
|
||||
{{- if $value.templateOverrides.annotations }}
|
||||
annotations:
|
||||
{{- toYaml $value.templateOverrides.annotations | nindent 10 }}
|
||||
{{- else if $.Values.agentk8sglue.basePodTemplate.annotations }}
|
||||
annotations:
|
||||
{{- toYaml $.Values.agentk8sglue.basePodTemplate.annotations | nindent 10 }}
|
||||
{{- end}}
|
||||
spec:
|
||||
{{- if $.Values.imageCredentials.enabled }}
|
||||
imagePullSecrets:
|
||||
{{- if $.Values.imageCredentials.existingSecret }}
|
||||
- name: {{ $.Values.imageCredentials.existingSecret }}
|
||||
{{- else }}
|
||||
- name: {{ include "clearmlAgent.name" $ }}-ark
|
||||
{{- end }}
|
||||
{{- end }}
|
||||
{{- if $value.templateOverrides.schedulerName }}
|
||||
schedulerName: {{ $value.templateOverrides.schedulerName }}
|
||||
{{- else if $.Values.agentk8sglue.basePodTemplate.schedulerName }}
|
||||
schedulerName: {{ $.Values.agentk8sglue.basePodTemplate.schedulerName }}
|
||||
{{- end}}
|
||||
restartPolicy: Never
|
||||
{{- if $value.templateOverrides.securityContext }}
|
||||
securityContext:
|
||||
{{- toYaml $value.templateOverrides.securityContext | nindent 10 }}
|
||||
{{- else if $.Values.agentk8sglue.basePodTemplate.securityContext }}
|
||||
securityContext:
|
||||
{{- toYaml $.Values.agentk8sglue.basePodTemplate.securityContext | nindent 10 }}
|
||||
{{- end}}
|
||||
{{- if $value.templateOverrides.hostAliases }}
|
||||
{{- with $value.templateOverrides.hostAliases }}
|
||||
hostAliases:
|
||||
{{- toYaml . | nindent 10 }}
|
||||
{{- end }}
|
||||
{{- else if $.Values.agentk8sglue.basePodTemplate.hostAliases }}
|
||||
{{- with $.Values.agentk8sglue.basePodTemplate.hostAliases }}
|
||||
hostAliases:
|
||||
{{- toYaml . | nindent 10 }}
|
||||
{{- end }}
|
||||
{{- end }}
|
||||
volumes:
|
||||
{{- if $value.templateOverrides.volumes }}
|
||||
{{- toYaml $value.templateOverrides.volumes | nindent 10 }}
|
||||
{{- else if $.Values.agentk8sglue.basePodTemplate.volumes }}
|
||||
{{- toYaml $.Values.agentk8sglue.basePodTemplate.volumes | nindent 10 }}
|
||||
{{- end }}
|
||||
{{- if $value.templateOverrides.fileMounts }}
|
||||
- name: filemounts
|
||||
secret:
|
||||
secretName: {{ include "clearmlAgent.name" $ }}-{{ $key }}-fm
|
||||
{{- else if $.Values.agentk8sglue.basePodTemplate.fileMounts }}
|
||||
- name: filemounts
|
||||
secret:
|
||||
secretName: {{ include "clearmlAgent.name" $ }}-fm
|
||||
{{- end }}
|
||||
{{- if not $.Values.enterpriseFeatures.serviceAccountClusterAccess }}
|
||||
serviceAccountName: {{ include "clearmlAgent.serviceAccountName" $ }}
|
||||
{{- end }}
|
||||
{{- if $value.templateOverrides.initContainers }}
|
||||
initContainers:
|
||||
{{- toYaml $value.templateOverrides.initContainers | nindent 10 }}
|
||||
{{- else if $.Values.agentk8sglue.basePodTemplate.initContainers }}
|
||||
initContainers:
|
||||
{{- toYaml $.Values.agentk8sglue.basePodTemplate.initContainers | nindent 10 }}
|
||||
{{- end }}
|
||||
containers:
|
||||
- resources:
|
||||
{{- if $value.templateOverrides.resources }}
|
||||
{{- toYaml $value.templateOverrides.resources | nindent 12 }}
|
||||
{{- else if $.Values.agentk8sglue.basePodTemplate.resources }}
|
||||
{{- toYaml $.Values.agentk8sglue.basePodTemplate.resources | nindent 12 }}
|
||||
{{- end}}
|
||||
ports:
|
||||
- containerPort: 10022
|
||||
volumeMounts:
|
||||
{{- if $value.templateOverrides.volumeMounts }}
|
||||
{{- toYaml $value.templateOverrides.volumeMounts | nindent 12 }}
|
||||
{{- else if $.Values.agentk8sglue.basePodTemplate.volumeMounts }}
|
||||
{{- toYaml $.Values.agentk8sglue.basePodTemplate.volumeMounts | nindent 12 }}
|
||||
{{- end }}
|
||||
{{- if $value.templateOverrides.fileMounts }}
|
||||
{{- range $value.templateOverrides.fileMounts }}
|
||||
- name: filemounts
|
||||
mountPath: "{{ .folderPath }}/{{ .name }}"
|
||||
subPath: "{{ .name }}"
|
||||
readOnly: true
|
||||
{{- end }}
|
||||
{{- else if $.Values.agentk8sglue.basePodTemplate.fileMounts }}
|
||||
{{- range $.Values.agentk8sglue.basePodTemplate.fileMounts }}
|
||||
- name: filemounts
|
||||
mountPath: "{{ .folderPath }}/{{ .name }}"
|
||||
subPath: "{{ .name }}"
|
||||
readOnly: true
|
||||
{{- end }}
|
||||
{{- end }}
|
||||
env:
|
||||
- name: CLEARML_API_HOST
|
||||
value: {{ $.Values.agentk8sglue.apiServerUrlReference }}
|
||||
- name: CLEARML_WEB_HOST
|
||||
value: {{ $.Values.agentk8sglue.webServerUrlReference }}
|
||||
- name: CLEARML_FILES_HOST
|
||||
value: {{ $.Values.agentk8sglue.fileServerUrlReference }}
|
||||
{{- if not $.Values.enterpriseFeatures.useOwnerToken }}
|
||||
- name: CLEARML_API_ACCESS_KEY
|
||||
valueFrom:
|
||||
secretKeyRef:
|
||||
{{- if $.Values.clearml.existingAgentk8sglueSecret }}
|
||||
name: {{ $.Values.clearml.existingAgentk8sglueSecret }}
|
||||
{{- else }}
|
||||
name: {{ include "clearmlAgent.name" $ }}-ac
|
||||
{{- end }}
|
||||
key: agentk8sglue_key
|
||||
- name: CLEARML_API_SECRET_KEY
|
||||
valueFrom:
|
||||
secretKeyRef:
|
||||
{{- if $.Values.clearml.existingAgentk8sglueSecret }}
|
||||
name: {{ $.Values.clearml.existingAgentk8sglueSecret }}
|
||||
{{- else }}
|
||||
name: {{ include "clearmlAgent.name" $ }}-ac
|
||||
{{- end }}
|
||||
key: agentk8sglue_secret
|
||||
{{- end }}
|
||||
- name: PYTHONUNBUFFERED
|
||||
value: "x"
|
||||
{{- if not $.Values.agentk8sglue.clearmlcheckCertificate }}
|
||||
- name: CLEARML_API_HOST_VERIFY_CERT
|
||||
value: "false"
|
||||
{{- end }}
|
||||
{{- if $value.templateOverrides.env }}
|
||||
{{- toYaml $value.templateOverrides.env | nindent 12 }}
|
||||
{{- else if $.Values.agentk8sglue.basePodTemplate.env }}
|
||||
{{- toYaml $.Values.agentk8sglue.basePodTemplate.env | nindent 12 }}
|
||||
{{- end }}
|
||||
{{- if $value.templateOverrides.nodeSelector }}
|
||||
{{- with $value.templateOverrides.nodeSelector }}
|
||||
nodeSelector:
|
||||
{{- toYaml . | nindent 12 }}
|
||||
{{- end }}
|
||||
{{- else if $.Values.agentk8sglue.basePodTemplate.nodeSelector }}
|
||||
{{- with $.Values.agentk8sglue.basePodTemplate.nodeSelector }}
|
||||
nodeSelector:
|
||||
{{- toYaml . | nindent 10 }}
|
||||
{{- end }}
|
||||
{{- end }}
|
||||
{{- if $value.templateOverrides.tolerations }}
|
||||
{{- with $value.templateOverrides.tolerations }}
|
||||
tolerations:
|
||||
{{- toYaml . | nindent 10 }}
|
||||
{{- end }}
|
||||
{{- else if $.Values.agentk8sglue.basePodTemplate.tolerations }}
|
||||
{{- with $.Values.agentk8sglue.basePodTemplate.tolerations }}
|
||||
tolerations:
|
||||
{{- toYaml . | nindent 10 }}
|
||||
{{- end }}
|
||||
{{- end }}
|
||||
{{- if $value.templateOverrides.affinity }}
|
||||
{{- with $value.templateOverrides.affinity }}
|
||||
affinity:
|
||||
{{- toYaml . | nindent 10 }}
|
||||
{{- end }}
|
||||
{{- else if $.Values.agentk8sglue.basePodTemplate.affinity }}
|
||||
{{- with $.Values.agentk8sglue.basePodTemplate.affinity }}
|
||||
affinity:
|
||||
{{- toYaml . | nindent 10 }}
|
||||
{{- end }}
|
||||
{{- end }}
|
||||
{{- if .Values.agentk8sglue.taskAsJob }}
|
||||
{{ include "taskContainer.jobTemplate" . | nindent 4}}
|
||||
{{- else }}
|
||||
{{ include "taskContainer.podTemplate" . | nindent 4}}
|
||||
{{- end }}
|
||||
secrets.yaml: |
|
||||
{{- range $key, $value := $.Values.enterpriseFeatures.queues }}
|
||||
|
||||
@@ -14,7 +14,7 @@ spec:
|
||||
template:
|
||||
metadata:
|
||||
annotations:
|
||||
checksum/config: {{ printf "%s%s" .Values.clearml .Values.agentk8sglue | sha256sum }}
|
||||
checksum/config: {{ printf "%s" .Values | sha256sum }}
|
||||
{{- include "clearmlAgent.annotations" . | nindent 8 }}
|
||||
labels:
|
||||
{{- include "clearmlAgent.labels" . | nindent 8 }}
|
||||
@@ -158,6 +158,13 @@ spec:
|
||||
value: "interactive"
|
||||
{{- end }}
|
||||
{{- end }}
|
||||
{{- if .Values.agentk8sglue.taskAsJob }}
|
||||
- name: "CLEARML_K8S_GLUE_KIND"
|
||||
value: "job"
|
||||
{{- else }}
|
||||
- name: "CLEARML_K8S_GLUE_KIND"
|
||||
value: "pod"
|
||||
{{- end }}
|
||||
{{- if .Values.enterpriseFeatures.enabled }}
|
||||
- name: K8S_GLUE_QUEUE
|
||||
value: {{ include "agentk8sglue.queues" . | quote }}
|
||||
|
||||
@@ -24,6 +24,14 @@ rules:
|
||||
resources:
|
||||
- namespaces
|
||||
verbs: ["list"]
|
||||
{{- if .Values.agentk8sglue.taskAsJob }}
|
||||
- apiGroups:
|
||||
- batch
|
||||
- extensions
|
||||
resources:
|
||||
- jobs
|
||||
verbs: ["get", "list", "watch", "create", "patch", "delete"]
|
||||
{{- end }}
|
||||
---
|
||||
apiVersion: rbac.authorization.k8s.io/v1
|
||||
kind: ClusterRoleBinding
|
||||
@@ -56,6 +64,14 @@ rules:
|
||||
resources:
|
||||
- namespaces
|
||||
verbs: ["list"]
|
||||
{{- if .Values.agentk8sglue.taskAsJob }}
|
||||
- apiGroups:
|
||||
- batch
|
||||
- extensions
|
||||
resources:
|
||||
- jobs
|
||||
verbs: ["get", "list", "watch", "create", "patch", "delete"]
|
||||
{{- end }}
|
||||
---
|
||||
apiVersion: rbac.authorization.k8s.io/v1
|
||||
kind: RoleBinding
|
||||
|
||||
@@ -61,6 +61,8 @@ agentk8sglue:
|
||||
defaultContainerImage: ubuntu:18.04
|
||||
# -- ClearML queue this agent will consume
|
||||
queue: default
|
||||
# -- ClearML spawn tasks as jobs instead of pods
|
||||
taskAsJob: false
|
||||
# -- Custom Bash script for the Glue Agent
|
||||
# -- labels setup for Agent pod (example in values.yaml comments)
|
||||
labels: {}
|
||||
@@ -181,7 +183,7 @@ agentk8sglue:
|
||||
# runAsUser: 1001
|
||||
# fsGroup: 1001
|
||||
# -- hostAliases setup for pods spawned to consume ClearML Task (example in values.yaml comments)
|
||||
hostAliases: {}
|
||||
hostAliases: []
|
||||
# - ip: "127.0.0.1"
|
||||
# hostnames:
|
||||
# - "foo.local"
|
||||
|
||||
@@ -2,7 +2,7 @@ apiVersion: v2
|
||||
name: clearml
|
||||
description: MLOps platform
|
||||
type: application
|
||||
version: "5.6.0"
|
||||
version: "5.7.0"
|
||||
appVersion: "1.9.2"
|
||||
kubeVersion: ">= 1.21.0-0 < 1.27.0-0"
|
||||
home: https://clear.ml
|
||||
@@ -33,4 +33,4 @@ dependencies:
|
||||
annotations:
|
||||
artifacthub.io/changes: |
|
||||
- kind: added
|
||||
description: multi host external elasticsearch support
|
||||
description: fileserver support for emptyDir
|
||||
|
||||
@@ -1,6 +1,6 @@
|
||||
# ClearML Ecosystem for Kubernetes
|
||||
|
||||
  
|
||||
  
|
||||
|
||||
MLOps platform
|
||||
|
||||
@@ -222,7 +222,7 @@ Kubernetes: `>= 1.21.0-0 < 1.27.0-0`
|
||||
| externalServices.mongodbConnectionStringBackend | string | `""` | Existing MongoDB connection string for AUTH to use if mongodb.enabled is false |
|
||||
| externalServices.redisHost | string | `""` | Existing Redis Hostname to use if redis.enabled is false |
|
||||
| externalServices.redisPort | int | `6379` | Existing Redis Port to use if redis.enabled is false |
|
||||
| fileserver | object | `{"affinity":{},"enabled":true,"extraEnvs":[],"image":{"pullPolicy":"IfNotPresent","repository":"allegroai/clearml","tag":"1.9.2-317"},"ingress":{"annotations":{},"enabled":false,"hostName":"files.clearml.127-0-0-1.nip.io","ingressClassName":"","path":"/","tlsSecretName":""},"nodeSelector":{},"podAnnotations":{},"replicaCount":1,"resources":{"limits":{"cpu":"2000m","memory":"1Gi"},"requests":{"cpu":"100m","memory":"256Mi"}},"securityContext":{},"service":{"nodePort":30081,"port":8081,"type":"NodePort"},"storage":{"data":{"accessMode":"ReadWriteOnce","class":"","existingPVC":"","size":"50Gi"}},"tolerations":[]}` | File Server configurations |
|
||||
| fileserver | object | `{"affinity":{},"enabled":true,"extraEnvs":[],"image":{"pullPolicy":"IfNotPresent","repository":"allegroai/clearml","tag":"1.9.2-317"},"ingress":{"annotations":{},"enabled":false,"hostName":"files.clearml.127-0-0-1.nip.io","ingressClassName":"","path":"/","tlsSecretName":""},"nodeSelector":{},"podAnnotations":{},"replicaCount":1,"resources":{"limits":{"cpu":"2000m","memory":"1Gi"},"requests":{"cpu":"100m","memory":"256Mi"}},"securityContext":{},"service":{"nodePort":30081,"port":8081,"type":"NodePort"},"storage":{"data":{"accessMode":"ReadWriteOnce","class":"","existingPVC":"","size":"50Gi"},"enabled":true},"tolerations":[]}` | File Server configurations |
|
||||
| fileserver.affinity | object | `{}` | File Server affinity setup |
|
||||
| fileserver.enabled | bool | `true` | Enable/Disable component deployment |
|
||||
| fileserver.extraEnvs | list | `[]` | File Server extra envrinoment variables |
|
||||
@@ -241,10 +241,11 @@ Kubernetes: `>= 1.21.0-0 < 1.27.0-0`
|
||||
| fileserver.securityContext | object | `{}` | File Server pod security context |
|
||||
| fileserver.service | object | `{"nodePort":30081,"port":8081,"type":"NodePort"}` | File Server internal service configuration |
|
||||
| fileserver.service.nodePort | int | `30081` | If service.type set to NodePort, this will be set to service's nodePort field. If service.type is set to others, this field will be ignored |
|
||||
| fileserver.storage | object | `{"data":{"accessMode":"ReadWriteOnce","class":"","existingPVC":"","size":"50Gi"}}` | File server persistence settings |
|
||||
| fileserver.storage | object | `{"data":{"accessMode":"ReadWriteOnce","class":"","existingPVC":"","size":"50Gi"},"enabled":true}` | File server persistence settings |
|
||||
| fileserver.storage.data.accessMode | string | `"ReadWriteOnce"` | Access mode (must be ReadWriteMany if fileserver replica > 1) |
|
||||
| fileserver.storage.data.class | string | `""` | Storage class (use default if empty) |
|
||||
| fileserver.storage.data.existingPVC | string | `""` | If set, it uses an already existing PVC instead of dynamic provisioning |
|
||||
| fileserver.storage.enabled | bool | `true` | If set to false no PVC is created and emptyDir is used |
|
||||
| fileserver.tolerations | list | `[]` | File Server tolerations setup |
|
||||
| imageCredentials | object | `{"email":"someone@host.com","enabled":false,"existingSecret":"","password":"pwd","registry":"docker.io","username":"someone"}` | Container registry configuration |
|
||||
| imageCredentials.email | string | `"someone@host.com"` | Email |
|
||||
|
||||
@@ -110,9 +110,9 @@ spec:
|
||||
value: /opt/clearml/config
|
||||
- name: CLEARML__apiserver__default_company_name
|
||||
value: "{{ .Values.clearml.defaultCompany }}"
|
||||
{{- if not (eq .Values.clearml.cookieDomain "") }}
|
||||
- name: CLEARML__APISERVER__AUTH__SESSION_AUTH_COOKIE_NAME
|
||||
value: {{ .Values.clearml.cookieName }}
|
||||
{{- if .Values.clearml.cookieDomain }}
|
||||
- name: CLEARML__APISERVER__AUTH__COOKIES__DOMAIN
|
||||
value: ".{{ .Values.clearml.cookieDomain }}"
|
||||
{{- end }}
|
||||
|
||||
@@ -28,6 +28,7 @@ spec:
|
||||
{{- end }}
|
||||
{{- end }}
|
||||
volumes:
|
||||
{{- if .Values.fileserver.storage.enabled }}
|
||||
{{- if .Values.fileserver.storage.data.existingPVC }}
|
||||
- name: fileserver-data
|
||||
persistentVolumeClaim:
|
||||
@@ -37,6 +38,10 @@ spec:
|
||||
persistentVolumeClaim:
|
||||
claimName: {{ include "fileserver.referenceName" . }}-data
|
||||
{{- end }}
|
||||
{{- else }}
|
||||
- name: fileserver-data
|
||||
emptyDir: {}
|
||||
{{- end }}
|
||||
securityContext: {{ toYaml .Values.fileserver.podSecurityContext | nindent 8 }}
|
||||
initContainers:
|
||||
- name: init-fileserver
|
||||
|
||||
@@ -1,4 +1,5 @@
|
||||
{{- if .Values.fileserver.enabled }}
|
||||
{{- if .Values.fileserver.storage.enabled }}
|
||||
{{- if not .Values.fileserver.storage.data.existingPVC }}
|
||||
kind: PersistentVolumeClaim
|
||||
apiVersion: v1
|
||||
@@ -17,3 +18,4 @@ spec:
|
||||
{{- end -}}
|
||||
{{- end }}
|
||||
{{- end }}
|
||||
{{- end }}
|
||||
|
||||
@@ -16,6 +16,28 @@ webserver:
|
||||
ingress:
|
||||
enabled: true
|
||||
hostName: "app.clearml.127-0-0-1.nip.io"
|
||||
redis:
|
||||
master:
|
||||
name: "{{ .Release.Name }}-redis"
|
||||
persistence:
|
||||
enabled: true
|
||||
accessModes:
|
||||
- ReadWriteOnce
|
||||
size: 5Gi
|
||||
## If undefined (the default) or set to null, no storageClassName spec is set, choosing the default provisioner
|
||||
storageClass: null
|
||||
slave:
|
||||
persistence:
|
||||
enabled: true
|
||||
accessModes:
|
||||
- ReadWriteOnce
|
||||
size: 5Gi
|
||||
## If undefined (the default) or set to null, no storageClassName spec is set, choosing the default provisioner
|
||||
storageClass: null
|
||||
cluster:
|
||||
enabled: true
|
||||
sentinel:
|
||||
enabled: true
|
||||
mongodb:
|
||||
enabled: true
|
||||
architecture: replicaset
|
||||
|
||||
@@ -207,6 +207,8 @@ fileserver:
|
||||
# fsGroup: 1001
|
||||
# -- File server persistence settings
|
||||
storage:
|
||||
# -- If set to false no PVC is created and emptyDir is used
|
||||
enabled: true
|
||||
data:
|
||||
# -- If set, it uses an already existing PVC instead of dynamic provisioning
|
||||
existingPVC: ""
|
||||
|
||||
Reference in New Issue
Block a user