Compare commits

...

2 Commits

Author SHA1 Message Date
Niels ten Boom
a90b91f024 feat: expand volumemount capabilities for agent (#104)
* upgrade

* add upgrade instruction

* fix readme for agent

* Added newline at the end

* Try to fix CI

* Edited type added

* Update README.md

Co-authored-by: Valeriano Manassero <14011549+valeriano-manassero@users.noreply.github.com>
2022-09-13 14:53:44 +02:00
Nikolay Shamanovich
19a6785a03 Allowing auth secrets to be optional #2 (#100)
* Allowing auth secrets to be optional

* Add value secret.existingSecret for clearml chart.

* Add value clearml.existingAgentk8sglueSecret for clearml-agent chart.

* Add value clearml.existingClearmlConfigSecret for clearml-agent chart.

* Split Secret clearml-agent-conf in clearml-agent chart into two
  Secrets: clearml-agent-conf (agent.conf file) and clearml-agent-k8sglue
  (environment variables).

* Update helm-docs
2022-08-22 10:35:47 +02:00
13 changed files with 151 additions and 53 deletions

View File

@@ -1,7 +1,8 @@
name: Lint and Test Charts
on:
pull_request:
pull_request_target:
types: [opened, synchronize, edited, reopened]
paths:
- 'charts/**'

View File

@@ -2,7 +2,7 @@ apiVersion: v2
name: clearml-agent
description: MLOps platform
type: application
version: "1.2.3"
version: "2.0.0"
appVersion: "1.24"
kubeVersion: ">= 1.19.0-0 < 1.25.0-0"
home: https://clear.ml

View File

@@ -1,6 +1,6 @@
# clearml-agent
# ClearML Kubernetes Agent
![Version: 1.2.3](https://img.shields.io/badge/Version-1.2.3-informational?style=flat-square) ![Type: application](https://img.shields.io/badge/Type-application-informational?style=flat-square) ![AppVersion: 1.24](https://img.shields.io/badge/AppVersion-1.24-informational?style=flat-square)
![Version: 2.0.0](https://img.shields.io/badge/Version-2.0.0-informational?style=flat-square) ![Type: application](https://img.shields.io/badge/Type-application-informational?style=flat-square) ![AppVersion: 1.24](https://img.shields.io/badge/AppVersion-1.24-informational?style=flat-square)
MLOps platform
@@ -12,6 +12,11 @@ MLOps platform
| ---- | ------ | --- |
| valeriano-manassero | | <https://github.com/valeriano-manassero> |
## Introduction
The **clearml-agent** is the Kubernetes agent for for [ClearML](https://github.com/allegroai/clearml).
It allows you to schedule distributed experiments on a Kubernetes cluster.
## Source Code
* <https://github.com/allegroai/clearml-helm-charts>
@@ -25,7 +30,7 @@ Kubernetes: `>= 1.19.0-0 < 1.25.0-0`
| Key | Type | Default | Description |
|-----|------|---------|-------------|
| agentk8sglue | object | `{"apiServerUrlReference":"https://api.clear.ml","clearmlcheckCertificate":true,"defaultContainerImage":"ubuntu:18.04","extraEnvs":[],"fileServerUrlReference":"https://files.clear.ml","id":"k8s-agent","image":{"repository":"allegroai/clearml-agent-k8s-base","tag":"1.24-18"},"maxPods":10,"podTemplate":{"env":[],"nodeSelector":{},"resources":{},"tolerations":[],"volumes":[]},"queue":"default","replicaCount":1,"serviceAccountName":"default","webServerUrlReference":"https://app.clear.ml"}` | This agent will spawn queued experiments in new pods, a good use case is to combine this with GPU autoscaling nodes. https://github.com/allegroai/clearml-agent/tree/master/docker/k8s-glue |
| agentk8sglue | object | `{"apiServerUrlReference":"https://api.clear.ml","clearmlcheckCertificate":true,"defaultContainerImage":"ubuntu:18.04","extraEnvs":[],"fileServerUrlReference":"https://files.clear.ml","id":"k8s-agent","image":{"repository":"allegroai/clearml-agent-k8s-base","tag":"1.24-18"},"maxPods":10,"podTemplate":{"env":[],"nodeSelector":{},"resources":{},"tolerations":[],"volumeMounts":[],"volumes":[]},"queue":"default","replicaCount":1,"serviceAccountName":"default","webServerUrlReference":"https://app.clear.ml"}` | This agent will spawn queued experiments in new pods, a good use case is to combine this with GPU autoscaling nodes. https://github.com/allegroai/clearml-agent/tree/master/docker/k8s-glue |
| agentk8sglue.apiServerUrlReference | string | `"https://api.clear.ml"` | Reference to Api server url |
| agentk8sglue.clearmlcheckCertificate | bool | `true` | Check certificates validity for evefry UrlReference below. |
| agentk8sglue.defaultContainerImage | string | `"ubuntu:18.04"` | default container image for ClearML Task pod |
@@ -34,20 +39,23 @@ Kubernetes: `>= 1.19.0-0 < 1.25.0-0`
| agentk8sglue.id | string | `"k8s-agent"` | ClearML worker ID (must be unique across the entire ClearMLenvironment) |
| agentk8sglue.image | object | `{"repository":"allegroai/clearml-agent-k8s-base","tag":"1.24-18"}` | Glue Agent image configuration |
| agentk8sglue.maxPods | int | `10` | maximum concurrent consume ClearML Task pod |
| agentk8sglue.podTemplate | object | `{"env":[],"nodeSelector":{},"resources":{},"tolerations":[],"volumes":[]}` | template for pods spawned to consume ClearML Task |
| agentk8sglue.podTemplate | object | `{"env":[],"nodeSelector":{},"resources":{},"tolerations":[],"volumeMounts":[],"volumes":[]}` | template for pods spawned to consume ClearML Task |
| agentk8sglue.podTemplate.env | list | `[]` | environment variables for pods spawned to consume ClearML Task (example in values.yaml comments) |
| agentk8sglue.podTemplate.nodeSelector | object | `{}` | nodeSelector setup for pods spawned to consume ClearML Task (example in values.yaml comments) |
| agentk8sglue.podTemplate.resources | object | `{}` | resources declaration for pods spawned to consume ClearML Task (example in values.yaml comments) |
| agentk8sglue.podTemplate.tolerations | list | `[]` | tolerations setup for pods spawned to consume ClearML Task (example in values.yaml comments) |
| agentk8sglue.podTemplate.volumeMounts | list | `[]` | volumeMounts definition for pods spawned to consume ClearML Task (example in values.yaml comments) |
| agentk8sglue.podTemplate.volumes | list | `[]` | volumes definition for pods spawned to consume ClearML Task (example in values.yaml comments) |
| agentk8sglue.queue | string | `"default"` | ClearML queue this agent will consume |
| agentk8sglue.replicaCount | int | `1` | Glue Agent number of pods |
| agentk8sglue.serviceAccountName | string | `"default"` | serviceAccountName for pods spawned to consume ClearML Task |
| agentk8sglue.webServerUrlReference | string | `"https://app.clear.ml"` | Reference to Web server url |
| clearml | object | `{"agentk8sglueKey":"ACCESSKEY","agentk8sglueSecret":"SECRETKEY","clearmlConfig":"sdk {\n}"}` | ClearMl generic configurations |
| clearml | object | `{"agentk8sglueKey":"ACCESSKEY","agentk8sglueSecret":"SECRETKEY","clearmlConfig":"sdk {\n}","existingAgentk8sglueSecret":"","existingClearmlConfigSecret":""}` | ClearMl generic configurations |
| clearml.agentk8sglueKey | string | `"ACCESSKEY"` | Agent k8s Glue basic auth key |
| clearml.agentk8sglueSecret | string | `"SECRETKEY"` | Agent k8s Glue basic auth secret |
| clearml.clearmlConfig | string | `"sdk {\n}"` | ClearML configuration file |
| clearml.existingAgentk8sglueSecret | string | `""` | If this is set, chart will not generate a secret but will use what is defined here |
| clearml.existingClearmlConfigSecret | string | `""` | If this is set, chart will not generate a secret but will use what is defined here |
| imageCredentials | object | `{"email":"someone@host.com","enabled":false,"existingSecret":"","password":"pwd","registry":"docker.io","username":"someone"}` | Private image registry configuration |
| imageCredentials.email | string | `"someone@host.com"` | Email |
| imageCredentials.enabled | bool | `false` | Use private authentication mode |
@@ -56,5 +64,26 @@ Kubernetes: `>= 1.19.0-0 < 1.25.0-0`
| imageCredentials.registry | string | `"docker.io"` | Registry name |
| imageCredentials.username | string | `"someone"` | Registry username |
----------------------------------------------
Autogenerated from chart metadata using [helm-docs v1.11.0](https://github.com/norwoodj/helm-docs/releases/v1.11.0)
# Upgrading Chart
### From v1.x to v2.x
Chart 1.x was under the assumption that all mounted volumes would be PVC's. Version > 2.x allows for more flexibility and will inject the yaml from podTemplate.volumes and podtemplate.volumeMounts directly.
v1.x
```
volumes:
- name: "yourvolume"
path: "/yourpath"
```
v2.x
```
volumes:
- name: "yourvolume"
persistentVolumeClaim:
claimName: "yourvolume"
volumeMounts:
- name: "yourvolume"
mountPath: "/yourpath"
```

View File

@@ -0,0 +1,45 @@
# ClearML Kubernetes Agent
{{ template "chart.deprecationWarning" . }}
{{ template "chart.badgesSection" . }}
{{ template "chart.description" . }}
{{ template "chart.homepageLine" . }}
{{ template "chart.maintainersSection" . }}
## Introduction
The **clearml-agent** is the Kubernetes agent for for [ClearML](https://github.com/allegroai/clearml).
It allows you to schedule distributed experiments on a Kubernetes cluster.
{{ template "chart.sourcesSection" . }}
{{ template "chart.requirementsSection" . }}
{{ template "chart.valuesSection" . }}
# Upgrading Chart
### From v1.x to v2.x
Chart 1.x was under the assumption that all mounted volumes would be PVC's. Version > 2.x allows for more flexibility and will inject the yaml from podTemplate.volumes and podtemplate.volumeMounts directly.
v1.x
```
volumes:
- name: "yourvolume"
path: "/yourpath"
```
v2.x
```
volumes:
- name: "yourvolume"
persistentVolumeClaim:
claimName: "yourvolume"
volumeMounts:
- name: "yourvolume"
mountPath: "/yourpath"
```

View File

@@ -17,21 +17,18 @@ data:
{{- end }}
{{- end }}
serviceAccountName: {{ .Values.agentk8sglue.serviceAccountName }}
{{- with .Values.agentk8sglue.podTemplate.volumes }}
volumes:
{{- range .Values.agentk8sglue.podTemplate.volumes }}
- name: {{ .name }}
persistentVolumeClaim:
claimName: {{ .name }}
{{- toYaml . | nindent 8 }}
{{- end }}
containers:
- resources:
{{- toYaml .Values.agentk8sglue.podTemplate.resources | nindent 10 }}
ports:
- containerPort: 10022
{{- with .Values.agentk8sglue.podTemplate.volumeMounts }}
volumeMounts:
{{- range .Values.agentk8sglue.podTemplate.volumes }}
- mountPath: {{ .path }}
name: {{ .name }}
{{- toYaml . | nindent 10 }}
{{- end }}
env:
- name: CLEARML_API_HOST
@@ -43,12 +40,20 @@ data:
- name: CLEARML_API_ACCESS_KEY
valueFrom:
secretKeyRef:
name: {{ include "agentk8sglue.referenceName" . }}-clearml-agent-conf
{{- if .Values.clearml.existingAgentk8sglueSecret }}
name: {{ .Values.clearml.existingAgentk8sglueSecret }}
{{- else }}
name: {{ include "agentk8sglue.referenceName" . }}-clearml-agent-k8sglue
{{- end }}
key: agentk8sglue_key
- name: CLEARML_API_SECRET_KEY
valueFrom:
secretKeyRef:
name: {{ include "agentk8sglue.referenceName" . }}-clearml-agent-conf
{{- if .Values.clearml.existingAgentk8sglueSecret }}
name: {{ .Values.clearml.existingAgentk8sglueSecret }}
{{- else }}
name: {{ include "agentk8sglue.referenceName" . }}-clearml-agent-k8sglue
{{- end }}
key: agentk8sglue_secret
{{- if .Values.agentk8sglue.podTemplate.env }}
{{ toYaml .Values.agentk8sglue.podTemplate.env | nindent 8 }}

View File

@@ -24,26 +24,6 @@ spec:
- name: {{ include "agentk8sglue.referenceName" . }}-clearml-agent-registry-key
{{- end }}
{{- end }}
initContainers:
- name: init-k8s-glue
image: "{{ .Values.agentk8sglue.image.repository }}:{{ .Values.agentk8sglue.image.tag }}"
command:
- /bin/sh
- -c
- >
set -x;
while [ $(curl {{ if not .Values.agentk8sglue.clearmlcheckCertificate }}--insecure{{ end }} -sw '%{http_code}' "{{.Values.agentk8sglue.apiServerUrlReference}}/debug.ping" -o /dev/null) -ne 200 ] ; do
echo "waiting for apiserver" ;
sleep 5 ;
done;
while [[ $(curl {{ if not .Values.agentk8sglue.clearmlcheckCertificate }}--insecure{{ end }} -sw '%{http_code}' "{{.Values.agentk8sglue.fileServerUrlReference}}/" -o /dev/null) =~ 403|405 ]] ; do
echo "waiting for fileserver" ;
sleep 5 ;
done;
while [ $(curl {{ if not .Values.agentk8sglue.clearmlcheckCertificate }}--insecure{{ end }} -sw '%{http_code}' "{{.Values.agentk8sglue.webServerUrlReference}}/" -o /dev/null) -ne 200 ] ; do
echo "waiting for webserver" ;
sleep 5 ;
done
containers:
- name: k8s-glue
image: "{{ .Values.agentk8sglue.image.repository }}:{{ .Values.agentk8sglue.image.tag }}"
@@ -52,7 +32,7 @@ spec:
volumeMounts:
- name: {{ include "agentk8sglue.referenceName" . }}-k8sagent-pod-template
mountPath: /root/template
{{ if .Values.clearml.clearmlConfig }}
{{- if or .Values.clearml.clearmlConfig .Values.clearml.existingClearmlConfigSecret }}
- name: k8sagent-clearml-conf-volume
mountPath: /root/clearml.conf
subPath: clearml.conf
@@ -76,12 +56,20 @@ spec:
- name: CLEARML_API_ACCESS_KEY
valueFrom:
secretKeyRef:
name: {{ include "agentk8sglue.referenceName" . }}-clearml-agent-conf
{{- if .Values.clearml.existingAgentk8sglueSecret }}
name: {{ .Values.clearml.existingAgentk8sglueSecret }}
{{- else }}
name: {{ include "agentk8sglue.referenceName" . }}-clearml-agent-k8sglue
{{- end }}
key: agentk8sglue_key
- name: CLEARML_API_SECRET_KEY
valueFrom:
secretKeyRef:
name: {{ include "agentk8sglue.referenceName" . }}-clearml-agent-conf
{{- if .Values.clearml.existingAgentk8sglueSecret }}
name: {{ .Values.clearml.existingAgentk8sglueSecret }}
{{- else }}
name: {{ include "agentk8sglue.referenceName" . }}-clearml-agent-k8sglue
{{- end }}
key: agentk8sglue_secret
- name: CLEARML_WORKER_ID
value: "{{.Values.agentk8sglue.id}}"
@@ -98,10 +86,14 @@ spec:
- name: {{ include "agentk8sglue.referenceName" . }}-k8sagent-pod-template
configMap:
name: {{ include "agentk8sglue.referenceName" . }}-k8sagent-pod-template
{{ if .Values.clearml.clearmlConfig }}
{{- if or .Values.clearml.clearmlConfig .Values.clearml.existingClearmlConfigSecret }}
- name: k8sagent-clearml-conf-volume
secret:
{{- if .Values.clearml.existingClearmlConfigSecret }}
secretName: {{ .Values.clearml.existingClearmlConfigSecret }}
{{- else }}
secretName: {{ include "agentk8sglue.referenceName" . }}-clearml-agent-conf
{{- end }}
items:
- key: clearml.conf
path: clearml.conf

View File

@@ -1,12 +1,22 @@
{{- if not .Values.clearml.existingAgentk8sglueSecret }}
apiVersion: v1
kind: Secret
metadata:
name: {{ include "agentk8sglue.referenceName" . }}-clearml-agent-k8sglue
data:
agentk8sglue_key: {{ .Values.clearml.agentk8sglueKey | b64enc }}
agentk8sglue_secret: {{ .Values.clearml.agentk8sglueSecret | b64enc }}
{{- end }}
---
{{- if not .Values.clearml.existingClearmlConfigSecret }}
apiVersion: v1
kind: Secret
metadata:
name: {{ include "agentk8sglue.referenceName" . }}-clearml-agent-conf
data:
agentk8sglue_key: {{ .Values.clearml.agentk8sglueKey | b64enc }}
agentk8sglue_secret: {{ .Values.clearml.agentk8sglueSecret | b64enc }}
clearml.conf: {{ .Values.clearml.clearmlConfig | b64enc }}
---
{{- end }}
{{- if .Values.imageCredentials.enabled }}
{{- if not .Values.imageCredentials.existingSecret }}
apiVersion: v1

View File

@@ -15,10 +15,15 @@ imageCredentials:
# -- ClearMl generic configurations
clearml:
# -- If this is set, chart will not generate a secret but will use what is defined here
existingAgentk8sglueSecret: ""
# -- Agent k8s Glue basic auth key
agentk8sglueKey: "ACCESSKEY"
# -- Agent k8s Glue basic auth secret
agentk8sglueSecret: "SECRETKEY"
# -- If this is set, chart will not generate a secret but will use what is defined here
existingClearmlConfigSecret: ""
# -- ClearML configuration file
clearmlConfig: |-
sdk {
@@ -66,7 +71,12 @@ agentk8sglue:
# -- volumes definition for pods spawned to consume ClearML Task (example in values.yaml comments)
volumes: []
# - name: "yourvolume"
# path: "/yourpath"
# persistentVolumeClaim:
# claimName: "yourvolume"
# -- volumeMounts definition for pods spawned to consume ClearML Task (example in values.yaml comments)
volumeMounts: []
# - name: "yourvolume"
# mountPath: "/yourpath"
# -- environment variables for pods spawned to consume ClearML Task (example in values.yaml comments)
env: []
# # to setup access to private repo, setup secret with git credentials:

View File

@@ -2,7 +2,7 @@ apiVersion: v2
name: clearml
description: MLOps platform
type: application
version: "4.1.3"
version: "4.2.0"
appVersion: "1.6.0"
home: https://clear.ml
icon: https://raw.githubusercontent.com/allegroai/clearml/master/docs/clearml-logo.svg

View File

@@ -1,6 +1,6 @@
# ClearML Ecosystem for Kubernetes
![Version: 4.1.3](https://img.shields.io/badge/Version-4.1.3-informational?style=flat-square) ![Type: application](https://img.shields.io/badge/Type-application-informational?style=flat-square) ![AppVersion: 1.6.0](https://img.shields.io/badge/AppVersion-1.6.0-informational?style=flat-square)
![Version: 4.2.0](https://img.shields.io/badge/Version-4.2.0-informational?style=flat-square) ![Type: application](https://img.shields.io/badge/Type-application-informational?style=flat-square) ![AppVersion: 1.6.0](https://img.shields.io/badge/AppVersion-1.6.0-informational?style=flat-square)
MLOps platform
@@ -256,6 +256,7 @@ For detailed instructions, see the [Optional Configuration](https://github.com/a
| secret.credentials.apiserver.secretKey | string | `"BxapIRo9ZINi8x25CRxz8Wdmr2pQjzuWVB4PNASZqCtTyWgWVQ"` | Set for apiserver_secret field |
| secret.credentials.tests.accessKey | string | `"ENP39EQM4SLACGD5FXB7"` | Set for tests_user_key field |
| secret.credentials.tests.secretKey | string | `"lPcm0imbcBZ8mwgO7tpadutiS3gnJD05x9j7afwXPS35IKbpiQ"` | Set for tests_user_secret field |
| secret.existingSecret | string | `""` | If this is set, chart will not generate a secret but will use what is defined here |
| secret.httpSession | string | `"9Tw20RbhJ1bLBiHEOWXvhplKGUbTgLzAtwFN2oLQvWwS0uRpD5"` | Set for http_session field |
| webserver.additionalConfigs | object | `{}` | |
| webserver.affinity | object | `{}` | |

View File

@@ -87,32 +87,32 @@ spec:
- name: CLEARML__SECURE__HTTP__SESSION_SECRET__APISERVER
valueFrom:
secretKeyRef:
name: clearml-conf
name: {{ default "clearml-conf" .Values.secret.existingSecret }}
key: http_session
- name: CLEARML__SECURE__AUTH__TOKEN_SECRET
valueFrom:
secretKeyRef:
name: clearml-conf
name: {{ default "clearml-conf" .Values.secret.existingSecret }}
key: auth_token
- name: CLEARML__SECURE__CREDENTIALS__APISERVER__USER_KEY
valueFrom:
secretKeyRef:
name: clearml-conf
name: {{ default "clearml-conf" .Values.secret.existingSecret }}
key: apiserver_key
- name: CLEARML__SECURE__CREDENTIALS__APISERVER__USER_SECRET
valueFrom:
secretKeyRef:
name: clearml-conf
name: {{ default "clearml-conf" .Values.secret.existingSecret }}
key: apiserver_secret
- name: CLEARML__SECURE__CREDENTIALS__TESTS__USER_KEY
valueFrom:
secretKeyRef:
name: clearml-conf
name: {{ default "clearml-conf" .Values.secret.existingSecret }}
key: tests_user_key
- name: CLEARML__SECURE__CREDENTIALS__TESTS__USER_SECRET
valueFrom:
secretKeyRef:
name: clearml-conf
name: {{ default "clearml-conf" .Values.secret.existingSecret }}
key: tests_user_secret
{{- if .Values.apiserver.extraEnvs }}
{{ toYaml .Values.apiserver.extraEnvs | nindent 10 }}

View File

@@ -1,3 +1,4 @@
{{- if not .Values.secret.existingSecret }}
apiVersion: v1
kind: Secret
metadata:
@@ -9,3 +10,4 @@ stringData:
auth_token: {{ .Values.secret.authToken }}
tests_user_key: {{ .Values.secret.credentials.tests.accessKey }}
tests_user_secret: {{ .Values.secret.credentials.tests.secretKey }}
{{- end }}

View File

@@ -39,6 +39,9 @@ ingress:
path: "/"
secret:
# -- If this is set, chart will not generate a secret but will use what is defined here
existingSecret: ""
# -- Set for http_session field
httpSession: "9Tw20RbhJ1bLBiHEOWXvhplKGUbTgLzAtwFN2oLQvWwS0uRpD5"
# -- Set for auth_token field