Compare commits

...

50 Commits

Author SHA1 Message Date
Valeriano Manassero
1b164c2906 Add configurable default base serve url (#83)
* Added: configurable default base serving url

* Changed: chart version bump
2022-06-23 10:46:18 +02:00
Valeriano Manassero
43806b8e21 Add insecure cert check flag (#85)
* Added: clearmlcheckCertificate flag

* Changed: bump chart
2022-06-23 10:43:39 +02:00
Valeriano Manassero
80072c0654 Add editable config for k8s Agent (#84)
* Added: editable configuration

* Changed: bump up version
2022-06-23 09:52:19 +02:00
Valeriano Manassero
e22bd30764 Upgrade to app version 1.5.0 (#81)
* Changed: upgrade to 1.5.0

* Fixed: inject after ct check

* Fixed: list changd

* Fixed: typo
2022-06-23 07:49:45 +02:00
Valeriano Manassero
84a003b7bc Fix fileserver check on Agent (#82)
* Fixed: fileserver check

* Changed: version bump
2022-06-22 16:52:51 +02:00
Valeriano Manassero
1d95f0c27f Pullsecrets pod template (#80)
* Added: pullsecrets management for pod template

* Changed: version bump
2022-06-22 15:52:14 +02:00
Valeriano Manassero
562815e97a ClearML standalone agent chart (#79)
* Added: agent chart

* Changed: base image tag

* Changed: updated helm-docs

* Fixed: maintainers

* Changed: updated radme

* Fixed: http code to check

* Fixed: default values to check

* Changed: updated helm-docs

* Added: default values to be substituted by GH action

* Added: sed on the fly for testing

* Changed: updated CI images for Kind
2022-06-08 10:01:33 +02:00
Leon Rotim
e317610397 Fix linebreak formatting (#77)
* fix wrong line break delete

* version, bugfix inc

* regenerate README.md

Co-authored-by: Valeriano Manassero <14011549+valeriano-manassero@users.noreply.github.com>
2022-06-08 08:16:52 +02:00
Luca Cerone
16f172fc1c Allowing extraEnv to be added to agentservices and agent deployments. (#76)
* Allowing extraEnv to be added to agentservices and agent deployments.

* Bumped version and generated documentation chart

* Lint fix

* Chart version update

* helm-docs update

Co-authored-by: Valeriano Manassero <14011549+valeriano-manassero@users.noreply.github.com>
2022-06-08 08:06:20 +02:00
Valeriano Manassero
69048b5c96 Fix glue agent image (#78)
* Changed: avoid latest image

* Changed: version bump

* Fixed: pull policy

* Removed: specific ci for glue since now it's on by default

* Fixed: don't refresh dependencies

* Changed: testing chart action version update

* Fixed: action

* Changed: dependency updates required

* Fixed: lint and install

* Revert "Changed: dependency updates required"

This reverts commit 34ee22d7d0.

* Changed: use copy of dep charts because ththey may become unavailable

* Changed: updated readme
2022-06-02 21:20:00 +02:00
wasabipeas
9cf2868738 Clearml serving add support for triton (#74)
* added triton deployment and service, added triton block to values file, added value for CLEARML_DEFAULT_TRITON_GRPC_ADDR env variable in the serving-inference deployment

* re-generated README

* fixed yaml

* added condition to enable triton support

* changed chart version

* changed chart version

* bumped version to 0.3.0

* added conditional extraPythonPackages variable to clearml_serving_triton deploymnent

* added conditional extraPythonPackages to all the relevant deployments

* bumped version to 0.3.0
2022-05-24 13:12:15 +02:00
Valeriano Manassero
8098fd82df Add extra packages (#72)
* Added: extra python packages

* Changed: chart version
2022-05-23 13:06:48 +02:00
wasabipeas
4422cf433d added clearml-serving chart (#69)
* added clearml-serving chart

* fixed typo, added autogenerated README.md

* removed trailing space from values.yaml

* removed namespace definition from the values file and all the templates

* fixed typo

* re-run helm-docs
2022-05-19 07:35:30 +02:00
Valeriano Manassero
10296ac979 Update helm-docs.sh (#70) 2022-05-18 13:21:43 +02:00
Valeriano Manassero
06070a5c20 Default storageclass (#66)
* Changed: use deffault storageclass if not declared

* Changed: chart version
2022-05-02 18:00:46 +02:00
Niels ten Boom
5972fd8e5f fix: k8sagent indentation (#65) 2022-04-27 22:55:27 +02:00
Valeriano Manassero
7a7bd930f8 Fix glue namespace handling (#63)
* Changed: namespace handling for glue

* Changed: set glue as default agent system

* Changed: bump up version
2022-04-22 10:19:03 +02:00
Valeriano Manassero
25dfbd12d6 Changed: bump up versions (#62)
* Changed: bump up versions

* Changed: helmpdocs to 1.8.1
2022-04-19 08:22:08 +02:00
Valeriano Manassero
d7c3b9d5d9 Added: upgrade procedures (#61)
* Added: upgrade procedures

* Changed: template

* Changed: updated chart version
2022-04-04 10:32:51 +02:00
Valeriano Manassero
e16060f2ad Fix empty glue configs (#59)
* Added: use empty values without breaking glue agent

* Added: release namespace

* Changed: bump up version
2022-03-30 16:33:06 +02:00
Valeriano Manassero
27a666d2ae Clarml app 1.3.0 (#57)
* Changed: clarml app version

* Changed: chart version bump

* Added: comment on additional configs
2022-03-28 09:29:04 +02:00
Valeriano Manassero
d7bef0ff9d Add authentication example (#56)
* Added: auth enabled example in additionalConfigs

* Changed: bump up version

* Fixed: remove trailing spaces
2022-03-25 10:27:40 +01:00
Zied ANDOLSI
049e609ce0 add image pull secret + add ingress path (#55) 2022-03-16 18:04:56 +01:00
Niels ten Boom
fa3739b643 Improvements k8sagent (#54) 2022-03-01 17:48:33 +01:00
Valeriano Manassero
018348bc1d Fix image versions (#53)
* Fixed: image versions

* Changed: chart version

* Changed: readme update by helm-docs
2022-02-22 11:42:23 +01:00
Valeriano Manassero
57b85cbfce Update clearml image 1.2.0 (#52) 2022-02-17 15:33:30 +01:00
Niels ten Boom
9c15a8a348 fix: faulty service values references in k8s agent (#50)
* add k8s glue deployment

* more docs

* bump

* disabled by default

* run helm-docs

* fix service references

* fix readme

* add values file where k8sagent enabled

* empty files

* newline

* fix linter

Co-authored-by: Valeriano Manassero <14011549+valeriano-manassero@users.noreply.github.com>
2022-01-21 16:15:09 +01:00
Niels ten Boom
cd7f22f7d8 feat: Add k8s glue agent deployment (#49) 2022-01-18 23:27:12 +01:00
Shaun Howell
078e394e24 update ingress templates to accept per-service annotation overrides (#48)
* update ingress yamls to accept annotation overrides

* bump version to 3.3.1

* update readme via helm-docs
2022-01-18 18:06:01 +01:00
Valeriano Manassero
70b07c637a Update Elasticsearch (#47)
* update elasticsearch

* update elasticsearch reference

* bump up chart version
2022-01-05 08:26:57 +01:00
Valeriano Manassero
7b8e40c626 Agent foreground mode (#46)
* use foreground to push output on console

* bump up version
2021-12-13 09:04:02 +01:00
Valeriano Manassero
d8117eeb0d add k8s 1.22.1 to ci procedure (#44)
After some tests I found 1.22.1 doesn't have ulimit issue so I can include it into the ci process
2021-12-09 11:47:13 +01:00
Valeriano Manassero
4c09ae2c92 Fix env typo (#39)
* typo fix

* bump up version
2021-12-09 11:39:04 +01:00
Valeriano Manassero
478eecd5f2 remove k8s 1.22 from ci (#43)
It looks 1.22 k8s image from kind has a very low ulimit preventing elastic search from installing, removing it waiting for a fix.
2021-12-09 11:30:15 +01:00
Valeriano Manassero
43f4c44219 test one single kind cluster at time to avoid pressure fails (#42) 2021-12-09 11:00:57 +01:00
Valeriano Manassero
b83c8cd0e8 indentation fix (#41) 2021-12-09 10:32:42 +01:00
Valeriano Manassero
97f219228d update kind k8s versions (#40) 2021-12-09 10:31:17 +01:00
Valeriano Manassero
1b5b9407f6 Configurable auth cookies age (#38)
* configurable auth cookies age

* version bump up
2021-12-09 08:14:09 +01:00
Valeriano Manassero
b494a8c0cf External services (#36)
* use external services switch

* bump up version

* readme update
2021-11-26 08:11:55 +01:00
Weixiao Huang
266a1e3c41 feat: make service nodePort configurable and add some doc descriptions (#33)
* feat: make service nodePort configurable

* feat: bump version to 3.0.6

* docs: add descriptions for secret and service fields

* feat: add comments in clearml-kind.yaml of README.md

Co-authored-by: 黄维啸 <huangweixiao@megvii.com>
2021-11-08 14:23:10 +01:00
Weixiao Huang
bba5c0769f feat: make secret configurable and add secret annotations to deployment (#32) 2021-11-04 20:36:21 +01:00
Valeriano Manassero
b7f73e3bd9 Switch enabler agentservices (#31)
* switch to enable/disable agentservices

* bump up version
2021-09-21 14:16:30 +02:00
Valeriano Manassero
d3f6f3e50d Fix helper typo (#30)
* fix helper typo on api service name

* bump up version
2021-09-16 11:21:25 +02:00
Valeriano Manassero
979e73fe3d Fix ingress compat (#29)
* fix ingress compatibility with different k8s version

* bump up version
2021-09-16 10:54:25 +02:00
Valeriano Manassero
7352f35836 Helpers fix (#28)
* fix wrong service names

* bump up version
2021-09-16 09:11:58 +02:00
Valeriano Manassero
82ad17860d New ingress style (#27)
* new ingress style

* bump up version

* hostName fix

* helm-docs update
2021-09-16 08:51:07 +02:00
Valeriano Manassero
aa761dd450 Agent enable switch (#26)
* enable/disable switch

* bump up chart
2021-09-15 08:13:01 +02:00
Valeriano Manassero
7ff2f94d1a Apiserver configmap (#25)
* metadata name fix

* use toString

* use configmap for apiserver configs

* bump up version

* indentation fix

* fix trailing whitespaces
2021-09-14 15:43:10 +02:00
Valeriano Manassero
618a269c97 Fix service url generation (#21)
* service url generation functions

* use generation functions

* bump up version
2021-08-26 10:58:06 +02:00
Valeriano Manassero
3f215d2d90 Use many ingresses (#20)
* use many ingresses

* bump up version
2021-08-25 14:49:43 +02:00
217 changed files with 16306 additions and 587 deletions

View File

@@ -1,7 +1,7 @@
#!/bin/bash
CHART_DIRS="$(git diff --find-renames --name-only "$(git rev-parse --abbrev-ref HEAD)" remotes/origin/main -- 'charts' | grep '[cC]hart.yaml' | sed -e 's#/[Cc]hart.yaml##g')"
HELM_DOCS_VERSION="1.5.0"
HELM_DOCS_VERSION="1.10.0"
curl --silent --show-error --fail --location --output /tmp/helm-docs.tar.gz https://github.com/norwoodj/helm-docs/releases/download/v"${HELM_DOCS_VERSION}"/helm-docs_"${HELM_DOCS_VERSION}"_Linux_x86_64.tar.gz
tar -xf /tmp/helm-docs.tar.gz helm-docs

View File

@@ -21,28 +21,32 @@ jobs:
strategy:
matrix:
k8s:
- v1.20.7
- v1.21.1
- v1.22.7
- v1.23.6
- v1.24.0
steps:
- name: Checkout
uses: actions/checkout@v1
- name: Create kind ${{ matrix.k8s }} cluster
uses: helm/kind-action@v1.1.0
uses: helm/kind-action@v1.2.0
with:
version: v0.11.1
version: v0.13.0
node_image: kindest/node:${{ matrix.k8s }}
- name: Set up chart-testing
uses: helm/chart-testing-action@v2.0.1
- name: Add bitnami repo
run: helm repo add bitnami https://charts.bitnami.com/bitnami
- name: Add elastic repo
run: helm repo add elastic https://helm.elastic.co
uses: helm/chart-testing-action@v2.2.1
- name: Run chart-testing (list-changed)
id: list-changed
run: |
changed=$(ct list-changed --chart-dirs=charts --target-branch=main)
if [[ -n "$changed" ]]; then
echo "::set-output name=changed::true"
echo "::set-output name=changed_charts::\"${changed//$'\n'/,}\""
fi
- name: Inject secrets
run: |
find ./charts/*/ci/*.yaml -type f -exec sed -i "s/AGENTK8SGLUEKEY/${{ secrets.agentk8sglueKey }}/g" {} \;
find ./charts/*/ci/*.yaml -type f -exec sed -i "s/AGENTK8SGLUESECRET/${{ secrets.agentk8sglueSecret }}/g" {} \;
if: steps.list-changed.outputs.changed == 'true'
- name: Run chart-testing (lint and install)
run: ct lint-and-install --chart-dirs=charts --target-branch=main --helm-extra-args="--timeout=15m" --debug=true
run: ct lint-and-install --chart-dirs=charts --target-branch=main --helm-extra-args="--timeout=15m" --charts=${{steps.list-changed.outputs.changed_charts}} --debug=true
if: steps.list-changed.outputs.changed == 'true'

View File

@@ -0,0 +1,23 @@
# Patterns to ignore when building packages.
# This supports shell glob matching, relative path matching, and
# negation (prefixed with !). Only one pattern per line.
.DS_Store
# Common VCS dirs
.git/
.gitignore
.bzr/
.bzrignore
.hg/
.hgignore
.svn/
# Common backup files
*.swp
*.bak
*.tmp
*.orig
*~
# Various IDEs
.project
.idea/
*.tmproj
.vscode/

View File

@@ -0,0 +1,19 @@
apiVersion: v2
name: clearml-agent
description: MLOps platform
type: application
version: "1.1.1"
appVersion: "1.24"
kubeVersion: ">= 1.19.0-0 < 1.25.0-0"
home: https://clear.ml
icon: https://raw.githubusercontent.com/allegroai/clearml/master/docs/clearml-logo.svg
sources:
- https://github.com/allegroai/clearml-helm-charts
- https://github.com/allegroai/clearml
maintainers:
- name: valeriano-manassero
url: https://github.com/valeriano-manassero
keywords:
- clearml
- "machine learning"
- mlops

View File

@@ -0,0 +1,59 @@
# clearml-agent
![Version: 1.1.1](https://img.shields.io/badge/Version-1.1.1-informational?style=flat-square) ![Type: application](https://img.shields.io/badge/Type-application-informational?style=flat-square) ![AppVersion: 1.24](https://img.shields.io/badge/AppVersion-1.24-informational?style=flat-square)
MLOps platform
**Homepage:** <https://clear.ml>
## Maintainers
| Name | Email | Url |
| ---- | ------ | --- |
| valeriano-manassero | | <https://github.com/valeriano-manassero> |
## Source Code
* <https://github.com/allegroai/clearml-helm-charts>
* <https://github.com/allegroai/clearml>
## Requirements
Kubernetes: `>= 1.19.0-0 < 1.25.0-0`
## Values
| Key | Type | Default | Description |
|-----|------|---------|-------------|
| agentk8sglue | object | `{"apiServerUrlReference":"https://api.clear.ml","clearmlcheckCertificate":true,"defaultContainerImage":"ubuntu:18.04","fileServerUrlReference":"https://files.clear.ml","id":"k8s-agent","image":{"repository":"allegroai/clearml-agent-k8s","tag":"base-1.21"},"maxPods":10,"podTemplate":{"env":[],"nodeSelector":{},"resources":{},"tolerations":[],"volumes":[]},"queue":"default","replicaCount":1,"serviceAccountName":"default","webServerUrlReference":"https://app.clear.ml"}` | This agent will spawn queued experiments in new pods, a good use case is to combine this with GPU autoscaling nodes. https://github.com/allegroai/clearml-agent/tree/master/docker/k8s-glue |
| agentk8sglue.apiServerUrlReference | string | `"https://api.clear.ml"` | Reference to Api server url |
| agentk8sglue.clearmlcheckCertificate | bool | `true` | Check certificates validity for evefry UrlReference below. |
| agentk8sglue.defaultContainerImage | string | `"ubuntu:18.04"` | default container image for ClearML Task pod |
| agentk8sglue.fileServerUrlReference | string | `"https://files.clear.ml"` | Reference to File server url |
| agentk8sglue.id | string | `"k8s-agent"` | ClearML worker ID (must be unique across the entire ClearMLenvironment) |
| agentk8sglue.image | object | `{"repository":"allegroai/clearml-agent-k8s","tag":"base-1.21"}` | Glue Agent image configuration |
| agentk8sglue.maxPods | int | `10` | maximum concurrent consume ClearML Task pod |
| agentk8sglue.podTemplate | object | `{"env":[],"nodeSelector":{},"resources":{},"tolerations":[],"volumes":[]}` | template for pods spawned to consume ClearML Task |
| agentk8sglue.podTemplate.env | list | `[]` | environment variables for pods spawned to consume ClearML Task (example in values.yaml comments) |
| agentk8sglue.podTemplate.nodeSelector | object | `{}` | nodeSelector setup for pods spawned to consume ClearML Task (example in values.yaml comments) |
| agentk8sglue.podTemplate.resources | object | `{}` | resources declaration for pods spawned to consume ClearML Task (example in values.yaml comments) |
| agentk8sglue.podTemplate.tolerations | list | `[]` | tolerations setup for pods spawned to consume ClearML Task (example in values.yaml comments) |
| agentk8sglue.podTemplate.volumes | list | `[]` | volumes definition for pods spawned to consume ClearML Task (example in values.yaml comments) |
| agentk8sglue.queue | string | `"default"` | ClearML queue this agent will consume |
| agentk8sglue.replicaCount | int | `1` | Glue Agent number of pods |
| agentk8sglue.serviceAccountName | string | `"default"` | serviceAccountName for pods spawned to consume ClearML Task |
| agentk8sglue.webServerUrlReference | string | `"https://app.clear.ml"` | Reference to Web server url |
| clearml | object | `{"agentk8sglueKey":"ACCESSKEY","agentk8sglueSecret":"SECRETKEY","clearmlConfig":"sdk {\n}"}` | ClearMl generic configurations |
| clearml.agentk8sglueKey | string | `"ACCESSKEY"` | Agent k8s Glue basic auth key |
| clearml.agentk8sglueSecret | string | `"SECRETKEY"` | Agent k8s Glue basic auth secret |
| clearml.clearmlConfig | string | `"sdk {\n}"` | ClearML configuration file |
| imageCredentials | object | `{"email":"someone@host.com","enabled":false,"existingSecret":"","password":"pwd","registry":"docker.io","username":"someone"}` | Private image registry configuration |
| imageCredentials.email | string | `"someone@host.com"` | Email |
| imageCredentials.enabled | bool | `false` | Use private authentication mode |
| imageCredentials.existingSecret | string | `""` | If this is set, chart will not generate a secret but will use what is defined here |
| imageCredentials.password | string | `"pwd"` | Registry password |
| imageCredentials.registry | string | `"docker.io"` | Registry name |
| imageCredentials.username | string | `"someone"` | Registry username |
----------------------------------------------
Autogenerated from chart metadata using [helm-docs v1.10.0](https://github.com/norwoodj/helm-docs/releases/v1.10.0)

View File

@@ -0,0 +1,3 @@
clearml:
agentk8sglueKey: "AGENTK8SGLUEKEY"
agentk8sglueSecret: "AGENTK8SGLUESECRET"

View File

@@ -0,0 +1 @@
Glue Agent deployed.

View File

@@ -0,0 +1,86 @@
{{/*
Expand the name of the chart.
*/}}
{{- define "clearml.name" -}}
{{- default .Chart.Name .Values.nameOverride | trunc 63 | trimSuffix "-" }}
{{- end }}
{{/*
Create a default fully qualified app name.
We truncate at 63 chars because some Kubernetes name fields are limited to this (by the DNS naming spec).
If release name contains chart name it will be used as a full name.
*/}}
{{- define "clearml.fullname" -}}
{{- if .Values.fullnameOverride }}
{{- .Values.fullnameOverride | trunc 63 | trimSuffix "-" }}
{{- else }}
{{- $name := default .Chart.Name .Values.nameOverride }}
{{- if contains $name .Release.Name }}
{{- .Release.Name | trunc 63 | trimSuffix "-" }}
{{- else }}
{{- printf "%s-%s" .Release.Name $name | trunc 63 | trimSuffix "-" }}
{{- end }}
{{- end }}
{{- end }}
{{/*
Create chart name and version as used by the chart label.
*/}}
{{- define "clearml.chart" -}}
{{- printf "%s-%s" .Chart.Name .Chart.Version | replace "+" "_" | trunc 63 | trimSuffix "-" }}
{{- end }}
{{/*
Common labels
*/}}
{{- define "clearml.labels" -}}
helm.sh/chart: {{ include "clearml.chart" . }}
{{ include "clearml.selectorLabels" . }}
{{- if .Chart.AppVersion }}
app.kubernetes.io/version: {{ .Chart.AppVersion | quote }}
{{- end }}
app.kubernetes.io/managed-by: {{ .Release.Service }}
{{- end }}
{{/*
Selector labels
*/}}
{{- define "clearml.selectorLabels" -}}
app.kubernetes.io/name: {{ include "clearml.name" . }}
app.kubernetes.io/instance: {{ .Release.Name }}
{{- end }}
{{/*
Reference Name (agentk8sglue)
*/}}
{{- define "agentk8sglue.referenceName" -}}
{{- include "clearml.name" . }}-agentk8sglue
{{- end }}
{{/*
Selector labels (agentk8sglue)
*/}}
{{- define "agentk8sglue.selectorLabels" -}}
app.kubernetes.io/name: {{ include "clearml.name" . }}
app.kubernetes.io/instance: {{ include "agentk8sglue.referenceName" . }}
{{- end }}
{{/*
Create the name of the service account to use
*/}}
{{- define "clearml.serviceAccountName" -}}
{{- if .Values.serviceAccount.create }}
{{- default (include "clearml.fullname" .) .Values.serviceAccount.name }}
{{- else }}
{{- default "default" .Values.serviceAccount.name }}
{{- end }}
{{- end }}
{{/*
Create secret to access docker registry
*/}}
{{- define "imagePullSecret" }}
{{- with .Values.imageCredentials }}
{{- printf "{\"auths\":{\"%s\":{\"username\":\"%s\",\"password\":\"%s\",\"email\":\"%s\",\"auth\":\"%s\"}}}" .registry .username .password .email (printf "%s:%s" .username .password | b64enc) | b64enc }}
{{- end }}
{{- end }}

View File

@@ -0,0 +1,63 @@
apiVersion: v1
kind: ConfigMap
metadata:
name: k8sagent-pod-template
data:
template.yaml: |
apiVersion: v1
metadata:
namespace: {{ .Release.Namespace }}
spec:
{{- if .Values.imageCredentials.enabled }}
imagePullSecrets:
{{- if .Values.imageCredentials.existingSecret }}
- name: .Values.imageCredentials.existingSecret
{{- else }}
- name: clearml-agent-registry-key
{{- end }}
{{- end }}
serviceAccountName: {{ .Values.agentk8sglue.serviceAccountName }}
volumes:
{{- range .Values.agentk8sglue.podTemplate.volumes }}
- name: {{ .name }}
persistentVolumeClaim:
claimName: {{ .name }}
{{- end }}
containers:
- resources:
{{- toYaml .Values.agentk8sglue.podTemplate.resources | nindent 10 }}
ports:
- containerPort: 10022
volumeMounts:
{{- range .Values.agentk8sglue.podTemplate.volumes }}
- mountPath: {{ .path }}
name: {{ .name }}
{{- end }}
env:
- name: CLEARML_API_HOST
value: {{.Values.agentk8sglue.apiServerUrlReference}}
- name: CLEARML_WEB_HOST
value: {{.Values.agentk8sglue.webServerUrlReference}}
- name: CLEARML_FILES_HOST
value: {{.Values.agentk8sglue.fileServerUrlReference}}
- name: CLEARML_API_ACCESS_KEY
valueFrom:
secretKeyRef:
name: clearml-agent-conf
key: agentk8sglue_key
- name: CLEARML_API_SECRET_KEY
valueFrom:
secretKeyRef:
name: clearml-agent-conf
key: agentk8sglue_secret
{{- if .Values.agentk8sglue.podTemplate.env }}
{{ toYaml .Values.agentk8sglue.podTemplate.env | nindent 8 }}
{{- end }}
{{- with .Values.agentk8sglue.podTemplate.nodeSelector}}
nodeSelector:
{{- toYaml . | nindent 8 }}
{{- end }}
{{- with .Values.agentk8sglue.podTemplate.tolerations }}
tolerations:
{{- toYaml . | nindent 8 }}
{{- end }}

View File

@@ -0,0 +1,105 @@
apiVersion: apps/v1
kind: Deployment
metadata:
name: {{ include "agentk8sglue.referenceName" . }}
labels:
{{- include "clearml.labels" . | nindent 4 }}
spec:
replicas: {{ .Values.agentk8sglue.replicaCount }}
selector:
matchLabels:
{{- include "agentk8sglue.selectorLabels" . | nindent 6 }}
template:
metadata:
annotations:
checksum/config: {{ printf "%s" .Values.clearml | sha256sum }}
labels:
{{- include "agentk8sglue.selectorLabels" . | nindent 8 }}
spec:
{{- if .Values.imageCredentials.enabled }}
imagePullSecrets:
{{- if .Values.imageCredentials.existingSecret }}
- name: .Values.imageCredentials.existingSecret
{{- else }}
- name: clearml-agent-registry-key
{{- end }}
{{- end }}
initContainers:
- name: init-k8s-glue
image: "{{ .Values.agentk8sglue.image.repository }}:{{ .Values.agentk8sglue.image.tag }}"
command:
- /bin/sh
- -c
- >
set -x;
while [ $(curl {{ if not .Values.agentk8sglue.clearmlcheckCertificate }}--insecure{{ end }} -sw '%{http_code}' "{{.Values.agentk8sglue.apiServerUrlReference}}/debug.ping" -o /dev/null) -ne 200 ] ; do
echo "waiting for apiserver" ;
sleep 5 ;
done;
while [[ $(curl {{ if not .Values.agentk8sglue.clearmlcheckCertificate }}--insecure{{ end }} -sw '%{http_code}' "{{.Values.agentk8sglue.fileServerUrlReference}}/" -o /dev/null) =~ 403|405 ]] ; do
echo "waiting for fileserver" ;
sleep 5 ;
done;
while [ $(curl {{ if not .Values.agentk8sglue.clearmlcheckCertificate }}--insecure{{ end }} -sw '%{http_code}' "{{.Values.agentk8sglue.webServerUrlReference}}/" -o /dev/null) -ne 200 ] ; do
echo "waiting for webserver" ;
sleep 5 ;
done
containers:
- name: k8s-glue
image: "{{ .Values.agentk8sglue.image.repository }}:{{ .Values.agentk8sglue.image.tag }}"
imagePullPolicy: IfNotPresent
command: ["/bin/bash", "-c", "export PATH=$PATH:$HOME/bin; source /root/.bashrc && /root/entrypoint.sh"]
volumeMounts:
- name: k8sagent-pod-template
mountPath: /root/template
{{ if .Values.clearml.clearmlConfig }}
- name: k8sagent-clearml-conf-volume
mountPath: /root/clearml.conf
subPath: clearml.conf
readOnly: true
{{- end }}
env:
- name: CLEARML_API_HOST
value: "{{.Values.agentk8sglue.apiServerUrlReference}}"
- name: CLEARML_WEB_HOST
value: "{{.Values.agentk8sglue.webServerUrlReference}}"
- name: CLEARML_FILES_HOST
value: "{{.Values.agentk8sglue.fileServerUrlReference}}"
- name: K8S_GLUE_MAX_PODS
value: "{{.Values.agentk8sglue.maxPods}}"
- name: K8S_GLUE_QUEUE
value: "{{.Values.agentk8sglue.queue}}"
- name: K8S_GLUE_EXTRA_ARGS
value: "--namespace {{ .Release.Namespace }} --template-yaml /root/template/template.yaml"
- name: K8S_DEFAULT_NAMESPACE
value: "{{ .Release.Namespace }}"
- name: CLEARML_API_ACCESS_KEY
valueFrom:
secretKeyRef:
name: clearml-agent-conf
key: agentk8sglue_key
- name: CLEARML_API_SECRET_KEY
valueFrom:
secretKeyRef:
name: clearml-agent-conf
key: agentk8sglue_secret
- name: CLEARML_WORKER_ID
value: "{{.Values.agentk8sglue.id}}"
- name: CLEARML_AGENT_UPDATE_REPO
value: ""
- name: FORCE_CLEARML_AGENT_REPO
value: ""
- name: CLEARML_DOCKER_IMAGE
value: "{{.Values.agentk8sglue.defaultContainerImage}}"
volumes:
- name: k8sagent-pod-template
configMap:
name: k8sagent-pod-template
{{ if .Values.clearml.clearmlConfig }}
- name: k8sagent-clearml-conf-volume
secret:
secretName: clearml-agent-conf
items:
- key: clearml.conf
path: clearml.conf
{{ end }}

View File

@@ -0,0 +1,23 @@
apiVersion: rbac.authorization.k8s.io/v1
kind: Role
metadata:
name: k8sagent-pods-access
rules:
- apiGroups:
- ""
resources:
- pods
verbs: ["get", "list", "watch", "create", "patch", "delete"]
---
apiVersion: rbac.authorization.k8s.io/v1
kind: RoleBinding
metadata:
name: k8sagent-pods-access
subjects:
- kind: ServiceAccount
name: default
namespace: {{ .Release.Namespace }}
roleRef:
apiGroup: rbac.authorization.k8s.io
kind: Role
name: k8sagent-pods-access

View File

@@ -0,0 +1,20 @@
apiVersion: v1
kind: Secret
metadata:
name: clearml-agent-conf
data:
agentk8sglue_key: {{ .Values.clearml.agentk8sglueKey | b64enc }}
agentk8sglue_secret: {{ .Values.clearml.agentk8sglueSecret | b64enc }}
clearml.conf: {{ .Values.clearml.clearmlConfig | b64enc }}
---
{{- if .Values.imageCredentials.enabled }}
{{- if not .Values.imageCredentials.existingSecret }}
apiVersion: v1
kind: Secret
metadata:
name: clearml-agent-registry-key
type: kubernetes.io/dockerconfigjson
data:
.dockerconfigjson: {{ template "imagePullSecret" . }}
{{- end }}
{{- end }}

View File

@@ -0,0 +1,88 @@
# -- Private image registry configuration
imageCredentials:
# -- Use private authentication mode
enabled: false
# -- If this is set, chart will not generate a secret but will use what is defined here
existingSecret: ""
# -- Registry name
registry: docker.io
# -- Registry username
username: someone
# -- Registry password
password: pwd
# -- Email
email: someone@host.com
# -- ClearMl generic configurations
clearml:
# -- Agent k8s Glue basic auth key
agentk8sglueKey: "ACCESSKEY"
# -- Agent k8s Glue basic auth secret
agentk8sglueSecret: "SECRETKEY"
# -- ClearML configuration file
clearmlConfig: |-
sdk {
}
# -- This agent will spawn queued experiments in new pods, a good use case is to combine this with
# GPU autoscaling nodes.
# https://github.com/allegroai/clearml-agent/tree/master/docker/k8s-glue
agentk8sglue:
# -- Glue Agent image configuration
image:
repository: "allegroai/clearml-agent-k8s"
tag: "base-1.21"
# -- Glue Agent number of pods
replicaCount: 1
# -- Check certificates validity for evefry UrlReference below.
clearmlcheckCertificate: true
# -- Reference to Api server url
apiServerUrlReference: "https://api.clear.ml"
# -- Reference to File server url
fileServerUrlReference: "https://files.clear.ml"
# -- Reference to Web server url
webServerUrlReference: "https://app.clear.ml"
# -- serviceAccountName for pods spawned to consume ClearML Task
serviceAccountName: default
# -- maximum concurrent consume ClearML Task pod
maxPods: 10
# -- default container image for ClearML Task pod
defaultContainerImage: ubuntu:18.04
# -- ClearML queue this agent will consume
queue: default
# -- ClearML worker ID (must be unique across the entire ClearMLenvironment)
id: k8s-agent
# -- template for pods spawned to consume ClearML Task
podTemplate:
# -- volumes definition for pods spawned to consume ClearML Task (example in values.yaml comments)
volumes: []
# - name: "yourvolume"
# path: "/yourpath"
# -- environment variables for pods spawned to consume ClearML Task (example in values.yaml comments)
env: []
# # to setup access to private repo, setup secret with git credentials:
# - name: CLEARML_AGENT_GIT_USER
# value: mygitusername
# - name: CLEARML_AGENT_GIT_PASS
# valueFrom:
# secretKeyRef:
# name: git-password
# key: git-password
# -- resources declaration for pods spawned to consume ClearML Task (example in values.yaml comments)
resources: {}
# limits:
# nvidia.com/gpu: 1
# -- tolerations setup for pods spawned to consume ClearML Task (example in values.yaml comments)
tolerations: []
# - key: "nvidia.com/gpu"
# operator: Exists
# effect: "NoSchedule"
# -- nodeSelector setup for pods spawned to consume ClearML Task (example in values.yaml comments)
nodeSelector: {}
# fleet: gpu-nodes

View File

@@ -0,0 +1,23 @@
# Patterns to ignore when building packages.
# This supports shell glob matching, relative path matching, and
# negation (prefixed with !). Only one pattern per line.
.DS_Store
# Common VCS dirs
.git/
.gitignore
.bzr/
.bzrignore
.hg/
.hgignore
.svn/
# Common backup files
*.swp
*.bak
*.tmp
*.orig
*~
# Various IDEs
.project
.idea/
*.tmproj
.vscode/

View File

@@ -0,0 +1,16 @@
apiVersion: v2
name: clearml-serving
description: ClearML Serving Helm Chart
type: application
version: 0.4.0
appVersion: "0.9.0"
maintainers:
- name: valeriano-manassero
url: https://github.com/valeriano-manassero
- name: stefano-cherchi
url: https://github.com/stefano-cherchi
keywords:
- clearml
- "machine learning"
- mlops
- "model serving"

View File

@@ -0,0 +1,71 @@
# clearml-serving
![Version: 0.4.0](https://img.shields.io/badge/Version-0.4.0-informational?style=flat-square) ![Type: application](https://img.shields.io/badge/Type-application-informational?style=flat-square) ![AppVersion: 0.9.0](https://img.shields.io/badge/AppVersion-0.9.0-informational?style=flat-square)
ClearML Serving Helm Chart
## Maintainers
| Name | Email | Url |
| ---- | ------ | --- |
| valeriano-manassero | | <https://github.com/valeriano-manassero> |
| stefano-cherchi | | <https://github.com/stefano-cherchi> |
## Values
| Key | Type | Default | Description |
|-----|------|---------|-------------|
| alertmanager.affinity | object | `{}` | |
| alertmanager.image | string | `"prom/alertmanager:v0.23.0"` | |
| alertmanager.nodeSelector | object | `{}` | |
| alertmanager.resources | object | `{}` | |
| alertmanager.tolerations | list | `[]` | |
| clearml.apiAccessKey | string | `"ClearML API Access Key"` | |
| clearml.apiHost | string | `"http://clearml-server-apiserver:8008"` | |
| clearml.apiSecretKey | string | `"ClearML API Secret Key"` | |
| clearml.defaultBaseServeUrl | string | `"http://127.0.0.1:8080/serve"` | |
| clearml.filesHost | string | `"http://clearml-server-fileserver:8081"` | |
| clearml.servingTaskId | string | `"ClearML Serving Task ID"` | |
| clearml.webHost | string | `"http://clearml-server-webserver:80"` | |
| clearml_serving_inference.affinity | object | `{}` | |
| clearml_serving_inference.extraPythonPackages | list | `[]` | Extra Python Packages to be installed in running pods |
| clearml_serving_inference.image | string | `"allegroai/clearml-serving-inference"` | |
| clearml_serving_inference.nodeSelector | object | `{}` | |
| clearml_serving_inference.resources | object | `{}` | |
| clearml_serving_inference.tolerations | list | `[]` | |
| clearml_serving_statistics.affinity | object | `{}` | |
| clearml_serving_statistics.extraPythonPackages | list | `[]` | Extra Python Packages to be installed in running pods |
| clearml_serving_statistics.image | string | `"allegroai/clearml-serving-statistics"` | |
| clearml_serving_statistics.nodeSelector | object | `{}` | |
| clearml_serving_statistics.resources | object | `{}` | |
| clearml_serving_statistics.tolerations | list | `[]` | |
| clearml_serving_triton.affinity | object | `{}` | |
| clearml_serving_triton.enabled | bool | `true` | |
| clearml_serving_triton.extraPythonPackages | list | `[]` | Extra Python Packages to be installed in running pods |
| clearml_serving_triton.image | string | `"allegroai/clearml-serving-triton"` | |
| clearml_serving_triton.nodeSelector | object | `{}` | |
| clearml_serving_triton.resources | object | `{}` | |
| clearml_serving_triton.tolerations | list | `[]` | |
| grafana.affinity | object | `{}` | |
| grafana.image | string | `"grafana/grafana:8.4.4-ubuntu"` | |
| grafana.nodeSelector | object | `{}` | |
| grafana.resources | object | `{}` | |
| grafana.tolerations | list | `[]` | |
| kafka.affinity | object | `{}` | |
| kafka.image | string | `"bitnami/kafka:3.1.0"` | |
| kafka.nodeSelector | object | `{}` | |
| kafka.resources | object | `{}` | |
| kafka.tolerations | list | `[]` | |
| prometheus.affinity | object | `{}` | |
| prometheus.image | string | `"prom/prometheus:v2.34.0"` | |
| prometheus.nodeSelector | object | `{}` | |
| prometheus.resources | object | `{}` | |
| prometheus.tolerations | list | `[]` | |
| zookeeper.affinity | object | `{}` | |
| zookeeper.image | string | `"bitnami/zookeeper:3.7.0"` | |
| zookeeper.nodeSelector | object | `{}` | |
| zookeeper.resources | object | `{}` | |
| zookeeper.tolerations | list | `[]` | |
----------------------------------------------
Autogenerated from chart metadata using [helm-docs v1.10.0](https://github.com/norwoodj/helm-docs/releases/v1.10.0)

View File

@@ -0,0 +1,62 @@
{{/*
Expand the name of the chart.
*/}}
{{- define "clearml-serving.name" -}}
{{- default .Chart.Name .Values.nameOverride | trunc 63 | trimSuffix "-" }}
{{- end }}
{{/*
Create a default fully qualified app name.
We truncate at 63 chars because some Kubernetes name fields are limited to this (by the DNS naming spec).
If release name contains chart name it will be used as a full name.
*/}}
{{- define "clearml-serving.fullname" -}}
{{- if .Values.fullnameOverride }}
{{- .Values.fullnameOverride | trunc 63 | trimSuffix "-" }}
{{- else }}
{{- $name := default .Chart.Name .Values.nameOverride }}
{{- if contains $name .Release.Name }}
{{- .Release.Name | trunc 63 | trimSuffix "-" }}
{{- else }}
{{- printf "%s-%s" .Release.Name $name | trunc 63 | trimSuffix "-" }}
{{- end }}
{{- end }}
{{- end }}
{{/*
Create chart name and version as used by the chart label.
*/}}
{{- define "clearml-serving.chart" -}}
{{- printf "%s-%s" .Chart.Name .Chart.Version | replace "+" "_" | trunc 63 | trimSuffix "-" }}
{{- end }}
{{/*
Common labels
*/}}
{{- define "clearml-serving.labels" -}}
helm.sh/chart: {{ include "clearml-serving.chart" . }}
{{ include "clearml-serving.selectorLabels" . }}
{{- if .Chart.AppVersion }}
app.kubernetes.io/version: {{ .Chart.AppVersion | quote }}
{{- end }}
app.kubernetes.io/managed-by: {{ .Release.Service }}
{{- end }}
{{/*
Selector labels
*/}}
{{- define "clearml-serving.selectorLabels" -}}
app.kubernetes.io/name: {{ include "clearml-serving.name" . }}
app.kubernetes.io/instance: {{ .Release.Name }}
{{- end }}
{{/*
Create the name of the service account to use
*/}}
{{- define "clearml-serving.serviceAccountName" -}}
{{- if .Values.serviceAccount.create }}
{{- default (include "clearml-serving.fullname" .) .Values.serviceAccount.name }}
{{- else }}
{{- default "default" .Values.serviceAccount.name }}
{{- end }}
{{- end }}

View File

@@ -0,0 +1,28 @@
apiVersion: apps/v1
kind: Deployment
metadata:
annotations: {}
labels:
clearml.serving.service: alertmanager
name: alertmanager
spec:
replicas: 1
selector:
matchLabels:
clearml.serving.service: alertmanager
strategy: {}
template:
metadata:
annotations: {}
labels:
clearml.serving.network/clearml-serving-backend: "true"
clearml.serving.service: alertmanager
spec:
containers:
- image: {{ .Values.alertmanager.image }}
name: clearml-serving-alertmanager
ports:
- containerPort: 9093
resources: {}
restartPolicy: Always
status: {}

View File

@@ -0,0 +1,16 @@
apiVersion: v1
kind: Service
metadata:
annotations: {}
labels:
clearml.serving.service: alertmanager
name: clearml-serving-alertmanager
spec:
ports:
- name: "9093"
port: 9093
targetPort: 9093
selector:
clearml.serving.service: alertmanager
status:
loadBalancer: {}

View File

@@ -0,0 +1,13 @@
apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
name: clearml-serving-backend
spec:
ingress:
- from:
- podSelector:
matchLabels:
clearml.serving.network/clearml-serving-backend: "true"
podSelector:
matchLabels:
clearml.serving.network/clearml-serving-backend: "true"

View File

@@ -0,0 +1,63 @@
apiVersion: apps/v1
kind: Deployment
metadata:
annotations: {}
labels:
clearml.serving.service: clearml-serving-inference
name: clearml-serving-inference
spec:
replicas: 1
selector:
matchLabels:
clearml.serving.service: clearml-serving-inference
strategy: {}
template:
metadata:
annotations: {}
labels:
clearml.serving.network/clearml-serving-backend: "true"
clearml.serving.service: clearml-serving-inference
spec:
containers:
- env:
- name: CLEARML_API_ACCESS_KEY
value: "{{ .Values.clearml.apiAccessKey }}"
- name: CLEARML_API_SECRET_KEY
value: "{{ .Values.clearml.apiSecretKey }}"
- name: CLEARML_API_HOST
value: "{{ .Values.clearml.apiHost }}"
- name: CLEARML_FILES_HOST
value: "{{ .Values.clearml.filesHost }}"
- name: CLEARML_WEB_HOST
value: "{{ .Values.clearml.webHost }}"
- name: CLEARML_DEFAULT_KAFKA_SERVE_URL
value: clearml-serving-kafka:9092
- name: CLEARML_SERVING_POLL_FREQ
value: "1.0"
- name: CLEARML_DEFAULT_BASE_SERVE_URL
value: "{{ .Values.clearml.defaultBaseServeUrl }}"
- name: CLEARML_DEFAULT_TRITON_GRPC_ADDR
{{- if .Values.clearml_serving_triton.enabled }}
value: "clearml-serving-triton:8001"
{{- else }}
value: ""
{{- end }}
- name: CLEARML_SERVING_NUM_PROCESS
value: "2"
- name: CLEARML_SERVING_PORT
value: "8080"
- name: CLEARML_SERVING_TASK_ID
value: "{{ .Values.clearml.servingTaskId }}"
- name: CLEARML_USE_GUNICORN
value: "true"
{{- if .Values.clearml_serving_inference.extraPythonPackages }}
- name: EXTRA_PYTHON_PACKAGES
value: '{{ join " " .Values.clearml_serving_inference.extraPythonPackages }}'
{{- end }}
image: "{{ .Values.clearml_serving_inference.image }}:{{ .Chart.AppVersion }}"
name: clearml-serving-inference
ports:
- containerPort: 8080
resources: {}
restartPolicy: Always
status: {}

View File

@@ -0,0 +1,16 @@
apiVersion: v1
kind: Service
metadata:
annotations: {}
labels:
clearml.serving.service: clearml-serving-inference
name: clearml-serving-inference
spec:
ports:
- name: "8080"
port: 8080
targetPort: 8080
selector:
clearml.serving.service: clearml-serving-inference
status:
loadBalancer: {}

View File

@@ -0,0 +1,49 @@
apiVersion: apps/v1
kind: Deployment
metadata:
annotations: {}
labels:
clearml.serving.service: clearml-serving-statistics
name: clearml-serving-statistics
spec:
replicas: 1
selector:
matchLabels:
clearml.serving.service: clearml-serving-statistics
strategy: {}
template:
metadata:
annotations: {}
labels:
clearml.serving.network/clearml-serving-backend: "true"
clearml.serving.service: clearml-serving-statistics
spec:
containers:
- env:
- name: CLEARML_API_ACCESS_KEY
value: "{{ .Values.clearml.apiAccessKey }}"
- name: CLEARML_API_SECRET_KEY
value: "{{ .Values.clearml.apiSecretKey }}"
- name: CLEARML_API_HOST
value: "{{ .Values.clearml.apiHost }}"
- name: CLEARML_FILES_HOST
value: "{{ .Values.clearml.filesHost }}"
- name: CLEARML_WEB_HOST
value: "{{ .Values.clearml.webHost }}"
- name: CLEARML_DEFAULT_KAFKA_SERVE_URL
value: clearml-serving-kafka:9092
- name: CLEARML_SERVING_POLL_FREQ
value: "1.0"
- name: CLEARML_SERVING_TASK_ID
value: "{{ .Values.clearml.servingTaskId }}"
{{- if .Values.clearml_serving_statistics.extraPythonPackages }}
- name: EXTRA_PYTHON_PACKAGES
value: '{{ join " " .Values.clearml_serving_statistics.extraPythonPackages }}'
{{- end }}
image: "{{ .Values.clearml_serving_statistics.image }}:{{ .Chart.AppVersion }}"
name: clearml-serving-statistics
ports:
- containerPort: 9999
resources: {}
restartPolicy: Always
status: {}

View File

@@ -0,0 +1,16 @@
apiVersion: v1
kind: Service
metadata:
annotations: {}
labels:
clearml.serving.service: clearml-serving-statistics
name: clearml-serving-statistics
spec:
ports:
- name: "9999"
port: 9999
targetPort: 9999
selector:
clearml.serving.service: clearml-serving-statistics
status:
loadBalancer: {}

View File

@@ -0,0 +1,52 @@
{{ if .Values.clearml_serving_triton.enabled }}
apiVersion: apps/v1
kind: Deployment
metadata:
annotations: {}
labels:
clearml.serving.service: clearml-serving-triton
name: clearml-serving-triton
spec:
replicas: 1
selector:
matchLabels:
clearml.serving.service: clearml-serving-triton
strategy: {}
template:
metadata:
annotations: {}
labels:
clearml.serving.network/clearml-serving-backend: "true"
clearml.serving.service: clearml-serving-triton
spec:
containers:
- env:
- name: CLEARML_API_ACCESS_KEY
value: "{{ .Values.clearml.apiAccessKey }}"
- name: CLEARML_API_SECRET_KEY
value: "{{ .Values.clearml.apiSecretKey }}"
- name: CLEARML_API_HOST
value: "{{ .Values.clearml.apiHost }}"
- name: CLEARML_FILES_HOST
value: "{{ .Values.clearml.filesHost }}"
- name: CLEARML_WEB_HOST
value: "{{ .Values.clearml.webHost }}"
- name: CLEARML_SERVING_TASK_ID
value: "{{ .Values.clearml.servingTaskId }}"
- name: CLEARML_TRITON_POLL_FREQ
value: "1.0"
- name: CLEARML_TRITON_METRIC_FREQ
value: "1.0"
{{- if .Values.clearml_serving_triton.extraPythonPackages }}
- name: EXTRA_PYTHON_PACKAGES
value: '{{ join " " .Values.clearml_serving_triton.extraPythonPackages }}'
{{- end }}
image: "{{ .Values.clearml_serving_triton.image }}:{{ .Chart.AppVersion }}"
name: clearml-serving-triton
ports:
- containerPort: 8001
resources: {}
restartPolicy: Always
status: {}
{{ end }}

View File

@@ -0,0 +1,18 @@
{{ if .Values.clearml_serving_triton.enabled }}
apiVersion: v1
kind: Service
metadata:
annotations: {}
labels:
clearml.serving.service: clearml-serving-triton
name: clearml-serving-triton
spec:
ports:
- name: "8001"
port: 8001
targetPort: 8001
selector:
clearml.serving.service: clearml-serving-triton
status:
loadBalancer: {}
{{ end }}

View File

@@ -0,0 +1,14 @@
apiVersion: v1
kind: Secret
metadata:
name: grafana-config
stringData:
datasource.yaml: |-
apiVersion: 1
datasources:
- name: Prometheus
type: prometheus
# Access mode - proxy (server in the UI) or direct (browser in the UI).
access: proxy
url: http://clearml-serving-prometheus:9090

View File

@@ -0,0 +1,36 @@
apiVersion: apps/v1
kind: Deployment
metadata:
annotations: {}
labels:
clearml.serving.service: grafana
name: grafana
spec:
replicas: 1
selector:
matchLabels:
clearml.serving.service: grafana
strategy:
type: Recreate
template:
metadata:
annotations: {}
labels:
clearml.serving.network/clearml-serving-backend: "true"
clearml.serving.service: grafana
spec:
containers:
- image: {{ .Values.grafana.image }}
name: clearml-serving-grafana
ports:
- containerPort: 3000
resources: {}
volumeMounts:
- mountPath: /etc/grafana/provisioning/datasources/
name: grafana-conf
restartPolicy: Always
volumes:
- name: grafana-conf
secret:
secretName: grafana-config
status: {}

View File

@@ -0,0 +1,16 @@
apiVersion: v1
kind: Service
metadata:
annotations: {}
labels:
clearml.serving.service: grafana
name: clearml-serving-grafana
spec:
ports:
- name: "3000"
port: 3000
targetPort: 3000
selector:
clearml.serving.service: grafana
status:
loadBalancer: {}

View File

@@ -0,0 +1,41 @@
apiVersion: apps/v1
kind: Deployment
metadata:
annotations: {}
labels:
clearml.serving.service: kafka
name: kafka
spec:
replicas: 1
selector:
matchLabels:
clearml.serving.service: kafka
strategy: {}
template:
metadata:
annotations: {}
labels:
clearml.serving.network/clearml-serving-backend: "true"
clearml.serving.service: kafka
spec:
containers:
- env:
- name: ALLOW_PLAINTEXT_LISTENER
value: "yes"
- name: KAFKA_BROKER_ID
value: "1"
- name: KAFKA_CFG_ADVERTISED_LISTENERS
value: PLAINTEXT://clearml-serving-kafka:9092
- name: KAFKA_CFG_LISTENERS
value: PLAINTEXT://0.0.0.0:9092
- name: KAFKA_CFG_ZOOKEEPER_CONNECT
value: clearml-serving-zookeeper:2181
- name: KAFKA_CREATE_TOPICS
value: '"topic_test:1:1"'
image: {{ .Values.kafka.image }}
name: clearml-serving-kafka
ports:
- containerPort: 9092
resources: {}
restartPolicy: Always
status: {}

View File

@@ -0,0 +1,16 @@
apiVersion: v1
kind: Service
metadata:
annotations: {}
labels:
clearml.serving.service: kafka
name: clearml-serving-kafka
spec:
ports:
- name: "9092"
port: 9092
targetPort: 9092
selector:
clearml.serving.service: kafka
status:
loadBalancer: {}

View File

@@ -0,0 +1,28 @@
apiVersion: v1
kind: Secret
metadata:
name: prometheus-config
stringData:
prometheus.yml: |-
global:
scrape_interval: "15s" # By default, scrape targets every 15 seconds.
evaluation_interval: 15s # By default, scrape targets every 15 seconds.
external_labels:
monitor: 'clearml-serving'
scrape_configs:
# The job name is added as a label `job=<job_name>` to any timeseries scraped from this config.
- job_name: 'prometheus'
scrape_interval: 5s
static_configs:
- targets: ['localhost:9090']
# The job name is added as a label `job=<job_name>` to any timeseries scraped from this config.
- job_name: 'clearml-inference-stats'
scrape_interval: 5s
static_configs:
- targets: ['clearml-serving-statistics:9999']

View File

@@ -0,0 +1,43 @@
apiVersion: apps/v1
kind: Deployment
metadata:
annotations: {}
labels:
clearml.serving.service: prometheus
name: prometheus
spec:
replicas: 1
selector:
matchLabels:
clearml.serving.service: prometheus
strategy:
type: Recreate
template:
metadata:
annotations: {}
labels:
clearml.serving.network/clearml-serving-backend: "true"
clearml.serving.service: prometheus
spec:
containers:
- args:
- --config.file=/mnt/prometheus.yml
- --storage.tsdb.path=/prometheus
- --web.console.libraries=/etc/prometheus/console_libraries
- --web.console.templates=/etc/prometheus/consoles
- --storage.tsdb.retention.time=200h
- --web.enable-lifecycle
image: {{ .Values.prometheus.image }}
name: clearml-serving-prometheus
ports:
- containerPort: 9090
resources: {}
volumeMounts:
- mountPath: /mnt
name: prometheus-conf
restartPolicy: Always
volumes:
- name: prometheus-conf
secret:
secretName: prometheus-config
status: {}

View File

@@ -0,0 +1,16 @@
apiVersion: v1
kind: Service
metadata:
annotations: {}
labels:
clearml.serving.service: prometheus
name: clearml-serving-prometheus
spec:
ports:
- name: "9090"
port: 9090
targetPort: 9090
selector:
clearml.serving.service: prometheus
status:
loadBalancer: {}

View File

@@ -0,0 +1,31 @@
apiVersion: apps/v1
kind: Deployment
metadata:
annotations: {}
labels:
clearml.serving.service: zookeeper
name: zookeeper
spec:
replicas: 1
selector:
matchLabels:
clearml.serving.service: zookeeper
strategy: {}
template:
metadata:
annotations: {}
labels:
clearml.serving.network/clearml-serving-backend: "true"
clearml.serving.service: zookeeper
spec:
containers:
- env:
- name: ALLOW_ANONYMOUS_LOGIN
value: "yes"
image: {{ .Values.zookeeper.image }}
name: clearml-serving-zookeeper
ports:
- containerPort: 2181
resources: {}
restartPolicy: Always
status: {}

View File

@@ -0,0 +1,16 @@
apiVersion: v1
kind: Service
metadata:
annotations: {}
labels:
clearml.serving.service: zookeeper
name: clearml-serving-zookeeper
spec:
ports:
- name: "2181"
port: 2181
targetPort: 2181
selector:
clearml.serving.service: zookeeper
status:
loadBalancer: {}

View File

@@ -0,0 +1,79 @@
# Default values for clearml-serving.
clearml:
apiAccessKey: "ClearML API Access Key"
apiSecretKey: "ClearML API Secret Key"
apiHost: http://clearml-server-apiserver:8008
filesHost: http://clearml-server-fileserver:8081
webHost: http://clearml-server-webserver:80
defaultBaseServeUrl: http://127.0.0.1:8080/serve
servingTaskId: "ClearML Serving Task ID"
zookeeper:
image: bitnami/zookeeper:3.7.0
nodeSelector: {}
tolerations: []
affinity: {}
resources: {}
kafka:
image: bitnami/kafka:3.1.0
nodeSelector: {}
tolerations: []
affinity: {}
resources: {}
prometheus:
image: prom/prometheus:v2.34.0
nodeSelector: {}
tolerations: []
affinity: {}
resources: {}
grafana:
image: grafana/grafana:8.4.4-ubuntu
nodeSelector: {}
tolerations: []
affinity: {}
resources: {}
alertmanager:
image: prom/alertmanager:v0.23.0
nodeSelector: {}
tolerations: []
affinity: {}
resources: {}
clearml_serving_statistics:
image: allegroai/clearml-serving-statistics
nodeSelector: {}
tolerations: []
affinity: {}
resources: {}
# -- Extra Python Packages to be installed in running pods
extraPythonPackages: []
# - numpy==1.22.4
# - pandas==1.4.2
clearml_serving_inference:
image: allegroai/clearml-serving-inference
nodeSelector: {}
tolerations: []
affinity: {}
resources: {}
# -- Extra Python Packages to be installed in running pods
extraPythonPackages: []
# - numpy==1.22.4
# - pandas==1.4.2
clearml_serving_triton:
enabled: true
image: allegroai/clearml-serving-triton
nodeSelector: {}
tolerations: []
affinity: {}
resources: {}
# -- Extra Python Packages to be installed in running pods
extraPythonPackages: []
# - numpy==1.22.4
# - pandas==1.4.2

View File

@@ -1,12 +1,12 @@
dependencies:
- name: redis
repository: https://charts.bitnami.com/bitnami
repository: file://../../dependency_charts/redis
version: 10.9.0
- name: mongodb
repository: https://charts.bitnami.com/bitnami
repository: file://../../dependency_charts/mongodb
version: 10.3.4
- name: elasticsearch
repository: https://helm.elastic.co
version: 7.10.1
digest: sha256:aefd3992b2ab085161e4cca35c6f73dd33f8d19272a9405b5ee4e8c2a0e79bba
generated: "2021-01-05T14:26:33.629164+01:00"
repository: file://../../dependency_charts/elasticsearch
version: 7.16.2
digest: sha256:149b5a49382d280b1e083f3c193d014d3d2eb7fcdf3ec1402008996960cc173a
generated: "2022-06-02T21:09:00.961174+02:00"

View File

@@ -2,8 +2,8 @@ apiVersion: v2
name: clearml
description: MLOps platform
type: application
version: "2.2.0"
appVersion: "1.1.1"
version: "4.0.0"
appVersion: "1.5.0"
home: https://clear.ml
icon: https://raw.githubusercontent.com/allegroai/clearml/master/docs/clearml-logo.svg
sources:
@@ -18,14 +18,14 @@ keywords:
- mlops
dependencies:
- name: redis
version: "~10.9.0"
repository: "https://charts.bitnami.com/bitnami"
version: "10.9.0"
repository: "file://../../dependency_charts/redis"
condition: redis.enabled
- name: mongodb
version: "~10.3.2"
repository: "https://charts.bitnami.com/bitnami"
version: "10.3.4"
repository: "file://../../dependency_charts/mongodb"
condition: mongodb.enabled
- name: elasticsearch
version: "~7.10.1"
repository: "https://helm.elastic.co"
version: "7.16.2"
repository: "file://../../dependency_charts/elasticsearch"
condition: elasticsearch.enabled

View File

@@ -1,6 +1,6 @@
# ClearML Ecosystem for Kubernetes
![Version: 2.2.0](https://img.shields.io/badge/Version-2.2.0-informational?style=flat-square) ![Type: application](https://img.shields.io/badge/Type-application-informational?style=flat-square) ![AppVersion: 1.1.1](https://img.shields.io/badge/AppVersion-1.1.1-informational?style=flat-square)
![Version: 4.0.0](https://img.shields.io/badge/Version-4.0.0-informational?style=flat-square) ![Type: application](https://img.shields.io/badge/Type-application-informational?style=flat-square) ![AppVersion: 1.5.0](https://img.shields.io/badge/AppVersion-1.5.0-informational?style=flat-square)
MLOps platform
@@ -10,7 +10,7 @@ MLOps platform
| Name | Email | Url |
| ---- | ------ | --- |
| valeriano-manassero | | https://github.com/valeriano-manassero |
| valeriano-manassero | | <https://github.com/valeriano-manassero> |
## Introduction
@@ -31,22 +31,26 @@ For development/evaluation it's possible to use [kind](https://kind.sigs.k8s.io)
After installation, following commands will create a complete ClearML insatllation:
```
mkdir -pm 777 /tmp/clearml-kind
cat <<EOF > /tmp/clearml-kind.yaml
cat <<EOF | kind create cluster --config=- ─╯
kind: Cluster
apiVersion: kind.x-k8s.io/v1alpha4
nodes:
- role: control-plane
extraPortMappings:
# API server's default nodePort is 30008. If you customize it in helm values by
# `apiserver.service.nodePort`, `containerPort` should match it
- containerPort: 30008
hostPort: 30008
listenAddress: "127.0.0.1"
protocol: TCP
# Web server's default nodePort is 30080. If you customize it in helm values by
# `webserver.service.nodePort`, `containerPort` should match it
- containerPort: 30080
hostPort: 30080
listenAddress: "127.0.0.1"
protocol: TCP
# File server's default nodePort is 30081. If you customize it in helm values by
# `fileserver.service.nodePort`, `containerPort` should match it
- containerPort: 30081
hostPort: 30081
listenAddress: "127.0.0.1"
@@ -56,8 +60,6 @@ nodes:
containerPath: /var/local-path-provisioner
EOF
kind create cluster --config /tmp/clearml-kind.yaml
helm install clearml allegroai/clearml
```
@@ -83,6 +85,24 @@ This will create 3 ingress rules:
Just pointing the domain records to the IP where ingress controller is responding will complete the deployment process.
## Upgrades/ Values upgrades
Updating to latest version of this chart can be done in two steps:
```
helm repo update
helm upgrade clearml allegroai/clearml
```
Changing values on existing installation can be done with:
```
helm upgrade clearml allegroai/clearml --version <CURRENT CHART VERSION> -f custom_values.yaml
```
Please note: updating values only should always be done setting explicit chart version to avoid a possible chart update.
Keeping separate updates procedures between version and values can be a good practice to seprate potential concerns.
## Additional Configuration for ClearML Server
You can also configure the **clearml-server** for:
@@ -101,91 +121,22 @@ For detailed instructions, see the [Optional Configuration](https://github.com/a
| Repository | Name | Version |
|------------|------|---------|
| https://charts.bitnami.com/bitnami | mongodb | ~10.3.2 |
| https://charts.bitnami.com/bitnami | redis | ~10.9.0 |
| https://helm.elastic.co | elasticsearch | ~7.10.1 |
| file://../../dependency_charts/elasticsearch | elasticsearch | 7.16.2 |
| file://../../dependency_charts/mongodb | mongodb | 10.3.4 |
| file://../../dependency_charts/redis | redis | 10.9.0 |
## Values
| Key | Type | Default | Description |
|-----|------|---------|-------------|
| agentGroups.agent-group-cpu.affinity | object | `{}` | |
| agentGroups.agent-group-cpu.agentVersion | string | `""` | |
| agentGroups.agent-group-cpu.awsAccessKeyId | string | `nil` | |
| agentGroups.agent-group-cpu.awsDefaultRegion | string | `nil` | |
| agentGroups.agent-group-cpu.awsSecretAccessKey | string | `nil` | |
| agentGroups.agent-group-cpu.azureStorageAccount | string | `nil` | |
| agentGroups.agent-group-cpu.azureStorageKey | string | `nil` | |
| agentGroups.agent-group-cpu.clearmlAccessKey | string | `nil` | |
| agentGroups.agent-group-cpu.clearmlConfig | string | `"sdk {\n}"` | |
| agentGroups.agent-group-cpu.clearmlGitPassword | string | `nil` | |
| agentGroups.agent-group-cpu.clearmlGitUser | string | `nil` | |
| agentGroups.agent-group-cpu.clearmlSecretKey | string | `nil` | |
| agentGroups.agent-group-cpu.image.pullPolicy | string | `"IfNotPresent"` | |
| agentGroups.agent-group-cpu.image.repository | string | `"ubuntu"` | |
| agentGroups.agent-group-cpu.image.tag | string | `"18.04"` | |
| agentGroups.agent-group-cpu.name | string | `"agent-group-cpu"` | |
| agentGroups.agent-group-cpu.nodeSelector | object | `{}` | |
| agentGroups.agent-group-cpu.nvidiaGpusPerAgent | int | `0` | |
| agentGroups.agent-group-cpu.podAnnotations | object | `{}` | |
| agentGroups.agent-group-cpu.queues | string | `"default"` | |
| agentGroups.agent-group-cpu.replicaCount | int | `1` | |
| agentGroups.agent-group-cpu.tolerations | list | `[]` | |
| agentGroups.agent-group-cpu.updateStrategy | string | `"Recreate"` | |
| agentGroups.agent-group-gpu.affinity | object | `{}` | |
| agentGroups.agent-group-gpu.agentVersion | string | `""` | |
| agentGroups.agent-group-gpu.awsAccessKeyId | string | `nil` | |
| agentGroups.agent-group-gpu.awsDefaultRegion | string | `nil` | |
| agentGroups.agent-group-gpu.awsSecretAccessKey | string | `nil` | |
| agentGroups.agent-group-gpu.azureStorageAccount | string | `nil` | |
| agentGroups.agent-group-gpu.azureStorageKey | string | `nil` | |
| agentGroups.agent-group-gpu.clearmlAccessKey | string | `nil` | |
| agentGroups.agent-group-gpu.clearmlConfig | string | `"sdk {\n}"` | |
| agentGroups.agent-group-gpu.clearmlGitPassword | string | `nil` | |
| agentGroups.agent-group-gpu.clearmlGitUser | string | `nil` | |
| agentGroups.agent-group-gpu.clearmlSecretKey | string | `nil` | |
| agentGroups.agent-group-gpu.image.pullPolicy | string | `"IfNotPresent"` | |
| agentGroups.agent-group-gpu.image.repository | string | `"nvidia/cuda"` | |
| agentGroups.agent-group-gpu.image.tag | string | `"11.0-base-ubuntu18.04"` | |
| agentGroups.agent-group-gpu.name | string | `"agent-group-gpu"` | |
| agentGroups.agent-group-gpu.nodeSelector | object | `{}` | |
| agentGroups.agent-group-gpu.nvidiaGpusPerAgent | int | `1` | |
| agentGroups.agent-group-gpu.podAnnotations | object | `{}` | |
| agentGroups.agent-group-gpu.queues | string | `"default"` | |
| agentGroups.agent-group-gpu.replicaCount | int | `0` | |
| agentGroups.agent-group-gpu.tolerations | list | `[]` | |
| agentGroups.agent-group-gpu.updateStrategy | string | `"Recreate"` | |
| agentservices.affinity | object | `{}` | |
| agentservices.agentVersion | string | `""` | |
| agentservices.awsAccessKeyId | string | `nil` | |
| agentservices.awsDefaultRegion | string | `nil` | |
| agentservices.awsSecretAccessKey | string | `nil` | |
| agentservices.azureStorageAccount | string | `nil` | |
| agentservices.azureStorageKey | string | `nil` | |
| agentservices.clearmlFilesHost | string | `nil` | |
| agentservices.clearmlGitPassword | string | `nil` | |
| agentservices.clearmlGitUser | string | `nil` | |
| agentservices.clearmlHostIp | string | `nil` | |
| agentservices.clearmlWebHost | string | `nil` | |
| agentservices.clearmlWorkerId | string | `"clearml-services"` | |
| agentservices.extraEnvs | list | `[]` | |
| agentservices.googleCredentials | string | `nil` | |
| agentservices.image.pullPolicy | string | `"IfNotPresent"` | |
| agentservices.image.repository | string | `"allegroai/clearml-agent-services"` | |
| agentservices.image.tag | string | `"latest"` | |
| agentservices.nodeSelector | object | `{}` | |
| agentservices.podAnnotations | object | `{}` | |
| agentservices.replicaCount | int | `1` | |
| agentservices.resources | object | `{}` | |
| agentservices.storage.data.class | string | `"standard"` | |
| agentservices.storage.data.size | string | `"50Gi"` | |
| agentservices.tolerations | list | `[]` | |
| apiserver.additionalConfigs | object | `{}` | additional configurations that can be used by api server; check examples in values.yaml file |
| apiserver.affinity | object | `{}` | |
| apiserver.authCookiesMaxAge | int | `864000` | Amount of seconds the authorization cookie will last in user browser |
| apiserver.configDir | string | `"/opt/clearml/config"` | |
| apiserver.extraEnvs | list | `[]` | |
| apiserver.image.pullPolicy | string | `"IfNotPresent"` | |
| apiserver.image.repository | string | `"allegroai/clearml"` | |
| apiserver.image.tag | string | `"1.1.1"` | |
| apiserver.image.tag | string | `"1.5.0"` | |
| apiserver.livenessDelay | int | `60` | |
| apiserver.nodeSelector | object | `{}` | |
| apiserver.podAnnotations | object | `{}` | |
@@ -195,13 +146,11 @@ For detailed instructions, see the [Optional Configuration](https://github.com/a
| apiserver.readinessDelay | int | `60` | |
| apiserver.replicaCount | int | `1` | |
| apiserver.resources | object | `{}` | |
| apiserver.service.nodePort | int | `30008` | If service.type set to NodePort, this will be set to service's nodePort field. If service.type is set to others, this field will be ignored |
| apiserver.service.port | int | `8008` | |
| apiserver.service.type | string | `"NodePort"` | |
| apiserver.storage.config.class | string | `"standard"` | |
| apiserver.storage.config.size | string | `"1Gi"` | |
| apiserver.storage.enableConfigVolume | bool | `false` | |
| apiserver.service.type | string | `"NodePort"` | This will set to service's spec.type field |
| apiserver.tolerations | list | `[]` | |
| clearml.defaultCompany | string | `"d1bd92a3b039400cbafc60a7a5b1e52b"` | |
| clearml | object | `{"defaultCompany":"d1bd92a3b039400cbafc60a7a5b1e52b"}` | ClearMl generic configurations |
| elasticsearch.clusterHealthCheckParams | string | `"wait_for_status=yellow&timeout=1s"` | |
| elasticsearch.clusterName | string | `"clearml-elastic"` | |
| elasticsearch.enabled | bool | `true` | |
@@ -237,28 +186,51 @@ For detailed instructions, see the [Optional Configuration](https://github.com/a
| elasticsearch.roles.remote_cluster_client | string | `"true"` | |
| elasticsearch.volumeClaimTemplate.accessModes[0] | string | `"ReadWriteOnce"` | |
| elasticsearch.volumeClaimTemplate.resources.requests.storage | string | `"50Gi"` | |
| externalServices.elasticsearchHost | string | `""` | Existing ElasticSearch Hostname to use if elasticsearch.enabled is false |
| externalServices.elasticsearchPort | int | `9200` | Existing ElasticSearch Port to use if elasticsearch.enabled is false |
| externalServices.mongodbHost | string | `""` | Existing MongoDB Hostname to use if elasticsearch.enabled is false |
| externalServices.mongodbPort | int | `27017` | Existing MongoDB Port to use if elasticsearch.enabled is false |
| externalServices.redisHost | string | `""` | Existing Redis Hostname to use if elasticsearch.enabled is false |
| externalServices.redisPort | int | `6379` | Existing Redis Port to use if elasticsearch.enabled is false |
| fileserver.affinity | object | `{}` | |
| fileserver.extraEnvs | list | `[]` | |
| fileserver.image.pullPolicy | string | `"IfNotPresent"` | |
| fileserver.image.repository | string | `"allegroai/clearml"` | |
| fileserver.image.tag | string | `"1.1.1"` | |
| fileserver.image.tag | string | `"1.5.0"` | |
| fileserver.nodeSelector | object | `{}` | |
| fileserver.podAnnotations | object | `{}` | |
| fileserver.replicaCount | int | `1` | |
| fileserver.resources | object | `{}` | |
| fileserver.service.nodePort | int | `30081` | If service.type set to NodePort, this will be set to service's nodePort field. If service.type is set to others, this field will be ignored |
| fileserver.service.port | int | `8081` | |
| fileserver.service.type | string | `"NodePort"` | |
| fileserver.storage.data.class | string | `"standard"` | |
| fileserver.service.type | string | `"NodePort"` | This will set to service's spec.type field |
| fileserver.storage.data.class | string | `""` | |
| fileserver.storage.data.size | string | `"50Gi"` | |
| fileserver.tolerations | list | `[]` | |
| imageCredentials | object | `{"email":"someone@host.com","enabled":false,"existingSecret":"","password":"pwd","registry":"docker.io","username":"someone"}` | Private image registry configuration |
| imageCredentials.email | string | `"someone@host.com"` | Email |
| imageCredentials.enabled | bool | `false` | Use private authentication mode |
| imageCredentials.existingSecret | string | `""` | If this is set, chart will not generate a secret but will use what is defined here |
| imageCredentials.password | string | `"pwd"` | Registry password |
| imageCredentials.registry | string | `"docker.io"` | Registry name |
| imageCredentials.username | string | `"someone"` | Registry username |
| ingress.annotations | object | `{}` | |
| ingress.enabled | bool | `false` | |
| ingress.host | string | `""` | |
| ingress.hostPrefixApi | string | `"api."` | |
| ingress.hostPrefixApp | string | `"app."` | |
| ingress.hostPrefixFiles | string | `"files."` | |
| ingress.api.annotations | object | `{}` | |
| ingress.api.enabled | bool | `false` | |
| ingress.api.hostName | string | `"api.clearml.127-0-0-1.nip.io"` | |
| ingress.api.path | string | `"/"` | |
| ingress.api.tlsSecretName | string | `""` | |
| ingress.app.annotations | object | `{}` | |
| ingress.app.enabled | bool | `false` | |
| ingress.app.hostName | string | `"app.clearml.127-0-0-1.nip.io"` | |
| ingress.app.path | string | `"/"` | |
| ingress.app.tlsSecretName | string | `""` | |
| ingress.files.annotations | object | `{}` | |
| ingress.files.enabled | bool | `false` | |
| ingress.files.hostName | string | `"files.clearml.127-0-0-1.nip.io"` | |
| ingress.files.path | string | `"/"` | |
| ingress.files.tlsSecretName | string | `""` | |
| ingress.name | string | `"clearml-server-ingress"` | |
| ingress.tls.secretName | string | `""` | |
| mongodb.architecture | string | `"standalone"` | |
| mongodb.auth.enabled | bool | `false` | |
| mongodb.enabled | bool | `true` | |
@@ -279,15 +251,23 @@ For detailed instructions, see the [Optional Configuration](https://github.com/a
| redis.master.persistence.size | string | `"5Gi"` | |
| redis.master.port | int | `6379` | |
| redis.usePassword | bool | `false` | |
| secret.authToken | string | `"1SCf0ov3Nm544Td2oZ0gXSrsNx5XhMWdVlKz1tOgcx158bD5RV"` | Set for auth_token field |
| secret.credentials.apiserver.accessKey | string | `"5442F3443MJMORWZA3ZH"` | Set for apiserver_key field |
| secret.credentials.apiserver.secretKey | string | `"BxapIRo9ZINi8x25CRxz8Wdmr2pQjzuWVB4PNASZqCtTyWgWVQ"` | Set for apiserver_secret field |
| secret.credentials.tests.accessKey | string | `"ENP39EQM4SLACGD5FXB7"` | Set for tests_user_key field |
| secret.credentials.tests.secretKey | string | `"lPcm0imbcBZ8mwgO7tpadutiS3gnJD05x9j7afwXPS35IKbpiQ"` | Set for tests_user_secret field |
| secret.httpSession | string | `"9Tw20RbhJ1bLBiHEOWXvhplKGUbTgLzAtwFN2oLQvWwS0uRpD5"` | Set for http_session field |
| webserver.additionalConfigs | object | `{}` | |
| webserver.affinity | object | `{}` | |
| webserver.extraEnvs | list | `[]` | |
| webserver.image.pullPolicy | string | `"IfNotPresent"` | |
| webserver.image.repository | string | `"allegroai/clearml"` | |
| webserver.image.tag | string | `"1.1.1"` | |
| webserver.image.tag | string | `"1.5.0"` | |
| webserver.nodeSelector | object | `{}` | |
| webserver.podAnnotations | object | `{}` | |
| webserver.replicaCount | int | `1` | |
| webserver.resources | object | `{}` | |
| webserver.service.nodePort | int | `30080` | If service.type set to NodePort, this will be set to service's nodePort field. If service.type is set to others, this field will be ignored |
| webserver.service.port | int | `80` | |
| webserver.service.type | string | `"NodePort"` | |
| webserver.service.type | string | `"NodePort"` | This will set to service's spec.type field |
| webserver.tolerations | list | `[]` | |

View File

@@ -28,22 +28,26 @@ For development/evaluation it's possible to use [kind](https://kind.sigs.k8s.io)
After installation, following commands will create a complete ClearML insatllation:
```
mkdir -pm 777 /tmp/clearml-kind
cat <<EOF > /tmp/clearml-kind.yaml
cat <<EOF | kind create cluster --config=- ─╯
kind: Cluster
apiVersion: kind.x-k8s.io/v1alpha4
nodes:
- role: control-plane
extraPortMappings:
# API server's default nodePort is 30008. If you customize it in helm values by
# `apiserver.service.nodePort`, `containerPort` should match it
- containerPort: 30008
hostPort: 30008
listenAddress: "127.0.0.1"
protocol: TCP
# Web server's default nodePort is 30080. If you customize it in helm values by
# `webserver.service.nodePort`, `containerPort` should match it
- containerPort: 30080
hostPort: 30080
listenAddress: "127.0.0.1"
protocol: TCP
# File server's default nodePort is 30081. If you customize it in helm values by
# `fileserver.service.nodePort`, `containerPort` should match it
- containerPort: 30081
hostPort: 30081
listenAddress: "127.0.0.1"
@@ -53,8 +57,6 @@ nodes:
containerPath: /var/local-path-provisioner
EOF
kind create cluster --config /tmp/clearml-kind.yaml
helm install clearml allegroai/clearml
```
@@ -80,6 +82,24 @@ This will create 3 ingress rules:
Just pointing the domain records to the IP where ingress controller is responding will complete the deployment process.
## Upgrades/ Values upgrades
Updating to latest version of this chart can be done in two steps:
```
helm repo update
helm upgrade clearml allegroai/clearml
```
Changing values on existing installation can be done with:
```
helm upgrade clearml allegroai/clearml --version <CURRENT CHART VERSION> -f custom_values.yaml
```
Please note: updating values only should always be done setting explicit chart version to avoid a possible chart update.
Keeping separate updates procedures between version and values can be a good practice to seprate potential concerns.
## Additional Configuration for ClearML Server
You can also configure the **clearml-server** for:

Binary file not shown.

View File

@@ -0,0 +1,7 @@
Place values files with different values in this directory to ensure these cases are tested by the CI as well.
https://github.com/helm/chart-testing/blob/main/doc/ct_install.md
```
"Charts may have multiple custom values files matching the glob pattern '*-values.yaml' in a directory named 'ci' in the root of the chart's directory. The chart is installed and tested for each of these files. If no custom values file is present, the chart is installed and tested with defaults."
```

View File

@@ -0,0 +1 @@
# empty so default values.yaml gets tested

View File

@@ -95,3 +95,62 @@ Create the name of the service account to use
{{- default "default" .Values.serviceAccount.name }}
{{- end }}
{{- end }}
{{/*
Create the name of the App service to use
*/}}
{{- define "clearml.serviceApp" -}}
{{- if .Values.ingress.enabled }}
{{- if .Values.ingress.app.tlsSecretName }}
{{- printf "%s%s" "https://" .Values.ingress.app.hostName }}
{{- else }}
{{- printf "%s%s" "http://" .Values.ingress.app.hostName }}
{{- end }}
{{- else }}
{{- printf "%s%s%s%s" "http://" (include "clearml.fullname" .) "-webserver:" (.Values.webserver.service.port | toString) }}
{{- end }}
{{- end }}
{{/*
Create the name of the Api service to use
*/}}
{{- define "clearml.serviceApi" -}}
{{- if .Values.ingress.enabled }}
{{- if .Values.ingress.api.tlsSecretName }}
{{- printf "%s%s" "https://" .Values.ingress.api.hostName }}
{{- else }}
{{- printf "%s%s" "http://" .Values.ingress.api.hostName }}
{{- end }}
{{- else }}
{{- printf "%s%s%s%s" "http://" (include "clearml.fullname" .) "-apiserver:" (.Values.apiserver.service.port | toString) }}
{{- end }}
{{- end }}
{{/*
Create the name of the Files service to use
*/}}
{{- define "clearml.serviceFiles" -}}
{{- if .Values.ingress.enabled }}
{{- if .Values.ingress.files.tlsSecretName }}
{{- printf "%s%s" "https://" .Values.ingress.files.hostName }}
{{- else }}
{{- printf "%s%s" "http://" .Values.ingress.files.hostName }}
{{- end }}
{{- else }}
{{- printf "%s%s%s%s" "http://" (include "clearml.fullname" .) "-fileserver:" (.Values.fileserver.service.port | toString) }}
{{- end }}
{{- end }}
{{/*
Return the proper Docker Image Registry Secret Names
*/}}
{{- define "clearml.imagePullSecrets" -}}
{{- if .Values.global }}
{{- if .Values.global.imagePullSecrets }}
imagePullSecrets:
{{- range .Values.global.imagePullSecrets }}
- name: {{ . }}
{{- end }}
{{- end -}}
{{- end -}}
{{- end -}}

View File

@@ -0,0 +1,13 @@
{{- if .Values.apiserver.additionalConfigs -}}
apiVersion: v1
kind: ConfigMap
metadata:
name: "{{ include "clearml.fullname" . }}-apiserver-configmap"
labels:
{{- include "clearml.labels" . | nindent 4 }}
data:
{{- range $key, $val := .Values.apiserver.additionalConfigs }}
{{ $key }}: |
{{- $val | nindent 4 }}
{{- end }}
{{- end -}}

View File

@@ -0,0 +1,13 @@
{{- if .Values.webserver.additionalConfigs -}}
apiVersion: v1
kind: ConfigMap
metadata:
name: "{{ include "clearml.fullname" . }}-webserver-configmap"
labels:
{{- include "clearml.labels" . | nindent 4 }}
data:
{{- range $key, $val := .Values.webserver.additionalConfigs }}
{{ $key }}: |
{{- $val | nindent 4 }}
{{- end }}
{{- end -}}

View File

@@ -1,116 +0,0 @@
{{- range $key, $value := .Values.agentGroups }}
{{- with $value }}
---
apiVersion: apps/v1
kind: Deployment
metadata:
name: {{ include "clearml.fullname" $ }}-{{ .name }}-agent
labels:
{{- include "clearml.labels" $ | nindent 4 }}
spec:
replicas: {{ .replicaCount }}
strategy:
type: {{ .updateStrategy }}
selector:
matchLabels:
{{- include "clearml.selectorLabelsAgent" $ | nindent 6 }}
template:
metadata:
{{- with .podAnnotations }}
annotations:
{{- toYaml . | nindent 8 }}
{{- end }}
labels:
{{- include "clearml.selectorLabelsAgent" $ | nindent 8 }}
spec:
volumes:
{{ if .clearmlConfig }}
- name: agent-clearml-conf-volume
secret:
secretName: {{ .name }}-conf
items:
- key: clearml.conf
path: clearml.conf
{{ end }}
initContainers:
- name: init-agent-{{ .name }}
image: "{{ .image.repository }}:{{ .image.tag | default $.Chart.AppVersion }}"
command:
- /bin/sh
- -c
- >
set -x;
while [ $(curl -sw '%{http_code}' "http://{{ include "clearml.fullname" $ }}-apiserver:{{ $.Values.apiserver.service.port }}/debug.ping" -o /dev/null) -ne 200 ] ; do
echo "waiting for apiserver" ;
sleep 5 ;
done
containers:
- name: {{ $.Chart.Name }}-{{ .name }}
image: "{{ .image.repository }}:{{ .image.tag }}"
imagePullPolicy: {{ .image.pullPolicy }}
securityContext:
privileged: true
resources:
limits:
nvidia.com/gpu:
{{ .nvidiaGpusPerAgent }}
env:
- name: CLEARML_API_HOST
value: 'http://{{ include "clearml.fullname" $ }}-apiserver:{{ $.Values.apiserver.service.port }}'
- name: CLEARML_WEB_HOST
value: 'http://{{ include "clearml.fullname" $ }}-webserver:{{ $.Values.webserver.service.port }}'
- name: CLEARML_FILES_HOST
value: 'http://{{ include "clearml.fullname" $ }}-fileserver:{{ $.Values.fileserver.service.port }}'
- name: CLEARML_AGENT_GIT_USER
value: {{ .clearmlGitUser}}
- name: CLEARML_AGENT_GIT_PASS
value: {{ .clearmlGitPassword}}
- name: AWS_ACCESS_KEY_ID
value: {{ .awsAccessKeyId}}
- name: AWS_SECRET_ACCESS_KEY
value: {{ .awsSecretAccessKey}}
- name: AWS_DEFAULT_REGION
value: {{ .awsDefaultRegion}}
- name: AZURE_STORAGE_ACCOUNT
value: {{ .azureStorageAccount}}
- name: AZURE_STORAGE_KEY
value: {{ .azureStorageKey}}
- name: CLEARML_API_ACCESS_KEY
valueFrom:
secretKeyRef:
name: clearml-conf
key: tests_user_key
- name: CLEARML_API_SECRET_KEY
valueFrom:
secretKeyRef:
name: clearml-conf
key: tests_user_secret
command:
- /bin/sh
- -c
- "apt-get update ;
apt-get install -y curl python3-pip git;
python3 -m pip install -U pip ;
python3 -m pip install clearml-agent{{ .agentVersion}} ;
CLEARML_AGENT_K8S_HOST_MOUNT=/root/.clearml:/root/.clearml clearml-agent daemon --queue {{ .queues}}"
{{ if .clearmlConfig }}
volumeMounts:
- name: agent-clearml-conf-volume
mountPath: /root/clearml.conf
subPath: clearml.conf
readOnly: true
{{- end }}
{{- with .nodeSelector }}
nodeSelector:
{{- toYaml . | nindent 8 }}
{{- end }}
{{- with .affinity }}
affinity:
{{- toYaml . | nindent 8 }}
{{- end }}
{{- with .tolerations }}
tolerations:
{{- toYaml . | nindent 8 }}
{{- end }}
{{- end }}
{{- end }}

View File

@@ -1,100 +0,0 @@
apiVersion: apps/v1
kind: Deployment
metadata:
name: {{ include "clearml.fullname" . }}-agentservices
labels:
{{- include "clearml.labels" . | nindent 4 }}
spec:
replicas: {{ .Values.agentservices.replicaCount }}
selector:
matchLabels:
{{- include "clearml.selectorLabelsAgentServices" . | nindent 6 }}
template:
metadata:
{{- with .Values.agentservices.podAnnotations }}
annotations:
{{- toYaml . | nindent 8 }}
{{- end }}
labels:
{{- include "clearml.selectorLabelsAgentServices" . | nindent 8 }}
spec:
volumes:
- name: agentservices-data
persistentVolumeClaim:
claimName: {{ include "clearml.fullname" . }}-agentservices-data
initContainers:
- name: init-agentservices
image: "{{ .Values.agentservices.image.repository }}:{{ .Values.agentservices.image.tag | default .Chart.AppVersion }}"
command:
- /bin/sh
- -c
- >
set -x;
while [ $(curl -sw '%{http_code}' "http://{{ include "clearml.fullname" . }}-apiserver:{{ .Values.apiserver.service.port }}/debug.ping" -o /dev/null) -ne 200 ] ; do
echo "waiting for apiserver" ;
sleep 5 ;
done
containers:
- name: {{ .Chart.Name }}
image: "{{ .Values.agentservices.image.repository }}:{{ .Values.agentservices.image.tag | default .Chart.AppVersion }}"
imagePullPolicy: {{ .Values.agentservices.image.pullPolicy }}
env:
- name: CLEARML_HOST_IP
value: {{ .Values.agentservices.clearmlHostIp }}
- name: CLEARML_API_HOST
value: "http://{{ include "clearml.fullname" . }}-apiserver:{{ .Values.apiserver.service.port }}"
- name: CLEARML_WEB_HOST
value: {{ .Values.agentservices.clearmlWebHost }}
- name: CLEARML_FILES_HOST
value: {{ .Values.agentservices.clearmlFilesHost }}
- name: CLEARML_AGENT_GIT_USER
value: {{ .Values.agentservices.clearmlGitUser }}
- name: CLEARML_AGENT_GIT_PASS
value: {{ .Values.agentservices.clearmlGitPassword }}
- name: CLEARML_AGENT_UPDATE_VERSION
value: {{ .Values.agentservices.agentVersion }}
- name: CLEARML_AGENT_DEFAULT_BASE_DOCKER
value: {{ .Values.agentservices.defaultBaseDocker }}
- name: AWS_ACCESS_KEY_ID
value: {{ .Values.agentservices.awsAccessKeyId }}
- name: AWS_SECRET_ACCESS_KEY
value: {{ .Values.agentservices.awsSecretAccessKey }}
- name: AWS_DEFAULT_REGION
value: {{ .Values.agentservices.awsDefaultRegion }}
- name: AZURE_STORAGE_ACCOUNT
value: {{ .Values.agentservices.azureStorageAccount }}
- name: AZURE_STORAGE_KEY
value: {{ .Values.agentservices.azureStorageKey }}
- name: GOOGLE_APPLICATION_CREDENTIALS
value: {{ .Values.agentservices.googleCredentials }}
- name: CLEARML_WORKER_ID
value: {{ .Values.agentservices.clearmlWorkerId }}
- name: CLEARML_API_ACCESS_KEY
valueFrom:
secretKeyRef:
name: clearml-conf
key: tests_user_key
- name: CLEARML_API_SECRET_KEY
valueFrom:
secretKeyRef:
name: clearml-conf
key: tests_user_secret
args:
- agentservices
volumeMounts:
- name: agentservices-data
mountPath: /root/.clearml
resources:
{{- toYaml .Values.agentservices.resources | nindent 12 }}
{{- with .Values.agentservices.nodeSelector }}
nodeSelector:
{{- toYaml . | nindent 8 }}
{{- end }}
{{- with .Values.agentservices.affinity }}
affinity:
{{- toYaml . | nindent 8 }}
{{- end }}
{{- with .Values.agentservices.tolerations }}
tolerations:
{{- toYaml . | nindent 8 }}
{{- end }}

View File

@@ -11,18 +11,21 @@ spec:
{{- include "clearml.selectorLabelsApiServer" . | nindent 6 }}
template:
metadata:
{{- with .Values.apiserver.podAnnotations }}
annotations:
checksum/secret: {{ include (print $.Template.BasePath "/secrets.yaml") . | sha256sum }}
{{- with .Values.apiserver.podAnnotations }}
{{- toYaml . | nindent 8 }}
{{- end }}
labels:
{{- include "clearml.selectorLabelsApiServer" . | nindent 8 }}
spec:
{{- if .Values.apiserver.storage.enableConfigVolume }}
volumes:
- name: apiserver-config
persistentVolumeClaim:
claimName: {{ include "clearml.fullname" . }}-apiserver-config
{{- if .Values.imageCredentials.enabled }}
imagePullSecrets:
{{- if .Values.imageCredentials.existingSecret }}
- name: .Values.imageCredentials.existingSecret
{{- else }}
- name: clearml-agent-registry-key
{{- end }}
{{- end }}
containers:
- name: {{ .Chart.Name }}
@@ -34,23 +37,49 @@ spec:
protocol: TCP
env:
- name: CLEARML_ELASTIC_SERVICE_HOST
{{- if .Values.elasticsearch.enabled }}
value: "{{ .Values.elasticsearch.clusterName }}-master"
{{- else }}
value: "{{ .Values.externalServices.elasticsearchHost }}"
{{- end }}
- name: CLEARML_ELASTIC_SERVICE_PORT
{{- if .Values.elasticsearch.enabled }}
value: "{{ .Values.elasticsearch.httpPort }}"
{{- else }}
value: "{{ .Values.externalServices.elasticsearchPort }}"
{{- end }}
- name: CLEARML_MONGODB_SERVICE_HOST
{{- if .Values.mongodb.enabled }}
value: "{{ tpl .Values.mongodb.service.name . }}"
{{- else }}
value: "{{ .Values.externalServices.mongodbHost }}"
{{- end }}
- name: CLEARML_MONGODB_SERVICE_PORT
{{- if .Values.mongodb.enabled }}
value: "{{ .Values.mongodb.service.port }}"
{{- else }}
value: "{{ .Values.externalServices.mongodbPort }}"
{{- end }}
- name: CLEARML_REDIS_SERVICE_HOST
{{- if .Values.redis.enabled }}
value: "{{ tpl .Values.redis.master.name . }}"
{{- else }}
value: "{{ .Values.externalServices.redisHost }}"
{{- end }}
- name: CLEARML_REDIS_SERVICE_PORT
{{- if .Values.redis.enabled }}
value: "{{ .Values.redis.master.port }}"
{{- else }}
value: "{{ .Values.externalServices.redisPort }}"
{{- end }}
- name: CLEARML__APISERVER__PRE_POPULATE__ENABLED
value: "{{ .Values.apiserver.prepopulateEnabled }}"
- name: CLEARML__APISERVER__PRE_POPULATE__ZIP_FILES
value: "{{ .Values.apiserver.prepopulateZipFiles }}"
- name: CLEARML_SERVER_DEPLOYMENT_TYPE
value: "helm-cloud"
- name: CLEARML__APISERVER__AUTH__COOKIES__MAX_AGE
value: "{{ .Values.apiserver.authCookiesMaxAge }}"
- name: CLEARML_CONFIG_DIR
value: /opt/clearml/config
- name: CLEARML__APISERVER__DEFAULT_COMPANY
@@ -101,13 +130,19 @@ spec:
httpGet:
path: /debug.ping
port: 8008
{{- if .Values.apiserver.storage.enableConfigVolume }}
{{- if .Values.apiserver.additionalConfigs }}
volumeMounts:
- name: apiserver-config
mountPath: /opt/clearml/config
{{- end }}
resources:
{{- toYaml .Values.apiserver.resources | nindent 12 }}
{{- if .Values.apiserver.additionalConfigs }}
volumes:
- name: apiserver-config
configMap:
name: "{{ include "clearml.fullname" . }}-apiserver-configmap"
{{- end }}
{{- with .Values.apiserver.nodeSelector }}
nodeSelector:
{{- toYaml . | nindent 8 }}

View File

@@ -22,6 +22,14 @@ spec:
- name: fileserver-data
persistentVolumeClaim:
claimName: {{ include "clearml.fullname" . }}-fileserver-data
{{- if .Values.imageCredentials.enabled }}
imagePullSecrets:
{{- if .Values.imageCredentials.existingSecret }}
- name: .Values.imageCredentials.existingSecret
{{- else }}
- name: clearml-agent-registry-key
{{- end }}
{{- end }}
containers:
- name: {{ .Chart.Name }}
image: "{{ .Values.fileserver.image.repository }}:{{ .Values.fileserver.image.tag | default .Chart.AppVersion }}"

View File

@@ -18,6 +18,14 @@ spec:
labels:
{{- include "clearml.selectorLabelsWebServer" . | nindent 8 }}
spec:
{{- if .Values.imageCredentials.enabled }}
imagePullSecrets:
{{- if .Values.imageCredentials.existingSecret }}
- name: .Values.imageCredentials.existingSecret
{{- else }}
- name: clearml-agent-registry-key
{{- end }}
{{- end }}
containers:
- name: {{ .Chart.Name }}
image: "{{ .Values.webserver.image.repository }}:{{ .Values.webserver.image.tag | default .Chart.AppVersion }}"
@@ -38,6 +46,11 @@ spec:
- curl
- -X OPTIONS
- http://0.0.0.0:80/
{{- if .Values.webserver.additionalConfigs }}
volumeMounts:
- name: webserver-config
mountPath: /opt/clearml/config
{{- end }}
env:
- name: NGINX_APISERVER_ADDRESS
value: "http://{{ include "clearml.fullname" . }}-apiserver:{{ .Values.apiserver.service.port }}"
@@ -50,6 +63,12 @@ spec:
- webserver
resources:
{{- toYaml .Values.webserver.resources | nindent 12 }}
{{- if .Values.webserver.additionalConfigs }}
volumes:
- name: webserver-config
configMap:
name: "{{ include "clearml.fullname" . }}-webserver-configmap"
{{- end }}
{{- with .Values.webserver.nodeSelector }}
nodeSelector:
{{- toYaml . | nindent 8 }}
@@ -61,4 +80,4 @@ spec:
{{- with .Values.webserver.tolerations }}
tolerations:
{{- toYaml . | nindent 8 }}
{{- end }}
{{- end }}

View File

@@ -0,0 +1,45 @@
{{- if .Values.ingress.api.enabled -}}
{{- if semverCompare ">=1.19-0" .Capabilities.KubeVersion.GitVersion -}}
apiVersion: networking.k8s.io/v1
{{- else if semverCompare ">=1.14-0" .Capabilities.KubeVersion.GitVersion -}}
apiVersion: networking.k8s.io/v1beta1
{{- else -}}
apiVersion: extensions/v1beta1
{{- end }}
kind: Ingress
metadata:
name: {{ include "clearml.fullname" . }}-api
labels:
{{- include "clearml.labels" . | nindent 4 }}
{{- $annotations := .Values.ingress.annotations }}
{{- if .Values.ingress.api.annotations }}
{{- $annotations = mergeOverwrite $annotations .Values.ingress.api.annotations }}
{{- end }}
annotations:
{{- toYaml $annotations | nindent 4 }}
spec:
{{- if .Values.ingress.api.tlsSecretName }}
tls:
- hosts:
- {{ .Values.ingress.api.hostName }}
secretName: {{ .Values.ingress.api.tlsSecretName }}
{{- end }}
rules:
- host: {{ .Values.ingress.api.hostName }}
http:
paths:
- path: {{ .Values.ingress.api.path }}
{{ if semverCompare ">=1.19-0" .Capabilities.KubeVersion.GitVersion }}
pathType: Prefix
backend:
service:
name: {{ include "clearml.fullname" . }}-apiserver
port:
number: {{ .Values.apiserver.service.port }}
{{ else }}
backend:
serviceName: {{ include "clearml.fullname" . }}-apiserver
servicePort: {{ .Values.apiserver.service.port }}
{{ end }}
{{- end }}

View File

@@ -0,0 +1,44 @@
{{- if .Values.ingress.app.enabled -}}
{{- if semverCompare ">=1.19-0" .Capabilities.KubeVersion.GitVersion -}}
apiVersion: networking.k8s.io/v1
{{- else if semverCompare ">=1.14-0" .Capabilities.KubeVersion.GitVersion -}}
apiVersion: networking.k8s.io/v1beta1
{{- else -}}
apiVersion: extensions/v1beta1
{{- end }}
kind: Ingress
metadata:
name: {{ include "clearml.fullname" . }}-app
labels:
{{- include "clearml.labels" . | nindent 4 }}
{{- $annotations := .Values.ingress.annotations }}
{{- if .Values.ingress.app.annotations }}
{{- $annotations = mergeOverwrite $annotations .Values.ingress.app.annotations }}
{{- end }}
annotations:
{{- toYaml $annotations | nindent 4 }}
spec:
{{- if .Values.ingress.app.tlsSecretName }}
tls:
- hosts:
- {{ .Values.ingress.app.hostName }}
secretName: {{ .Values.ingress.app.tlsSecretName }}
{{- end }}
rules:
- host: {{ .Values.ingress.app.hostName }}
http:
paths:
- path: {{ .Values.ingress.app.path }}
{{ if semverCompare ">=1.19-0" .Capabilities.KubeVersion.GitVersion }}
pathType: Prefix
backend:
service:
name: {{ include "clearml.fullname" . }}-webserver
port:
number: {{ .Values.webserver.service.port }}
{{ else }}
backend:
serviceName: {{ include "clearml.fullname" . }}-webserver
servicePort: {{ .Values.webserver.service.port }}
{{ end }}
{{- end }}

View File

@@ -0,0 +1,44 @@
{{- if .Values.ingress.files.enabled -}}
{{- if semverCompare ">=1.19-0" .Capabilities.KubeVersion.GitVersion -}}
apiVersion: networking.k8s.io/v1
{{- else if semverCompare ">=1.14-0" .Capabilities.KubeVersion.GitVersion -}}
apiVersion: networking.k8s.io/v1beta1
{{- else -}}
apiVersion: extensions/v1beta1
{{- end }}
kind: Ingress
metadata:
name: {{ include "clearml.fullname" . }}-files
labels:
{{- include "clearml.labels" . | nindent 4 }}
{{- $annotations := .Values.ingress.annotations }}
{{- if .Values.ingress.files.annotations }}
{{- $annotations = mergeOverwrite $annotations .Values.ingress.files.annotations }}
{{- end }}
annotations:
{{- toYaml $annotations | nindent 4 }}
spec:
{{- if .Values.ingress.files.tlsSecretName }}
tls:
- hosts:
- {{ .Values.ingress.files.hostName }}
secretName: {{ .Values.ingress.files.tlsSecretName }}
{{- end }}
rules:
- host: {{ .Values.ingress.files.hostName }}
http:
paths:
- path: {{ .Values.ingress.files.path }}
{{ if semverCompare ">=1.19-0" .Capabilities.KubeVersion.GitVersion }}
pathType: Prefix
backend:
service:
name: {{ include "clearml.fullname" . }}-fileserver
port:
number: {{ .Values.fileserver.service.port }}
{{ else }}
backend:
serviceName: {{ include "clearml.fullname" . }}-fileserver
servicePort: {{ .Values.fileserver.service.port }}
{{ end }}
{{- end }}

View File

@@ -1,51 +0,0 @@
{{- if .Values.ingress.enabled -}}
{{- $fullName := include "clearml.fullname" . -}}
{{- if semverCompare ">=1.14-0" .Capabilities.KubeVersion.GitVersion -}}
apiVersion: networking.k8s.io/v1beta1
{{- else -}}
apiVersion: extensions/v1beta1
{{- end }}
kind: Ingress
metadata:
name: {{ $fullName }}
labels:
{{- include "clearml.labels" . | nindent 4 }}
{{- with .Values.ingress.annotations }}
annotations:
{{- toYaml . | nindent 4 }}
{{- end }}
spec:
{{- if .Values.ingress.tls.secretName }}
tls:
- hosts:
- "{{ .Values.ingress.hostPrefixAp }}{{ .Values.ingress.host }}"
- "{{ .Values.ingress.hostPrefixFiles }}{{ .Values.ingress.host }}"
- "{{ .Values.ingress.hostPrefixApi }}{{ .Values.ingress.host }}"
secretName: {{ .Values.ingress.tls.secretName }}
{{- end }}
rules:
- host: "{{ .Values.ingress.hostPrefixApp }}{{ .Values.ingress.host }}"
http:
paths:
- path: "/"
pathType: Prefix
backend:
serviceName: {{ include "clearml.fullname" . }}-webserver
servicePort: {{ .Values.webserver.service.port }}
- host: "{{ .Values.ingress.hostPrefixApi }}{{ .Values.ingress.host }}"
http:
paths:
- path: "/"
pathType: Prefix
backend:
serviceName: {{ include "clearml.fullname" . }}-apiserver
servicePort: {{ .Values.apiserver.service.port }}
- host: "{{ .Values.ingress.hostPrefixFiles }}{{ .Values.ingress.host }}"
http:
paths:
- path: "/"
pathType: Prefix
backend:
serviceName: {{ include "clearml.fullname" . }}-fileserver
servicePort: {{ .Values.fileserver.service.port }}
{{- end }}

View File

@@ -1,13 +0,0 @@
kind: PersistentVolumeClaim
apiVersion: v1
metadata:
name: {{ include "clearml.fullname" . }}-agentservices-data
labels:
{{- include "clearml.labels" . | nindent 4 }}
spec:
accessModes:
- ReadWriteOnce
resources:
requests:
storage: {{ .Values.agentservices.storage.data.size | quote }}
storageClassName: {{ .Values.agentservices.storage.data.class | quote }}

View File

@@ -1,15 +0,0 @@
{{- if .Values.apiserver.storage.enableConfigVolume }}
kind: PersistentVolumeClaim
apiVersion: v1
metadata:
name: {{ include "clearml.fullname" . }}-apiserver-config
labels:
{{- include "clearml.labels" . | nindent 4 }}
spec:
accessModes:
- ReadWriteOnce
resources:
requests:
storage: {{ .Values.apiserver.storage.config.size | quote }}
storageClassName: {{ .Values.apiserver.storage.config.class | quote }}
{{- end }}

View File

@@ -10,4 +10,7 @@ spec:
resources:
requests:
storage: {{ .Values.fileserver.storage.data.size | quote }}
storageClassName: {{ .Values.fileserver.storage.data.class | quote }}
{{- if .Values.fileserver.storage.data.class }}
storageClassName: {{ .Values.fileserver.storage.data.class | quote }}
{{- end -}}

View File

@@ -1,13 +0,0 @@
{{- range $key, $value := .Values.agentGroups }}
{{- with $value }}
---
{{ if .clearmlConfig }}
apiVersion: v1
kind: Secret
metadata:
name: {{ .name }}-conf
data:
clearml.conf: {{ .clearmlConfig | b64enc }}
{{ end }}
{{- end }}
{{- end }}

View File

@@ -2,10 +2,10 @@ apiVersion: v1
kind: Secret
metadata:
name: clearml-conf
data:
apiserver_key: NTQ0MkYzNDQzTUpNT1JXWkEzWkg=
apiserver_secret: QnhhcElSbzlaSU5pOHgyNUNSeHo4V2RtcjJwUWp6dVdWQjRQTkFTWnFDdFR5V2dXVlE=
http_session: OVR3MjBSYmhKMWJMQmlIRU9XWHZocGxLR1ViVGdMekF0d0ZOMm9MUXZXd1MwdVJwRDU=
auth_token: MVNDZjBvdjNObTU0NFRkMm9aMGdYU3JzTng1WGhNV2RWbEt6MXRPZ2N4MTU4YkQ1UlY=
tests_user_key: RU5QMzlFUU00U0xBQ0dENUZYQjc=
tests_user_secret: bFBjbTBpbWJjQlo4bXdnTzd0cGFkdXRpUzNnbkpEMDV4OWo3YWZ3WFBTMzVJS2JwaVE=
stringData:
apiserver_key: {{ .Values.secret.credentials.apiserver.accessKey }}
apiserver_secret: {{ .Values.secret.credentials.apiserver.secretKey }}
http_session: {{ .Values.secret.httpSession }}
auth_token: {{ .Values.secret.authToken }}
tests_user_key: {{ .Values.secret.credentials.tests.accessKey }}
tests_user_secret: {{ .Values.secret.credentials.tests.secretKey }}

View File

@@ -9,7 +9,9 @@ spec:
ports:
- port: {{ .Values.apiserver.service.port }}
targetPort: {{ .Values.apiserver.service.port }}
nodePort: 30008
{{- if eq .Values.apiserver.service.type "NodePort" }}
nodePort: {{ .Values.apiserver.service.nodePort }}
{{- end }}
protocol: TCP
selector:
{{- include "clearml.selectorLabelsApiServer" . | nindent 4 }}

View File

@@ -9,7 +9,9 @@ spec:
ports:
- port: {{ .Values.fileserver.service.port }}
targetPort: {{ .Values.fileserver.service.port }}
nodePort: 30081
{{- if eq .Values.fileserver.service.type "NodePort" }}
nodePort: {{ .Values.fileserver.service.nodePort }}
{{- end }}
protocol: TCP
selector:
{{- include "clearml.selectorLabelsFileServer" . | nindent 4 }}

View File

@@ -9,7 +9,9 @@ spec:
ports:
- port: {{ .Values.webserver.service.port }}
targetPort: {{ .Values.webserver.service.port }}
nodePort: 30080
{{- if eq .Values.webserver.service.type "NodePort" }}
nodePort: {{ .Values.webserver.service.nodePort }}
{{- end }}
protocol: TCP
selector:
{{- include "clearml.selectorLabelsWebServer" . | nindent 4 }}

247
charts/clearml/values.yaml Normal file → Executable file
View File

@@ -1,15 +1,59 @@
# -- Private image registry configuration
imageCredentials:
# -- Use private authentication mode
enabled: false
# -- If this is set, chart will not generate a secret but will use what is defined here
existingSecret: ""
# -- Registry name
registry: docker.io
# -- Registry username
username: someone
# -- Registry password
password: pwd
# -- Email
email: someone@host.com
# -- ClearMl generic configurations
clearml:
defaultCompany: "d1bd92a3b039400cbafc60a7a5b1e52b"
ingress:
enabled: false
name: clearml-server-ingress
annotations: {}
host: ""
hostPrefixApp: "app."
hostPrefixApi: "api."
hostPrefixFiles: "files."
tls:
secretName: ""
app:
enabled: false
hostName: "app.clearml.127-0-0-1.nip.io"
tlsSecretName: ""
annotations: {}
path: "/"
api:
enabled: false
hostName: "api.clearml.127-0-0-1.nip.io"
tlsSecretName: ""
annotations: {}
path: "/"
files:
enabled: false
hostName: "files.clearml.127-0-0-1.nip.io"
tlsSecretName: ""
annotations: {}
path: "/"
secret:
# -- Set for http_session field
httpSession: "9Tw20RbhJ1bLBiHEOWXvhplKGUbTgLzAtwFN2oLQvWwS0uRpD5"
# -- Set for auth_token field
authToken: "1SCf0ov3Nm544Td2oZ0gXSrsNx5XhMWdVlKz1tOgcx158bD5RV"
credentials:
apiserver:
# -- Set for apiserver_key field
accessKey: "5442F3443MJMORWZA3ZH"
# -- Set for apiserver_secret field
secretKey: "BxapIRo9ZINi8x25CRxz8Wdmr2pQjzuWVB4PNASZqCtTyWgWVQ"
tests:
# -- Set for tests_user_key field
accessKey: "ENP39EQM4SLACGD5FXB7"
# -- Set for tests_user_secret field
secretKey: "lPcm0imbcBZ8mwgO7tpadutiS3gnJD05x9j7afwXPS35IKbpiQ"
apiserver:
prepopulateEnabled: "true"
@@ -17,9 +61,16 @@ apiserver:
prepopulateArtifactsPath: "/mnt/fileserver"
configDir: /opt/clearml/config
# -- Amount of seconds the authorization cookie will last in user browser
authCookiesMaxAge: 864000
service:
# -- This will set to service's spec.type field
type: NodePort
port: 8008
# -- If service.type set to NodePort, this will be set to service's nodePort field.
# If service.type is set to others, this field will be ignored
nodePort: 30008
livenessDelay: 60
readinessDelay: 60
@@ -29,7 +80,7 @@ apiserver:
image:
repository: "allegroai/clearml"
pullPolicy: IfNotPresent
tag: "1.1.1"
tag: "1.5.0"
extraEnvs: []
@@ -53,24 +104,52 @@ apiserver:
affinity: {}
# Optional: used in pvc-apiserver containing optional server configuration files
storage:
enableConfigVolume: false
config:
class: "standard"
size: 1Gi
# -- additional configurations that can be used by api server; check examples in values.yaml file
additionalConfigs: {}
# services.conf: |
# tasks {
# non_responsive_tasks_watchdog {
# # In-progress tasks that haven't been updated for at least 'value' seconds will be stopped by the watchdog
# threshold_sec: 21000
# # Watchdog will sleep for this number of seconds after each cycle
# watch_interval_sec: 900
# }
# }
# apiserver.conf: |
# auth {
# fixed_users {
# enabled: true
# pass_hashed: false
# users: [
# {
# username: "jane"
# password: "12345678"
# name: "Jane Doe"
# },
# {
# username: "john"
# password: "12345678"
# name: "John Doe"
# },
# ]
# }
# }
fileserver:
service:
# -- This will set to service's spec.type field
type: NodePort
port: 8081
# -- If service.type set to NodePort, this will be set to service's nodePort field.
# If service.type is set to others, this field will be ignored
nodePort: 30081
replicaCount: 1
image:
repository: "allegroai/clearml"
pullPolicy: IfNotPresent
tag: "1.1.1"
tag: "1.5.0"
extraEnvs: []
@@ -96,22 +175,26 @@ fileserver:
storage:
data:
class: "standard"
class: ""
size: 50Gi
webserver:
extraEnvs: []
service:
# -- This will set to service's spec.type field
type: NodePort
port: 80
# -- If service.type set to NodePort, this will be set to service's nodePort field.
# If service.type is set to others, this field will be ignored
nodePort: 30080
replicaCount: 1
image:
repository: "allegroai/clearml"
pullPolicy: IfNotPresent
tag: "1.1.1"
tag: "1.5.0"
podAnnotations: {}
@@ -133,121 +216,21 @@ webserver:
affinity: {}
agentservices:
clearmlHostIp: null
agentVersion: ""
clearmlWebHost: null
clearmlFilesHost: null
clearmlGitUser: null
clearmlGitPassword: null
awsAccessKeyId: null
awsSecretAccessKey: null
awsDefaultRegion: null
azureStorageAccount: null
azureStorageKey: null
googleCredentials: null
clearmlWorkerId: "clearml-services"
additionalConfigs: {}
replicaCount: 1
image:
repository: "allegroai/clearml-agent-services"
pullPolicy: IfNotPresent
tag: "latest"
extraEnvs: []
podAnnotations: {}
resources: {}
# We usually recommend not to specify default resources and to leave this as a conscious
# choice for the user. This also increases chances charts run on environments with little
# resources, such as Minikube. If you do want to specify resources, uncomment the following
# lines, adjust them as necessary, and remove the curly braces after 'resources:'.
# limits:
# cpu: 100m
# memory: 128Mi
# requests:
# cpu: 100m
# memory: 128Mi
nodeSelector: {}
tolerations: []
affinity: {}
storage:
data:
class: "standard"
size: 50Gi
agentGroups:
agent-group-cpu:
name: agent-group-cpu
replicaCount: 1
updateStrategy: Recreate
nvidiaGpusPerAgent: 0
agentVersion: "" # if set, it *MUST* include comparison operator (e.g. ">=0.16.1")
queues: "default" # multiple queues can be specified separated by a space (e.g. "important_jobs default")
clearmlGitUser: null
clearmlGitPassword: null
clearmlAccessKey: null
clearmlSecretKey: null
awsAccessKeyId: null
awsSecretAccessKey: null
awsDefaultRegion: null
azureStorageAccount: null
azureStorageKey: null
clearmlConfig: |-
sdk {
}
image:
repository: "ubuntu"
pullPolicy: IfNotPresent
tag: "18.04"
podAnnotations: {}
nodeSelector: {}
tolerations: []
affinity: {}
agent-group-gpu:
name: agent-group-gpu
replicaCount: 0
updateStrategy: Recreate
nvidiaGpusPerAgent: 1
agentVersion: "" # if set, it *MUST* include comparison operator (e.g. ">=0.16.1")
queues: "default" # multiple queues can be specified separated by a space (e.g. "important_jobs default")
clearmlGitUser: null
clearmlGitPassword: null
clearmlAccessKey: null
clearmlSecretKey: null
awsAccessKeyId: null
awsSecretAccessKey: null
awsDefaultRegion: null
azureStorageAccount: null
azureStorageKey: null
clearmlConfig: |-
sdk {
}
image:
repository: "nvidia/cuda"
pullPolicy: IfNotPresent
tag: "11.0-base-ubuntu18.04"
podAnnotations: {}
nodeSelector: {}
tolerations: []
affinity: {}
externalServices:
# -- Existing ElasticSearch Hostname to use if elasticsearch.enabled is false
elasticsearchHost: ""
# -- Existing ElasticSearch Port to use if elasticsearch.enabled is false
elasticsearchPort: 9200
# -- Existing MongoDB Hostname to use if elasticsearch.enabled is false
mongodbHost: ""
# -- Existing MongoDB Port to use if elasticsearch.enabled is false
mongodbPort: 27017
# -- Existing Redis Hostname to use if elasticsearch.enabled is false
redisHost: ""
# -- Existing Redis Port to use if elasticsearch.enabled is false
redisPort: 6379
redis: # configuration from https://github.com/bitnami/charts/blob/master/bitnami/redis/values.yaml
enabled: true
@@ -281,7 +264,7 @@ mongodb: # configuration from https://github.com/bitnami/charts/blob/master/bit
port: 27017
portName: mongo-service
elasticsearch: # configuration from https://github.com/elastic/helm-charts/blob/7.10/elasticsearch/values.yaml
elasticsearch: # configuration from https://github.com/elastic/helm-charts/blob/7.16/elasticsearch/values.yaml
enabled: true
httpPort: 9200
roles:

View File

@@ -0,0 +1,2 @@
tests/
.pytest_cache/

View File

@@ -0,0 +1,12 @@
apiVersion: v1
appVersion: 7.16.2
description: Official Elastic helm chart for Elasticsearch
home: https://github.com/elastic/helm-charts
icon: https://helm.elastic.co/icons/elasticsearch.png
maintainers:
- email: helm-charts@elastic.co
name: Elastic
name: elasticsearch
sources:
- https://github.com/elastic/elasticsearch
version: 7.16.2

View File

@@ -0,0 +1 @@
include ../helpers/common.mk

View File

@@ -0,0 +1,457 @@
# Elasticsearch Helm Chart
[![Build Status](https://img.shields.io/jenkins/s/https/devops-ci.elastic.co/job/elastic+helm-charts+master.svg)](https://devops-ci.elastic.co/job/elastic+helm-charts+master/) [![Artifact HUB](https://img.shields.io/endpoint?url=https://artifacthub.io/badge/repository/elastic)](https://artifacthub.io/packages/search?repo=elastic)
This Helm chart is a lightweight way to configure and run our official
[Elasticsearch Docker image][].
<!-- development warning placeholder -->
<!-- START doctoc generated TOC please keep comment here to allow auto update -->
<!-- DON'T EDIT THIS SECTION, INSTEAD RE-RUN doctoc TO UPDATE -->
- [Requirements](#requirements)
- [Installing](#installing)
- [Install released version using Helm repository](#install-released-version-using-helm-repository)
- [Install development version from a branch](#install-development-version-from-a-branch)
- [Upgrading](#upgrading)
- [Usage notes](#usage-notes)
- [Configuration](#configuration)
- [Deprecated](#deprecated)
- [FAQ](#faq)
- [How to deploy this chart on a specific K8S distribution?](#how-to-deploy-this-chart-on-a-specific-k8s-distribution)
- [How to deploy dedicated nodes types?](#how-to-deploy-dedicated-nodes-types)
- [Clustering and Node Discovery](#clustering-and-node-discovery)
- [How to deploy clusters with security (authentication and TLS) enabled?](#how-to-deploy-clusters-with-security-authentication-and-tls-enabled)
- [How to migrate from helm/charts stable chart?](#how-to-migrate-from-helmcharts-stable-chart)
- [How to install plugins?](#how-to-install-plugins)
- [How to use the keystore?](#how-to-use-the-keystore)
- [Basic example](#basic-example)
- [Multiple keys](#multiple-keys)
- [Custom paths and keys](#custom-paths-and-keys)
- [How to enable snapshotting?](#how-to-enable-snapshotting)
- [How to configure templates post-deployment?](#how-to-configure-templates-post-deployment)
- [Contributing](#contributing)
<!-- END doctoc generated TOC please keep comment here to allow auto update -->
<!-- Use this to update TOC: -->
<!-- docker run --rm -it -v $(pwd):/usr/src jorgeandrada/doctoc --github -->
## Requirements
* Kubernetes >= 1.14
* [Helm][] >= 2.17.0
* Minimum cluster requirements include the following to run this chart with
default settings. All of these settings are configurable.
* Three Kubernetes nodes to respect the default "hard" affinity settings
* 1GB of RAM for the JVM heap
See [supported configurations][] for more details.
## Installing
This chart is tested with the latest 7.16.2 version.
### Install released version using Helm repository
* Add the Elastic Helm charts repo:
`helm repo add elastic https://helm.elastic.co`
* Install it:
- with Helm 3: `helm install elasticsearch --version <version> elastic/elasticsearch`
- with Helm 2 (deprecated): `helm install --name elasticsearch --version <version> elastic/elasticsearch`
### Install development version from a branch
* Clone the git repo: `git clone git@github.com:elastic/helm-charts.git`
* Checkout the branch : `git checkout 7.16`
* Install it:
- with Helm 3: `helm install elasticsearch ./helm-charts/elasticsearch --set imageTag=7.16.2`
- with Helm 2 (deprecated): `helm install --name elasticsearch ./helm-charts/elasticsearch --set imageTag=7.16.2`
## Upgrading
Please always check [CHANGELOG.md][] and [BREAKING_CHANGES.md][] before
upgrading to a new chart version.
## Usage notes
* This repo includes a number of [examples][] configurations which can be used
as a reference. They are also used in the automated testing of this chart.
* Automated testing of this chart is currently only run against GKE (Google
Kubernetes Engine).
* The chart deploys a StatefulSet and by default will do an automated rolling
update of your cluster. It does this by waiting for the cluster health to become
green after each instance is updated. If you prefer to update manually you can
set `OnDelete` [updateStrategy][].
* It is important to verify that the JVM heap size in `esJavaOpts` and to set
the CPU/Memory `resources` to something suitable for your cluster.
* To simplify chart and maintenance each set of node groups is deployed as a
separate Helm release. Take a look at the [multi][] example to get an idea for
how this works. Without doing this it isn't possible to resize persistent
volumes in a StatefulSet. By setting it up this way it makes it possible to add
more nodes with a new storage size then drain the old ones. It also solves the
problem of allowing the user to determine which node groups to update first when
doing upgrades or changes.
* We have designed this chart to be very un-opinionated about how to configure
Elasticsearch. It exposes ways to set environment variables and mount secrets
inside of the container. Doing this makes it much easier for this chart to
support multiple versions with minimal changes.
## Configuration
| Parameter | Description | Default |
|------------------------------------|-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|--------------------------------------------------|
| `antiAffinityTopologyKey` | The [anti-affinity][] topology key. By default this will prevent multiple Elasticsearch nodes from running on the same Kubernetes node | `kubernetes.io/hostname` |
| `antiAffinity` | Setting this to hard enforces the [anti-affinity][] rules. If it is set to soft it will be done "best effort". Other values will be ignored | `hard` |
| `clusterHealthCheckParams` | The [Elasticsearch cluster health status params][] that will be used by readiness [probe][] command | `wait_for_status=green&timeout=1s` |
| `clusterName` | This will be used as the Elasticsearch [cluster.name][] and should be unique per cluster in the namespace | `elasticsearch` |
| `clusterDeprecationIndexing` | Enable or disable deprecation logs to be indexed (should be disabled when deploying master only node groups) | `false` |
| `enableServiceLinks` | Set to false to disabling service links, which can cause slow pod startup times when there are many services in the current namespace. | `true` |
| `envFrom` | Templatable string to be passed to the [environment from variables][] which will be appended to the `envFrom:` definition for the container | `[]` |
| `esConfig` | Allows you to add any config files in `/usr/share/elasticsearch/config/` such as `elasticsearch.yml` and `log4j2.properties`. See [values.yaml][] for an example of the formatting | `{}` |
| `esJavaOpts` | [Java options][] for Elasticsearch. This is where you could configure the [jvm heap size][] | `""` |
| `esMajorVersion` | Deprecated. Instead, use the version of the chart corresponding to your ES minor version. Used to set major version specific configuration. If you are using a custom image and not running the default Elasticsearch version you will need to set this to the version you are running (e.g. `esMajorVersion: 6`) | `""` |
| `extraContainers` | Templatable string of additional `containers` to be passed to the `tpl` function | `""` |
| `extraEnvs` | Extra [environment variables][] which will be appended to the `env:` definition for the container | `[]` |
| `extraInitContainers` | Templatable string of additional `initContainers` to be passed to the `tpl` function | `""` |
| `extraVolumeMounts` | Templatable string of additional `volumeMounts` to be passed to the `tpl` function | `""` |
| `extraVolumes` | Templatable string of additional `volumes` to be passed to the `tpl` function | `""` |
| `fullnameOverride` | Overrides the `clusterName` and `nodeGroup` when used in the naming of resources. This should only be used when using a single `nodeGroup`, otherwise you will have name conflicts | `""` |
| `healthNameOverride` | Overrides `test-elasticsearch-health` pod name | `""` |
| `hostAliases` | Configurable [hostAliases][] | `[]` |
| `httpPort` | The http port that Kubernetes will use for the healthchecks and the service. If you change this you will also need to set [http.port][] in `extraEnvs` | `9200` |
| `imagePullPolicy` | The Kubernetes [imagePullPolicy][] value | `IfNotPresent` |
| `imagePullSecrets` | Configuration for [imagePullSecrets][] so that you can use a private registry for your image | `[]` |
| `imageTag` | The Elasticsearch Docker image tag | `7.16.2` |
| `image` | The Elasticsearch Docker image | `docker.elastic.co/elasticsearch/elasticsearch` |
| `ingress` | Configurable [ingress][] to expose the Elasticsearch service. See [values.yaml][] for an example | see [values.yaml][] |
| `initResources` | Allows you to set the [resources][] for the `initContainer` in the StatefulSet | `{}` |
| `keystore` | Allows you map Kubernetes secrets into the keystore. See the [config example][] and [how to use the keystore][] | `[]` |
| `labels` | Configurable [labels][] applied to all Elasticsearch pods | `{}` |
| `lifecycle` | Allows you to add [lifecycle hooks][]. See [values.yaml][] for an example of the formatting | `{}` |
| `masterService` | The service name used to connect to the masters. You only need to set this if your master `nodeGroup` is set to something other than `master`. See [Clustering and Node Discovery][] for more information | `""` |
| `maxUnavailable` | The [maxUnavailable][] value for the pod disruption budget. By default this will prevent Kubernetes from having more than 1 unhealthy pod in the node group | `1` |
| `minimumMasterNodes` | The value for [discovery.zen.minimum_master_nodes][]. Should be set to `(master_eligible_nodes / 2) + 1`. Ignored in Elasticsearch versions >= 7 | `2` |
| `nameOverride` | Overrides the `clusterName` when used in the naming of resources | `""` |
| `networkHost` | Value for the [network.host Elasticsearch setting][] | `0.0.0.0` |
| `networkPolicy` | The [NetworkPolicy](https://kubernetes.io/docs/concepts/services-networking/network-policies/) to set. See [`values.yaml`](./values.yaml) for an example | `{http.enabled: false,transport.enabled: false}` |
| `nodeAffinity` | Value for the [node affinity settings][] | `{}` |
| `nodeGroup` | This is the name that will be used for each group of nodes in the cluster. The name will be `clusterName-nodeGroup-X` , `nameOverride-nodeGroup-X` if a `nameOverride` is specified, and `fullnameOverride-X` if a `fullnameOverride` is specified | `master` |
| `nodeSelector` | Configurable [nodeSelector][] so that you can target specific nodes for your Elasticsearch cluster | `{}` |
| `persistence` | Enables a persistent volume for Elasticsearch data. Can be disabled for nodes that only have [roles][] which don't require persistent data | see [values.yaml][] |
| `podAnnotations` | Configurable [annotations][] applied to all Elasticsearch pods | `{}` |
| `podManagementPolicy` | By default Kubernetes [deploys StatefulSets serially][]. This deploys them in parallel so that they can discover each other | `Parallel` |
| `podSecurityContext` | Allows you to set the [securityContext][] for the pod | see [values.yaml][] |
| `podSecurityPolicy` | Configuration for create a pod security policy with minimal permissions to run this Helm chart with `create: true`. Also can be used to reference an external pod security policy with `name: "externalPodSecurityPolicy"` | see [values.yaml][] |
| `priorityClassName` | The name of the [PriorityClass][]. No default is supplied as the PriorityClass must be created first | `""` |
| `protocol` | The protocol that will be used for the readiness [probe][]. Change this to `https` if you have `xpack.security.http.ssl.enabled` set | `http` |
| `rbac` | Configuration for creating a role, role binding and ServiceAccount as part of this Helm chart with `create: true`. Also can be used to reference an external ServiceAccount with `serviceAccountName: "externalServiceAccountName"`, or automount the service account token | see [values.yaml][] |
| `readinessProbe` | Configuration fields for the readiness [probe][] | see [values.yaml][] |
| `replicas` | Kubernetes replica count for the StatefulSet (i.e. how many pods) | `3` |
| `resources` | Allows you to set the [resources][] for the StatefulSet | see [values.yaml][] |
| `roles` | A hash map with the specific [roles][] for the `nodeGroup` | see [values.yaml][] |
| `schedulerName` | Name of the [alternate scheduler][] | `""` |
| `secretMounts` | Allows you easily mount a secret as a file inside the StatefulSet. Useful for mounting certificates and other secrets. See [values.yaml][] for an example | `[]` |
| `securityContext` | Allows you to set the [securityContext][] for the container | see [values.yaml][] |
| `service.annotations` | [LoadBalancer annotations][] that Kubernetes will use for the service. This will configure load balancer if `service.type` is `LoadBalancer` | `{}` |
| `service.enabled` | Enable non-headless service | `true` |
| `service.externalTrafficPolicy` | Some cloud providers allow you to specify the [LoadBalancer externalTrafficPolicy][]. Kubernetes will use this to preserve the client source IP. This will configure load balancer if `service.type` is `LoadBalancer` | `""` |
| `service.httpPortName` | The name of the http port within the service | `http` |
| `service.labelsHeadless` | Labels to be added to headless service | `{}` |
| `service.labels` | Labels to be added to non-headless service | `{}` |
| `service.loadBalancerIP` | Some cloud providers allow you to specify the [loadBalancer][] IP. If the `loadBalancerIP` field is not specified, the IP is dynamically assigned. If you specify a `loadBalancerIP` but your cloud provider does not support the feature, it is ignored. | `""` |
| `service.loadBalancerSourceRanges` | The IP ranges that are allowed to access | `[]` |
| `service.nodePort` | Custom [nodePort][] port that can be set if you are using `service.type: nodePort` | `""` |
| `service.transportPortName` | The name of the transport port within the service | `transport` |
| `service.type` | Elasticsearch [Service Types][] | `ClusterIP` |
| `sysctlInitContainer` | Allows you to disable the `sysctlInitContainer` if you are setting [sysctl vm.max_map_count][] with another method | `enabled: true` |
| `sysctlVmMaxMapCount` | Sets the [sysctl vm.max_map_count][] needed for Elasticsearch | `262144` |
| `terminationGracePeriod` | The [terminationGracePeriod][] in seconds used when trying to stop the pod | `120` |
| `tests.enabled` | Enable creating test related resources when running `helm template` or `helm test` | `true` |
| `tolerations` | Configurable [tolerations][] | `[]` |
| `transportPort` | The transport port that Kubernetes will use for the service. If you change this you will also need to set [transport port configuration][] in `extraEnvs` | `9300` |
| `updateStrategy` | The [updateStrategy][] for the StatefulSet. By default Kubernetes will wait for the cluster to be green after upgrading each pod. Setting this to `OnDelete` will allow you to manually delete each pod during upgrades | `RollingUpdate` |
| `volumeClaimTemplate` | Configuration for the [volumeClaimTemplate for StatefulSets][]. You will want to adjust the storage (default `30Gi` ) and the `storageClassName` if you are using a different storage class | see [values.yaml][] |
### Deprecated
| Parameter | Description | Default |
|-----------|---------------------------------------------------------------------------------------------------------------|---------|
| `fsGroup` | The Group ID (GID) for [securityContext][] so that the Elasticsearch user can read from the persistent volume | `""` |
## FAQ
### How to deploy this chart on a specific K8S distribution?
This chart is designed to run on production scale Kubernetes clusters with
multiple nodes, lots of memory and persistent storage. For that reason it can be
a bit tricky to run them against local Kubernetes environments such as
[Minikube][].
This chart is highly tested with [GKE][], but some K8S distribution also
requires specific configurations.
We provide examples of configuration for the following K8S providers:
- [Docker for Mac][]
- [KIND][]
- [Minikube][]
- [MicroK8S][]
- [OpenShift][]
### How to deploy dedicated nodes types?
All the Elasticsearch pods deployed share the same configuration. If you need to
deploy dedicated [nodes types][] (for example dedicated master and data nodes),
you can deploy multiple releases of this chart with different configurations
while they share the same `clusterName` value.
For each Helm release, the nodes types can then be defined using `roles` value.
An example of Elasticsearch cluster using 2 different Helm releases for master
and data nodes can be found in [examples/multi][].
#### Clustering and Node Discovery
This chart facilitates Elasticsearch node discovery and services by creating two
`Service` definitions in Kubernetes, one with the name `$clusterName-$nodeGroup`
and another named `$clusterName-$nodeGroup-headless`.
Only `Ready` pods are a part of the `$clusterName-$nodeGroup` service, while all
pods ( `Ready` or not) are a part of `$clusterName-$nodeGroup-headless`.
If your group of master nodes has the default `nodeGroup: master` then you can
just add new groups of nodes with a different `nodeGroup` and they will
automatically discover the correct master. If your master nodes have a different
`nodeGroup` name then you will need to set `masterService` to
`$clusterName-$masterNodeGroup`.
The chart value for `masterService` is used to populate
`discovery.zen.ping.unicast.hosts` , which Elasticsearch nodes will use to
contact master nodes and form a cluster.
Therefore, to add a group of nodes to an existing cluster, setting
`masterService` to the desired `Service` name of the related cluster is
sufficient.
### How to deploy clusters with security (authentication and TLS) enabled?
This Helm chart can use existing [Kubernetes secrets][] to setup
credentials or certificates for examples. These secrets should be created
outside of this chart and accessed using [environment variables][] and volumes.
An example of Elasticsearch cluster using security can be found in
[examples/security][].
### How to migrate from helm/charts stable chart?
If you currently have a cluster deployed with the [helm/charts stable][] chart
you can follow the [migration guide][].
### How to install plugins?
The recommended way to install plugins into our Docker images is to create a
[custom Docker image][].
The Dockerfile would look something like:
```
ARG elasticsearch_version
FROM docker.elastic.co/elasticsearch/elasticsearch:${elasticsearch_version}
RUN bin/elasticsearch-plugin install --batch repository-gcs
```
And then updating the `image` in values to point to your custom image.
There are a couple reasons we recommend this.
1. Tying the availability of Elasticsearch to the download service to install
plugins is not a great idea or something that we recommend. Especially in
Kubernetes where it is normal and expected for a container to be moved to
another host at random times.
2. Mutating the state of a running Docker image (by installing plugins) goes
against best practices of containers and immutable infrastructure.
### How to use the keystore?
#### Basic example
Create the secret, the key name needs to be the keystore key path. In this
example we will create a secret from a file and from a literal string.
```
kubectl create secret generic encryption-key --from-file=xpack.watcher.encryption_key=./watcher_encryption_key
kubectl create secret generic slack-hook --from-literal=xpack.notification.slack.account.monitoring.secure_url='https://hooks.slack.com/services/asdasdasd/asdasdas/asdasd'
```
To add these secrets to the keystore:
```
keystore:
- secretName: encryption-key
- secretName: slack-hook
```
#### Multiple keys
All keys in the secret will be added to the keystore. To create the previous
example in one secret you could also do:
```
kubectl create secret generic keystore-secrets --from-file=xpack.watcher.encryption_key=./watcher_encryption_key --from-literal=xpack.notification.slack.account.monitoring.secure_url='https://hooks.slack.com/services/asdasdasd/asdasdas/asdasd'
```
```
keystore:
- secretName: keystore-secrets
```
#### Custom paths and keys
If you are using these secrets for other applications (besides the Elasticsearch
keystore) then it is also possible to specify the keystore path and which keys
you want to add. Everything specified under each `keystore` item will be passed
through to the `volumeMounts` section for mounting the [secret][]. In this
example we will only add the `slack_hook` key from a secret that also has other
keys. Our secret looks like this:
```
kubectl create secret generic slack-secrets --from-literal=slack_channel='#general' --from-literal=slack_hook='https://hooks.slack.com/services/asdasdasd/asdasdas/asdasd'
```
We only want to add the `slack_hook` key to the keystore at path
`xpack.notification.slack.account.monitoring.secure_url`:
```
keystore:
- secretName: slack-secrets
items:
- key: slack_hook
path: xpack.notification.slack.account.monitoring.secure_url
```
You can also take a look at the [config example][] which is used as part of the
automated testing pipeline.
### How to enable snapshotting?
1. Install your [snapshot plugin][] into a custom Docker image following the
[how to install plugins guide][].
2. Add any required secrets or credentials into an Elasticsearch keystore
following the [how to use the keystore][] guide.
3. Configure the [snapshot repository][] as you normally would.
4. To automate snapshots you can use [Snapshot Lifecycle Management][] or a tool
like [curator][].
### How to configure templates post-deployment?
You can use `postStart` [lifecycle hooks][] to run code triggered after a
container is created.
Here is an example of `postStart` hook to configure templates:
```yaml
lifecycle:
postStart:
exec:
command:
- bash
- -c
- |
#!/bin/bash
# Add a template to adjust number of shards/replicas
TEMPLATE_NAME=my_template
INDEX_PATTERN="logstash-*"
SHARD_COUNT=8
REPLICA_COUNT=1
ES_URL=http://localhost:9200
while [[ "$(curl -s -o /dev/null -w '%{http_code}\n' $ES_URL)" != "200" ]]; do sleep 1; done
curl -XPUT "$ES_URL/_template/$TEMPLATE_NAME" -H 'Content-Type: application/json' -d'{"index_patterns":['\""$INDEX_PATTERN"\"'],"settings":{"number_of_shards":'$SHARD_COUNT',"number_of_replicas":'$REPLICA_COUNT'}}'
```
## Contributing
Please check [CONTRIBUTING.md][] before any contribution or for any questions
about our development and testing process.
[7.16]: https://github.com/elastic/helm-charts/releases
[#63]: https://github.com/elastic/helm-charts/issues/63
[BREAKING_CHANGES.md]: https://github.com/elastic/helm-charts/blob/master/BREAKING_CHANGES.md
[CHANGELOG.md]: https://github.com/elastic/helm-charts/blob/master/CHANGELOG.md
[CONTRIBUTING.md]: https://github.com/elastic/helm-charts/blob/master/CONTRIBUTING.md
[alternate scheduler]: https://kubernetes.io/docs/tasks/administer-cluster/configure-multiple-schedulers/#specify-schedulers-for-pods
[annotations]: https://kubernetes.io/docs/concepts/overview/working-with-objects/annotations/
[anti-affinity]: https://kubernetes.io/docs/concepts/configuration/assign-pod-node/#affinity-and-anti-affinity
[cluster.name]: https://www.elastic.co/guide/en/elasticsearch/reference/7.16/important-settings.html#cluster-name
[clustering and node discovery]: https://github.com/elastic/helm-charts/tree/7.16/elasticsearch/README.md#clustering-and-node-discovery
[config example]: https://github.com/elastic/helm-charts/tree/7.16/elasticsearch/examples/config/values.yaml
[curator]: https://www.elastic.co/guide/en/elasticsearch/client/curator/7.9/snapshot.html
[custom docker image]: https://www.elastic.co/guide/en/elasticsearch/reference/7.16/docker.html#_c_customized_image
[deploys statefulsets serially]: https://kubernetes.io/docs/concepts/workloads/controllers/statefulset/#pod-management-policies
[discovery.zen.minimum_master_nodes]: https://www.elastic.co/guide/en/elasticsearch/reference/7.16/discovery-settings.html#minimum_master_nodes
[docker for mac]: https://github.com/elastic/helm-charts/tree/7.16/elasticsearch/examples/docker-for-mac
[elasticsearch cluster health status params]: https://www.elastic.co/guide/en/elasticsearch/reference/7.16/cluster-health.html#request-params
[elasticsearch docker image]: https://www.elastic.co/guide/en/elasticsearch/reference/7.16/docker.html
[environment variables]: https://kubernetes.io/docs/tasks/inject-data-application/define-environment-variable-container/#using-environment-variables-inside-of-your-config
[environment from variables]: https://kubernetes.io/docs/tasks/configure-pod-container/configure-pod-configmap/#configure-all-key-value-pairs-in-a-configmap-as-container-environment-variables
[examples]: https://github.com/elastic/helm-charts/tree/7.16/elasticsearch/examples/
[examples/multi]: https://github.com/elastic/helm-charts/tree/7.16/elasticsearch/examples/multi
[examples/security]: https://github.com/elastic/helm-charts/tree/7.16/elasticsearch/examples/security
[gke]: https://cloud.google.com/kubernetes-engine
[helm]: https://helm.sh
[helm/charts stable]: https://github.com/helm/charts/tree/master/stable/elasticsearch/
[how to install plugins guide]: https://github.com/elastic/helm-charts/tree/7.16/elasticsearch/README.md#how-to-install-plugins
[how to use the keystore]: https://github.com/elastic/helm-charts/tree/7.16/elasticsearch/README.md#how-to-use-the-keystore
[http.port]: https://www.elastic.co/guide/en/elasticsearch/reference/7.16/modules-http.html#_settings
[imagePullPolicy]: https://kubernetes.io/docs/concepts/containers/images/#updating-images
[imagePullSecrets]: https://kubernetes.io/docs/tasks/configure-pod-container/pull-image-private-registry/#create-a-pod-that-uses-your-secret
[ingress]: https://kubernetes.io/docs/concepts/services-networking/ingress/
[java options]: https://www.elastic.co/guide/en/elasticsearch/reference/7.16/jvm-options.html
[jvm heap size]: https://www.elastic.co/guide/en/elasticsearch/reference/7.16/heap-size.html
[hostAliases]: https://kubernetes.io/docs/concepts/services-networking/add-entries-to-pod-etc-hosts-with-host-aliases/
[kind]: https://github.com/elastic/helm-charts/tree/7.16/elasticsearch/examples/kubernetes-kind
[kubernetes secrets]: https://kubernetes.io/docs/concepts/configuration/secret/
[labels]: https://kubernetes.io/docs/concepts/overview/working-with-objects/labels/
[lifecycle hooks]: https://kubernetes.io/docs/concepts/containers/container-lifecycle-hooks/
[loadBalancer annotations]: https://kubernetes.io/docs/concepts/services-networking/service/#ssl-support-on-aws
[loadBalancer externalTrafficPolicy]: https://kubernetes.io/docs/tasks/access-application-cluster/create-external-load-balancer/#preserving-the-client-source-ip
[loadBalancer]: https://kubernetes.io/docs/concepts/services-networking/service/#loadbalancer
[maxUnavailable]: https://kubernetes.io/docs/tasks/run-application/configure-pdb/#specifying-a-poddisruptionbudget
[migration guide]: https://github.com/elastic/helm-charts/tree/7.16/elasticsearch/examples/migration/README.md
[minikube]: https://github.com/elastic/helm-charts/tree/7.16/elasticsearch/examples/minikube
[microk8s]: https://github.com/elastic/helm-charts/tree/7.16/elasticsearch/examples/microk8s
[multi]: https://github.com/elastic/helm-charts/tree/7.16/elasticsearch/examples/multi/
[network.host elasticsearch setting]: https://www.elastic.co/guide/en/elasticsearch/reference/7.16/network.host.html
[node affinity settings]: https://kubernetes.io/docs/concepts/configuration/assign-pod-node/#node-affinity-beta-feature
[node-certificates]: https://www.elastic.co/guide/en/elasticsearch/reference/7.16/configuring-tls.html#node-certificates
[nodePort]: https://kubernetes.io/docs/concepts/services-networking/service/#nodeport
[nodes types]: https://www.elastic.co/guide/en/elasticsearch/reference/7.16/modules-node.html
[nodeSelector]: https://kubernetes.io/docs/concepts/configuration/assign-pod-node/#nodeselector
[openshift]: https://github.com/elastic/helm-charts/tree/7.16/elasticsearch/examples/openshift
[priorityClass]: https://kubernetes.io/docs/concepts/configuration/pod-priority-preemption/#priorityclass
[probe]: https://kubernetes.io/docs/tasks/configure-pod-container/configure-liveness-readiness-probes/
[resources]: https://kubernetes.io/docs/concepts/configuration/manage-compute-resources-container/
[roles]: https://www.elastic.co/guide/en/elasticsearch/reference/7.16/modules-node.html
[secret]: https://kubernetes.io/docs/concepts/configuration/secret/#using-secrets
[securityContext]: https://kubernetes.io/docs/tasks/configure-pod-container/security-context/
[service types]: https://kubernetes.io/docs/concepts/services-networking/service/#publishing-services-service-types
[snapshot lifecycle management]: https://www.elastic.co/guide/en/elasticsearch/reference/7.16/snapshot-lifecycle-management.html
[snapshot plugin]: https://www.elastic.co/guide/en/elasticsearch/plugins/7.16/repository.html
[snapshot repository]: https://www.elastic.co/guide/en/elasticsearch/reference/7.16/modules-snapshots.html
[supported configurations]: https://github.com/elastic/helm-charts/tree/7.16/README.md#supported-configurations
[sysctl vm.max_map_count]: https://www.elastic.co/guide/en/elasticsearch/reference/7.16/vm-max-map-count.html#vm-max-map-count
[terminationGracePeriod]: https://kubernetes.io/docs/concepts/workloads/pods/pod/#termination-of-pods
[tolerations]: https://kubernetes.io/docs/concepts/configuration/taint-and-toleration/
[transport port configuration]: https://www.elastic.co/guide/en/elasticsearch/reference/7.16/modules-transport.html#_transport_settings
[updateStrategy]: https://kubernetes.io/docs/concepts/workloads/controllers/statefulset/
[values.yaml]: https://github.com/elastic/helm-charts/tree/7.16/elasticsearch/values.yaml
[volumeClaimTemplate for statefulsets]: https://kubernetes.io/docs/concepts/workloads/controllers/statefulset/#stable-storage

View File

@@ -0,0 +1,21 @@
default: test
include ../../../helpers/examples.mk
RELEASE := helm-es-config
TIMEOUT := 1200s
install:
helm upgrade --wait --timeout=$(TIMEOUT) --install --values values.yaml $(RELEASE) ../../
secrets:
kubectl delete secret elastic-config-credentials elastic-config-secret elastic-config-slack elastic-config-custom-path || true
kubectl create secret generic elastic-config-credentials --from-literal=password=changeme --from-literal=username=elastic
kubectl create secret generic elastic-config-slack --from-literal=xpack.notification.slack.account.monitoring.secure_url='https://hooks.slack.com/services/asdasdasd/asdasdas/asdasd'
kubectl create secret generic elastic-config-secret --from-file=xpack.watcher.encryption_key=./watcher_encryption_key
kubectl create secret generic elastic-config-custom-path --from-literal=slack_url='https://hooks.slack.com/services/asdasdasd/asdasdas/asdasd' --from-literal=thing_i_don_tcare_about=test
test: secrets install goss
purge:
helm del $(RELEASE)

View File

@@ -0,0 +1,27 @@
# Config
This example deploy a single node Elasticsearch 7.16.2 with authentication and
custom [values][].
## Usage
* Create the required secrets: `make secrets`
* Deploy Elasticsearch chart with the default values: `make install`
* You can now setup a port forward to query Elasticsearch API:
```
kubectl port-forward svc/config-master 9200
curl -u elastic:changeme http://localhost:9200/_cat/indices
```
## Testing
You can also run [goss integration tests][] using `make test`
[goss integration tests]: https://github.com/elastic/helm-charts/tree/7.16/elasticsearch/examples/config/test/goss.yaml
[values]: https://github.com/elastic/helm-charts/tree/7.16/elasticsearch/examples/config/values.yaml

View File

@@ -0,0 +1,29 @@
http:
http://localhost:9200/_cluster/health:
status: 200
timeout: 2000
username: elastic
password: "{{ .Env.ELASTIC_PASSWORD }}"
body:
- "green"
- '"number_of_nodes":1'
- '"number_of_data_nodes":1'
http://localhost:9200:
status: 200
timeout: 2000
username: elastic
password: "{{ .Env.ELASTIC_PASSWORD }}"
body:
- '"cluster_name" : "config"'
- "You Know, for Search"
command:
"elasticsearch-keystore list":
exit-status: 0
stdout:
- keystore.seed
- bootstrap.password
- xpack.notification.slack.account.monitoring.secure_url
- xpack.notification.slack.account.otheraccount.secure_url
- xpack.watcher.encryption_key

View File

@@ -0,0 +1,27 @@
---
clusterName: "config"
replicas: 1
extraEnvs:
- name: ELASTIC_PASSWORD
valueFrom:
secretKeyRef:
name: elastic-config-credentials
key: password
# This is just a dummy file to make sure that
# the keystore can be mounted at the same time
# as a custom elasticsearch.yml
esConfig:
elasticsearch.yml: |
xpack.security.enabled: true
path.data: /usr/share/elasticsearch/data
keystore:
- secretName: elastic-config-secret
- secretName: elastic-config-slack
- secretName: elastic-config-custom-path
items:
- key: slack_url
path: xpack.notification.slack.account.otheraccount.secure_url

View File

@@ -0,0 +1 @@
supersecret

View File

@@ -0,0 +1,14 @@
default: test
include ../../../helpers/examples.mk
RELEASE := helm-es-default
TIMEOUT := 1200s
install:
helm upgrade --wait --timeout=$(TIMEOUT) --install $(RELEASE) ../../
test: install goss
purge:
helm del $(RELEASE)

View File

@@ -0,0 +1,25 @@
# Default
This example deploy a 3 nodes Elasticsearch 7.16.2 cluster using
[default values][].
## Usage
* Deploy Elasticsearch chart with the default values: `make install`
* You can now setup a port forward to query Elasticsearch API:
```
kubectl port-forward svc/elasticsearch-master 9200
curl localhost:9200/_cat/indices
```
## Testing
You can also run [goss integration tests][] using `make test`
[goss integration tests]: https://github.com/elastic/helm-charts/tree/7.16/elasticsearch/examples/default/test/goss.yaml
[default values]: https://github.com/elastic/helm-charts/tree/7.16/elasticsearch/values.yaml

View File

@@ -0,0 +1,19 @@
#!/usr/bin/env bash -x
kubectl proxy || true &
make &
PROC_ID=$!
while kill -0 "$PROC_ID" >/dev/null 2>&1; do
echo "PROCESS IS RUNNING"
if curl --fail 'http://localhost:8001/api/v1/proxy/namespaces/default/services/elasticsearch-master:9200/_search' ; then
echo "cluster is healthy"
else
echo "cluster not healthy!"
exit 1
fi
sleep 1
done
echo "PROCESS TERMINATED"
exit 0

View File

@@ -0,0 +1,38 @@
kernel-param:
vm.max_map_count:
value: "262144"
http:
http://elasticsearch-master:9200/_cluster/health:
status: 200
timeout: 2000
body:
- "green"
- '"number_of_nodes":3'
- '"number_of_data_nodes":3'
http://localhost:9200:
status: 200
timeout: 2000
body:
- '"number" : "7.16.2"'
- '"cluster_name" : "elasticsearch"'
- "You Know, for Search"
file:
/usr/share/elasticsearch/data:
exists: true
mode: "2775"
owner: root
group: elasticsearch
filetype: directory
mount:
/usr/share/elasticsearch/data:
exists: true
user:
elasticsearch:
exists: true
uid: 1000
gid: 1000

View File

@@ -0,0 +1,13 @@
default: test
RELEASE := helm-es-docker-for-mac
TIMEOUT := 1200s
install:
helm upgrade --wait --timeout=$(TIMEOUT) --install --values values.yaml $(RELEASE) ../../
test: install
helm test $(RELEASE)
purge:
helm del $(RELEASE)

View File

@@ -0,0 +1,23 @@
# Docker for Mac
This example deploy a 3 nodes Elasticsearch 7.16.2 cluster on [Docker for Mac][]
using [custom values][].
Note that this configuration should be used for test only and isn't recommended
for production.
## Usage
* Deploy Elasticsearch chart with the default values: `make install`
* You can now setup a port forward to query Elasticsearch API:
```
kubectl port-forward svc/elasticsearch-master 9200
curl localhost:9200/_cat/indices
```
[custom values]: https://github.com/elastic/helm-charts/tree/7.16/elasticsearch/examples/docker-for-mac/values.yaml
[docker for mac]: https://docs.docker.com/docker-for-mac/kubernetes/

View File

@@ -0,0 +1,23 @@
---
# Permit co-located instances for solitary minikube virtual machines.
antiAffinity: "soft"
# Shrink default JVM heap.
esJavaOpts: "-Xmx128m -Xms128m"
# Allocate smaller chunks of memory per pod.
resources:
requests:
cpu: "100m"
memory: "512M"
limits:
cpu: "1000m"
memory: "512M"
# Request smaller persistent volumes.
volumeClaimTemplate:
accessModes: [ "ReadWriteOnce" ]
storageClassName: "hostpath"
resources:
requests:
storage: 100M

View File

@@ -0,0 +1,17 @@
default: test
RELEASE := helm-es-kind
TIMEOUT := 1200s
install:
helm upgrade --wait --timeout=$(TIMEOUT) --install --values values.yaml $(RELEASE) ../../
install-local-path:
kubectl apply -f https://raw.githubusercontent.com/rancher/local-path-provisioner/master/deploy/local-path-storage.yaml
helm upgrade --wait --timeout=$(TIMEOUT) --install --values values-local-path.yaml $(RELEASE) ../../
test: install
helm test $(RELEASE)
purge:
helm del $(RELEASE)

View File

@@ -0,0 +1,36 @@
# KIND
This example deploy a 3 nodes Elasticsearch 7.16.2 cluster on [Kind][]
using [custom values][].
Note that this configuration should be used for test only and isn't recommended
for production.
Note that Kind < 0.7.0 are affected by a [kind issue][] with mount points
created from PVCs not writable by non-root users. [kubernetes-sigs/kind#1157][]
fix it in Kind 0.7.0.
The workaround for Kind < 0.7.0 is to install manually
[Rancher Local Path Provisioner][] and use `local-path` storage class for
Elasticsearch volumes (see [Makefile][] instructions).
## Usage
* For Kind >= 0.7.0: Deploy Elasticsearch chart with the default values: `make install`
* For Kind < 0.7.0: Deploy Elasticsearch chart with `local-path` storage class: `make install-local-path`
* You can now setup a port forward to query Elasticsearch API:
```
kubectl port-forward svc/elasticsearch-master 9200
curl localhost:9200/_cat/indices
```
[custom values]: https://github.com/elastic/helm-charts/blob/7.16/elasticsearch/examples/kubernetes-kind/values.yaml
[kind]: https://kind.sigs.k8s.io/
[kind issue]: https://github.com/kubernetes-sigs/kind/issues/830
[kubernetes-sigs/kind#1157]: https://github.com/kubernetes-sigs/kind/pull/1157
[rancher local path provisioner]: https://github.com/rancher/local-path-provisioner
[Makefile]: https://github.com/elastic/helm-charts/blob/7.16/elasticsearch/examples/kubernetes-kind/Makefile#L5

View File

@@ -0,0 +1,23 @@
---
# Permit co-located instances for solitary minikube virtual machines.
antiAffinity: "soft"
# Shrink default JVM heap.
esJavaOpts: "-Xmx128m -Xms128m"
# Allocate smaller chunks of memory per pod.
resources:
requests:
cpu: "100m"
memory: "512M"
limits:
cpu: "1000m"
memory: "512M"
# Request smaller persistent volumes.
volumeClaimTemplate:
accessModes: [ "ReadWriteOnce" ]
storageClassName: "local-path"
resources:
requests:
storage: 100M

View File

@@ -0,0 +1,23 @@
---
# Permit co-located instances for solitary minikube virtual machines.
antiAffinity: "soft"
# Shrink default JVM heap.
esJavaOpts: "-Xmx128m -Xms128m"
# Allocate smaller chunks of memory per pod.
resources:
requests:
cpu: "100m"
memory: "512M"
limits:
cpu: "1000m"
memory: "512M"
# Request smaller persistent volumes.
volumeClaimTemplate:
accessModes: [ "ReadWriteOnce" ]
storageClassName: "local-path"
resources:
requests:
storage: 100M

View File

@@ -0,0 +1,13 @@
default: test
RELEASE := helm-es-microk8s
TIMEOUT := 1200s
install:
helm upgrade --wait --timeout=$(TIMEOUT) --install --values values.yaml $(RELEASE) ../../
test: install
helm test $(RELEASE)
purge:
helm del $(RELEASE)

View File

@@ -0,0 +1,32 @@
# MicroK8S
This example deploy a 3 nodes Elasticsearch 7.16.2 cluster on [MicroK8S][]
using [custom values][].
Note that this configuration should be used for test only and isn't recommended
for production.
## Requirements
The following MicroK8S [addons][] need to be enabled:
- `dns`
- `helm`
- `storage`
## Usage
* Deploy Elasticsearch chart with the default values: `make install`
* You can now setup a port forward to query Elasticsearch API:
```
kubectl port-forward svc/elasticsearch-master 9200
curl localhost:9200/_cat/indices
```
[addons]: https://microk8s.io/docs/addons
[custom values]: https://github.com/elastic/helm-charts/tree/7.16/elasticsearch/examples/microk8s/values.yaml
[MicroK8S]: https://microk8s.io

View File

@@ -0,0 +1,32 @@
---
# Disable privileged init Container creation.
sysctlInitContainer:
enabled: false
# Restrict the use of the memory-mapping when sysctlInitContainer is disabled.
esConfig:
elasticsearch.yml: |
node.store.allow_mmap: false
# Permit co-located instances for solitary minikube virtual machines.
antiAffinity: "soft"
# Shrink default JVM heap.
esJavaOpts: "-Xmx128m -Xms128m"
# Allocate smaller chunks of memory per pod.
resources:
requests:
cpu: "100m"
memory: "512M"
limits:
cpu: "1000m"
memory: "512M"
# Request smaller persistent volumes.
volumeClaimTemplate:
accessModes: [ "ReadWriteOnce" ]
storageClassName: "microk8s-hostpath"
resources:
requests:
storage: 100M

View File

@@ -0,0 +1,10 @@
PREFIX := helm-es-migration
data:
helm upgrade --wait --timeout=$(TIMEOUT) --install --values data.yaml $(PREFIX)-data ../../
master:
helm upgrade --wait --timeout=$(TIMEOUT) --install --values master.yaml $(PREFIX)-master ../../
client:
helm upgrade --wait --timeout=$(TIMEOUT) --install --values client.yaml $(PREFIX)-client ../../

View File

@@ -0,0 +1,167 @@
# Migration Guide from helm/charts
There are two viable options for migrating from the community Elasticsearch Helm
chart from the [helm/charts][] repo.
1. Restoring from Snapshot to a fresh cluster
2. Live migration by joining a new cluster to the existing cluster.
## Restoring from Snapshot
This is the recommended and preferred option. The downside is that it will
involve a period of write downtime during the migration. If you have a way to
temporarily stop writes to your cluster then this is the way to go. This is also
a lot simpler as it just involves launching a fresh cluster and restoring a
snapshot following the [restoring to a different cluster guide][].
## Live migration
If restoring from a snapshot is not possible due to the write downtime then a
live migration is also possible. It is very important to first test this in a
testing environment to make sure you are comfortable with the process and fully
understand what is happening.
This process will involve joining a new set of master, data and client nodes to
an existing cluster that has been deployed using the [helm/charts][] community
chart. Nodes will then be replaced one by one in a controlled fashion to
decommission the old cluster.
This example will be using the default values for the existing helm/charts
release and for the Elastic helm-charts release. If you have changed any of the
default values then you will need to first make sure that your values are
configured in a compatible way before starting the migration.
The process will involve a re-sync and a rolling restart of all of your data
nodes. Therefore it is important to disable shard allocation and perform a synced
flush like you normally would during any other rolling upgrade. See the
[rolling upgrades guide][] for more information.
* The default image for this chart is
`docker.elastic.co/elasticsearch/elasticsearch` which contains the default
distribution of Elasticsearch with a [basic license][]. Make sure to update the
`image` and `imageTag` values to the correct Docker image and Elasticsearch
version that you currently have deployed.
* Convert your current helm/charts configuration into something that is
compatible with this chart.
* Take a fresh snapshot of your cluster. If something goes wrong you want to be
able to restore your data no matter what.
* Check that your clusters health is green. If not abort and make sure your
cluster is healthy before continuing:
```
curl localhost:9200/_cluster/health
```
* Deploy new data nodes which will join the existing cluster. Take a look at the
configuration in [data.yaml][]:
```
make data
```
* Check that the new nodes have joined the cluster (run this and any other curl
commands from within one of your pods):
```
curl localhost:9200/_cat/nodes
```
* Check that your cluster is still green. If so we can now start to scale down
the existing data nodes. Assuming you have the default amount of data nodes (2)
we now want to scale it down to 1:
```
kubectl scale statefulsets my-release-elasticsearch-data --replicas=1
```
* Wait for your cluster to become green again:
```
watch 'curl -s localhost:9200/_cluster/health'
```
* Once the cluster is green we can scale down again:
```
kubectl scale statefulsets my-release-elasticsearch-data --replicas=0
```
* Wait for the cluster to be green again.
* OK. We now have all data nodes running in the new cluster. Time to replace the
masters by firstly scaling down the masters from 3 to 2. Between each step make
sure to wait for the cluster to become green again, and check with
`curl localhost:9200/_cat/nodes` that you see the correct amount of master
nodes. During this process we will always make sure to keep at least 2 master
nodes as to not lose quorum:
```
kubectl scale statefulsets my-release-elasticsearch-master --replicas=2
```
* Now deploy a single new master so that we have 3 masters again. See
[master.yaml][] for the configuration:
```
make master
```
* Scale down old masters to 1:
```
kubectl scale statefulsets my-release-elasticsearch-master --replicas=1
```
* Edit the masters in [masters.yaml][] to 2 and redeploy:
```
make master
```
* Scale down the old masters to 0:
```
kubectl scale statefulsets my-release-elasticsearch-master --replicas=0
```
* Edit the [masters.yaml][] to have 3 replicas and remove the
`discovery.zen.ping.unicast.hosts` entry from `extraEnvs` then redeploy the
masters. This will make sure all 3 masters are running in the new cluster and
are pointing at each other for discovery:
```
make master
```
* Remove the `discovery.zen.ping.unicast.hosts` entry from `extraEnvs` then
redeploy the data nodes to make sure they are pointing at the new masters:
```
make data
```
* Deploy the client nodes:
```
make client
```
* Update any processes that are talking to the existing client nodes and point
them to the new client nodes. Once this is done you can scale down the old
client nodes:
```
kubectl scale deployment my-release-elasticsearch-client --replicas=0
```
* The migration should now be complete. After verifying that everything is
working correctly you can cleanup leftover resources from your old cluster.
[basic license]: https://www.elastic.co/subscriptions
[data.yaml]: https://github.com/elastic/helm-charts/blob/7.16/elasticsearch/examples/migration/data.yaml
[helm/charts]: https://github.com/helm/charts/tree/7.16/stable/elasticsearch
[master.yaml]: https://github.com/elastic/helm-charts/blob/7.16/elasticsearch/examples/migration/master.yaml
[restoring to a different cluster guide]: https://www.elastic.co/guide/en/elasticsearch/reference/6.8/modules-snapshots.html#_restoring_to_a_different_cluster
[rolling upgrades guide]: https://www.elastic.co/guide/en/elasticsearch/reference/6.8/rolling-upgrades.html

View File

@@ -0,0 +1,23 @@
---
replicas: 2
clusterName: "elasticsearch"
nodeGroup: "client"
esMajorVersion: 6
roles:
master: "false"
ingest: "false"
data: "false"
volumeClaimTemplate:
accessModes: [ "ReadWriteOnce" ]
storageClassName: "standard"
resources:
requests:
storage: 1Gi # Currently needed till pvcs are made optional
persistence:
enabled: false

View File

@@ -0,0 +1,17 @@
---
replicas: 2
esMajorVersion: 6
extraEnvs:
- name: discovery.zen.ping.unicast.hosts
value: "my-release-elasticsearch-discovery"
clusterName: "elasticsearch"
nodeGroup: "data"
roles:
master: "false"
ingest: "false"
data: "true"

View File

@@ -0,0 +1,26 @@
---
# Temporarily set to 3 so we can scale up/down the old a new cluster
# one at a time whilst always keeping 3 masters running
replicas: 1
esMajorVersion: 6
extraEnvs:
- name: discovery.zen.ping.unicast.hosts
value: "my-release-elasticsearch-discovery"
clusterName: "elasticsearch"
nodeGroup: "master"
roles:
master: "true"
ingest: "false"
data: "false"
volumeClaimTemplate:
accessModes: [ "ReadWriteOnce" ]
storageClassName: "standard"
resources:
requests:
storage: 4Gi

View File

@@ -0,0 +1,13 @@
default: test
RELEASE := helm-es-minikube
TIMEOUT := 1200s
install:
helm upgrade --wait --timeout=$(TIMEOUT) --install --values values.yaml $(RELEASE) ../../
test: install
helm test $(RELEASE)
purge:
helm del $(RELEASE)

View File

@@ -0,0 +1,38 @@
# Minikube
This example deploy a 3 nodes Elasticsearch 7.16.2 cluster on [Minikube][]
using [custom values][].
If helm or kubectl timeouts occur, you may consider creating a minikube VM with
more CPU cores or memory allocated.
Note that this configuration should be used for test only and isn't recommended
for production.
## Requirements
In order to properly support the required persistent volume claims for the
Elasticsearch StatefulSet, the `default-storageclass` and `storage-provisioner`
minikube addons must be enabled.
```
minikube addons enable default-storageclass
minikube addons enable storage-provisioner
```
## Usage
* Deploy Elasticsearch chart with the default values: `make install`
* You can now setup a port forward to query Elasticsearch API:
```
kubectl port-forward svc/elasticsearch-master 9200
curl localhost:9200/_cat/indices
```
[custom values]: https://github.com/elastic/helm-charts/tree/7.16/elasticsearch/examples/minikube/values.yaml
[minikube]: https://minikube.sigs.k8s.io/docs/

View File

@@ -0,0 +1,23 @@
---
# Permit co-located instances for solitary minikube virtual machines.
antiAffinity: "soft"
# Shrink default JVM heap.
esJavaOpts: "-Xmx128m -Xms128m"
# Allocate smaller chunks of memory per pod.
resources:
requests:
cpu: "100m"
memory: "512M"
limits:
cpu: "1000m"
memory: "512M"
# Request smaller persistent volumes.
volumeClaimTemplate:
accessModes: [ "ReadWriteOnce" ]
storageClassName: "standard"
resources:
requests:
storage: 100M

View File

@@ -0,0 +1,19 @@
default: test
include ../../../helpers/examples.mk
PREFIX := helm-es-multi
RELEASE := helm-es-multi-master
TIMEOUT := 1200s
install:
helm upgrade --wait --timeout=$(TIMEOUT) --install --values master.yaml $(PREFIX)-master ../../
helm upgrade --wait --timeout=$(TIMEOUT) --install --values data.yaml $(PREFIX)-data ../../
helm upgrade --wait --timeout=$(TIMEOUT) --install --values client.yaml $(PREFIX)-client ../../
test: install goss
purge:
helm del $(PREFIX)-master
helm del $(PREFIX)-data
helm del $(PREFIX)-client

Some files were not shown because too many files have changed in this diff Show More