Compare commits

...

101 Commits

Author SHA1 Message Date
Valeriano Manassero
7c3ed7eb72 Fix external mongodb connstring (#135)
* Changed: maongodb.enabled check not needed

* Changed: external MongoDB connection string

* Changed: bump up version

* Added: artifacthub changelog annotation
2023-01-24 09:27:42 +01:00
Valeriano Manassero
67d4b5b95d Enterprise apps sa (#134)
* Changed: don't use cluster wide access

* Changed: bump version
2023-01-20 10:24:34 +01:00
Valeriano Manassero
832090a791 Configurable securitycontext (#133)
* Added: configurable securityContext

* Changed: bump up version

* Changed: bump up version
2023-01-19 15:00:22 +01:00
Valeriano Manassero
e1049fa0ab Ingressclassname (#132)
* Added: ingressclassname

* Changed: bump up version
2023-01-19 07:48:30 +01:00
Valeriano Manassero
5f62daac0f Existing resource for additionalconfigs (#130)
* Added: additionalConfigs reference for existing resurce

* Changed: version bump
2023-01-18 13:34:29 +01:00
Valeriano Manassero
cdcd35c224 Enterprise 3.15.3 (#129)
* Changed: enterprise version bump

* Changed: version bump
2023-01-16 16:46:26 +01:00
Valeriano Manassero
3fd3f30030 Enterprise override tag (#127)
* Added: override for enterprise  image tag

* Changed: version bump

* Added: enterprise image tage overrides

* Changed: bump up version
2023-01-12 09:12:19 +01:00
Valeriano Manassero
bdea0e778b Fix nodeport (#126)
* fixed: agent nodeSelector

* Changed: version bump
2023-01-12 08:21:38 +01:00
Valeriano Manassero
1ea09e63e5 Fix fileserver pvc class (#125)
* Fixed: fileserver custom storageclass

* Changed: version bump
2023-01-10 16:48:06 +01:00
Valeriano Manassero
1cc3018ef3 Fix enterprise secret generation (#124)
* Fixed: secret reference

* Changed: bump up version
2023-01-09 16:05:08 +01:00
Valeriano Manassero
3b689bf051 Various fixes after major releases (#123)
* Fixed: env vars

* Changed: version bump

* Fixed: config path

* Fixed: queues generation

* Fixed: typo

* Fixed: no default queue set

* Fixed: enterprise only sec creds

* Fixed: typo
2023-01-05 11:52:53 +01:00
Valeriano Manassero
622ec331ac Agent chart annotations, labels and sa improvements (#122)
* Added: sa reference name in task pod

* Changed: version bump

* Added: annotations generator

* Added: annotations

* Aded: labels and annotations

* Added: annotations and labels

* Added: agent node-selector

* Fixed: annotations generation
2023-01-04 12:01:24 +01:00
Valeriano Manassero
7041c62f44 Clearml agent enterprise features (#121)
* Added: enterprise features alignment

* Changed: version bump

* Fixed: trailing spaces

* Fixed: comment starting space

* Changed: owner-token feature

* Fixed: secret reference name

* Changed: owner-token enterprise reference
2023-01-04 09:45:23 +01:00
Valeriano Manassero
cb98ae9a19 Clearml enterprise features (#120)
* Changed: rename alignment

* Changed: general refactoring

* Changed: version bump

* Added: enterprise company guid

* Added: tanzu rolebinding

* Changed: updated parames

* Changed: bump to 1.9

* Fixed: whitespaces

* Added: fake values for apps git user

* Changed: updated deps

* Changed: app version reference

* Changed: enterprise parameters name

* Changed: image version bump

* Changed: extra index url config for enterprise
2023-01-04 09:32:01 +01:00
Valeriano Manassero
874f1cf0ce Upgrade agent 124 21 (#119)
* Changed: bump up agent version

* Changed: bump up chart version
2022-11-30 08:25:12 +01:00
Valeriano Manassero
c54e6ef44a Upgrade 180 (#118)
* Changed: bump up app to version 1.8.0

* Changed: bump up chart version
2022-11-30 08:12:52 +01:00
Valeriano Manassero
ca54cb570f Serving refactoring (#116)
* Changed: remove unused status

* Changed: image tags refactoring

* Changed: Images references refactoring

* Added: serving ingresses

* Added: grafana ingress

* Added: grafana ingress

* Changed: chart version bump

* Changed: maintainers

* Added: value comments

* Fixed: typo

* Fixed: typos
2022-11-04 15:10:11 +01:00
Valeriano Manassero
5035814ed9 Changed: serving app version to 1.2.0 (#114) 2022-10-11 10:34:52 +02:00
Valeriano Manassero
462a8da239 Serving hpa (#113)
* Added: basic hpa

* Changed: version bump
2022-10-10 09:17:05 +02:00
Valeriano Manassero
8747bceb4e 1.7.0 upgrade (#112)
* Changed: mage update

* Changed: version update
2022-10-05 14:34:47 +02:00
Valeriano Manassero
6aea682b0d Fix: agent release (#109)
* Fix: agent release

* Changed: version bump
2022-09-16 08:42:34 +02:00
Valeriano Manassero
4704415662 Make PDR compatible with k8s 1.25 (#108)
* Changed: pdr version

* Changed: dependency update

* Changed: removed eol k8s

* Changed: kind versions update

* Removed: incompatible version with GH actions

* Changed: updated action
2022-09-16 08:28:41 +02:00
Brett Cullen
8374ece563 Added missing brackets around .Values.imageCredentials.existingSecret (#107) 2022-09-16 00:12:03 +03:00
Brett Cullen
0871e73831 Fixed missing brackets for k8 secret (docker config) (#106) 2022-09-15 23:35:36 +03:00
Niels ten Boom
a90b91f024 feat: expand volumemount capabilities for agent (#104)
* upgrade

* add upgrade instruction

* fix readme for agent

* Added newline at the end

* Try to fix CI

* Edited type added

* Update README.md

Co-authored-by: Valeriano Manassero <14011549+valeriano-manassero@users.noreply.github.com>
2022-09-13 14:53:44 +02:00
Nikolay Shamanovich
19a6785a03 Allowing auth secrets to be optional #2 (#100)
* Allowing auth secrets to be optional

* Add value secret.existingSecret for clearml chart.

* Add value clearml.existingAgentk8sglueSecret for clearml-agent chart.

* Add value clearml.existingClearmlConfigSecret for clearml-agent chart.

* Split Secret clearml-agent-conf in clearml-agent chart into two
  Secrets: clearml-agent-conf (agent.conf file) and clearml-agent-k8sglue
  (environment variables).

* Update helm-docs
2022-08-22 10:35:47 +02:00
Valeriano Manassero
fdea0c4a3f Use fullname (#97)
* Changed: use fullname to generate resources

* Changed: version bump
2022-08-11 11:23:02 +02:00
zandolsi-psee
ace37019a8 Use existing config map or secret for api-server config (#95)
* adapt ingress

* update docs

* remove idea

* fix imagepullsecret error

* add apiserver configuration

* add apiserver configuration

* add apiserver configuration
2022-08-09 07:08:12 +02:00
Luca Cerone
c3b5198dc9 Clearml agentk8s extra envs (#94)
* Added "extraEnvs" to agentk8s deployment definition

* added extraEnvs to values.yaml

* Bumped version

* Fixed linting issue

* updated docs

* Regenerated README.md

* Update values.yaml

Co-authored-by: Luca Cerone <luca.cerone@team.bumble.com>
Co-authored-by: Valeriano Manassero <14011549+valeriano-manassero@users.noreply.github.com>
2022-08-04 14:17:47 +02:00
Valeriano Manassero
9fd65b68f7 Apache2 License (#93)
* Added: reference to Apache2 license

* Changed: bump up version

* Fixed: License reference
2022-07-21 11:12:10 +02:00
Valeriano Manassero
56880de8bb Enable multiple agents installations (#92)
* Changed: dynamic names

* Changed: bump up version
2022-07-15 08:28:40 +02:00
Valeriano Manassero
d778d0ef93 Changed: update helm chart version (#90) 2022-07-13 10:23:34 +02:00
zandolsi-psee
e28a2f991b Fix imagepullsecret error (#89)
* adapt ingress

* update docs

* remove idea

* fix imagepullsecret error
2022-07-13 10:14:16 +02:00
Valeriano Manassero
dc30518c26 ClearML Version upgrade to 1.6.0 (#88)
* Changed: version upgrade

* Changed: helm docs version update

* Changed: helm-docs update
2022-07-12 10:48:16 +02:00
Valeriano Manassero
50237dcb9d Update agent 1.24 (#86)
* Changed: image update to 1.24

* Changed: bump up version
2022-06-23 11:19:03 +02:00
Valeriano Manassero
1b164c2906 Add configurable default base serve url (#83)
* Added: configurable default base serving url

* Changed: chart version bump
2022-06-23 10:46:18 +02:00
Valeriano Manassero
43806b8e21 Add insecure cert check flag (#85)
* Added: clearmlcheckCertificate flag

* Changed: bump chart
2022-06-23 10:43:39 +02:00
Valeriano Manassero
80072c0654 Add editable config for k8s Agent (#84)
* Added: editable configuration

* Changed: bump up version
2022-06-23 09:52:19 +02:00
Valeriano Manassero
e22bd30764 Upgrade to app version 1.5.0 (#81)
* Changed: upgrade to 1.5.0

* Fixed: inject after ct check

* Fixed: list changd

* Fixed: typo
2022-06-23 07:49:45 +02:00
Valeriano Manassero
84a003b7bc Fix fileserver check on Agent (#82)
* Fixed: fileserver check

* Changed: version bump
2022-06-22 16:52:51 +02:00
Valeriano Manassero
1d95f0c27f Pullsecrets pod template (#80)
* Added: pullsecrets management for pod template

* Changed: version bump
2022-06-22 15:52:14 +02:00
Valeriano Manassero
562815e97a ClearML standalone agent chart (#79)
* Added: agent chart

* Changed: base image tag

* Changed: updated helm-docs

* Fixed: maintainers

* Changed: updated radme

* Fixed: http code to check

* Fixed: default values to check

* Changed: updated helm-docs

* Added: default values to be substituted by GH action

* Added: sed on the fly for testing

* Changed: updated CI images for Kind
2022-06-08 10:01:33 +02:00
Leon Rotim
e317610397 Fix linebreak formatting (#77)
* fix wrong line break delete

* version, bugfix inc

* regenerate README.md

Co-authored-by: Valeriano Manassero <14011549+valeriano-manassero@users.noreply.github.com>
2022-06-08 08:16:52 +02:00
Luca Cerone
16f172fc1c Allowing extraEnv to be added to agentservices and agent deployments. (#76)
* Allowing extraEnv to be added to agentservices and agent deployments.

* Bumped version and generated documentation chart

* Lint fix

* Chart version update

* helm-docs update

Co-authored-by: Valeriano Manassero <14011549+valeriano-manassero@users.noreply.github.com>
2022-06-08 08:06:20 +02:00
Valeriano Manassero
69048b5c96 Fix glue agent image (#78)
* Changed: avoid latest image

* Changed: version bump

* Fixed: pull policy

* Removed: specific ci for glue since now it's on by default

* Fixed: don't refresh dependencies

* Changed: testing chart action version update

* Fixed: action

* Changed: dependency updates required

* Fixed: lint and install

* Revert "Changed: dependency updates required"

This reverts commit 34ee22d7d0.

* Changed: use copy of dep charts because ththey may become unavailable

* Changed: updated readme
2022-06-02 21:20:00 +02:00
wasabipeas
9cf2868738 Clearml serving add support for triton (#74)
* added triton deployment and service, added triton block to values file, added value for CLEARML_DEFAULT_TRITON_GRPC_ADDR env variable in the serving-inference deployment

* re-generated README

* fixed yaml

* added condition to enable triton support

* changed chart version

* changed chart version

* bumped version to 0.3.0

* added conditional extraPythonPackages variable to clearml_serving_triton deploymnent

* added conditional extraPythonPackages to all the relevant deployments

* bumped version to 0.3.0
2022-05-24 13:12:15 +02:00
Valeriano Manassero
8098fd82df Add extra packages (#72)
* Added: extra python packages

* Changed: chart version
2022-05-23 13:06:48 +02:00
wasabipeas
4422cf433d added clearml-serving chart (#69)
* added clearml-serving chart

* fixed typo, added autogenerated README.md

* removed trailing space from values.yaml

* removed namespace definition from the values file and all the templates

* fixed typo

* re-run helm-docs
2022-05-19 07:35:30 +02:00
Valeriano Manassero
10296ac979 Update helm-docs.sh (#70) 2022-05-18 13:21:43 +02:00
Valeriano Manassero
06070a5c20 Default storageclass (#66)
* Changed: use deffault storageclass if not declared

* Changed: chart version
2022-05-02 18:00:46 +02:00
Niels ten Boom
5972fd8e5f fix: k8sagent indentation (#65) 2022-04-27 22:55:27 +02:00
Valeriano Manassero
7a7bd930f8 Fix glue namespace handling (#63)
* Changed: namespace handling for glue

* Changed: set glue as default agent system

* Changed: bump up version
2022-04-22 10:19:03 +02:00
Valeriano Manassero
25dfbd12d6 Changed: bump up versions (#62)
* Changed: bump up versions

* Changed: helmpdocs to 1.8.1
2022-04-19 08:22:08 +02:00
Valeriano Manassero
d7c3b9d5d9 Added: upgrade procedures (#61)
* Added: upgrade procedures

* Changed: template

* Changed: updated chart version
2022-04-04 10:32:51 +02:00
Valeriano Manassero
e16060f2ad Fix empty glue configs (#59)
* Added: use empty values without breaking glue agent

* Added: release namespace

* Changed: bump up version
2022-03-30 16:33:06 +02:00
Valeriano Manassero
27a666d2ae Clarml app 1.3.0 (#57)
* Changed: clarml app version

* Changed: chart version bump

* Added: comment on additional configs
2022-03-28 09:29:04 +02:00
Valeriano Manassero
d7bef0ff9d Add authentication example (#56)
* Added: auth enabled example in additionalConfigs

* Changed: bump up version

* Fixed: remove trailing spaces
2022-03-25 10:27:40 +01:00
Zied ANDOLSI
049e609ce0 add image pull secret + add ingress path (#55) 2022-03-16 18:04:56 +01:00
Niels ten Boom
fa3739b643 Improvements k8sagent (#54) 2022-03-01 17:48:33 +01:00
Valeriano Manassero
018348bc1d Fix image versions (#53)
* Fixed: image versions

* Changed: chart version

* Changed: readme update by helm-docs
2022-02-22 11:42:23 +01:00
Valeriano Manassero
57b85cbfce Update clearml image 1.2.0 (#52) 2022-02-17 15:33:30 +01:00
Niels ten Boom
9c15a8a348 fix: faulty service values references in k8s agent (#50)
* add k8s glue deployment

* more docs

* bump

* disabled by default

* run helm-docs

* fix service references

* fix readme

* add values file where k8sagent enabled

* empty files

* newline

* fix linter

Co-authored-by: Valeriano Manassero <14011549+valeriano-manassero@users.noreply.github.com>
2022-01-21 16:15:09 +01:00
Niels ten Boom
cd7f22f7d8 feat: Add k8s glue agent deployment (#49) 2022-01-18 23:27:12 +01:00
Shaun Howell
078e394e24 update ingress templates to accept per-service annotation overrides (#48)
* update ingress yamls to accept annotation overrides

* bump version to 3.3.1

* update readme via helm-docs
2022-01-18 18:06:01 +01:00
Valeriano Manassero
70b07c637a Update Elasticsearch (#47)
* update elasticsearch

* update elasticsearch reference

* bump up chart version
2022-01-05 08:26:57 +01:00
Valeriano Manassero
7b8e40c626 Agent foreground mode (#46)
* use foreground to push output on console

* bump up version
2021-12-13 09:04:02 +01:00
Valeriano Manassero
d8117eeb0d add k8s 1.22.1 to ci procedure (#44)
After some tests I found 1.22.1 doesn't have ulimit issue so I can include it into the ci process
2021-12-09 11:47:13 +01:00
Valeriano Manassero
4c09ae2c92 Fix env typo (#39)
* typo fix

* bump up version
2021-12-09 11:39:04 +01:00
Valeriano Manassero
478eecd5f2 remove k8s 1.22 from ci (#43)
It looks 1.22 k8s image from kind has a very low ulimit preventing elastic search from installing, removing it waiting for a fix.
2021-12-09 11:30:15 +01:00
Valeriano Manassero
43f4c44219 test one single kind cluster at time to avoid pressure fails (#42) 2021-12-09 11:00:57 +01:00
Valeriano Manassero
b83c8cd0e8 indentation fix (#41) 2021-12-09 10:32:42 +01:00
Valeriano Manassero
97f219228d update kind k8s versions (#40) 2021-12-09 10:31:17 +01:00
Valeriano Manassero
1b5b9407f6 Configurable auth cookies age (#38)
* configurable auth cookies age

* version bump up
2021-12-09 08:14:09 +01:00
Valeriano Manassero
b494a8c0cf External services (#36)
* use external services switch

* bump up version

* readme update
2021-11-26 08:11:55 +01:00
Weixiao Huang
266a1e3c41 feat: make service nodePort configurable and add some doc descriptions (#33)
* feat: make service nodePort configurable

* feat: bump version to 3.0.6

* docs: add descriptions for secret and service fields

* feat: add comments in clearml-kind.yaml of README.md

Co-authored-by: 黄维啸 <huangweixiao@megvii.com>
2021-11-08 14:23:10 +01:00
Weixiao Huang
bba5c0769f feat: make secret configurable and add secret annotations to deployment (#32) 2021-11-04 20:36:21 +01:00
Valeriano Manassero
b7f73e3bd9 Switch enabler agentservices (#31)
* switch to enable/disable agentservices

* bump up version
2021-09-21 14:16:30 +02:00
Valeriano Manassero
d3f6f3e50d Fix helper typo (#30)
* fix helper typo on api service name

* bump up version
2021-09-16 11:21:25 +02:00
Valeriano Manassero
979e73fe3d Fix ingress compat (#29)
* fix ingress compatibility with different k8s version

* bump up version
2021-09-16 10:54:25 +02:00
Valeriano Manassero
7352f35836 Helpers fix (#28)
* fix wrong service names

* bump up version
2021-09-16 09:11:58 +02:00
Valeriano Manassero
82ad17860d New ingress style (#27)
* new ingress style

* bump up version

* hostName fix

* helm-docs update
2021-09-16 08:51:07 +02:00
Valeriano Manassero
aa761dd450 Agent enable switch (#26)
* enable/disable switch

* bump up chart
2021-09-15 08:13:01 +02:00
Valeriano Manassero
7ff2f94d1a Apiserver configmap (#25)
* metadata name fix

* use toString

* use configmap for apiserver configs

* bump up version

* indentation fix

* fix trailing whitespaces
2021-09-14 15:43:10 +02:00
Valeriano Manassero
618a269c97 Fix service url generation (#21)
* service url generation functions

* use generation functions

* bump up version
2021-08-26 10:58:06 +02:00
Valeriano Manassero
3f215d2d90 Use many ingresses (#20)
* use many ingresses

* bump up version
2021-08-25 14:49:43 +02:00
Valeriano Manassero
03223fc1c1 Use Recreate as strategy (#19)
* use Recreate as strategybump up version

* fix strategy indentation and position

* updatesStrategy configurable

* updateStrategy parameter

* use 2.2.0 instead of patch release
2021-08-17 14:59:13 +02:00
Valeriano Manassero
898089b7fb Fix agent clearml conf (#18)
* fix agent mount clearml.conf

* bump up version
2021-08-14 12:00:23 +02:00
Valeriano Manassero
732bb970aa Configurable prefix ingress (#17)
* ingress configurable prefixes

* chart version bump up

* fix version number
2021-08-12 12:02:26 +03:00
Valeriano Manassero
91d45281fa bump up app version to 1.1.1 (#16) 2021-08-06 14:18:53 +02:00
Valeriano Manassero
28b6e9f4e4 Remove badges from root README since we have them in chart one (#14) 2021-07-27 16:18:02 +02:00
Valeriano Manassero
7f6df85ec5 Bump up version (#13)
* bump up to 1.1.0

* helm-docs update
2021-07-27 15:26:51 +02:00
Valeriano Manassero
97f1077072 Readme improvements (#12)
* better contributing guidelins

* added more info on repo itself
2021-07-27 13:30:55 +02:00
Valeriano Manassero
189de106c9 Kind data folder (#11)
* explain data folder for kind

* bump up version
2021-07-16 07:30:09 +02:00
Valeriano Manassero
d269374a49 One default agent (#10)
* one cpu only agent by default

* helm-docs update

* suggest kind for single done cluster

* bump up version

* fix trailing space
2021-07-15 17:34:29 +02:00
Valeriano Manassero
cc8789d71f Clearml chart readme improvements (#7)
* clearml chart LICENSE

* bump up version

* improved readme

* clearml chart name fix
2021-07-07 11:44:21 +02:00
Valeriano Manassero
6a2e3ed47e typo fixes (#6) 2021-07-07 09:39:04 +02:00
Valeriano Manassero
873fb6f7f0 added repo update reference (#5) 2021-07-07 09:28:30 +02:00
Valeriano Manassero
d6e967c9f5 filter release trigger (#4) 2021-07-07 09:20:48 +02:00
Valeriano Manassero
a98a0804ed force a trigger for every change until I got it working (#3) 2021-07-07 09:18:09 +02:00
Valeriano Manassero
2190f568fb Fix release action (#2)
* typo fix

* add dependencies repos
2021-07-07 09:15:53 +02:00
Valeriano Manassero
cf5a2e5fe6 Initial load (#1)
* imported chart

* ci

* docs

* added myself as maintainer

* helm-docs update

* bump up action version

* typo fix

* fix official webste

* fix allegro ai website
2021-07-07 09:04:15 +02:00
230 changed files with 19733 additions and 1 deletions

12
.github/helm-docs.sh vendored Executable file
View File

@@ -0,0 +1,12 @@
#!/bin/bash
CHART_DIRS="$(git diff --find-renames --name-only "$(git rev-parse --abbrev-ref HEAD)" remotes/origin/main -- 'charts' | grep '[cC]hart.yaml' | sed -e 's#/[Cc]hart.yaml##g')"
HELM_DOCS_VERSION="1.11.0"
curl --silent --show-error --fail --location --output /tmp/helm-docs.tar.gz https://github.com/norwoodj/helm-docs/releases/download/v"${HELM_DOCS_VERSION}"/helm-docs_"${HELM_DOCS_VERSION}"_Linux_x86_64.tar.gz
tar -xf /tmp/helm-docs.tar.gz helm-docs
for CHART_DIR in ${CHART_DIRS}; do
./helm-docs -c ${CHART_DIR}
git diff --exit-code
done

48
.github/workflows/ci.yaml vendored Normal file
View File

@@ -0,0 +1,48 @@
name: Lint and Test Charts
on:
pull_request:
types: [opened, synchronize, edited, reopened]
paths:
- 'charts/**'
jobs:
lint-docs:
runs-on: ubuntu-latest
steps:
- name: Checkout
uses: actions/checkout@v1
- name: Run helm-docs
run: .github/helm-docs.sh
install-chart:
name: install-chart
runs-on: ubuntu-latest
needs:
- lint-docs
strategy:
matrix:
k8s:
- v1.22.13
- v1.23.10
- v1.24.4
- v1.25.0
steps:
- name: Checkout
uses: actions/checkout@v1
- name: Create kind ${{ matrix.k8s }} cluster
uses: helm/kind-action@v1.3.0
with:
node_image: kindest/node:${{ matrix.k8s }}
- name: Set up chart-testing
uses: helm/chart-testing-action@v2.2.1
- name: Run chart-testing (list-changed)
id: list-changed
run: |
changed=$(ct list-changed --chart-dirs=charts --target-branch=main)
if [[ -n "$changed" ]]; then
echo "::set-output name=changed::true"
echo "::set-output name=changed_charts::\"${changed//$'\n'/,}\""
fi
- name: Run chart-testing (lint and install)
run: ct lint-and-install --chart-dirs=charts --target-branch=main --helm-extra-args="--timeout=15m" --charts=${{steps.list-changed.outputs.changed_charts}} --debug=true
if: steps.list-changed.outputs.changed == 'true'

29
.github/workflows/release.yaml vendored Normal file
View File

@@ -0,0 +1,29 @@
name: Release Charts
on:
push:
branches:
- main
paths:
- 'charts/**'
jobs:
release:
runs-on: ubuntu-latest
steps:
- name: Checkout
uses: actions/checkout@v1
- name: Add bitnami repo
run: helm repo add bitnami https://charts.bitnami.com/bitnami
- name: Add elastic repo
run: helm repo add elastic https://helm.elastic.co
- name: Configure Git
run: |
git config user.name "$GITHUB_ACTOR"
git config user.email "$GITHUB_ACTOR@users.noreply.github.com"
- name: Run chart-releaser
uses: helm/chart-releaser-action@v1.2.1
env:
CR_TOKEN: '${{ secrets.CR_TOKEN }}'
with:
charts_dir: charts

60
CONTRIBUTING.md Normal file
View File

@@ -0,0 +1,60 @@
# Guidelines for Contributing
:+1::tada: Firstly, we thank you for taking the time to contribute! :tada::+1:
Contribution comes in many forms:
* Reporting [issues](https://github.com/allegroai/clearml-helm-charts/issues) you've come upon
* Participating in issue discussions in the [issue tracker](https://github.com/allegroai/clearml-helm-charts/issues) and the [ClearML community slack space](https://join.slack.com/t/allegroai-trains/shared_invite/enQtOTQyMTI1MzQxMzE4LTY5NTUxOTY1NmQ1MzQ5MjRhMGRhZmM4ODE5NTNjMTg2NTBlZGQzZGVkMWU3ZDg1MGE1MjQxNDEzMWU2NmVjZmY)
* Suggesting new features or enhancements
* Implementing new features or fixing outstanding issues
The following is a set of guidelines for contributing to ClearML.
These are primarily guidelines, not rules.
Use your best judgment and feel free to propose changes to this document in a pull request.
## Reporting Issues
By following these guidelines, you help maintainers and the community understand your report, reproduce the behavior, and find related reports.
Before reporting an issue, please check whether it already appears [here](https://github.com/allegroai/clearml-helm-charts/issues).
If it does, join the on-going discussion instead.
**Note**: If you find a **Closed** issue that may be the same issue which you are currently experiencing,
then open a **New** issue and include a link to the original (Closed) issue in the body of your new one.
When reporting an issue, please include as much detail as possible: explain the problem and include additional details to help maintainers reproduce the problem:
* **Use a clear and descriptive title** for the issue to identify the problem.
* **Describe the exact steps necessary to reproduce the problem** in as much detail as possible. Please do not just summarize what you did. Make sure to explain how you did it.
* **Provide the specific environment setup.** Include the `pip freeze` output, specific environment variables, Python version, and other relevant information.
* **Provide specific examples to demonstrate the steps.** Include links to files or GitHub projects, or copy/paste snippets which you use in those examples.
* **If you are reporting any ClearML crash,** include a crash report with a stack trace from the operating system. Make sure to add the crash report in the issue and place it in a [code block](https://help.github.com/en/articles/getting-started-with-writing-and-formatting-on-github#multiple-lines),
a [file attachment](https://help.github.com/articles/file-attachments-on-issues-and-pull-requests/), or just put it in a [gist](https://gist.github.com/) (and provide link to that gist).
* **Describe the behavior you observed after following the steps** and the exact problem with that behavior.
* **Explain which behavior you expected to see and why.**
* **For Web-App issues, please include screenshots and animated GIFs** which recreate the described steps and clearly demonstrate the problem. You can use [LICEcap](https://www.cockos.com/licecap/) to record GIFs on macOS and Windows, and [silentcast](https://github.com/colinkeenan/silentcast) or [byzanz](https://github.com/threedaymonk/byzanz) on Linux.
## Suggesting New Features and Enhancements
By following these guidelines, you help maintainers and the community understand your suggestion and find related suggestions.
Enhancement suggestions are tracked as GitHub issues. After you determine which repository your enhancement suggestion is related to, create an issue on that repository and provide the following:
* **A clear and descriptive title** for the issue to identify the suggestion.
* **A step-by-step description of the suggested enhancement** in as much detail as possible.
* **Specific examples to demonstrate the steps.** Include copy/pasteable snippets which you use in those examples as [Markdown code blocks](https://help.github.com/articles/markdown-basics/#multiple-lines).
* **Describe the current behavior and explain which behavior you expected to see instead and why.**
* **Include screenshots or animated GIFs** which help you demonstrate the steps or point out the part of ClearML which the suggestion is related to. You can use [LICEcap](https://www.cockos.com/licecap/) to record GIFs on macOS and Windows, and [silentcast](https://github.com/colinkeenan/silentcast) or [byzanz](https://github.com/threedaymonk/byzanz) on Linux.
## Pull Requests
Before you submit a new PR:
* Verify the work you plan to merge addresses an existing [issue](https://github.com/allegroai/clearml-helm-charts/issues) (If not, open a new one)
* Check related discussions in the [ClearML slack community](https://join.slack.com/t/allegroai-trains/shared_invite/enQtOTQyMTI1MzQxMzE4LTY5NTUxOTY1NmQ1MzQ5MjRhMGRhZmM4ODE5NTNjMTg2NTBlZGQzZGVkMWU3ZDg1MGE1MjQxNDEzMWU2NmVjZmY) (Or start your own discussion on the `#clearml-dev` channel)
* Make sure your code conforms to the ClearML coding standards by running:
`flake8 --max-line-length=120 --statistics --show-source --extend-ignore=E501 ./clearml*`
In your PR include:
* A reference to the issue it addresses
* A brief description of the approach you've taken for implementing

View File

@@ -1 +1,83 @@
# clearml-helm-charts
# ClearML Helm Charts Library for Kubernetes
## Auto-Magical Experiment Manager & Version Control for AI
Helm charts provided by [Allegro AI](https://clear.ml), ready to launch on Kubernetes using [Kubernetes Helm](https://github.com/helm/helm).
## Introduction
The **clearml-server** is the backend service infrastructure for [ClearML](https://github.com/allegroai/clearml).
It allows multiple users to collaborate and manage their experiments.
By default, **ClearML** is set up to work with the **ClearML** demo server, which is open to anyone and resets periodically.
In order to host your own server, you will need to install **clearml-server** and point **ClearML** to it.
**clearml-server** contains the following components:
* The **ClearML** Web-App, a single-page UI for experiment management and browsing
* RESTful API for:
* Documenting and logging experiment information, statistics and results
* Querying experiments history, logs and results
* Locally-hosted file server for storing images and models making them easily accessible using the Web-App
Use this repository to deploy **clearml-server** on Kubernetes clusters.
## Provided in this repository
### [All around Helm Chart](https://github.com/allegroai/clearml-helm-charts/tree/main/charts/clearml)
## Who We Are
ClearML is supported by the team behind *allegro.ai*,
where we build deep learning pipelines and infrastructure for enterprise companies.
We built ClearML to track and control the glorious but messy process of training production-grade deep learning models.
We are committed to vigorously supporting and expanding the capabilities of ClearML.
We promise to always be backwardly compatible, making sure all your logs, data and pipelines
will always upgrade with you.
## License
Apache License, Version 2.0, (see the [LICENSE](https://www.apache.org/licenses/LICENSE-2.0) for more information)
## Requirements
### Setup a Kubernetes Cluster
For setting up Kubernetes on various platforms refer to the Kubernetes [getting started guide](http://kubernetes.io/docs/getting-started-guides/).
### Setup a single node LOCAL Kubernetes on laptop/desktop
For setting up Kubernetes on your laptop/desktop we suggest [kind](https://kind.sigs.k8s.io).
### Install Helm
Helm is a tool for managing Kubernetes charts. Charts are packages of pre-configured Kubernetes resources.
To install Helm, refer to the [Helm install guide](https://github.com/helm/helm#install) and ensure that the `helm` binary is in the `PATH` of your shell.
## Usage
```bash
$ helm repo add allegroai https://allegroai.github.io/clearml-helm-charts
$ helm repo update
$ helm search repo allegroai
$ helm install <release-name> allegroai/<chart>
```
## Documentation, Community & Support
More information in the [official documentation](https://allegro.ai/clearml/docs) and [on YouTube](https://www.youtube.com/c/ClearML).
If you have any questions: post on our [Slack Channel](https://join.slack.com/t/clearml/shared_invite/zt-c0t13pty-aVUZZW1TSSSg2vyIGVPBhg), or tag your questions on [stackoverflow](https://stackoverflow.com/questions/tagged/clearml) with '**[clearml](https://stackoverflow.com/questions/tagged/clearml)**' tag (*previously [trains](https://stackoverflow.com/questions/tagged/trains) tag*).
For feature requests or bug reports, please use [GitHub issues](https://github.com/allegroai/clearml-helm-charts/issues).
Additionally, you can always find us at *clearml@allegro.ai*
## Contributing
**PRs are always welcomed** :heart: See more details in the ClearML [Guidelines for Contributing](https://github.com/allegroai/clearml-helm-charts/blob/main/CONTRIBUTING.md).
_May the force (and the goddess of learning rates) be with you!_

View File

@@ -0,0 +1,23 @@
# Patterns to ignore when building packages.
# This supports shell glob matching, relative path matching, and
# negation (prefixed with !). Only one pattern per line.
.DS_Store
# Common VCS dirs
.git/
.gitignore
.bzr/
.bzrignore
.hg/
.hgignore
.svn/
# Common backup files
*.swp
*.bak
*.tmp
*.orig
*~
# Various IDEs
.project
.idea/
*.tmproj
.vscode/

View File

@@ -0,0 +1,19 @@
apiVersion: v2
name: clearml-agent
description: MLOps platform
type: application
version: "3.1.4"
appVersion: "1.24"
kubeVersion: ">= 1.19.0-0 < 1.26.0-0"
home: https://clear.ml
icon: https://raw.githubusercontent.com/allegroai/clearml/master/docs/clearml-logo.svg
sources:
- https://github.com/allegroai/clearml-helm-charts
- https://github.com/allegroai/clearml
maintainers:
- name: valeriano-manassero
url: https://github.com/valeriano-manassero
keywords:
- clearml
- "machine learning"
- mlops

View File

@@ -0,0 +1,201 @@
Apache License
Version 2.0, January 2004
http://www.apache.org/licenses/
TERMS AND CONDITIONS FOR USE, REPRODUCTION, AND DISTRIBUTION
1. Definitions.
"License" shall mean the terms and conditions for use, reproduction,
and distribution as defined by Sections 1 through 9 of this document.
"Licensor" shall mean the copyright owner or entity authorized by
the copyright owner that is granting the License.
"Legal Entity" shall mean the union of the acting entity and all
other entities that control, are controlled by, or are under common
control with that entity. For the purposes of this definition,
"control" means (i) the power, direct or indirect, to cause the
direction or management of such entity, whether by contract or
otherwise, or (ii) ownership of fifty percent (50%) or more of the
outstanding shares, or (iii) beneficial ownership of such entity.
"You" (or "Your") shall mean an individual or Legal Entity
exercising permissions granted by this License.
"Source" form shall mean the preferred form for making modifications,
including but not limited to software source code, documentation
source, and configuration files.
"Object" form shall mean any form resulting from mechanical
transformation or translation of a Source form, including but
not limited to compiled object code, generated documentation,
and conversions to other media types.
"Work" shall mean the work of authorship, whether in Source or
Object form, made available under the License, as indicated by a
copyright notice that is included in or attached to the work
(an example is provided in the Appendix below).
"Derivative Works" shall mean any work, whether in Source or Object
form, that is based on (or derived from) the Work and for which the
editorial revisions, annotations, elaborations, or other modifications
represent, as a whole, an original work of authorship. For the purposes
of this License, Derivative Works shall not include works that remain
separable from, or merely link (or bind by name) to the interfaces of,
the Work and Derivative Works thereof.
"Contribution" shall mean any work of authorship, including
the original version of the Work and any modifications or additions
to that Work or Derivative Works thereof, that is intentionally
submitted to Licensor for inclusion in the Work by the copyright owner
or by an individual or Legal Entity authorized to submit on behalf of
the copyright owner. For the purposes of this definition, "submitted"
means any form of electronic, verbal, or written communication sent
to the Licensor or its representatives, including but not limited to
communication on electronic mailing lists, source code control systems,
and issue tracking systems that are managed by, or on behalf of, the
Licensor for the purpose of discussing and improving the Work, but
excluding communication that is conspicuously marked or otherwise
designated in writing by the copyright owner as "Not a Contribution."
"Contributor" shall mean Licensor and any individual or Legal Entity
on behalf of whom a Contribution has been received by Licensor and
subsequently incorporated within the Work.
2. Grant of Copyright License. Subject to the terms and conditions of
this License, each Contributor hereby grants to You a perpetual,
worldwide, non-exclusive, no-charge, royalty-free, irrevocable
copyright license to reproduce, prepare Derivative Works of,
publicly display, publicly perform, sublicense, and distribute the
Work and such Derivative Works in Source or Object form.
3. Grant of Patent License. Subject to the terms and conditions of
this License, each Contributor hereby grants to You a perpetual,
worldwide, non-exclusive, no-charge, royalty-free, irrevocable
(except as stated in this section) patent license to make, have made,
use, offer to sell, sell, import, and otherwise transfer the Work,
where such license applies only to those patent claims licensable
by such Contributor that are necessarily infringed by their
Contribution(s) alone or by combination of their Contribution(s)
with the Work to which such Contribution(s) was submitted. If You
institute patent litigation against any entity (including a
cross-claim or counterclaim in a lawsuit) alleging that the Work
or a Contribution incorporated within the Work constitutes direct
or contributory patent infringement, then any patent licenses
granted to You under this License for that Work shall terminate
as of the date such litigation is filed.
4. Redistribution. You may reproduce and distribute copies of the
Work or Derivative Works thereof in any medium, with or without
modifications, and in Source or Object form, provided that You
meet the following conditions:
(a) You must give any other recipients of the Work or
Derivative Works a copy of this License; and
(b) You must cause any modified files to carry prominent notices
stating that You changed the files; and
(c) You must retain, in the Source form of any Derivative Works
that You distribute, all copyright, patent, trademark, and
attribution notices from the Source form of the Work,
excluding those notices that do not pertain to any part of
the Derivative Works; and
(d) If the Work includes a "NOTICE" text file as part of its
distribution, then any Derivative Works that You distribute must
include a readable copy of the attribution notices contained
within such NOTICE file, excluding those notices that do not
pertain to any part of the Derivative Works, in at least one
of the following places: within a NOTICE text file distributed
as part of the Derivative Works; within the Source form or
documentation, if provided along with the Derivative Works; or,
within a display generated by the Derivative Works, if and
wherever such third-party notices normally appear. The contents
of the NOTICE file are for informational purposes only and
do not modify the License. You may add Your own attribution
notices within Derivative Works that You distribute, alongside
or as an addendum to the NOTICE text from the Work, provided
that such additional attribution notices cannot be construed
as modifying the License.
You may add Your own copyright statement to Your modifications and
may provide additional or different license terms and conditions
for use, reproduction, or distribution of Your modifications, or
for any such Derivative Works as a whole, provided Your use,
reproduction, and distribution of the Work otherwise complies with
the conditions stated in this License.
5. Submission of Contributions. Unless You explicitly state otherwise,
any Contribution intentionally submitted for inclusion in the Work
by You to the Licensor shall be under the terms and conditions of
this License, without any additional terms or conditions.
Notwithstanding the above, nothing herein shall supersede or modify
the terms of any separate license agreement you may have executed
with Licensor regarding such Contributions.
6. Trademarks. This License does not grant permission to use the trade
names, trademarks, service marks, or product names of the Licensor,
except as required for reasonable and customary use in describing the
origin of the Work and reproducing the content of the NOTICE file.
7. Disclaimer of Warranty. Unless required by applicable law or
agreed to in writing, Licensor provides the Work (and each
Contributor provides its Contributions) on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or
implied, including, without limitation, any warranties or conditions
of TITLE, NON-INFRINGEMENT, MERCHANTABILITY, or FITNESS FOR A
PARTICULAR PURPOSE. You are solely responsible for determining the
appropriateness of using or redistributing the Work and assume any
risks associated with Your exercise of permissions under this License.
8. Limitation of Liability. In no event and under no legal theory,
whether in tort (including negligence), contract, or otherwise,
unless required by applicable law (such as deliberate and grossly
negligent acts) or agreed to in writing, shall any Contributor be
liable to You for damages, including any direct, indirect, special,
incidental, or consequential damages of any character arising as a
result of this License or out of the use or inability to use the
Work (including but not limited to damages for loss of goodwill,
work stoppage, computer failure or malfunction, or any and all
other commercial damages or losses), even if such Contributor
has been advised of the possibility of such damages.
9. Accepting Warranty or Additional Liability. While redistributing
the Work or Derivative Works thereof, You may choose to offer,
and charge a fee for, acceptance of support, warranty, indemnity,
or other liability obligations and/or rights consistent with this
License. However, in accepting such obligations, You may act only
on Your own behalf and on Your sole responsibility, not on behalf
of any other Contributor, and only if You agree to indemnify,
defend, and hold each Contributor harmless for any liability
incurred by, or claims asserted against, such Contributor by reason
of your accepting any such warranty or additional liability.
END OF TERMS AND CONDITIONS
APPENDIX: How to apply the Apache License to your work.
To apply the Apache License to your work, attach the following
boilerplate notice, with the fields enclosed by brackets "[]"
replaced with your own identifying information. (Don't include
the brackets!) The text should be enclosed in the appropriate
comment syntax for the file format. We also recommend that a
file or class name and description of purpose be included on the
same "printed page" as the copyright notice for easier
identification within third-party archives.
Copyright [yyyy] [name of copyright owner]
Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License.

View File

@@ -0,0 +1,123 @@
# ClearML Kubernetes Agent
![Version: 3.1.4](https://img.shields.io/badge/Version-3.1.4-informational?style=flat-square) ![Type: application](https://img.shields.io/badge/Type-application-informational?style=flat-square) ![AppVersion: 1.24](https://img.shields.io/badge/AppVersion-1.24-informational?style=flat-square)
MLOps platform
**Homepage:** <https://clear.ml>
## Maintainers
| Name | Email | Url |
| ---- | ------ | --- |
| valeriano-manassero | | <https://github.com/valeriano-manassero> |
## Introduction
The **clearml-agent** is the Kubernetes agent for for [ClearML](https://github.com/allegroai/clearml).
It allows you to schedule distributed experiments on a Kubernetes cluster.
## Source Code
* <https://github.com/allegroai/clearml-helm-charts>
* <https://github.com/allegroai/clearml>
## Requirements
Kubernetes: `>= 1.19.0-0 < 1.26.0-0`
## Values
| Key | Type | Default | Description |
|-----|------|---------|-------------|
| agentk8sglue | object | `{"annotations":{},"apiServerUrlReference":"https://api.clear.ml","basePodTemplate":{"annotations":{},"env":[],"fileMounts":[],"hostAliases":{},"initContainers":[],"labels":{},"nodeSelector":{},"resources":{},"schedulerName":"","securityContext":{},"tolerations":[],"volumeMounts":[],"volumes":[]},"clearmlcheckCertificate":true,"containerCustomBashScript":"","customBashScript":"","debugMode":false,"defaultContainerImage":"ubuntu:18.04","extraEnvs":[],"fileMounts":[],"fileServerUrlReference":"https://files.clear.ml","image":{"repository":"allegroai/clearml-agent-k8s-base","tag":"1.24-21"},"labels":{},"nodeSelector":{},"queue":"default","replicaCount":1,"serviceExistingAccountName":"","volumeMounts":[],"volumes":[],"webServerUrlReference":"https://app.clear.ml"}` | This agent will spawn queued experiments in new pods, a good use case is to combine this with GPU autoscaling nodes. https://github.com/allegroai/clearml-agent/tree/master/docker/k8s-glue |
| agentk8sglue.annotations | object | `{}` | annotations setup for Agent pod (example in values.yaml comments) |
| agentk8sglue.apiServerUrlReference | string | `"https://api.clear.ml"` | Reference to Api server url |
| agentk8sglue.basePodTemplate | object | `{"annotations":{},"env":[],"fileMounts":[],"hostAliases":{},"initContainers":[],"labels":{},"nodeSelector":{},"resources":{},"schedulerName":"","securityContext":{},"tolerations":[],"volumeMounts":[],"volumes":[]}` | base template for pods spawned to consume ClearML Task |
| agentk8sglue.basePodTemplate.annotations | object | `{}` | annotations setup for pods spawned to consume ClearML Task (example in values.yaml comments) |
| agentk8sglue.basePodTemplate.env | list | `[]` | environment variables for pods spawned to consume ClearML Task (example in values.yaml comments) |
| agentk8sglue.basePodTemplate.fileMounts | list | `[]` | file definition for pods spawned to consume ClearML Task (example in values.yaml comments) |
| agentk8sglue.basePodTemplate.hostAliases | object | `{}` | hostAliases setup for pods spawned to consume ClearML Task (example in values.yaml comments) |
| agentk8sglue.basePodTemplate.initContainers | list | `[]` | initContainers definition for pods spawned to consume ClearML Task (example in values.yaml comments) |
| agentk8sglue.basePodTemplate.labels | object | `{}` | labels setup for pods spawned to consume ClearML Task (example in values.yaml comments) |
| agentk8sglue.basePodTemplate.nodeSelector | object | `{}` | nodeSelector setup for pods spawned to consume ClearML Task (example in values.yaml comments) |
| agentk8sglue.basePodTemplate.resources | object | `{}` | resources declaration for pods spawned to consume ClearML Task (example in values.yaml comments) |
| agentk8sglue.basePodTemplate.schedulerName | string | `""` | schedulerName setup for pods spawned to consume ClearML Task |
| agentk8sglue.basePodTemplate.securityContext | object | `{}` | securityContext setup for pods spawned to consume ClearML Task (example in values.yaml comments) |
| agentk8sglue.basePodTemplate.tolerations | list | `[]` | tolerations setup for pods spawned to consume ClearML Task (example in values.yaml comments) |
| agentk8sglue.basePodTemplate.volumeMounts | list | `[]` | volume mounts definition for pods spawned to consume ClearML Task (example in values.yaml comments) |
| agentk8sglue.basePodTemplate.volumes | list | `[]` | volumes definition for pods spawned to consume ClearML Task (example in values.yaml comments) |
| agentk8sglue.clearmlcheckCertificate | bool | `true` | Check certificates validity for evefry UrlReference below. |
| agentk8sglue.containerCustomBashScript | string | `""` | Custom Bash script for the Task Pods ran by Glue Agent |
| agentk8sglue.debugMode | bool | `false` | Enable Debugging logs for Agent pod |
| agentk8sglue.defaultContainerImage | string | `"ubuntu:18.04"` | default container image for ClearML Task pod |
| agentk8sglue.extraEnvs | list | `[]` | Extra Environment variables for Glue Agent |
| agentk8sglue.fileMounts | list | `[]` | file definition for Glue Agent (example in values.yaml comments) |
| agentk8sglue.fileServerUrlReference | string | `"https://files.clear.ml"` | Reference to File server url |
| agentk8sglue.image | object | `{"repository":"allegroai/clearml-agent-k8s-base","tag":"1.24-21"}` | Glue Agent image configuration |
| agentk8sglue.labels | object | `{}` | labels setup for Agent pod (example in values.yaml comments) |
| agentk8sglue.nodeSelector | object | `{}` | nodeSelector setup for Agent pod (example in values.yaml comments) |
| agentk8sglue.queue | string | `"default"` | ClearML queue this agent will consume |
| agentk8sglue.replicaCount | int | `1` | Glue Agent number of pods |
| agentk8sglue.serviceExistingAccountName | string | `""` | if set, don't create a serviceAccountName but use defined existing one |
| agentk8sglue.volumeMounts | list | `[]` | volume mounts definition for Glue Agent (example in values.yaml comments) |
| agentk8sglue.volumes | list | `[]` | volumes definition for Glue Agent (example in values.yaml comments) |
| agentk8sglue.webServerUrlReference | string | `"https://app.clear.ml"` | Reference to Web server url |
| clearml | object | `{"agentk8sglueKey":"ACCESSKEY","agentk8sglueSecret":"SECRETKEY","clearmlConfig":"sdk {\n}","existingAgentk8sglueSecret":"","existingClearmlConfigSecret":""}` | ClearMl generic configurations |
| clearml.agentk8sglueKey | string | `"ACCESSKEY"` | Agent k8s Glue basic auth key |
| clearml.agentk8sglueSecret | string | `"SECRETKEY"` | Agent k8s Glue basic auth secret |
| clearml.clearmlConfig | string | `"sdk {\n}"` | ClearML configuration file |
| clearml.existingAgentk8sglueSecret | string | `""` | If this is set, chart will not generate a secret but will use what is defined here |
| clearml.existingClearmlConfigSecret | string | `""` | If this is set, chart will not generate a secret but will use what is defined here |
| enterpriseFeatures | object | `{"agentImageTagOverride":"1.24-57","applyVaultEnvVars":true,"enabled":false,"maxPods":10,"monitoredResources":{"maxResources":0,"maxResourcesFieldName":"resources|limits|nvidia.com/gpu","minResourcesFieldName":"resources|limits|nvidia.com/gpu"},"queues":null,"serviceAccountClusterAccess":false,"useOwnerToken":true}` | Enterprise features (work only with an Enterprise license) |
| enterpriseFeatures.agentImageTagOverride | string | `"1.24-57"` | Image tag override for enterprise version |
| enterpriseFeatures.applyVaultEnvVars | bool | `true` | push env vars from Clear.ML Vault to task pods |
| enterpriseFeatures.enabled | bool | `false` | Enable/Disable Enterprise features |
| enterpriseFeatures.maxPods | int | `10` | maximum concurrent consume ClearML Task pod |
| enterpriseFeatures.monitoredResources | object | `{"maxResources":0,"maxResourcesFieldName":"resources|limits|nvidia.com/gpu","minResourcesFieldName":"resources|limits|nvidia.com/gpu"}` | GPU resource general counters |
| enterpriseFeatures.monitoredResources.maxResources | int | `0` | Maximum resources counter |
| enterpriseFeatures.monitoredResources.maxResourcesFieldName | string | `"resources|limits|nvidia.com/gpu"` | Field name used by Agent to count maximum resources |
| enterpriseFeatures.monitoredResources.minResourcesFieldName | string | `"resources|limits|nvidia.com/gpu"` | Field name used by Agent to count minimum resources |
| enterpriseFeatures.queues | string | `nil` | ClearML queues and related template OVERRIDES used this agent will consume |
| enterpriseFeatures.serviceAccountClusterAccess | bool | `false` | service account access every namespace flag |
| enterpriseFeatures.useOwnerToken | bool | `true` | Agent must use owner Token |
| imageCredentials | object | `{"email":"someone@host.com","enabled":false,"existingSecret":"","password":"pwd","registry":"docker.io","username":"someone"}` | Private image registry configuration |
| imageCredentials.email | string | `"someone@host.com"` | Email |
| imageCredentials.enabled | bool | `false` | Use private authentication mode |
| imageCredentials.existingSecret | string | `""` | If this is set, chart will not generate a secret but will use what is defined here |
| imageCredentials.password | string | `"pwd"` | Registry password |
| imageCredentials.registry | string | `"docker.io"` | Registry name |
| imageCredentials.username | string | `"someone"` | Registry username |
| sessions | object | `{"dynamicSvcs":false,"externalIP":"0.0.0.0","maxServices":20,"portModeEnabled":false,"setInteractiveQueuesTag":true,"startingPort":30000,"svcAnnotations":{},"svcType":"NodePort"}` | Sessions internal service configuration |
| sessions.dynamicSvcs | bool | `false` | Enable/Disable dynamic svc for sessions pods |
| sessions.externalIP | string | `"0.0.0.0"` | External IP sessions clients can connect to |
| sessions.maxServices | int | `20` | maximum number of NodePorts exposed |
| sessions.portModeEnabled | bool | `false` | Enable/Disable sessions portmode WARNING: only one Agent deployment can have this set to true |
| sessions.setInteractiveQueuesTag | bool | `true` | set interactive queue tags |
| sessions.startingPort | int | `30000` | starting range of exposed NodePorts |
| sessions.svcAnnotations | object | `{}` | specific annotations for session services |
| sessions.svcType | string | `"NodePort"` | service type ("NodePort" or "ClusterIP" or "LoadBalancer") |
# Upgrading Chart
### From v1.x to v2.x
Chart 1.x was under the assumption that all mounted volumes would be PVC's. Version > 2.x allows for more flexibility and will inject the yaml from podTemplate.volumes and podtemplate.volumeMounts directly.
v1.x
```
volumes:
- name: "yourvolume"
path: "/yourpath"
```
v2.x
```
volumes:
- name: "yourvolume"
persistentVolumeClaim:
claimName: "yourvolume"
volumeMounts:
- name: "yourvolume"
mountPath: "/yourpath"
```

View File

@@ -0,0 +1,45 @@
# ClearML Kubernetes Agent
{{ template "chart.deprecationWarning" . }}
{{ template "chart.badgesSection" . }}
{{ template "chart.description" . }}
{{ template "chart.homepageLine" . }}
{{ template "chart.maintainersSection" . }}
## Introduction
The **clearml-agent** is the Kubernetes agent for for [ClearML](https://github.com/allegroai/clearml).
It allows you to schedule distributed experiments on a Kubernetes cluster.
{{ template "chart.sourcesSection" . }}
{{ template "chart.requirementsSection" . }}
{{ template "chart.valuesSection" . }}
# Upgrading Chart
### From v1.x to v2.x
Chart 1.x was under the assumption that all mounted volumes would be PVC's. Version > 2.x allows for more flexibility and will inject the yaml from podTemplate.volumes and podtemplate.volumeMounts directly.
v1.x
```
volumes:
- name: "yourvolume"
path: "/yourpath"
```
v2.x
```
volumes:
- name: "yourvolume"
persistentVolumeClaim:
claimName: "yourvolume"
volumeMounts:
- name: "yourvolume"
mountPath: "/yourpath"
```

View File

@@ -0,0 +1,3 @@
clearml:
agentk8sglueKey: "AGENTK8SGLUEKEY"
agentk8sglueSecret: "AGENTK8SGLUESECRET"

View File

@@ -0,0 +1 @@
Glue Agent deployed.

View File

@@ -0,0 +1,85 @@
{{/*
Expand the name of the chart.
*/}}
{{- define "clearml.name" -}}
{{- .Release.Name | trunc 59 | trimSuffix "-" }}
{{- end }}
{{/*
Create chart name and version as used by the chart label.
*/}}
{{- define "clearml.chart" -}}
{{- printf "%s-%s" .Chart.Name .Chart.Version | replace "+" "_" | trunc 59 | trimSuffix "-" }}
{{- end }}
{{/*
Common labels
*/}}
{{- define "clearml.labels" -}}
helm.sh/chart: {{ include "clearml.chart" . }}
{{ include "clearml.selectorLabels" . }}
{{- if .Chart.AppVersion }}
app.kubernetes.io/version: {{ .Chart.AppVersion | quote }}
{{- end }}
app.kubernetes.io/managed-by: {{ .Release.Service }}
{{- if $.Values.agentk8sglue.labels }}
{{ toYaml $.Values.agentk8sglue.labels }}
{{- end }}
{{- end }}
{{/*
Common annotations
*/}}
{{- define "clearml.annotations" -}}
{{- if $.Values.agentk8sglue.annotations }}
{{ toYaml $.Values.agentk8sglue.annotations }}
{{- end }}
{{- end }}
{{/*
Selector labels
*/}}
{{- define "clearml.selectorLabels" -}}
app.kubernetes.io/name: {{ include "clearml.name" . }}
app.kubernetes.io/instance: {{ .Release.Name }}
{{- end }}
{{/*
Selector labels (agentk8sglue)
*/}}
{{- define "agentk8sglue.selectorLabels" -}}
app.kubernetes.io/name: {{ include "clearml.name" . }}
app.kubernetes.io/instance: {{ include "clearml.name" . }}
{{- end }}
{{/*
Create the name of the service account to use
*/}}
{{- define "clearml.serviceAccountName" -}}
{{- if .Values.agentk8sglue.serviceExistingAccountName }}
{{- .Values.agentk8sglue.serviceExistingAccountName }}
{{- else }}
{{- include "clearml.name" . }}-sa
{{- end }}
{{- end }}
{{/*
Create secret to access docker registry
*/}}
{{- define "imagePullSecret" }}
{{- with .Values.imageCredentials }}
{{- printf "{\"auths\":{\"%s\":{\"username\":\"%s\",\"password\":\"%s\",\"email\":\"%s\",\"auth\":\"%s\"}}}" .registry .username .password .email (printf "%s:%s" .username .password | b64enc) | b64enc }}
{{- end }}
{{- end }}
{{/*
Create a string composed by queue names
*/}}
{{- define "agentk8sglue.queues" -}}
{{- $list := list }}
{{- range $key, $value := .Values.enterpriseFeatures.queues }}
{{- $list = append $list (printf "%s" $key) }}
{{- end }}
{{- join " " $list }}
{{- end }}

View File

@@ -0,0 +1,283 @@
apiVersion: v1
kind: ConfigMap
metadata:
name: {{ include "clearml.name" . }}-pt
data:
{{- if .Values.enterpriseFeatures.enabled }}
template.yaml: |
{{- range $key, $value := $.Values.enterpriseFeatures.queues }}
{{ $key }}:
apiVersion: v1
metadata:
namespace: {{ $.Release.Namespace }}
{{- if $value.templateOverrides.labels }}
labels:
{{- toYaml $value.templateOverrides.labels | nindent 10 }}
{{- else if $.Values.agentk8sglue.basePodTemplate.labels }}
labels:
{{- toYaml $.Values.agentk8sglue.basePodTemplate.labels | nindent 10 }}
{{- end}}
{{- if $value.templateOverrides.annotations }}
annotations:
{{- toYaml $value.templateOverrides.annotations | nindent 10 }}
{{- else if $.Values.agentk8sglue.basePodTemplate.annotations }}
annotations:
{{- toYaml $.Values.agentk8sglue.basePodTemplate.annotations | nindent 10 }}
{{- end}}
spec:
{{- if $.Values.imageCredentials.enabled }}
imagePullSecrets:
{{- if $.Values.imageCredentials.existingSecret }}
- name: $.Values.imageCredentials.existingSecret
{{- else }}
- name: {{ include "clearml.name" $ }}-ark
{{- end }}
{{- end }}
{{- if $value.templateOverrides.schedulerName }}
schedulerName: {{ $value.templateOverrides.schedulerName }}
{{- else if $.Values.agentk8sglue.basePodTemplate.schedulerName }}
schedulerName: {{ $.Values.agentk8sglue.basePodTemplate.schedulerName }}
{{- end}}
restartPolicy: Never
{{- if $value.templateOverrides.securityContext }}
securityContext:
{{- toYaml $value.templateOverrides.securityContext | nindent 10 }}
{{- else if $.Values.agentk8sglue.basePodTemplate.securityContext }}
securityContext:
{{- toYaml $.Values.agentk8sglue.basePodTemplate.securityContext | nindent 10 }}
{{- end}}
{{- if $value.templateOverrides.hostAliases }}
{{- with $value.templateOverrides.hostAliases }}
hostAliases:
{{- toYaml . | nindent 10 }}
{{- end }}
{{- else if $.Values.agentk8sglue.basePodTemplate.hostAliases }}
{{- with $.Values.agentk8sglue.basePodTemplate.hostAliases }}
hostAliases:
{{- toYaml . | nindent 10 }}
{{- end }}
{{- end }}
volumes:
{{- if $value.templateOverrides.volumes }}
{{- toYaml $value.templateOverrides.volumes | nindent 10 }}
{{- else if $.Values.agentk8sglue.basePodTemplate.volumes }}
{{- toYaml $.Values.agentk8sglue.basePodTemplate.volumes | nindent 10 }}
{{- end }}
{{- if $value.templateOverrides.fileMounts }}
- name: filemounts
secret:
secretName: {{ include "clearml.name" $ }}-{{ $key }}-fm
{{- else if $.Values.agentk8sglue.basePodTemplate.fileMounts }}
- name: filemounts
secret:
secretName: {{ include "clearml.name" $ }}-fm
{{- end }}
{{- if not $.Values.enterpriseFeatures.serviceAccountClusterAccess }}
serviceAccountName: {{ include "clearml.serviceAccountName" $ }}
{{- end }}
{{- if $value.templateOverrides.initContainers }}
initContainers:
{{- toYaml $value.templateOverrides.initContainers | nindent 10 }}
{{- else if $.Values.agentk8sglue.basePodTemplate.initContainers }}
initContainers:
{{- toYaml $.Values.agentk8sglue.basePodTemplate.initContainers | nindent 10 }}
{{- end }}
containers:
- resources:
{{- if $value.templateOverrides.resources }}
{{- toYaml $value.templateOverrides.resources | nindent 12 }}
{{- else if $.Values.agentk8sglue.basePodTemplate.resources }}
{{- toYaml $.Values.agentk8sglue.basePodTemplate.resources | nindent 12 }}
{{- end}}
ports:
- containerPort: 10022
volumeMounts:
{{- if $value.templateOverrides.volumeMounts }}
{{- toYaml $value.templateOverrides.volumeMounts | nindent 12 }}
{{- else if $.Values.agentk8sglue.basePodTemplate.volumeMounts }}
{{- toYaml $.Values.agentk8sglue.basePodTemplate.volumeMounts | nindent 12 }}
{{- end }}
{{- if $value.templateOverrides.fileMounts }}
{{- range $value.templateOverrides.fileMounts }}
- name: filemounts
mountPath: "{{ .folderPath }}/{{ .name }}"
subPath: "{{ .name }}"
readOnly: true
{{- end }}
{{- else if $.Values.agentk8sglue.basePodTemplate.fileMounts }}
{{- range $.Values.agentk8sglue.basePodTemplate.fileMounts }}
- name: filemounts
mountPath: "{{ .folderPath }}/{{ .name }}"
subPath: "{{ .name }}"
readOnly: true
{{- end }}
{{- end }}
env:
- name: CLEARML_API_HOST
value: {{ $.Values.agentk8sglue.apiServerUrlReference }}
- name: CLEARML_WEB_HOST
value: {{ $.Values.agentk8sglue.webServerUrlReference }}
- name: CLEARML_FILES_HOST
value: {{ $.Values.agentk8sglue.fileServerUrlReference }}
{{- if not $.Values.enterpriseFeatures.useOwnerToken }}
- name: CLEARML_API_ACCESS_KEY
valueFrom:
secretKeyRef:
{{- if $.Values.clearml.existingAgentk8sglueSecret }}
name: {{ $.Values.clearml.existingAgentk8sglueSecret }}
{{- else }}
name: {{ include "clearml.name" $ }}-ac
{{- end }}
key: agentk8sglue_key
- name: CLEARML_API_SECRET_KEY
valueFrom:
secretKeyRef:
{{- if $.Values.clearml.existingAgentk8sglueSecret }}
name: {{ $.Values.clearml.existingAgentk8sglueSecret }}
{{- else }}
name: {{ include "clearml.name" $ }}-ac
{{- end }}
key: agentk8sglue_secret
{{- end }}
- name: PYTHONUNBUFFERED
value: "x"
{{- if not $.Values.agentk8sglue.clearmlcheckCertificate }}
- name: CLEARML_API_HOST_VERIFY_CERT
value: "false"
{{- end }}
{{- if $value.templateOverrides.env }}
{{- toYaml $value.templateOverrides.env | nindent 12 }}
{{- else if $.Values.agentk8sglue.basePodTemplate.env }}
{{- toYaml $.Values.agentk8sglue.basePodTemplate.env | nindent 12 }}
{{- end }}
{{- if $value.templateOverrides.nodeSelector }}
{{- with $value.templateOverrides.nodeSelector }}
nodeSelector:
{{- toYaml . | nindent 12 }}
{{- end }}
{{- else if $.Values.agentk8sglue.basePodTemplate.nodeSelector }}
{{- with $.Values.agentk8sglue.basePodTemplate.nodeSelector }}
nodeSelector:
{{- toYaml . | nindent 10 }}
{{- end }}
{{- end }}
{{- if $value.templateOverrides.tolerations }}
{{- with $value.templateOverrides.tolerations }}
tolerations:
{{- toYaml . | nindent 10 }}
{{- end }}
{{- else if $.Values.agentk8sglue.basePodTemplate.tolerations }}
{{- with $.Values.agentk8sglue.basePodTemplate.tolerations }}
tolerations:
{{- toYaml . | nindent 10 }}
{{- end }}
{{- end }}
{{- end }}
secrets.yaml: |
{{- range $key, $value := $.Values.enterpriseFeatures.queues }}
{{ $key }}:
{{- if $value.templateOverrides.fileMounts }}
- {{ include "clearml.name" $ }}-{{ $key }}-fm
{{- else if $.Values.agentk8sglue.basePodTemplate.fileMounts }}
- {{ include "clearml.name" $ }}-fm
{{- end }}
{{- end }}
{{- else }}
template.yaml: |
apiVersion: v1
metadata:
namespace: {{ .Release.Namespace }}
labels:
{{- toYaml $.Values.agentk8sglue.basePodTemplate.labels | nindent 8 }}
annotations:
{{- toYaml $.Values.agentk8sglue.basePodTemplate.annotations | nindent 8 }}
spec:
{{- if .Values.imageCredentials.enabled }}
imagePullSecrets:
{{- if .Values.imageCredentials.existingSecret }}
- name: {{.Values.imageCredentials.existingSecret}}
{{- else }}
- name: name: {{ include "clearml.name" $ }}-ark
{{- end }}
{{- end }}
{{- with .Values.agentk8sglue.basePodTemplate.volumes }}
volumes:
{{- toYaml . | nindent 8 }}
{{- end }}
serviceAccountName: {{ include "clearml.serviceAccountName" $ }}
containers:
- resources:
{{- toYaml .Values.agentk8sglue.basePodTemplate.resources | nindent 10 }}
ports:
- containerPort: 10022
{{- with .Values.agentk8sglue.basePodTemplate.volumeMounts }}
volumeMounts:
{{- toYaml . | nindent 10 }}
{{- end }}
env:
- name: CLEARML_API_HOST
value: {{.Values.agentk8sglue.apiServerUrlReference}}
- name: CLEARML_WEB_HOST
value: {{.Values.agentk8sglue.webServerUrlReference}}
- name: CLEARML_FILES_HOST
value: {{.Values.agentk8sglue.fileServerUrlReference}}
- name: CLEARML_API_ACCESS_KEY
valueFrom:
secretKeyRef:
{{- if .Values.clearml.existingAgentk8sglueSecret }}
name: {{ .Values.clearml.existingAgentk8sglueSecret }}
{{- else }}
name: {{ include "clearml.name" . }}-ac
{{- end }}
key: agentk8sglue_key
- name: CLEARML_API_SECRET_KEY
valueFrom:
secretKeyRef:
{{- if .Values.clearml.existingAgentk8sglueSecret }}
name: {{ .Values.clearml.existingAgentk8sglueSecret }}
{{- else }}
name: {{ include "clearml.name" . }}-ac
{{- end }}
key: agentk8sglue_secret
{{- if .Values.agentk8sglue.basePodTemplate.env }}
{{ toYaml .Values.agentk8sglue.basePodTemplate.env | nindent 8 }}
{{- end }}
{{- with .Values.agentk8sglue.basePodTemplate.nodeSelector}}
nodeSelector:
{{- toYaml . | nindent 8 }}
{{- end }}
{{- with .Values.agentk8sglue.basePodTemplate.tolerations }}
tolerations:
{{- toYaml . | nindent 8 }}
{{- end }}
{{- end }}
{{- if .Values.sessions.portModeEnabled }}
{{- range untilStep 1 ( ( add .Values.sessions.maxServices 1 ) | int ) 1 }}
services-{{ . }}.yaml: |
apiVersion: v1
kind: Service
metadata:
name: clearml-session-{{ . }}
labels:
{{- include "clearml.labels" $ | nindent 8 }}
{{- with $.Values.sessions.svcAnnotations }}
annotations:
{{- toYaml . | nindent 8 }}
{{- end }}
spec:
type: {{ $.Values.sessions.svcType }}
ports:
- targetPort: 10022
{{- if eq $.Values.sessions.svcType "NodePort" }}
port: 10022
{{- else }}
port: {{ add $.Values.sessions.startingPort . }}
{{- end }}
protocol: TCP
{{- if eq $.Values.sessions.svcType "NodePort" }}
nodePort: {{ add $.Values.sessions.startingPort . }}
{{- end }}
selector:
ai.allegro.agent.serial: pod-{{ . }}
{{- end }}
{{- end }}

View File

@@ -0,0 +1,198 @@
apiVersion: apps/v1
kind: Deployment
metadata:
name: {{ include "clearml.name" . }}
labels:
{{- include "clearml.labels" . | nindent 4 }}
annotations:
{{- include "clearml.annotations" . | nindent 4 }}
spec:
replicas: {{ .Values.agentk8sglue.replicaCount }}
selector:
matchLabels:
{{- include "agentk8sglue.selectorLabels" . | nindent 6 }}
template:
metadata:
annotations:
checksum/config: {{ printf "%s%s" .Values.clearml .Values.agentk8sglue | sha256sum }}
{{- include "clearml.annotations" . | nindent 8 }}
labels:
{{- include "clearml.labels" . | nindent 8 }}
spec:
{{- if .Values.imageCredentials.enabled }}
imagePullSecrets:
{{- if .Values.imageCredentials.existingSecret }}
- name: .Values.imageCredentials.existingSecret
{{- else }}
- name: {{ include "clearml.name" . }}-ark
{{- end }}
{{- end }}
serviceAccountName: {{ include "clearml.serviceAccountName" . }}
initContainers:
- name: init-k8s-glue
{{- if .Values.enterpriseFeatures.enabled }}
image: "{{ .Values.agentk8sglue.image.repository }}:{{ .Values.enterpriseFeatures.agentImageTagOverride }}"
{{- else }}
image: "{{ .Values.agentk8sglue.image.repository }}:{{ .Values.agentk8sglue.image.tag }}"
{{- end }}
command:
- /bin/sh
- -c
- >
set -x;
while [ $(curl {{ if not .Values.agentk8sglue.clearmlcheckCertificate }}--insecure{{ end }} -sw '%{http_code}' "{{.Values.agentk8sglue.apiServerUrlReference}}/debug.ping" -o /dev/null) -ne 200 ] ; do
echo "waiting for apiserver" ;
sleep 5 ;
done;
while [[ $(curl {{ if not .Values.agentk8sglue.clearmlcheckCertificate }}--insecure{{ end }} -sw '%{http_code}' "{{.Values.agentk8sglue.fileServerUrlReference}}/" -o /dev/null) =~ 403|405 ]] ; do
echo "waiting for fileserver" ;
sleep 5 ;
done;
while [ $(curl {{ if not .Values.agentk8sglue.clearmlcheckCertificate }}--insecure{{ end }} -sw '%{http_code}' "{{.Values.agentk8sglue.webServerUrlReference}}/" -o /dev/null) -ne 200 ] ; do
echo "waiting for webserver" ;
sleep 5 ;
done
containers:
- name: k8s-glue
{{- if .Values.enterpriseFeatures.enabled }}
image: "{{ .Values.agentk8sglue.image.repository }}:{{ .Values.enterpriseFeatures.agentImageTagOverride }}"
{{- else }}
image: "{{ .Values.agentk8sglue.image.repository }}:{{ .Values.agentk8sglue.image.tag }}"
{{- end }}
imagePullPolicy: IfNotPresent
command:
- /bin/bash
- -c
- >
export PATH=$PATH:$HOME/bin;
source /root/.bashrc && /root/entrypoint.sh
volumeMounts:
- name: {{ include "clearml.name" . }}-pt
mountPath: /root/template
{{ if .Values.clearml.clearmlConfig }}
- name: k8sagent-clearml-conf-volume
mountPath: /root/clearml.conf
subPath: clearml.conf
readOnly: true
{{- end }}
{{- if .Values.agentk8sglue.volumeMounts }}
{{- toYaml .Values.agentk8sglue.volumeMounts | nindent 10 }}
{{- end }}
{{- range .Values.agentk8sglue.fileMounts }}
- name: filemounts
mountPath: "{{ .folderPath }}/{{ .name }}"
subPath: "{{ .name }}"
readOnly: true
{{- end }}
env:
- name: CLEARML_API_HOST
value: "{{.Values.agentk8sglue.apiServerUrlReference}}"
- name: CLEARML_WEB_HOST
value: "{{.Values.agentk8sglue.webServerUrlReference}}"
- name: CLEARML_FILES_HOST
value: "{{.Values.agentk8sglue.fileServerUrlReference}}"
{{- if not .Values.agentk8sglue.clearmlcheckCertificate }}
- name: CLEARML_API_HOST_VERIFY_CERT
value: "false"
{{- end }}
{{- if .Values.sessions.portModeEnabled }}
- name: K8S_GLUE_EXTRA_ARGS
value: "--namespace {{ .Release.Namespace }} --template-yaml /root/template/template.yaml \
--ports-mode --num-of-services {{ .Values.sessions.maxServices }} \
--base-port {{ .Values.sessions.startingPort }} \
--gateway-address {{ .Values.sessions.externalIP }}{{ if .Values.enterpriseFeatures.enabled }}{{ if .Values.enterpriseFeatures.useOwnerToken }} --use-owner-token{{ end }}{{ end }}"
{{- if .Values.sessions.dynamicSvcs }}
- name: CLEARML_K8S_GLUE_POD_POST_APPLY_CMD
value: "kubectl -n {namespace} apply -f ~/template/services-{pod_number}.yaml ; kubectl -n {namespace} label svc clearml-session-{pod_number} service-for={pod_name}"
- name: CLEARML_K8S_GLUE_POD_POST_DELETE_CMD
value: "kubectl -n {namespace} delete svc -l service-for={pod_name}"
{{- end }}
{{- else}}
- name: K8S_GLUE_EXTRA_ARGS
value: "--namespace {{ .Release.Namespace }} --template-yaml /root/template/template.yaml \
--max-pods {{.Values.enterpriseFeatures.maxPods}}{{ if .Values.enterpriseFeatures.enabled }}{{ if .Values.enterpriseFeatures.useOwnerToken }} --use-owner-token{{ end }}{{ end }}"
{{- end }}
- name: CLEARML_K8S_GLUE_LIMIT_POD_LABEL
value: "ai.allegro.agent.serial=pod-{pod_number}"
- name: CLEARML_K8S_SECRETS_LIST_FILE
value: /root/template/secrets.yaml
- name: K8S_DEFAULT_NAMESPACE
value: "{{ .Release.Namespace }}"
- name: CLEARML_API_ACCESS_KEY
valueFrom:
secretKeyRef:
name: {{ include "clearml.name" . }}-ac
key: agentk8sglue_key
- name: CLEARML_API_SECRET_KEY
valueFrom:
secretKeyRef:
name: {{ include "clearml.name" . }}-ac
key: agentk8sglue_secret
- name: CLEARML_WORKER_ID
value: {{ include "clearml.name" . }}
- name: CLEARML_AGENT_UPDATE_REPO
value: ""
- name: FORCE_CLEARML_AGENT_REPO
value: ""
- name: CLEARML_DOCKER_IMAGE
value: "{{.Values.agentk8sglue.defaultContainerImage}}"
{{ if .Values.agentk8sglue.customBashScript }}
- name: CLEARML_K8S_GLUE_EXTRA_BASH_SCRIPT
value: "{{.Values.agentk8sglue.customBashScript}}"
{{- end }}
{{ if .Values.agentk8sglue.containerCustomBashScript }}
- name: CLEARML_K8S_GLUE_POD_BASH_SCRIPT
value: "{{.Values.agentk8sglue.containerCustomBashScript}}"
{{- end }}
{{- if .Values.agentk8sglue.debugMode }}
- name: "CLEARML_K8S_GLUE_DEBUG"
value: "1"
{{- end }}
{{- if .Values.agentk8sglue.extraEnvs }}
{{ toYaml .Values.agentk8sglue.extraEnvs | nindent 10 }}
{{- end }}
{{- if .Values.sessions.portModeEnabled }}
{{- if .Values.sessions.setInteractiveQueuesTag }}
- name: "CLEARML_K8S_GLUE_SET_QUEUE_SYSTEM_TAGS"
value: "interactive"
{{- end }}
{{- end }}
{{- if .Values.enterpriseFeatures.enabled }}
- name: K8S_GLUE_QUEUE
value: {{ include "agentk8sglue.queues" . | quote }}
- name: CLEARML_K8S_GLUE_APPLY_VAULT_ENV_VARS
value: {{ .Values.enterpriseFeatures.applyVaultEnvVars | quote }}
- name: "CLEARML_K8S_GLUE_POD_MIN_RES_FIELD"
value: {{.Values.enterpriseFeatures.monitoredResources.minResourcesFieldName}}
- name: "CLEARML_K8S_GLUE_MAX_RESOURCES"
value: "{{.Values.enterpriseFeatures.monitoredResources.maxResources}}"
- name: "CLEARML_K8S_GLUE_POD_MAX_RES_FIELD"
value: {{.Values.enterpriseFeatures.monitoredResources.maxResourcesFieldName}}
{{- else }}
- name: K8S_GLUE_QUEUE
value: {{ .Values.agentk8sglue.queue }}
{{- end }}
{{- with .Values.agentk8sglue.nodeSelector}}
nodeSelector:
{{- toYaml . | nindent 8 }}
{{- end }}
volumes:
- name: {{ include "clearml.name" . }}-pt
configMap:
name: {{ include "clearml.name" . }}-pt
{{ if .Values.clearml.clearmlConfig }}
- name: k8sagent-clearml-conf-volume
secret:
secretName: {{ include "clearml.name" . }}-ac
items:
- key: clearml.conf
path: clearml.conf
{{ end }}
{{- if .Values.agentk8sglue.volumes }}
{{- toYaml .Values.agentk8sglue.volumes | nindent 8 }}
{{- end }}
{{ if .Values.agentk8sglue.fileMounts }}
- name: filemounts
secret:
secretName: {{ include "clearml.name" . }}-afm
{{- end }}

View File

@@ -0,0 +1,72 @@
{{- if not .Values.agentk8sglue.serviceExistingAccountName }}
apiVersion: v1
kind: ServiceAccount
metadata:
name: {{ include "clearml.serviceAccountName" . }}
namespace: {{ .Release.Namespace }}
{{- end }}
{{- if .Values.enterpriseFeatures.serviceAccountClusterAccess }}
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
name: {{ include "clearml.name" . }}-kpa
rules:
- apiGroups:
- ""
resources:
- pods
- secrets
- services
verbs: ["get", "list", "watch", "create", "patch", "delete"]
- apiGroups:
- ""
resources:
- namespaces
verbs: ["list"]
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
name: {{ include "clearml.name" . }}-kpa
subjects:
- kind: ServiceAccount
name: {{ include "clearml.serviceAccountName" . }}
namespace: {{ .Release.Namespace }}
roleRef:
apiGroup: rbac.authorization.k8s.io
kind: ClusterRole
name: {{ include "clearml.name" . }}-kpa
{{- else }}
---
apiVersion: rbac.authorization.k8s.io/v1
kind: Role
metadata:
name: {{ include "clearml.name" . }}-kpa
rules:
- apiGroups:
- ""
resources:
- pods
- secrets
- services
verbs: ["get", "list", "watch", "create", "patch", "delete"]
- apiGroups:
- ""
resources:
- namespaces
verbs: ["list"]
---
apiVersion: rbac.authorization.k8s.io/v1
kind: RoleBinding
metadata:
name: {{ include "clearml.name" . }}-kpa
subjects:
- kind: ServiceAccount
name: {{ include "clearml.serviceAccountName" . }}
namespace: {{ .Release.Namespace }}
roleRef:
apiGroup: rbac.authorization.k8s.io
kind: Role
name: {{ include "clearml.name" . }}-kpa
{{- end }}

View File

@@ -0,0 +1,20 @@
apiVersion: v1
kind: Secret
metadata:
name: {{ include "clearml.name" . }}-ac
data:
agentk8sglue_key: {{ .Values.clearml.agentk8sglueKey | b64enc }}
agentk8sglue_secret: {{ .Values.clearml.agentk8sglueSecret | b64enc }}
clearml.conf: {{ .Values.clearml.clearmlConfig | b64enc }}
---
{{- if .Values.imageCredentials.enabled }}
{{- if not .Values.imageCredentials.existingSecret }}
apiVersion: v1
kind: Secret
metadata:
name: {{ include "clearml.name" . }}-ark
type: kubernetes.io/dockerconfigjson
data:
.dockerconfigjson: {{ template "imagePullSecret" . }}
{{- end }}
{{- end }}

View File

@@ -0,0 +1,37 @@
{{ if .Values.agentk8sglue.fileMounts }}
apiVersion: v1
kind: Secret
metadata:
name: {{ include "clearml.name" . }}-afm
data:
{{- range .Values.agentk8sglue.fileMounts }}
{{ .name }}: {{ .fileContent | b64enc }}
{{- end }}
{{ end }}
---
{{- if .Values.enterpriseFeatures.enabled }}
{{ if .Values.agentk8sglue.basePodTemplate.fileMounts }}
apiVersion: v1
kind: Secret
metadata:
name: {{ include "clearml.name" . }}-fm
data:
{{- range .Values.agentk8sglue.basePodTemplate.fileMounts }}
{{ .name }}: {{ .fileContent | b64enc }}
{{- end }}
{{ end }}
---
{{- range $key, $value := $.Values.agentk8sglue.queues }}
{{ if .templateOverrides.fileMounts }}
apiVersion: v1
kind: Secret
metadata:
name: {{ include "clearml.name" $ }}-{{ $key }}-fm
data:
{{- range .templateOverrides.fileMounts }}
{{ .name }}: {{ .fileContent | b64enc }}
{{- end }}
{{ end }}
---
{{- end }}
{{- end }}

View File

@@ -0,0 +1,32 @@
{{- if .Values.sessions.portModeEnabled }}
{{- if not .Values.sessions.dynamicSvcs }}
{{- range untilStep 1 ( ( add .Values.sessions.maxServices 1 ) | int ) 1 }}
---
apiVersion: v1
kind: Service
metadata:
name: clearml-session-{{ . }}
labels:
{{- include "clearml.labels" $ | nindent 4 }}
{{- with $.Values.sessions.svcAnnotations }}
annotations:
{{- toYaml . | nindent 4 }}
{{- end }}
spec:
type: {{ $.Values.sessions.svcType }}
ports:
- targetPort: 10022
{{- if eq $.Values.sessions.svcType "NodePort" }}
port: 10022
{{- else }}
port: {{ add $.Values.sessions.startingPort . }}
{{- end }}
protocol: TCP
{{- if eq $.Values.sessions.svcType "NodePort" }}
nodePort: {{ add $.Values.sessions.startingPort . }}
{{- end }}
selector:
ai.allegro.agent.serial: pod-{{ . }}
{{- end }}
{{- end }}
{{- end }}

View File

@@ -0,0 +1,234 @@
# -- Private image registry configuration
imageCredentials:
# -- Use private authentication mode
enabled: false
# -- If this is set, chart will not generate a secret but will use what is defined here
existingSecret: ""
# -- Registry name
registry: docker.io
# -- Registry username
username: someone
# -- Registry password
password: pwd
# -- Email
email: someone@host.com
# -- ClearMl generic configurations
clearml:
# -- If this is set, chart will not generate a secret but will use what is defined here
existingAgentk8sglueSecret: ""
# -- Agent k8s Glue basic auth key
agentk8sglueKey: "ACCESSKEY"
# -- Agent k8s Glue basic auth secret
agentk8sglueSecret: "SECRETKEY"
# -- If this is set, chart will not generate a secret but will use what is defined here
existingClearmlConfigSecret: ""
# -- ClearML configuration file
clearmlConfig: |-
sdk {
}
# -- This agent will spawn queued experiments in new pods, a good use case is to combine this with
# GPU autoscaling nodes.
# https://github.com/allegroai/clearml-agent/tree/master/docker/k8s-glue
agentk8sglue:
# -- Glue Agent image configuration
image:
repository: "allegroai/clearml-agent-k8s-base"
tag: "1.24-21"
# -- Glue Agent number of pods
replicaCount: 1
# -- if set, don't create a serviceAccountName but use defined existing one
serviceExistingAccountName: ""
# -- Check certificates validity for evefry UrlReference below.
clearmlcheckCertificate: true
# -- Enable Debugging logs for Agent pod
debugMode: false
# -- Reference to Api server url
apiServerUrlReference: "https://api.clear.ml"
# -- Reference to File server url
fileServerUrlReference: "https://files.clear.ml"
# -- Reference to Web server url
webServerUrlReference: "https://app.clear.ml"
# -- default container image for ClearML Task pod
defaultContainerImage: ubuntu:18.04
# -- ClearML queue this agent will consume
queue: default
# -- Custom Bash script for the Glue Agent
# -- labels setup for Agent pod (example in values.yaml comments)
labels: {}
# schedulerName: scheduler
# -- annotations setup for Agent pod (example in values.yaml comments)
annotations: {}
# key1: value1
customBashScript: ""
# -- Custom Bash script for the Task Pods ran by Glue Agent
containerCustomBashScript: ""
# -- Extra Environment variables for Glue Agent
extraEnvs: []
# - name: PYTHONPATH
# value: "somepath"
# -- nodeSelector setup for Agent pod (example in values.yaml comments)
nodeSelector: {}
# fleet: agent-nodes
# -- volumes definition for Glue Agent (example in values.yaml comments)
volumes: []
# - name: "yourvolume"
# nfs:
# server: 192.168.0.1
# path: /var/nfs/mount
# -- volume mounts definition for Glue Agent (example in values.yaml comments)
volumeMounts: []
# - name: yourvolume
# mountPath: /yourpath
# subPath: userfolder
# -- file definition for Glue Agent (example in values.yaml comments)
fileMounts: []
# - name: "integration.py"
# folderPath: "/mnt/python"
# fileContent: |-
# def get_template(*args, **kwargs):
# print("args: {}".format(args))
# print("kwargs: {}".format(kwargs))
# return {
# "template": {
# }
# }
# -- base template for pods spawned to consume ClearML Task
basePodTemplate:
# -- labels setup for pods spawned to consume ClearML Task (example in values.yaml comments)
labels: {}
# schedulerName: scheduler
# -- annotations setup for pods spawned to consume ClearML Task (example in values.yaml comments)
annotations: {}
# key1: value1
# -- initContainers definition for pods spawned to consume ClearML Task (example in values.yaml comments)
initContainers: []
# - name: volume-dirs-init-cntr
# image: busybox:1.35
# command:
# - /bin/bash
# - -c
# - >
# /bin/echo "this is an init";
# -- schedulerName setup for pods spawned to consume ClearML Task
schedulerName: ""
# -- volumes definition for pods spawned to consume ClearML Task (example in values.yaml comments)
volumes: []
# - name: "yourvolume"
# nfs:
# server: 192.168.0.1
# path: /var/nfs/mount
# -- volume mounts definition for pods spawned to consume ClearML Task (example in values.yaml comments)
volumeMounts: []
# - name: yourvolume
# mountPath: /yourpath
# subPath: userfolder
# -- file definition for pods spawned to consume ClearML Task (example in values.yaml comments)
fileMounts: []
# - name: "mounted-file.txt"
# folderPath: "/mnt/"
# fileContent: |-
# this is a test file
# with test content
# -- environment variables for pods spawned to consume ClearML Task (example in values.yaml comments)
env: []
# # to setup access to private repo, setup secret with git credentials:
# - name: CLEARML_AGENT_GIT_USER
# value: mygitusername
# - name: CLEARML_AGENT_GIT_PASS
# valueFrom:
# secretKeyRef:
# name: git-password
# key: git-password
# - name: CURL_CA_BUNDLE
# value: ""
# - name: PYTHONWARNINGS
# value: "=\"ignore:Unverified HTTPS request\""
# -- resources declaration for pods spawned to consume ClearML Task (example in values.yaml comments)
resources: {}
# limits:
# nvidia.com/gpu: 1
# -- tolerations setup for pods spawned to consume ClearML Task (example in values.yaml comments)
tolerations: []
# - key: "nvidia.com/gpu"
# operator: Exists
# effect: "NoSchedule"
# -- nodeSelector setup for pods spawned to consume ClearML Task (example in values.yaml comments)
nodeSelector: {}
# fleet: gpu-nodes
# -- securityContext setup for pods spawned to consume ClearML Task (example in values.yaml comments)
securityContext: {}
# runAsUser: 1000
# -- hostAliases setup for pods spawned to consume ClearML Task (example in values.yaml comments)
hostAliases: {}
# - ip: "127.0.0.1"
# hostnames:
# - "foo.local"
# - "bar.local"
# -- Sessions internal service configuration
sessions:
# -- Enable/Disable sessions portmode WARNING: only one Agent deployment can have this set to true
portModeEnabled: false
# -- Enable/Disable dynamic svc for sessions pods
dynamicSvcs: false
# -- specific annotations for session services
svcAnnotations: {}
# -- service type ("NodePort" or "ClusterIP" or "LoadBalancer")
svcType: "NodePort"
# -- External IP sessions clients can connect to
externalIP: 0.0.0.0
# -- starting range of exposed NodePorts
startingPort: 30000
# -- maximum number of NodePorts exposed
maxServices: 20
# -- set interactive queue tags
setInteractiveQueuesTag: true
# -- Enterprise features (work only with an Enterprise license)
enterpriseFeatures:
# -- Enable/Disable Enterprise features
enabled: false
# -- Image tag override for enterprise version
agentImageTagOverride: "1.24-57"
# -- service account access every namespace flag
serviceAccountClusterAccess: false
# -- push env vars from Clear.ML Vault to task pods
applyVaultEnvVars: true
# -- GPU resource general counters
monitoredResources:
# -- Field name used by Agent to count minimum resources
minResourcesFieldName: "resources|limits|nvidia.com/gpu"
# -- Maximum resources counter
maxResources: 0
# -- Field name used by Agent to count maximum resources
maxResourcesFieldName: "resources|limits|nvidia.com/gpu"
# -- maximum concurrent consume ClearML Task pod
maxPods: 10
# -- Agent must use owner Token
useOwnerToken: true
# -- ClearML queues and related template OVERRIDES used this agent will consume
queues:
# -- name of the queue will be used for this template
# default:
# -- overrides of the base template for this queue (must be declared even if empty!)
# templateOverrides: {}
## -- name of the queue will be used for this template
# default-gpu:
# # -- overrides of the base template for this queue
# templateOverrides:
# # -- resources declaration for pods spawned to consume ClearML Task
# resources:
# limits:
# nvidia.com/gpu: 1

View File

@@ -0,0 +1,23 @@
# Patterns to ignore when building packages.
# This supports shell glob matching, relative path matching, and
# negation (prefixed with !). Only one pattern per line.
.DS_Store
# Common VCS dirs
.git/
.gitignore
.bzr/
.bzrignore
.hg/
.hgignore
.svn/
# Common backup files
*.swp
*.bak
*.tmp
*.orig
*~
# Various IDEs
.project
.idea/
*.tmproj
.vscode/

View File

@@ -0,0 +1,15 @@
apiVersion: v2
name: clearml-serving
description: ClearML Serving Helm Chart
type: application
version: 0.7.0
appVersion: "1.2.0"
kubeVersion: ">= 1.19.0-0 < 1.26.0-0"
maintainers:
- name: valeriano-manassero
url: https://github.com/valeriano-manassero
keywords:
- clearml
- "machine learning"
- mlops
- "model serving"

View File

@@ -0,0 +1,201 @@
Apache License
Version 2.0, January 2004
http://www.apache.org/licenses/
TERMS AND CONDITIONS FOR USE, REPRODUCTION, AND DISTRIBUTION
1. Definitions.
"License" shall mean the terms and conditions for use, reproduction,
and distribution as defined by Sections 1 through 9 of this document.
"Licensor" shall mean the copyright owner or entity authorized by
the copyright owner that is granting the License.
"Legal Entity" shall mean the union of the acting entity and all
other entities that control, are controlled by, or are under common
control with that entity. For the purposes of this definition,
"control" means (i) the power, direct or indirect, to cause the
direction or management of such entity, whether by contract or
otherwise, or (ii) ownership of fifty percent (50%) or more of the
outstanding shares, or (iii) beneficial ownership of such entity.
"You" (or "Your") shall mean an individual or Legal Entity
exercising permissions granted by this License.
"Source" form shall mean the preferred form for making modifications,
including but not limited to software source code, documentation
source, and configuration files.
"Object" form shall mean any form resulting from mechanical
transformation or translation of a Source form, including but
not limited to compiled object code, generated documentation,
and conversions to other media types.
"Work" shall mean the work of authorship, whether in Source or
Object form, made available under the License, as indicated by a
copyright notice that is included in or attached to the work
(an example is provided in the Appendix below).
"Derivative Works" shall mean any work, whether in Source or Object
form, that is based on (or derived from) the Work and for which the
editorial revisions, annotations, elaborations, or other modifications
represent, as a whole, an original work of authorship. For the purposes
of this License, Derivative Works shall not include works that remain
separable from, or merely link (or bind by name) to the interfaces of,
the Work and Derivative Works thereof.
"Contribution" shall mean any work of authorship, including
the original version of the Work and any modifications or additions
to that Work or Derivative Works thereof, that is intentionally
submitted to Licensor for inclusion in the Work by the copyright owner
or by an individual or Legal Entity authorized to submit on behalf of
the copyright owner. For the purposes of this definition, "submitted"
means any form of electronic, verbal, or written communication sent
to the Licensor or its representatives, including but not limited to
communication on electronic mailing lists, source code control systems,
and issue tracking systems that are managed by, or on behalf of, the
Licensor for the purpose of discussing and improving the Work, but
excluding communication that is conspicuously marked or otherwise
designated in writing by the copyright owner as "Not a Contribution."
"Contributor" shall mean Licensor and any individual or Legal Entity
on behalf of whom a Contribution has been received by Licensor and
subsequently incorporated within the Work.
2. Grant of Copyright License. Subject to the terms and conditions of
this License, each Contributor hereby grants to You a perpetual,
worldwide, non-exclusive, no-charge, royalty-free, irrevocable
copyright license to reproduce, prepare Derivative Works of,
publicly display, publicly perform, sublicense, and distribute the
Work and such Derivative Works in Source or Object form.
3. Grant of Patent License. Subject to the terms and conditions of
this License, each Contributor hereby grants to You a perpetual,
worldwide, non-exclusive, no-charge, royalty-free, irrevocable
(except as stated in this section) patent license to make, have made,
use, offer to sell, sell, import, and otherwise transfer the Work,
where such license applies only to those patent claims licensable
by such Contributor that are necessarily infringed by their
Contribution(s) alone or by combination of their Contribution(s)
with the Work to which such Contribution(s) was submitted. If You
institute patent litigation against any entity (including a
cross-claim or counterclaim in a lawsuit) alleging that the Work
or a Contribution incorporated within the Work constitutes direct
or contributory patent infringement, then any patent licenses
granted to You under this License for that Work shall terminate
as of the date such litigation is filed.
4. Redistribution. You may reproduce and distribute copies of the
Work or Derivative Works thereof in any medium, with or without
modifications, and in Source or Object form, provided that You
meet the following conditions:
(a) You must give any other recipients of the Work or
Derivative Works a copy of this License; and
(b) You must cause any modified files to carry prominent notices
stating that You changed the files; and
(c) You must retain, in the Source form of any Derivative Works
that You distribute, all copyright, patent, trademark, and
attribution notices from the Source form of the Work,
excluding those notices that do not pertain to any part of
the Derivative Works; and
(d) If the Work includes a "NOTICE" text file as part of its
distribution, then any Derivative Works that You distribute must
include a readable copy of the attribution notices contained
within such NOTICE file, excluding those notices that do not
pertain to any part of the Derivative Works, in at least one
of the following places: within a NOTICE text file distributed
as part of the Derivative Works; within the Source form or
documentation, if provided along with the Derivative Works; or,
within a display generated by the Derivative Works, if and
wherever such third-party notices normally appear. The contents
of the NOTICE file are for informational purposes only and
do not modify the License. You may add Your own attribution
notices within Derivative Works that You distribute, alongside
or as an addendum to the NOTICE text from the Work, provided
that such additional attribution notices cannot be construed
as modifying the License.
You may add Your own copyright statement to Your modifications and
may provide additional or different license terms and conditions
for use, reproduction, or distribution of Your modifications, or
for any such Derivative Works as a whole, provided Your use,
reproduction, and distribution of the Work otherwise complies with
the conditions stated in this License.
5. Submission of Contributions. Unless You explicitly state otherwise,
any Contribution intentionally submitted for inclusion in the Work
by You to the Licensor shall be under the terms and conditions of
this License, without any additional terms or conditions.
Notwithstanding the above, nothing herein shall supersede or modify
the terms of any separate license agreement you may have executed
with Licensor regarding such Contributions.
6. Trademarks. This License does not grant permission to use the trade
names, trademarks, service marks, or product names of the Licensor,
except as required for reasonable and customary use in describing the
origin of the Work and reproducing the content of the NOTICE file.
7. Disclaimer of Warranty. Unless required by applicable law or
agreed to in writing, Licensor provides the Work (and each
Contributor provides its Contributions) on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or
implied, including, without limitation, any warranties or conditions
of TITLE, NON-INFRINGEMENT, MERCHANTABILITY, or FITNESS FOR A
PARTICULAR PURPOSE. You are solely responsible for determining the
appropriateness of using or redistributing the Work and assume any
risks associated with Your exercise of permissions under this License.
8. Limitation of Liability. In no event and under no legal theory,
whether in tort (including negligence), contract, or otherwise,
unless required by applicable law (such as deliberate and grossly
negligent acts) or agreed to in writing, shall any Contributor be
liable to You for damages, including any direct, indirect, special,
incidental, or consequential damages of any character arising as a
result of this License or out of the use or inability to use the
Work (including but not limited to damages for loss of goodwill,
work stoppage, computer failure or malfunction, or any and all
other commercial damages or losses), even if such Contributor
has been advised of the possibility of such damages.
9. Accepting Warranty or Additional Liability. While redistributing
the Work or Derivative Works thereof, You may choose to offer,
and charge a fee for, acceptance of support, warranty, indemnity,
or other liability obligations and/or rights consistent with this
License. However, in accepting such obligations, You may act only
on Your own behalf and on Your sole responsibility, not on behalf
of any other Contributor, and only if You agree to indemnify,
defend, and hold each Contributor harmless for any liability
incurred by, or claims asserted against, such Contributor by reason
of your accepting any such warranty or additional liability.
END OF TERMS AND CONDITIONS
APPENDIX: How to apply the Apache License to your work.
To apply the Apache License to your work, attach the following
boilerplate notice, with the fields enclosed by brackets "[]"
replaced with your own identifying information. (Don't include
the brackets!) The text should be enclosed in the appropriate
comment syntax for the file format. We also recommend that a
file or class name and description of purpose be included on the
same "printed page" as the copyright notice for easier
identification within third-party archives.
Copyright [yyyy] [name of copyright owner]
Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License.

View File

@@ -0,0 +1,55 @@
# clearml-serving
![Version: 0.7.0](https://img.shields.io/badge/Version-0.7.0-informational?style=flat-square) ![Type: application](https://img.shields.io/badge/Type-application-informational?style=flat-square) ![AppVersion: 1.2.0](https://img.shields.io/badge/AppVersion-1.2.0-informational?style=flat-square)
ClearML Serving Helm Chart
## Maintainers
| Name | Email | Url |
| ---- | ------ | --- |
| valeriano-manassero | | <https://github.com/valeriano-manassero> |
## Requirements
Kubernetes: `>= 1.19.0-0 < 1.26.0-0`
## Values
| Key | Type | Default | Description |
|-----|------|---------|-------------|
| alertmanager | object | `{"affinity":{},"image":{"repository":"prom/alertmanager","tag":"v0.23.0"},"nodeSelector":{},"resources":{},"tolerations":[]}` | Alertmanager generic configigurations |
| clearml | object | `{"apiAccessKey":"ClearML API Access Key","apiHost":"http://clearml-server-apiserver:8008","apiSecretKey":"ClearML API Secret Key","defaultBaseServeUrl":"http://127.0.0.1:8080/serve","filesHost":"http://clearml-server-fileserver:8081","servingTaskId":"ClearML Serving Task ID","webHost":"http://clearml-server-webserver:80"}` | ClearMl generic configurations |
| clearml_serving_inference | object | `{"affinity":{},"autoscaling":{"enabled":false,"maxReplicas":11,"minReplicas":1,"targetCPU":50,"targetMemory":50},"extraPythonPackages":[],"image":{"repository":"allegroai/clearml-serving-inference","tag":"1.2.0"},"ingress":{"annotations":{},"enabled":false,"hostName":"serving.clearml.127-0-0-1.nip.io","path":"/","tlsSecretName":""},"nodeSelector":{},"resources":{},"tolerations":[]}` | ClearML serving inference configurations |
| clearml_serving_inference.affinity | object | `{}` | Affinity configuration |
| clearml_serving_inference.autoscaling | object | `{"enabled":false,"maxReplicas":11,"minReplicas":1,"targetCPU":50,"targetMemory":50}` | Autoscaling configuration |
| clearml_serving_inference.extraPythonPackages | list | `[]` | Extra Python Packages to be installed in running pods |
| clearml_serving_inference.image | object | `{"repository":"allegroai/clearml-serving-inference","tag":"1.2.0"}` | Container Image |
| clearml_serving_inference.ingress | object | `{"annotations":{},"enabled":false,"hostName":"serving.clearml.127-0-0-1.nip.io","path":"/","tlsSecretName":""}` | Ingress exposing configurations |
| clearml_serving_inference.nodeSelector | object | `{}` | Node Selector configuration |
| clearml_serving_inference.resources | object | `{}` | Pod resources definition |
| clearml_serving_inference.tolerations | list | `[]` | Tolerations configuration |
| clearml_serving_statistics | object | `{"affinity":{},"extraPythonPackages":[],"image":{"repository":"allegroai/clearml-serving-statistics","tag":"1.2.0"},"nodeSelector":{},"resources":{},"tolerations":[]}` | ClearML serving statistics configurations |
| clearml_serving_statistics.affinity | object | `{}` | Affinity configuration |
| clearml_serving_statistics.extraPythonPackages | list | `[]` | Extra Python Packages to be installed in running pods |
| clearml_serving_statistics.image | object | `{"repository":"allegroai/clearml-serving-statistics","tag":"1.2.0"}` | Container Image |
| clearml_serving_statistics.nodeSelector | object | `{}` | Node Selector configuration |
| clearml_serving_statistics.resources | object | `{}` | Pod resources definition |
| clearml_serving_statistics.tolerations | list | `[]` | Tolerations configuration |
| clearml_serving_triton | object | `{"affinity":{},"autoscaling":{"enabled":false,"maxReplicas":11,"minReplicas":1,"targetCPU":50,"targetMemory":50},"enabled":true,"extraPythonPackages":[],"image":{"repository":"allegroai/clearml-serving-triton","tag":"1.2.0-22.07"},"ingress":{"annotations":{},"enabled":false,"hostName":"serving-grpc.clearml.127-0-0-1.nip.io","path":"/","tlsSecretName":""},"nodeSelector":{},"resources":{},"tolerations":[]}` | ClearML serving Triton configurations |
| clearml_serving_triton.affinity | object | `{}` | Affinity configuration |
| clearml_serving_triton.autoscaling | object | `{"enabled":false,"maxReplicas":11,"minReplicas":1,"targetCPU":50,"targetMemory":50}` | Autoscaling configuration |
| clearml_serving_triton.enabled | bool | `true` | Triton pod creation enable/disable |
| clearml_serving_triton.extraPythonPackages | list | `[]` | Extra Python Packages to be installed in running pods |
| clearml_serving_triton.image | object | `{"repository":"allegroai/clearml-serving-triton","tag":"1.2.0-22.07"}` | Container Image |
| clearml_serving_triton.ingress | object | `{"annotations":{},"enabled":false,"hostName":"serving-grpc.clearml.127-0-0-1.nip.io","path":"/","tlsSecretName":""}` | Ingress exposing configurations |
| clearml_serving_triton.nodeSelector | object | `{}` | Node Selector configuration |
| clearml_serving_triton.resources | object | `{}` | Pod resources definition |
| clearml_serving_triton.tolerations | list | `[]` | Tolerations configuration |
| grafana | object | `{"affinity":{},"image":{"repository":"grafana/grafana","tag":"8.4.4-ubuntu"},"ingress":{"annotations":{},"enabled":false,"hostName":"serving-grafana.clearml.127-0-0-1.nip.io","path":"/","tlsSecretName":""},"nodeSelector":{},"resources":{},"tolerations":[]}` | Grafana generic configigurations |
| kafka | object | `{"affinity":{},"image":{"repository":"bitnami/kafka","tag":"3.1.0"},"nodeSelector":{},"resources":{},"tolerations":[]}` | Kafka generic configigurations |
| prometheus | object | `{"affinity":{},"image":{"repository":"prom/prometheus","tag":"v2.34.0"},"nodeSelector":{},"resources":{},"tolerations":[]}` | Prometheus generic configigurations |
| zookeeper | object | `{"affinity":{},"image":{"repository":"bitnami/zookeeper","tag":"3.7.0"},"nodeSelector":{},"resources":{},"tolerations":[]}` | Zookeeper generic configigurations |
----------------------------------------------
Autogenerated from chart metadata using [helm-docs v1.11.0](https://github.com/norwoodj/helm-docs/releases/v1.11.0)

View File

@@ -0,0 +1,92 @@
{{/*
Expand the name of the chart.
*/}}
{{- define "clearml-serving.name" -}}
{{- default .Chart.Name .Values.nameOverride | trunc 63 | trimSuffix "-" }}
{{- end }}
{{/*
Create a default fully qualified app name.
We truncate at 63 chars because some Kubernetes name fields are limited to this (by the DNS naming spec).
If release name contains chart name it will be used as a full name.
*/}}
{{- define "clearml-serving.fullname" -}}
{{- if .Values.fullnameOverride }}
{{- .Values.fullnameOverride | trunc 63 | trimSuffix "-" }}
{{- else }}
{{- $name := default .Chart.Name .Values.nameOverride }}
{{- if contains $name .Release.Name }}
{{- .Release.Name | trunc 63 | trimSuffix "-" }}
{{- else }}
{{- printf "%s-%s" .Release.Name $name | trunc 63 | trimSuffix "-" }}
{{- end }}
{{- end }}
{{- end }}
{{/*
Create chart name and version as used by the chart label.
*/}}
{{- define "clearml-serving.chart" -}}
{{- printf "%s-%s" .Chart.Name .Chart.Version | replace "+" "_" | trunc 63 | trimSuffix "-" }}
{{- end }}
{{/*
Common labels
*/}}
{{- define "clearml-serving.labels" -}}
helm.sh/chart: {{ include "clearml-serving.chart" . }}
{{ include "clearml-serving.selectorLabels" . }}
{{- if .Chart.AppVersion }}
app.kubernetes.io/version: {{ .Chart.AppVersion | quote }}
{{- end }}
app.kubernetes.io/managed-by: {{ .Release.Service }}
{{- end }}
{{/*
Selector labels
*/}}
{{- define "clearml-serving.selectorLabels" -}}
app.kubernetes.io/name: {{ include "clearml-serving.name" . }}
app.kubernetes.io/instance: {{ .Release.Name }}
{{- end }}
{{/*
Create the name of the service account to use
*/}}
{{- define "clearml-serving.serviceAccountName" -}}
{{- if .Values.serviceAccount.create }}
{{- default (include "clearml-serving.fullname" .) .Values.serviceAccount.name }}
{{- else }}
{{- default "default" .Values.serviceAccount.name }}
{{- end }}
{{- end }}
{{/*
Return the target Kubernetes version
*/}}
{{- define "common.capabilities.kubeVersion" -}}
{{- if .Values.global }}
{{- if .Values.global.kubeVersion }}
{{- .Values.global.kubeVersion -}}
{{- else }}
{{- default .Capabilities.KubeVersion.Version .Values.kubeVersion -}}
{{- end -}}
{{- else }}
{{- default .Capabilities.KubeVersion.Version .Values.kubeVersion -}}
{{- end -}}
{{- end -}}
{{/*
Return the appropriate apiVersion for Horizontal Pod Autoscaler.
*/}}
{{- define "common.capabilities.hpa.apiVersion" -}}
{{- if semverCompare "<1.23-0" (include "common.capabilities.kubeVersion" .context) -}}
{{- if .beta2 -}}
{{- print "autoscaling/v2beta2" -}}
{{- else -}}
{{- print "autoscaling/v2beta1" -}}
{{- end -}}
{{- else -}}
{{- print "autoscaling/v2" -}}
{{- end -}}
{{- end -}}

View File

@@ -0,0 +1,27 @@
apiVersion: apps/v1
kind: Deployment
metadata:
annotations: {}
labels:
clearml.serving.service: alertmanager
name: alertmanager
spec:
replicas: 1
selector:
matchLabels:
clearml.serving.service: alertmanager
strategy: {}
template:
metadata:
annotations: {}
labels:
clearml.serving.network/clearml-serving-backend: "true"
clearml.serving.service: alertmanager
spec:
containers:
- image: "{{ .Values.alertmanager.image.repository }}:{{ .Values.alertmanager.image.tag }}"
name: clearml-serving-alertmanager
ports:
- containerPort: 9093
resources: {}
restartPolicy: Always

View File

@@ -0,0 +1,14 @@
apiVersion: v1
kind: Service
metadata:
annotations: {}
labels:
clearml.serving.service: alertmanager
name: clearml-serving-alertmanager
spec:
ports:
- name: "9093"
port: 9093
targetPort: 9093
selector:
clearml.serving.service: alertmanager

View File

@@ -0,0 +1,13 @@
apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
name: clearml-serving-backend
spec:
ingress:
- from:
- podSelector:
matchLabels:
clearml.serving.network/clearml-serving-backend: "true"
podSelector:
matchLabels:
clearml.serving.network/clearml-serving-backend: "true"

View File

@@ -0,0 +1,62 @@
apiVersion: apps/v1
kind: Deployment
metadata:
annotations: {}
labels:
clearml.serving.service: clearml-serving-inference
name: clearml-serving-inference
spec:
replicas: 1
selector:
matchLabels:
clearml.serving.service: clearml-serving-inference
strategy: {}
template:
metadata:
annotations: {}
labels:
clearml.serving.network/clearml-serving-backend: "true"
clearml.serving.service: clearml-serving-inference
spec:
containers:
- env:
- name: CLEARML_API_ACCESS_KEY
value: "{{ .Values.clearml.apiAccessKey }}"
- name: CLEARML_API_SECRET_KEY
value: "{{ .Values.clearml.apiSecretKey }}"
- name: CLEARML_API_HOST
value: "{{ .Values.clearml.apiHost }}"
- name: CLEARML_FILES_HOST
value: "{{ .Values.clearml.filesHost }}"
- name: CLEARML_WEB_HOST
value: "{{ .Values.clearml.webHost }}"
- name: CLEARML_DEFAULT_KAFKA_SERVE_URL
value: clearml-serving-kafka:9092
- name: CLEARML_SERVING_POLL_FREQ
value: "1.0"
- name: CLEARML_DEFAULT_BASE_SERVE_URL
value: "{{ .Values.clearml.defaultBaseServeUrl }}"
- name: CLEARML_DEFAULT_TRITON_GRPC_ADDR
{{- if .Values.clearml_serving_triton.enabled }}
value: "clearml-serving-triton:8001"
{{- else }}
value: ""
{{- end }}
- name: CLEARML_SERVING_NUM_PROCESS
value: "2"
- name: CLEARML_SERVING_PORT
value: "8080"
- name: CLEARML_SERVING_TASK_ID
value: "{{ .Values.clearml.servingTaskId }}"
- name: CLEARML_USE_GUNICORN
value: "true"
{{- if .Values.clearml_serving_inference.extraPythonPackages }}
- name: EXTRA_PYTHON_PACKAGES
value: '{{ join " " .Values.clearml_serving_inference.extraPythonPackages }}'
{{- end }}
image: "{{ .Values.clearml_serving_inference.image.repository }}:{{ .Values.clearml_serving_inference.image.tag }}"
name: clearml-serving-inference
ports:
- containerPort: 8080
resources: {}
restartPolicy: Always

View File

@@ -0,0 +1,42 @@
{{- if .Values.clearml_serving_inference.autoscaling.enabled }}
apiVersion: {{ include "common.capabilities.hpa.apiVersion" ( dict "context" $ ) }}
kind: HorizontalPodAutoscaler
metadata:
name: clearml-serving-inference-hpa
namespace: {{ .Release.Namespace | quote }}
annotations: {}
labels:
clearml.serving.service: clearml-serving-inference
spec:
scaleTargetRef:
apiVersion: "apps/v1"
kind: Deployment
name: clearml-serving-inference
minReplicas: {{ .Values.clearml_serving_inference.autoscaling.minReplicas }}
maxReplicas: {{ .Values.clearml_serving_inference.autoscaling.maxReplicas }}
metrics:
{{- if .Values.clearml_serving_inference.autoscaling.targetCPU }}
- type: Resource
resource:
name: cpu
{{- if semverCompare "<1.23-0" (include "common.capabilities.kubeVersion" .) }}
targetAverageUtilization: {{ .Values.clearml_serving_inference.autoscaling.targetCPU }}
{{- else }}
target:
type: Utilization
averageUtilization: {{ .Values.clearml_serving_inference.autoscaling.targetCPU }}
{{- end }}
{{- end }}
{{- if .Values.clearml_serving_inference.autoscaling.targetMemory }}
- type: Resource
resource:
name: memory
{{- if semverCompare "<1.23-0" (include "common.capabilities.kubeVersion" .) }}
targetAverageUtilization: {{ .Values.clearml_serving_inference.autoscaling.targetMemory }}
{{- else }}
target:
type: Utilization
averageUtilization: {{ .Values.clearml_serving_inference.autoscaling.targetMemory }}
{{- end }}
{{- end }}
{{- end }}

View File

@@ -0,0 +1,40 @@
{{- if .Values.clearml_serving_inference.ingress.enabled -}}
{{- if semverCompare ">=1.19-0" .Capabilities.KubeVersion.GitVersion -}}
apiVersion: networking.k8s.io/v1
{{- else if semverCompare ">=1.14-0" .Capabilities.KubeVersion.GitVersion -}}
apiVersion: networking.k8s.io/v1beta1
{{- else -}}
apiVersion: extensions/v1beta1
{{- end }}
kind: Ingress
metadata:
name: clearml-serving-inference
labels:
clearml.serving.service: clearml-serving-inference
annotations:
{{- toYaml .Values.clearml_serving_inference.ingress.annotations | nindent 4 }}
spec:
{{- if .Values.clearml_serving_inference.ingress.tlsSecretName }}
tls:
- hosts:
- {{ .Values.clearml_serving_inference.ingress.hostName }}
secretName: {{ .Values.clearml_serving_inference.ingress.tlsSecretName }}
{{- end }}
rules:
- host: {{ .Values.clearml_serving_inference.ingress.hostName }}
http:
paths:
- path: {{ .Values.clearml_serving_inference.ingress.path }}
{{ if semverCompare ">=1.19-0" .Capabilities.KubeVersion.GitVersion }}
pathType: Prefix
backend:
service:
name: clearml-serving-inference
port:
number: 8080
{{ else }}
backend:
serviceName: clearml-serving-inference
servicePort: 8080
{{ end }}
{{- end }}

View File

@@ -0,0 +1,14 @@
apiVersion: v1
kind: Service
metadata:
annotations: {}
labels:
clearml.serving.service: clearml-serving-inference
name: clearml-serving-inference
spec:
ports:
- name: "8080"
port: 8080
targetPort: 8080
selector:
clearml.serving.service: clearml-serving-inference

View File

@@ -0,0 +1,48 @@
apiVersion: apps/v1
kind: Deployment
metadata:
annotations: {}
labels:
clearml.serving.service: clearml-serving-statistics
name: clearml-serving-statistics
spec:
replicas: 1
selector:
matchLabels:
clearml.serving.service: clearml-serving-statistics
strategy: {}
template:
metadata:
annotations: {}
labels:
clearml.serving.network/clearml-serving-backend: "true"
clearml.serving.service: clearml-serving-statistics
spec:
containers:
- env:
- name: CLEARML_API_ACCESS_KEY
value: "{{ .Values.clearml.apiAccessKey }}"
- name: CLEARML_API_SECRET_KEY
value: "{{ .Values.clearml.apiSecretKey }}"
- name: CLEARML_API_HOST
value: "{{ .Values.clearml.apiHost }}"
- name: CLEARML_FILES_HOST
value: "{{ .Values.clearml.filesHost }}"
- name: CLEARML_WEB_HOST
value: "{{ .Values.clearml.webHost }}"
- name: CLEARML_DEFAULT_KAFKA_SERVE_URL
value: clearml-serving-kafka:9092
- name: CLEARML_SERVING_POLL_FREQ
value: "1.0"
- name: CLEARML_SERVING_TASK_ID
value: "{{ .Values.clearml.servingTaskId }}"
{{- if .Values.clearml_serving_statistics.extraPythonPackages }}
- name: EXTRA_PYTHON_PACKAGES
value: '{{ join " " .Values.clearml_serving_statistics.extraPythonPackages }}'
{{- end }}
image: "{{ .Values.clearml_serving_statistics.image.repository }}:{{ .Values.clearml_serving_statistics.image.tag }}"
name: clearml-serving-statistics
ports:
- containerPort: 9999
resources: {}
restartPolicy: Always

View File

@@ -0,0 +1,14 @@
apiVersion: v1
kind: Service
metadata:
annotations: {}
labels:
clearml.serving.service: clearml-serving-statistics
name: clearml-serving-statistics
spec:
ports:
- name: "9999"
port: 9999
targetPort: 9999
selector:
clearml.serving.service: clearml-serving-statistics

View File

@@ -0,0 +1,51 @@
{{ if .Values.clearml_serving_triton.enabled }}
apiVersion: apps/v1
kind: Deployment
metadata:
annotations: {}
labels:
clearml.serving.service: clearml-serving-triton
name: clearml-serving-triton
spec:
replicas: 1
selector:
matchLabels:
clearml.serving.service: clearml-serving-triton
strategy: {}
template:
metadata:
annotations: {}
labels:
clearml.serving.network/clearml-serving-backend: "true"
clearml.serving.service: clearml-serving-triton
spec:
containers:
- env:
- name: CLEARML_API_ACCESS_KEY
value: "{{ .Values.clearml.apiAccessKey }}"
- name: CLEARML_API_SECRET_KEY
value: "{{ .Values.clearml.apiSecretKey }}"
- name: CLEARML_API_HOST
value: "{{ .Values.clearml.apiHost }}"
- name: CLEARML_FILES_HOST
value: "{{ .Values.clearml.filesHost }}"
- name: CLEARML_WEB_HOST
value: "{{ .Values.clearml.webHost }}"
- name: CLEARML_SERVING_TASK_ID
value: "{{ .Values.clearml.servingTaskId }}"
- name: CLEARML_TRITON_POLL_FREQ
value: "1.0"
- name: CLEARML_TRITON_METRIC_FREQ
value: "1.0"
{{- if .Values.clearml_serving_triton.extraPythonPackages }}
- name: EXTRA_PYTHON_PACKAGES
value: '{{ join " " .Values.clearml_serving_triton.extraPythonPackages }}'
{{- end }}
image: "{{ .Values.clearml_serving_triton.image.repository }}:{{ .Values.clearml_serving_triton.image.tag }}"
name: clearml-serving-triton
ports:
- containerPort: 8001
resources: {}
restartPolicy: Always
{{ end }}

View File

@@ -0,0 +1,42 @@
{{- if .Values.clearml_serving_triton.autoscaling.enabled }}
apiVersion: {{ include "common.capabilities.hpa.apiVersion" ( dict "context" $ ) }}
kind: HorizontalPodAutoscaler
metadata:
name: clearml-serving-triton-hpa
namespace: {{ .Release.Namespace | quote }}
annotations: {}
labels:
clearml.serving.service: clearml-serving-triton
spec:
scaleTargetRef:
apiVersion: "apps/v1"
kind: Deployment
name: clearml-serving-triton
minReplicas: {{ .Values.clearml_serving_triton.autoscaling.minReplicas }}
maxReplicas: {{ .Values.clearml_serving_triton.autoscaling.maxReplicas }}
metrics:
{{- if .Values.clearml_serving_triton.autoscaling.targetCPU }}
- type: Resource
resource:
name: cpu
{{- if semverCompare "<1.23-0" (include "common.capabilities.kubeVersion" .) }}
targetAverageUtilization: {{ .Values.clearml_serving_triton.autoscaling.targetCPU }}
{{- else }}
target:
type: Utilization
averageUtilization: {{ .Values.clearml_serving_triton.autoscaling.targetCPU }}
{{- end }}
{{- end }}
{{- if .Values.clearml_serving_triton.autoscaling.targetMemory }}
- type: Resource
resource:
name: memory
{{- if semverCompare "<1.23-0" (include "common.capabilities.kubeVersion" .) }}
targetAverageUtilization: {{ .Values.clearml_serving_triton.autoscaling.targetMemory }}
{{- else }}
target:
type: Utilization
averageUtilization: {{ .Values.clearml_serving_triton.autoscaling.targetMemory }}
{{- end }}
{{- end }}
{{- end }}

View File

@@ -0,0 +1,42 @@
{{- if .Values.clearml_serving_triton.enabled -}}
{{- if .Values.clearml_serving_triton.ingress.enabled -}}
{{- if semverCompare ">=1.19-0" .Capabilities.KubeVersion.GitVersion -}}
apiVersion: networking.k8s.io/v1
{{- else if semverCompare ">=1.14-0" .Capabilities.KubeVersion.GitVersion -}}
apiVersion: networking.k8s.io/v1beta1
{{- else -}}
apiVersion: extensions/v1beta1
{{- end }}
kind: Ingress
metadata:
name: clearml-serving-triton
labels:
clearml.serving.service: clearml-serving-triton
annotations:
{{- toYaml .Values.clearml_serving_triton.ingress.annotations | nindent 4 }}
spec:
{{- if .Values.clearml_serving_triton.ingress.tlsSecretName }}
tls:
- hosts:
- {{ .Values.clearml_serving_triton.ingress.hostName }}
secretName: {{ .Values.clearml_serving_triton.ingress.tlsSecretName }}
{{- end }}
rules:
- host: {{ .Values.clearml_serving_triton.ingress.hostName }}
http:
paths:
- path: {{ .Values.clearml_serving_triton.ingress.path }}
{{ if semverCompare ">=1.19-0" .Capabilities.KubeVersion.GitVersion }}
pathType: Prefix
backend:
service:
name: clearml-serving-triton
port:
number: 8001
{{ else }}
backend:
serviceName: clearml-serving-triton
servicePort: 8001
{{ end }}
{{- end }}
{{- end }}

View File

@@ -0,0 +1,16 @@
{{ if .Values.clearml_serving_triton.enabled }}
apiVersion: v1
kind: Service
metadata:
annotations: {}
labels:
clearml.serving.service: clearml-serving-triton
name: clearml-serving-triton
spec:
ports:
- name: "8001"
port: 8001
targetPort: 8001
selector:
clearml.serving.service: clearml-serving-triton
{{ end }}

View File

@@ -0,0 +1,14 @@
apiVersion: v1
kind: Secret
metadata:
name: grafana-config
stringData:
datasource.yaml: |-
apiVersion: 1
datasources:
- name: Prometheus
type: prometheus
# Access mode - proxy (server in the UI) or direct (browser in the UI).
access: proxy
url: http://clearml-serving-prometheus:9090

View File

@@ -0,0 +1,35 @@
apiVersion: apps/v1
kind: Deployment
metadata:
annotations: {}
labels:
clearml.serving.service: grafana
name: grafana
spec:
replicas: 1
selector:
matchLabels:
clearml.serving.service: grafana
strategy:
type: Recreate
template:
metadata:
annotations: {}
labels:
clearml.serving.network/clearml-serving-backend: "true"
clearml.serving.service: grafana
spec:
containers:
- image: "{{ .Values.grafana.image.repository }}:{{ .Values.grafana.image.tag }}"
name: clearml-serving-grafana
ports:
- containerPort: 3000
resources: {}
volumeMounts:
- mountPath: /etc/grafana/provisioning/datasources/
name: grafana-conf
restartPolicy: Always
volumes:
- name: grafana-conf
secret:
secretName: grafana-config

View File

@@ -0,0 +1,40 @@
{{- if .Values.grafana.ingress.enabled -}}
{{- if semverCompare ">=1.19-0" .Capabilities.KubeVersion.GitVersion -}}
apiVersion: networking.k8s.io/v1
{{- else if semverCompare ">=1.14-0" .Capabilities.KubeVersion.GitVersion -}}
apiVersion: networking.k8s.io/v1beta1
{{- else -}}
apiVersion: extensions/v1beta1
{{- end }}
kind: Ingress
metadata:
name: clearml-serving-grafana
labels:
clearml.serving.service: clearml-serving-grafana
annotations:
{{- toYaml .Values.grafana.ingress.annotations | nindent 4 }}
spec:
{{- if .Values.grafana.ingress.tlsSecretName }}
tls:
- hosts:
- {{ .Values.grafana.ingress.hostName }}
secretName: {{ .Values.grafana.ingress.tlsSecretName }}
{{- end }}
rules:
- host: {{ .Values.grafana.ingress.hostName }}
http:
paths:
- path: {{ .Values.grafana.ingress.path }}
{{ if semverCompare ">=1.19-0" .Capabilities.KubeVersion.GitVersion }}
pathType: Prefix
backend:
service:
name: clearml-serving-grafana
port:
number: 3000
{{ else }}
backend:
serviceName: clearml-serving-grafana
servicePort: 3000
{{ end }}
{{- end }}

View File

@@ -0,0 +1,14 @@
apiVersion: v1
kind: Service
metadata:
annotations: {}
labels:
clearml.serving.service: grafana
name: clearml-serving-grafana
spec:
ports:
- name: "3000"
port: 3000
targetPort: 3000
selector:
clearml.serving.service: grafana

View File

@@ -0,0 +1,40 @@
apiVersion: apps/v1
kind: Deployment
metadata:
annotations: {}
labels:
clearml.serving.service: kafka
name: kafka
spec:
replicas: 1
selector:
matchLabels:
clearml.serving.service: kafka
strategy: {}
template:
metadata:
annotations: {}
labels:
clearml.serving.network/clearml-serving-backend: "true"
clearml.serving.service: kafka
spec:
containers:
- env:
- name: ALLOW_PLAINTEXT_LISTENER
value: "yes"
- name: KAFKA_BROKER_ID
value: "1"
- name: KAFKA_CFG_ADVERTISED_LISTENERS
value: PLAINTEXT://clearml-serving-kafka:9092
- name: KAFKA_CFG_LISTENERS
value: PLAINTEXT://0.0.0.0:9092
- name: KAFKA_CFG_ZOOKEEPER_CONNECT
value: clearml-serving-zookeeper:2181
- name: KAFKA_CREATE_TOPICS
value: '"topic_test:1:1"'
image: "{{ .Values.kafka.image.repository }}:{{ .Values.kafka.image.tag }}"
name: clearml-serving-kafka
ports:
- containerPort: 9092
resources: {}
restartPolicy: Always

View File

@@ -0,0 +1,14 @@
apiVersion: v1
kind: Service
metadata:
annotations: {}
labels:
clearml.serving.service: kafka
name: clearml-serving-kafka
spec:
ports:
- name: "9092"
port: 9092
targetPort: 9092
selector:
clearml.serving.service: kafka

View File

@@ -0,0 +1,28 @@
apiVersion: v1
kind: Secret
metadata:
name: prometheus-config
stringData:
prometheus.yml: |-
global:
scrape_interval: "15s" # By default, scrape targets every 15 seconds.
evaluation_interval: 15s # By default, scrape targets every 15 seconds.
external_labels:
monitor: 'clearml-serving'
scrape_configs:
# The job name is added as a label `job=<job_name>` to any timeseries scraped from this config.
- job_name: 'prometheus'
scrape_interval: 5s
static_configs:
- targets: ['localhost:9090']
# The job name is added as a label `job=<job_name>` to any timeseries scraped from this config.
- job_name: 'clearml-inference-stats'
scrape_interval: 5s
static_configs:
- targets: ['clearml-serving-statistics:9999']

View File

@@ -0,0 +1,42 @@
apiVersion: apps/v1
kind: Deployment
metadata:
annotations: {}
labels:
clearml.serving.service: prometheus
name: prometheus
spec:
replicas: 1
selector:
matchLabels:
clearml.serving.service: prometheus
strategy:
type: Recreate
template:
metadata:
annotations: {}
labels:
clearml.serving.network/clearml-serving-backend: "true"
clearml.serving.service: prometheus
spec:
containers:
- args:
- --config.file=/mnt/prometheus.yml
- --storage.tsdb.path=/prometheus
- --web.console.libraries=/etc/prometheus/console_libraries
- --web.console.templates=/etc/prometheus/consoles
- --storage.tsdb.retention.time=200h
- --web.enable-lifecycle
image: "{{ .Values.prometheus.image.repository }}:{{ .Values.prometheus.image.tag }}"
name: clearml-serving-prometheus
ports:
- containerPort: 9090
resources: {}
volumeMounts:
- mountPath: /mnt
name: prometheus-conf
restartPolicy: Always
volumes:
- name: prometheus-conf
secret:
secretName: prometheus-config

View File

@@ -0,0 +1,14 @@
apiVersion: v1
kind: Service
metadata:
annotations: {}
labels:
clearml.serving.service: prometheus
name: clearml-serving-prometheus
spec:
ports:
- name: "9090"
port: 9090
targetPort: 9090
selector:
clearml.serving.service: prometheus

View File

@@ -0,0 +1,30 @@
apiVersion: apps/v1
kind: Deployment
metadata:
annotations: {}
labels:
clearml.serving.service: zookeeper
name: zookeeper
spec:
replicas: 1
selector:
matchLabels:
clearml.serving.service: zookeeper
strategy: {}
template:
metadata:
annotations: {}
labels:
clearml.serving.network/clearml-serving-backend: "true"
clearml.serving.service: zookeeper
spec:
containers:
- env:
- name: ALLOW_ANONYMOUS_LOGIN
value: "yes"
image: "{{ .Values.zookeeper.image.repository }}:{{ .Values.zookeeper.image.tag }}"
name: clearml-serving-zookeeper
ports:
- containerPort: 2181
resources: {}
restartPolicy: Always

View File

@@ -0,0 +1,14 @@
apiVersion: v1
kind: Service
metadata:
annotations: {}
labels:
clearml.serving.service: zookeeper
name: clearml-serving-zookeeper
spec:
ports:
- name: "2181"
port: 2181
targetPort: 2181
selector:
clearml.serving.service: zookeeper

View File

@@ -0,0 +1,164 @@
# -- ClearMl generic configurations
clearml:
apiAccessKey: "ClearML API Access Key"
apiSecretKey: "ClearML API Secret Key"
apiHost: http://clearml-server-apiserver:8008
filesHost: http://clearml-server-fileserver:8081
webHost: http://clearml-server-webserver:80
defaultBaseServeUrl: http://127.0.0.1:8080/serve
servingTaskId: "ClearML Serving Task ID"
# -- Zookeeper generic configigurations
zookeeper:
image:
repository: "bitnami/zookeeper"
tag: "3.7.0"
nodeSelector: {}
tolerations: []
affinity: {}
resources: {}
# -- Kafka generic configigurations
kafka:
image:
repository: "bitnami/kafka"
tag: "3.1.0"
nodeSelector: {}
tolerations: []
affinity: {}
resources: {}
# -- Prometheus generic configigurations
prometheus:
image:
repository: "prom/prometheus"
tag: "v2.34.0"
nodeSelector: {}
tolerations: []
affinity: {}
resources: {}
# -- Grafana generic configigurations
grafana:
image:
repository: "grafana/grafana"
tag: "8.4.4-ubuntu"
nodeSelector: {}
tolerations: []
affinity: {}
resources: {}
ingress:
enabled: false
hostName: "serving-grafana.clearml.127-0-0-1.nip.io"
tlsSecretName: ""
annotations: {}
path: "/"
# -- Alertmanager generic configigurations
alertmanager:
image:
repository: "prom/alertmanager"
tag: "v0.23.0"
nodeSelector: {}
tolerations: []
affinity: {}
resources: {}
# -- ClearML serving statistics configurations
clearml_serving_statistics:
# -- Container Image
image:
repository: "allegroai/clearml-serving-statistics"
tag: "1.2.0"
# -- Node Selector configuration
nodeSelector: {}
# -- Tolerations configuration
tolerations: []
# -- Affinity configuration
affinity: {}
# -- Pod resources definition
resources: {}
# -- Extra Python Packages to be installed in running pods
extraPythonPackages: []
# - numpy==1.22.4
# - pandas==1.4.2
# -- ClearML serving inference configurations
clearml_serving_inference:
# -- Container Image
image:
repository: "allegroai/clearml-serving-inference"
tag: "1.2.0"
# -- Node Selector configuration
nodeSelector: {}
# -- Tolerations configuration
tolerations: []
# -- Affinity configuration
affinity: {}
# -- Pod resources definition
resources: {}
# -- Extra Python Packages to be installed in running pods
extraPythonPackages: []
# - numpy==1.22.4
# - pandas==1.4.2
# -- Autoscaling configuration
autoscaling:
enabled: false
minReplicas: 1
maxReplicas: 11
targetCPU: 50
targetMemory: 50
# -- Ingress exposing configurations
ingress:
enabled: false
hostName: "serving.clearml.127-0-0-1.nip.io"
tlsSecretName: ""
annotations: {}
path: "/"
# -- ClearML serving Triton configurations
clearml_serving_triton:
# -- Triton pod creation enable/disable
enabled: true
# -- Container Image
image:
repository: "allegroai/clearml-serving-triton"
tag: "1.2.0-22.07"
# -- Node Selector configuration
nodeSelector: {}
# -- Tolerations configuration
tolerations: []
# -- Affinity configuration
affinity: {}
# -- Pod resources definition
resources: {}
# -- Extra Python Packages to be installed in running pods
extraPythonPackages: []
# - numpy==1.22.4
# - pandas==1.4.2
# -- Autoscaling configuration
autoscaling:
enabled: false
minReplicas: 1
maxReplicas: 11
targetCPU: 50
targetMemory: 50
# -- Ingress exposing configurations
ingress:
enabled: false
hostName: "serving-grpc.clearml.127-0-0-1.nip.io"
tlsSecretName: ""
annotations: {}
# # Example for AWS ALB
# kubernetes.io/ingress.class: alb
# alb.ingress.kubernetes.io/backend-protocol: HTTP
# alb.ingress.kubernetes.io/backend-protocol-version: GRPC
# alb.ingress.kubernetes.io/certificate-arn: <cerntificate arn>
# alb.ingress.kubernetes.io/ssl-redirect: '443'
# alb.ingress.kubernetes.io/listen-ports: '[{"HTTPS":443}]'
# alb.ingress.kubernetes.io/target-type: ip
#
# # Example for NNGINX ingress controller
# nginx.ingress.kubernetes.io/ssl-redirect: "true"
# nginx.ingress.kubernetes.io/backend-protocol: "GRPC"
path: "/"

View File

@@ -0,0 +1,23 @@
# Patterns to ignore when building packages.
# This supports shell glob matching, relative path matching, and
# negation (prefixed with !). Only one pattern per line.
.DS_Store
# Common VCS dirs
.git/
.gitignore
.bzr/
.bzrignore
.hg/
.hgignore
.svn/
# Common backup files
*.swp
*.bak
*.tmp
*.orig
*~
# Various IDEs
.project
.idea/
*.tmproj
.vscode/

12
charts/clearml/Chart.lock Normal file
View File

@@ -0,0 +1,12 @@
dependencies:
- name: redis
repository: file://../../dependency_charts/redis
version: 10.9.0
- name: mongodb
repository: file://../../dependency_charts/mongodb
version: 10.3.4
- name: elasticsearch
repository: file://../../dependency_charts/elasticsearch
version: 7.16.2
digest: sha256:149b5a49382d280b1e083f3c193d014d3d2eb7fcdf3ec1402008996960cc173a
generated: "2022-06-02T21:09:00.961174+02:00"

36
charts/clearml/Chart.yaml Normal file
View File

@@ -0,0 +1,36 @@
apiVersion: v2
name: clearml
description: MLOps platform
type: application
version: "5.4.0"
appVersion: "1.9.2"
kubeVersion: ">= 1.21.0-0 < 1.26.0-0"
home: https://clear.ml
icon: https://raw.githubusercontent.com/allegroai/clearml/master/docs/clearml-logo.svg
sources:
- https://github.com/allegroai/clearml-helm-charts
- https://github.com/allegroai/clearml
maintainers:
- name: valeriano-manassero
url: https://github.com/valeriano-manassero
keywords:
- clearml
- "machine learning"
- mlops
dependencies:
- name: redis
version: "10.9.0"
repository: "file://../../dependency_charts/redis"
condition: redis.enabled
- name: mongodb
version: "10.3.4"
repository: "file://../../dependency_charts/mongodb"
condition: mongodb.enabled
- name: elasticsearch
version: "7.16.2"
repository: "file://../../dependency_charts/elasticsearch"
condition: elasticsearch.enabled
annotations:
artifacthub.io/changes: |
- kind: changed
description: external mongodb improved connection string

201
charts/clearml/LICENSE Normal file
View File

@@ -0,0 +1,201 @@
Apache License
Version 2.0, January 2004
http://www.apache.org/licenses/
TERMS AND CONDITIONS FOR USE, REPRODUCTION, AND DISTRIBUTION
1. Definitions.
"License" shall mean the terms and conditions for use, reproduction,
and distribution as defined by Sections 1 through 9 of this document.
"Licensor" shall mean the copyright owner or entity authorized by
the copyright owner that is granting the License.
"Legal Entity" shall mean the union of the acting entity and all
other entities that control, are controlled by, or are under common
control with that entity. For the purposes of this definition,
"control" means (i) the power, direct or indirect, to cause the
direction or management of such entity, whether by contract or
otherwise, or (ii) ownership of fifty percent (50%) or more of the
outstanding shares, or (iii) beneficial ownership of such entity.
"You" (or "Your") shall mean an individual or Legal Entity
exercising permissions granted by this License.
"Source" form shall mean the preferred form for making modifications,
including but not limited to software source code, documentation
source, and configuration files.
"Object" form shall mean any form resulting from mechanical
transformation or translation of a Source form, including but
not limited to compiled object code, generated documentation,
and conversions to other media types.
"Work" shall mean the work of authorship, whether in Source or
Object form, made available under the License, as indicated by a
copyright notice that is included in or attached to the work
(an example is provided in the Appendix below).
"Derivative Works" shall mean any work, whether in Source or Object
form, that is based on (or derived from) the Work and for which the
editorial revisions, annotations, elaborations, or other modifications
represent, as a whole, an original work of authorship. For the purposes
of this License, Derivative Works shall not include works that remain
separable from, or merely link (or bind by name) to the interfaces of,
the Work and Derivative Works thereof.
"Contribution" shall mean any work of authorship, including
the original version of the Work and any modifications or additions
to that Work or Derivative Works thereof, that is intentionally
submitted to Licensor for inclusion in the Work by the copyright owner
or by an individual or Legal Entity authorized to submit on behalf of
the copyright owner. For the purposes of this definition, "submitted"
means any form of electronic, verbal, or written communication sent
to the Licensor or its representatives, including but not limited to
communication on electronic mailing lists, source code control systems,
and issue tracking systems that are managed by, or on behalf of, the
Licensor for the purpose of discussing and improving the Work, but
excluding communication that is conspicuously marked or otherwise
designated in writing by the copyright owner as "Not a Contribution."
"Contributor" shall mean Licensor and any individual or Legal Entity
on behalf of whom a Contribution has been received by Licensor and
subsequently incorporated within the Work.
2. Grant of Copyright License. Subject to the terms and conditions of
this License, each Contributor hereby grants to You a perpetual,
worldwide, non-exclusive, no-charge, royalty-free, irrevocable
copyright license to reproduce, prepare Derivative Works of,
publicly display, publicly perform, sublicense, and distribute the
Work and such Derivative Works in Source or Object form.
3. Grant of Patent License. Subject to the terms and conditions of
this License, each Contributor hereby grants to You a perpetual,
worldwide, non-exclusive, no-charge, royalty-free, irrevocable
(except as stated in this section) patent license to make, have made,
use, offer to sell, sell, import, and otherwise transfer the Work,
where such license applies only to those patent claims licensable
by such Contributor that are necessarily infringed by their
Contribution(s) alone or by combination of their Contribution(s)
with the Work to which such Contribution(s) was submitted. If You
institute patent litigation against any entity (including a
cross-claim or counterclaim in a lawsuit) alleging that the Work
or a Contribution incorporated within the Work constitutes direct
or contributory patent infringement, then any patent licenses
granted to You under this License for that Work shall terminate
as of the date such litigation is filed.
4. Redistribution. You may reproduce and distribute copies of the
Work or Derivative Works thereof in any medium, with or without
modifications, and in Source or Object form, provided that You
meet the following conditions:
(a) You must give any other recipients of the Work or
Derivative Works a copy of this License; and
(b) You must cause any modified files to carry prominent notices
stating that You changed the files; and
(c) You must retain, in the Source form of any Derivative Works
that You distribute, all copyright, patent, trademark, and
attribution notices from the Source form of the Work,
excluding those notices that do not pertain to any part of
the Derivative Works; and
(d) If the Work includes a "NOTICE" text file as part of its
distribution, then any Derivative Works that You distribute must
include a readable copy of the attribution notices contained
within such NOTICE file, excluding those notices that do not
pertain to any part of the Derivative Works, in at least one
of the following places: within a NOTICE text file distributed
as part of the Derivative Works; within the Source form or
documentation, if provided along with the Derivative Works; or,
within a display generated by the Derivative Works, if and
wherever such third-party notices normally appear. The contents
of the NOTICE file are for informational purposes only and
do not modify the License. You may add Your own attribution
notices within Derivative Works that You distribute, alongside
or as an addendum to the NOTICE text from the Work, provided
that such additional attribution notices cannot be construed
as modifying the License.
You may add Your own copyright statement to Your modifications and
may provide additional or different license terms and conditions
for use, reproduction, or distribution of Your modifications, or
for any such Derivative Works as a whole, provided Your use,
reproduction, and distribution of the Work otherwise complies with
the conditions stated in this License.
5. Submission of Contributions. Unless You explicitly state otherwise,
any Contribution intentionally submitted for inclusion in the Work
by You to the Licensor shall be under the terms and conditions of
this License, without any additional terms or conditions.
Notwithstanding the above, nothing herein shall supersede or modify
the terms of any separate license agreement you may have executed
with Licensor regarding such Contributions.
6. Trademarks. This License does not grant permission to use the trade
names, trademarks, service marks, or product names of the Licensor,
except as required for reasonable and customary use in describing the
origin of the Work and reproducing the content of the NOTICE file.
7. Disclaimer of Warranty. Unless required by applicable law or
agreed to in writing, Licensor provides the Work (and each
Contributor provides its Contributions) on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or
implied, including, without limitation, any warranties or conditions
of TITLE, NON-INFRINGEMENT, MERCHANTABILITY, or FITNESS FOR A
PARTICULAR PURPOSE. You are solely responsible for determining the
appropriateness of using or redistributing the Work and assume any
risks associated with Your exercise of permissions under this License.
8. Limitation of Liability. In no event and under no legal theory,
whether in tort (including negligence), contract, or otherwise,
unless required by applicable law (such as deliberate and grossly
negligent acts) or agreed to in writing, shall any Contributor be
liable to You for damages, including any direct, indirect, special,
incidental, or consequential damages of any character arising as a
result of this License or out of the use or inability to use the
Work (including but not limited to damages for loss of goodwill,
work stoppage, computer failure or malfunction, or any and all
other commercial damages or losses), even if such Contributor
has been advised of the possibility of such damages.
9. Accepting Warranty or Additional Liability. While redistributing
the Work or Derivative Works thereof, You may choose to offer,
and charge a fee for, acceptance of support, warranty, indemnity,
or other liability obligations and/or rights consistent with this
License. However, in accepting such obligations, You may act only
on Your own behalf and on Your sole responsibility, not on behalf
of any other Contributor, and only if You agree to indemnify,
defend, and hold each Contributor harmless for any liability
incurred by, or claims asserted against, such Contributor by reason
of your accepting any such warranty or additional liability.
END OF TERMS AND CONDITIONS
APPENDIX: How to apply the Apache License to your work.
To apply the Apache License to your work, attach the following
boilerplate notice, with the fields enclosed by brackets "[]"
replaced with your own identifying information. (Don't include
the brackets!) The text should be enclosed in the appropriate
comment syntax for the file format. We also recommend that a
file or class name and description of purpose be included on the
same "printed page" as the copyright notice for easier
identification within third-party archives.
Copyright [yyyy] [name of copyright owner]
Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License.

278
charts/clearml/README.md Normal file
View File

@@ -0,0 +1,278 @@
# ClearML Ecosystem for Kubernetes
![Version: 5.4.0](https://img.shields.io/badge/Version-5.4.0-informational?style=flat-square) ![Type: application](https://img.shields.io/badge/Type-application-informational?style=flat-square) ![AppVersion: 1.9.2](https://img.shields.io/badge/AppVersion-1.9.2-informational?style=flat-square)
MLOps platform
**Homepage:** <https://clear.ml>
## Maintainers
| Name | Email | Url |
| ---- | ------ | --- |
| valeriano-manassero | | <https://github.com/valeriano-manassero> |
## Introduction
The **clearml-server** is the backend service infrastructure for [ClearML](https://github.com/allegroai/clearml).
It allows multiple users to collaborate and manage their experiments.
**clearml-server** contains the following components:
* The ClearML Web-App, a single-page UI for experiment management and browsing
* RESTful API for:
* Documenting and logging experiment information, statistics and results
* Querying experiments history, logs and results
* Locally-hosted file server for storing images and models making them easily accessible using the Web-App
## Local environment
For development/evaluation it's possible to use [kind](https://kind.sigs.k8s.io).
After installation, following commands will create a complete ClearML insatllation:
```
cat <<EOF | kind create cluster --config=-
kind: Cluster
apiVersion: kind.x-k8s.io/v1alpha4
nodes:
- role: control-plane
extraPortMappings:
# API server's default nodePort is 30008. If you customize it in helm values by
# `apiserver.service.nodePort`, `containerPort` should match it
- containerPort: 30008
hostPort: 30008
listenAddress: "127.0.0.1"
protocol: TCP
# Web server's default nodePort is 30080. If you customize it in helm values by
# `webserver.service.nodePort`, `containerPort` should match it
- containerPort: 30080
hostPort: 30080
listenAddress: "127.0.0.1"
protocol: TCP
# File server's default nodePort is 30081. If you customize it in helm values by
# `fileserver.service.nodePort`, `containerPort` should match it
- containerPort: 30081
hostPort: 30081
listenAddress: "127.0.0.1"
protocol: TCP
extraMounts:
- hostPath: /tmp/clearml-kind/
containerPath: /var/local-path-provisioner
EOF
helm install clearml allegroai/clearml
```
After deployment, the services will be exposed on localhost on the following ports:
* API server on `30008`
* Web server on `30080`
* File server on `30081`
Data persisted in every Kubernetes volume by ClearML will be accessible in /tmp/clearml-kind folder on the host.
## Production cluster environment
In a production environment it's suggested to install an ingress controller and verify that is working correctly.
During ClearML deployment enable `ingress` section of chart values.
This will create 3 ingress rules:
* `app.<your domain name>`
* `files.<your domain name>`
* `api.<your domain name>`
(*for example, `app.clearml.mydomainname.com`, `files.clearml.mydomainname.com` and `api.clearml.mydomainname.com`*)
Just pointing the domain records to the IP where ingress controller is responding will complete the deployment process.
A production ready cluster should also have some different configuration like the one proposed in `values-production.yaml` that can be applied with:
```
helm install clearml allegroai/clearml -f values-production.yaml
```
## Upgrades/ Values upgrades
Updating to latest version of this chart can be done in two steps:
```
helm repo update
helm upgrade clearml allegroai/clearml
```
Changing values on existing installation can be done with:
```
helm upgrade clearml allegroai/clearml --version <CURRENT CHART VERSION> -f custom_values.yaml
```
Please note: updating values only should always be done setting explicit chart version to avoid a possible chart update.
Keeping separate updates procedures between version and values can be a good practice to seprate potential concerns.
## ENTERPRISE Version
There are some specific Enterprise version features that can be enabled only with specific Enterprise licensed images.
Enabling this features on OSS version can cause the entire installation to break.
## Additional Configuration for ClearML Server
You can also configure the **clearml-server** for:
* fixed users (users with credentials)
* non-responsive experiment watchdog settings
For detailed instructions, see the [Optional Configuration](https://github.com/allegroai/clearml-server#optional-configuration) section in the **clearml-server** repository README file.
## Source Code
* <https://github.com/allegroai/clearml-helm-charts>
* <https://github.com/allegroai/clearml>
## Requirements
Kubernetes: `>= 1.21.0-0 < 1.26.0-0`
| Repository | Name | Version |
|------------|------|---------|
| file://../../dependency_charts/elasticsearch | elasticsearch | 7.16.2 |
| file://../../dependency_charts/mongodb | mongodb | 10.3.4 |
| file://../../dependency_charts/redis | redis | 10.9.0 |
## Values
| Key | Type | Default | Description |
|-----|------|---------|-------------|
| apiserver | object | `{"additionalConfigs":{},"affinity":{},"enabled":true,"existingAdditionalConfigsConfigMap":"","existingAdditionalConfigsSecret":"","extraEnvs":[],"image":{"pullPolicy":"IfNotPresent","repository":"allegroai/clearml","tag":"1.9.2-317"},"indexReplicas":0,"indexShards":1,"ingress":{"annotations":{},"enabled":false,"hostName":"api.clearml.127-0-0-1.nip.io","ingressClassName":"","path":"/","tlsSecretName":""},"nodeSelector":{},"podAnnotations":{},"prepopulateEnabled":true,"processes":{"count":8,"maxRequests":1000,"maxRequestsJitter":300,"timeout":24000},"replicaCount":1,"resources":{"limits":{"cpu":"2000m","memory":"1Gi"},"requests":{"cpu":"100m","memory":"256Mi"}},"securityContext":{},"service":{"nodePort":30008,"port":8008,"type":"NodePort"},"tolerations":[]}` | Api Server configurations |
| apiserver.additionalConfigs | object | `{}` | files declared in this parameter will be mounted and read by apiserver (examples in values.yaml) if not overridden by existingAdditionalConfigsSecret |
| apiserver.affinity | object | `{}` | Api Server affinity setup |
| apiserver.enabled | bool | `true` | Enable/Disable component deployment |
| apiserver.existingAdditionalConfigsConfigMap | string | `""` | reference for files declared in existing ConfigMap will be mounted and read by apiserver (examples in values.yaml) |
| apiserver.existingAdditionalConfigsSecret | string | `""` | reference for files declared in existing Secret will be mounted and read by apiserver (examples in values.yaml) if not overridden by existingAdditionalConfigsConfigMap |
| apiserver.extraEnvs | list | `[]` | Api Server extra envrinoment variables |
| apiserver.image | object | `{"pullPolicy":"IfNotPresent","repository":"allegroai/clearml","tag":"1.9.2-317"}` | Api Server image configuration |
| apiserver.indexReplicas | int | `0` | Number of additional replicas in Elasticsearch indexes |
| apiserver.indexShards | int | `1` | Number of shards in Elasticsearch indexes |
| apiserver.ingress | object | `{"annotations":{},"enabled":false,"hostName":"api.clearml.127-0-0-1.nip.io","ingressClassName":"","path":"/","tlsSecretName":""}` | Ingress configuration for Api Server component |
| apiserver.ingress.annotations | object | `{}` | Ingress annotations |
| apiserver.ingress.enabled | bool | `false` | Enable/Disable ingress |
| apiserver.ingress.hostName | string | `"api.clearml.127-0-0-1.nip.io"` | Ingress hostname domain |
| apiserver.ingress.ingressClassName | string | `""` | ClassName (must be defined if no default ingressClassName is available) |
| apiserver.ingress.path | string | `"/"` | Ingress root path url |
| apiserver.ingress.tlsSecretName | string | `""` | Reference to secret containing TLS certificate. If set, it enables HTTPS on ingress rule. |
| apiserver.nodeSelector | object | `{}` | Api Server nodeselector |
| apiserver.podAnnotations | object | `{}` | specific annotation for Api Server pods |
| apiserver.prepopulateEnabled | bool | `true` | Enable/Disable example data load |
| apiserver.processes | object | `{"count":8,"maxRequests":1000,"maxRequestsJitter":300,"timeout":24000}` | Api Server internal processes configuration |
| apiserver.processes.count | int | `8` | Api Server internal listing processes |
| apiserver.processes.maxRequests | int | `1000` | Api Server maximum number of concurrent requests |
| apiserver.processes.maxRequestsJitter | int | `300` | Api Server max jitter on api request |
| apiserver.processes.timeout | int | `24000` | Api timeout (ms) |
| apiserver.replicaCount | int | `1` | Api Server number of pods |
| apiserver.resources | object | `{"limits":{"cpu":"2000m","memory":"1Gi"},"requests":{"cpu":"100m","memory":"256Mi"}}` | Api Server resources per pod; these are minimal requirements, it's suggested to increase these values in production environments |
| apiserver.securityContext | object | `{}` | Api Server pod security context |
| apiserver.service | object | `{"nodePort":30008,"port":8008,"type":"NodePort"}` | Api Server internal service configuration |
| apiserver.service.nodePort | int | `30008` | If service.type set to NodePort, this will be set to service's nodePort field. If service.type is set to others, this field will be ignored |
| apiserver.tolerations | list | `[]` | Api Server tolerations setup |
| clearml | object | `{"apiserverKey":"GGS9F4M6XB2DXJ5AFT9F","apiserverSecret":"2oGujVFhPfaozhpuz2GzQfA5OyxmMsR3WVJpsCR5hrgHFs20PO","clientConfigurationApiUrl":"","clientConfigurationFilesUrl":"","cookieDomain":"","cookieName":"clearml-token-k8s","defaultCompany":"d1bd92a3b039400cbafc60a7a5b1e52b","fileserverKey":"XXCRJ123CEE2KSQ068WO","fileserverSecret":"YIy8EVAC7QCT4FtgitxAQGyW7xRHDZ4jpYlTE7HKiscpORl1hG","readinessprobeKey":"GK4PRTVT3706T25K6BA1","readinessprobeSecret":"ymLh1ok5k5xNUQfS944Xdx9xjf0wueokqKM2dMZfHuH9ayItG2","secureAuthTokenSecret":"ymLh1ok5k5xNUQfS944Xdx9xjf0wueokqKM2dMZfHuH9ayItG2","testUserKey":"ENP39EQM4SLACGD5FXB7","testUserSecret":"lPcm0imbcBZ8mwgO7tpadutiS3gnJD05x9j7afwXPS35IKbpiQ"}` | ClearMl generic configurations |
| clearml.apiserverKey | string | `"GGS9F4M6XB2DXJ5AFT9F"` | Api Server basic auth key |
| clearml.apiserverSecret | string | `"2oGujVFhPfaozhpuz2GzQfA5OyxmMsR3WVJpsCR5hrgHFs20PO"` | Api Server basic auth secret |
| clearml.clientConfigurationApiUrl | string | `""` | Override the API Urls displayed when showing an example of the SDK's clearml.conf configuration |
| clearml.clientConfigurationFilesUrl | string | `""` | Override the Files Urls displayed when showing an example of the SDK's clearml.conf configuration |
| clearml.cookieDomain | string | `""` | Cookie domain to be left empty if not exposed with an ingress |
| clearml.cookieName | string | `"clearml-token-k8s"` | Name fo the UI cookie |
| clearml.defaultCompany | string | `"d1bd92a3b039400cbafc60a7a5b1e52b"` | Company name |
| clearml.fileserverKey | string | `"XXCRJ123CEE2KSQ068WO"` | File Server basic auth key |
| clearml.fileserverSecret | string | `"YIy8EVAC7QCT4FtgitxAQGyW7xRHDZ4jpYlTE7HKiscpORl1hG"` | File Server basic auth secret |
| clearml.readinessprobeKey | string | `"GK4PRTVT3706T25K6BA1"` | Readiness probe basic auth key |
| clearml.readinessprobeSecret | string | `"ymLh1ok5k5xNUQfS944Xdx9xjf0wueokqKM2dMZfHuH9ayItG2"` | Readiness probe basic auth secret |
| clearml.secureAuthTokenSecret | string | `"ymLh1ok5k5xNUQfS944Xdx9xjf0wueokqKM2dMZfHuH9ayItG2"` | Secure Auth secret |
| clearml.testUserKey | string | `"ENP39EQM4SLACGD5FXB7"` | Test Server basic auth key |
| clearml.testUserSecret | string | `"lPcm0imbcBZ8mwgO7tpadutiS3gnJD05x9j7afwXPS35IKbpiQ"` | Test File Server basic auth secret |
| elasticsearch | object | `{"clusterHealthCheckParams":"wait_for_status=yellow&timeout=1s","clusterName":"clearml-elastic","enabled":true,"esConfig":{"elasticsearch.yml":"xpack.security.enabled: false\n"},"esJavaOpts":"-Xmx2g -Xms2g","extraEnvs":[{"name":"bootstrap.memory_lock","value":"false"},{"name":"cluster.routing.allocation.node_initial_primaries_recoveries","value":"500"},{"name":"cluster.routing.allocation.disk.watermark.low","value":"500mb"},{"name":"cluster.routing.allocation.disk.watermark.high","value":"500mb"},{"name":"cluster.routing.allocation.disk.watermark.flood_stage","value":"500mb"},{"name":"http.compression_level","value":"7"},{"name":"reindex.remote.whitelist","value":"*.*"},{"name":"xpack.monitoring.enabled","value":"false"},{"name":"xpack.security.enabled","value":"false"}],"httpPort":9200,"minimumMasterNodes":1,"persistence":{"enabled":true},"replicas":1,"resources":{"limits":{"cpu":"2000m","memory":"4Gi"},"requests":{"cpu":"100m","memory":"2Gi"}},"roles":{"data":"true","ingest":"true","master":"true","remote_cluster_client":"true"},"volumeClaimTemplate":{"accessModes":["ReadWriteOnce"],"resources":{"requests":{"storage":"50Gi"}},"storageClassName":null}}` | Configuration from https://github.com/elastic/helm-charts/blob/7.16/elasticsearch/values.yaml |
| enterpriseFeatures | object | `{"airGappedDocumentation":{"enabled":false,"image":{"repository":"","tag":"4"}},"apiserverImageTagOverride":"3.15.3-909","clearmlApplications":{"affinity":{},"agentKey":"GK4PRTVT3706T25K6BA1","agentSecret":"ymLh1ok5k5xNUQfS944Xdx9xjf0wueokqKM2dMZfHuH9ayItG2","basePodImage":{"repository":"","tag":"app-1.1.1-47"},"enabled":true,"extraEnvs":[],"gitAgentPass":"git_password","gitAgentUser":"git_user","image":{"pullPolicy":"IfNotPresent","repository":"","tag":"1.24-57"},"nodeSelector":{},"podAnnotations":{},"replicaCount":1,"resources":{"limits":{"cpu":"2000m","memory":"1Gi"},"requests":{"cpu":"100m","memory":"256Mi"}},"tolerations":[]},"defaultCompanyGuid":"d1bd92a3b039400cbafc60a7a5b1e52b","enabled":false,"extraIndexUrl":"","fileserverImageTagOverride":"3.15.3-909","overrideReferenceApiUrl":"","overrideReferenceFileUrl":"","webserverImageTagOverride":"3.15.3-801"}` | Enterprise features (work only with an Enterprise license) |
| enterpriseFeatures.airGappedDocumentation | object | `{"enabled":false,"image":{"repository":"","tag":"4"}}` | Air gapped documentation configurations |
| enterpriseFeatures.airGappedDocumentation.enabled | bool | `false` | Enable/Disable air gapped documentation deployment |
| enterpriseFeatures.airGappedDocumentation.image | object | `{"repository":"","tag":"4"}` | Air gapped documentation image configuration |
| enterpriseFeatures.apiserverImageTagOverride | string | `"3.15.3-909"` | Image tag override for apiserver enterprise version |
| enterpriseFeatures.clearmlApplications | object | `{"affinity":{},"agentKey":"GK4PRTVT3706T25K6BA1","agentSecret":"ymLh1ok5k5xNUQfS944Xdx9xjf0wueokqKM2dMZfHuH9ayItG2","basePodImage":{"repository":"","tag":"app-1.1.1-47"},"enabled":true,"extraEnvs":[],"gitAgentPass":"git_password","gitAgentUser":"git_user","image":{"pullPolicy":"IfNotPresent","repository":"","tag":"1.24-57"},"nodeSelector":{},"podAnnotations":{},"replicaCount":1,"resources":{"limits":{"cpu":"2000m","memory":"1Gi"},"requests":{"cpu":"100m","memory":"256Mi"}},"tolerations":[]}` | APPS configurations |
| enterpriseFeatures.clearmlApplications.affinity | object | `{}` | APPS affinity setup |
| enterpriseFeatures.clearmlApplications.agentKey | string | `"GK4PRTVT3706T25K6BA1"` | Apps Server basic auth key |
| enterpriseFeatures.clearmlApplications.agentSecret | string | `"ymLh1ok5k5xNUQfS944Xdx9xjf0wueokqKM2dMZfHuH9ayItG2"` | Apps Server basic auth secret |
| enterpriseFeatures.clearmlApplications.basePodImage | object | `{"repository":"","tag":"app-1.1.1-47"}` | APPS base spawning pods image |
| enterpriseFeatures.clearmlApplications.enabled | bool | `true` | Enable/Disable component deployment |
| enterpriseFeatures.clearmlApplications.extraEnvs | list | `[]` | APPS extra envrinoment variables |
| enterpriseFeatures.clearmlApplications.gitAgentPass | string | `"git_password"` | Apps Server Git password |
| enterpriseFeatures.clearmlApplications.gitAgentUser | string | `"git_user"` | Apps Server Git user |
| enterpriseFeatures.clearmlApplications.image | object | `{"pullPolicy":"IfNotPresent","repository":"","tag":"1.24-57"}` | APPS image configuration |
| enterpriseFeatures.clearmlApplications.nodeSelector | object | `{}` | APPS nodeselector |
| enterpriseFeatures.clearmlApplications.podAnnotations | object | `{}` | specific annotation for APPS pods |
| enterpriseFeatures.clearmlApplications.replicaCount | int | `1` | APPS number of pods |
| enterpriseFeatures.clearmlApplications.resources | object | `{"limits":{"cpu":"2000m","memory":"1Gi"},"requests":{"cpu":"100m","memory":"256Mi"}}` | APPS resources per pod; these are minimal requirements, it's suggested to increase these values in production environments |
| enterpriseFeatures.clearmlApplications.tolerations | list | `[]` | APPS tolerations setup |
| enterpriseFeatures.defaultCompanyGuid | string | `"d1bd92a3b039400cbafc60a7a5b1e52b"` | Company ID |
| enterpriseFeatures.enabled | bool | `false` | Enable/Disable Enterprise features |
| enterpriseFeatures.extraIndexUrl | string | `""` | extra index URL for Enterprise packages |
| enterpriseFeatures.fileserverImageTagOverride | string | `"3.15.3-909"` | Image tag override for fileserver enterprise version |
| enterpriseFeatures.overrideReferenceApiUrl | string | `""` | set this value AND overrideReferenceFileUrl if external endpoint exposure is in place (like a LoadBalancer) example: "https://api.clearml.local" |
| enterpriseFeatures.overrideReferenceFileUrl | string | `""` | set this value AND overrideReferenceAPIUrl if external endpoint exposure is in place (like a LoadBalancer) example: "https://files.clearml.local" |
| enterpriseFeatures.webserverImageTagOverride | string | `"3.15.3-801"` | Image tag override for webserver enterprise version |
| externalServices | object | `{"elasticsearchHost":"","elasticsearchPort":9200,"mongodbConnectionStringAuth":"","mongodbConnectionStringBackend":"","redisHost":"","redisPort":6379}` | Definition of external services to use if not enabled as dependency charts here |
| externalServices.elasticsearchHost | string | `""` | Existing ElasticSearch Hostname to use if elasticsearch.enabled is false |
| externalServices.elasticsearchPort | int | `9200` | Existing ElasticSearch Port to use if elasticsearch.enabled is false |
| externalServices.mongodbConnectionStringAuth | string | `""` | Existing MongoDB connection string for BACKEND to use if mongodb.enabled is false |
| externalServices.mongodbConnectionStringBackend | string | `""` | Existing MongoDB connection string for AUTH to use if mongodb.enabled is false |
| externalServices.redisHost | string | `""` | Existing Redis Hostname to use if redis.enabled is false |
| externalServices.redisPort | int | `6379` | Existing Redis Port to use if redis.enabled is false |
| fileserver | object | `{"affinity":{},"enabled":true,"extraEnvs":[],"image":{"pullPolicy":"IfNotPresent","repository":"allegroai/clearml","tag":"1.9.2-317"},"ingress":{"annotations":{},"enabled":false,"hostName":"files.clearml.127-0-0-1.nip.io","ingressClassName":"","path":"/","tlsSecretName":""},"nodeSelector":{},"podAnnotations":{},"replicaCount":1,"resources":{"limits":{"cpu":"2000m","memory":"1Gi"},"requests":{"cpu":"100m","memory":"256Mi"}},"securityContext":{},"service":{"nodePort":30081,"port":8081,"type":"NodePort"},"storage":{"data":{"accessMode":"ReadWriteOnce","class":"","size":"50Gi"}},"tolerations":[]}` | File Server configurations |
| fileserver.affinity | object | `{}` | File Server affinity setup |
| fileserver.enabled | bool | `true` | Enable/Disable component deployment |
| fileserver.extraEnvs | list | `[]` | File Server extra envrinoment variables |
| fileserver.image | object | `{"pullPolicy":"IfNotPresent","repository":"allegroai/clearml","tag":"1.9.2-317"}` | File Server image configuration |
| fileserver.ingress | object | `{"annotations":{},"enabled":false,"hostName":"files.clearml.127-0-0-1.nip.io","ingressClassName":"","path":"/","tlsSecretName":""}` | Ingress configuration for File Server component |
| fileserver.ingress.annotations | object | `{}` | Ingress annotations |
| fileserver.ingress.enabled | bool | `false` | Enable/Disable ingress |
| fileserver.ingress.hostName | string | `"files.clearml.127-0-0-1.nip.io"` | Ingress hostname domain |
| fileserver.ingress.ingressClassName | string | `""` | ClassName (must be defined if no default ingressClassName is available) |
| fileserver.ingress.path | string | `"/"` | Ingress root path url |
| fileserver.ingress.tlsSecretName | string | `""` | Reference to secret containing TLS certificate. If set, it enables HTTPS on ingress rule. |
| fileserver.nodeSelector | object | `{}` | File Server nodeselector |
| fileserver.podAnnotations | object | `{}` | specific annotation for File Server pods |
| fileserver.replicaCount | int | `1` | File Server number of pods |
| fileserver.resources | object | `{"limits":{"cpu":"2000m","memory":"1Gi"},"requests":{"cpu":"100m","memory":"256Mi"}}` | File Server resources per pod; these are minimal requirements, it's suggested to increase these values in production environments |
| fileserver.securityContext | object | `{}` | File Server pod security context |
| fileserver.service | object | `{"nodePort":30081,"port":8081,"type":"NodePort"}` | File Server internal service configuration |
| fileserver.service.nodePort | int | `30081` | If service.type set to NodePort, this will be set to service's nodePort field. If service.type is set to others, this field will be ignored |
| fileserver.storage | object | `{"data":{"accessMode":"ReadWriteOnce","class":"","size":"50Gi"}}` | File server persistence settings |
| fileserver.storage.data.accessMode | string | `"ReadWriteOnce"` | Access mode (must be ReadWriteMany if fileserver replica > 1) |
| fileserver.storage.data.class | string | `""` | Storage class (use default if empty) |
| fileserver.tolerations | list | `[]` | File Server tolerations setup |
| imageCredentials | object | `{"email":"someone@host.com","enabled":false,"existingSecret":"","password":"pwd","registry":"docker.io","username":"someone"}` | Container registry configuration |
| imageCredentials.email | string | `"someone@host.com"` | Email |
| imageCredentials.enabled | bool | `false` | Use private authentication mode |
| imageCredentials.existingSecret | string | `""` | If this is set, chart will not generate a secret but will use what is defined here |
| imageCredentials.password | string | `"pwd"` | Registry password |
| imageCredentials.registry | string | `"docker.io"` | Registry name |
| imageCredentials.username | string | `"someone"` | Registry username |
| mongodb | object | `{"architecture":"standalone","auth":{"enabled":false},"enabled":true,"persistence":{"accessModes":["ReadWriteOnce"],"enabled":true,"size":"50Gi","storageClass":null},"replicaCount":1}` | Configuration from https://github.com/bitnami/charts/blob/master/bitnami/mongodb/values.yaml |
| redis | object | `{"cluster":{"enabled":false},"databaseNumber":0,"enabled":true,"master":{"name":"{{ .Release.Name }}-redis-master","persistence":{"accessModes":["ReadWriteOnce"],"enabled":true,"size":"5Gi","storageClass":null},"port":6379},"usePassword":false}` | Configuration from https://github.com/bitnami/charts/blob/master/bitnami/redis/values.yaml |
| webserver | object | `{"additionalConfigs":{},"affinity":{},"enabled":true,"extraEnvs":[],"image":{"pullPolicy":"IfNotPresent","repository":"allegroai/clearml","tag":"1.9.2-317"},"ingress":{"annotations":{},"enabled":false,"hostName":"app.clearml.127-0-0-1.nip.io","ingressClassName":"","path":"/","tlsSecretName":""},"nodeSelector":{},"podAnnotations":{},"podSecurityContext":{},"replicaCount":1,"resources":{"limits":{"cpu":"2000m","memory":"1Gi"},"requests":{"cpu":"100m","memory":"256Mi"}},"service":{"nodePort":30080,"port":8080,"type":"NodePort"},"tolerations":[]}` | Web Server configurations |
| webserver.additionalConfigs | object | `{}` | Additional specific webserver configurations |
| webserver.affinity | object | `{}` | Web Server affinity setup |
| webserver.enabled | bool | `true` | Enable/Disable component deployment |
| webserver.extraEnvs | list | `[]` | Web Server extra envrinoment variables |
| webserver.image | object | `{"pullPolicy":"IfNotPresent","repository":"allegroai/clearml","tag":"1.9.2-317"}` | Web Server image configuration |
| webserver.ingress | object | `{"annotations":{},"enabled":false,"hostName":"app.clearml.127-0-0-1.nip.io","ingressClassName":"","path":"/","tlsSecretName":""}` | Ingress configuration for Web Server component |
| webserver.ingress.annotations | object | `{}` | Ingress annotations |
| webserver.ingress.enabled | bool | `false` | Enable/Disable ingress |
| webserver.ingress.hostName | string | `"app.clearml.127-0-0-1.nip.io"` | Ingress hostname domain |
| webserver.ingress.ingressClassName | string | `""` | ClassName (must be defined if no default ingressClassName is available) |
| webserver.ingress.path | string | `"/"` | Ingress root path url |
| webserver.ingress.tlsSecretName | string | `""` | Reference to secret containing TLS certificate. If set, it enables HTTPS on ingress rule. |
| webserver.nodeSelector | object | `{}` | Web Server nodeselector |
| webserver.podAnnotations | object | `{}` | specific annotation for Web Server pods |
| webserver.podSecurityContext | object | `{}` | Web Server pod security context |
| webserver.replicaCount | int | `1` | Web Server number of pods |
| webserver.resources | object | `{"limits":{"cpu":"2000m","memory":"1Gi"},"requests":{"cpu":"100m","memory":"256Mi"}}` | Web Server resources per pod; these are minimal requirements, it's suggested to increase these values in production environments |
| webserver.service | object | `{"nodePort":30080,"port":8080,"type":"NodePort"}` | Web Server internal service configuration |
| webserver.service.nodePort | int | `30080` | If service.type set to NodePort, this will be set to service's nodePort field. If service.type is set to others, this field will be ignored |
| webserver.tolerations | list | `[]` | Web Server tolerations setup |

View File

@@ -0,0 +1,127 @@
# ClearML Ecosystem for Kubernetes
{{ template "chart.deprecationWarning" . }}
{{ template "chart.badgesSection" . }}
{{ template "chart.description" . }}
{{ template "chart.homepageLine" . }}
{{ template "chart.maintainersSection" . }}
## Introduction
The **clearml-server** is the backend service infrastructure for [ClearML](https://github.com/allegroai/clearml).
It allows multiple users to collaborate and manage their experiments.
**clearml-server** contains the following components:
* The ClearML Web-App, a single-page UI for experiment management and browsing
* RESTful API for:
* Documenting and logging experiment information, statistics and results
* Querying experiments history, logs and results
* Locally-hosted file server for storing images and models making them easily accessible using the Web-App
## Local environment
For development/evaluation it's possible to use [kind](https://kind.sigs.k8s.io).
After installation, following commands will create a complete ClearML insatllation:
```
cat <<EOF | kind create cluster --config=-
kind: Cluster
apiVersion: kind.x-k8s.io/v1alpha4
nodes:
- role: control-plane
extraPortMappings:
# API server's default nodePort is 30008. If you customize it in helm values by
# `apiserver.service.nodePort`, `containerPort` should match it
- containerPort: 30008
hostPort: 30008
listenAddress: "127.0.0.1"
protocol: TCP
# Web server's default nodePort is 30080. If you customize it in helm values by
# `webserver.service.nodePort`, `containerPort` should match it
- containerPort: 30080
hostPort: 30080
listenAddress: "127.0.0.1"
protocol: TCP
# File server's default nodePort is 30081. If you customize it in helm values by
# `fileserver.service.nodePort`, `containerPort` should match it
- containerPort: 30081
hostPort: 30081
listenAddress: "127.0.0.1"
protocol: TCP
extraMounts:
- hostPath: /tmp/clearml-kind/
containerPath: /var/local-path-provisioner
EOF
helm install clearml allegroai/clearml
```
After deployment, the services will be exposed on localhost on the following ports:
* API server on `30008`
* Web server on `30080`
* File server on `30081`
Data persisted in every Kubernetes volume by ClearML will be accessible in /tmp/clearml-kind folder on the host.
## Production cluster environment
In a production environment it's suggested to install an ingress controller and verify that is working correctly.
During ClearML deployment enable `ingress` section of chart values.
This will create 3 ingress rules:
* `app.<your domain name>`
* `files.<your domain name>`
* `api.<your domain name>`
(*for example, `app.clearml.mydomainname.com`, `files.clearml.mydomainname.com` and `api.clearml.mydomainname.com`*)
Just pointing the domain records to the IP where ingress controller is responding will complete the deployment process.
A production ready cluster should also have some different configuration like the one proposed in `values-production.yaml` that can be applied with:
```
helm install clearml allegroai/clearml -f values-production.yaml
```
## Upgrades/ Values upgrades
Updating to latest version of this chart can be done in two steps:
```
helm repo update
helm upgrade clearml allegroai/clearml
```
Changing values on existing installation can be done with:
```
helm upgrade clearml allegroai/clearml --version <CURRENT CHART VERSION> -f custom_values.yaml
```
Please note: updating values only should always be done setting explicit chart version to avoid a possible chart update.
Keeping separate updates procedures between version and values can be a good practice to seprate potential concerns.
## ENTERPRISE Version
There are some specific Enterprise version features that can be enabled only with specific Enterprise licensed images.
Enabling this features on OSS version can cause the entire installation to break.
## Additional Configuration for ClearML Server
You can also configure the **clearml-server** for:
* fixed users (users with credentials)
* non-responsive experiment watchdog settings
For detailed instructions, see the [Optional Configuration](https://github.com/allegroai/clearml-server#optional-configuration) section in the **clearml-server** repository README file.
{{ template "chart.sourcesSection" . }}
{{ template "chart.requirementsSection" . }}
{{ template "chart.valuesSection" . }}

Binary file not shown.

Binary file not shown.

Binary file not shown.

View File

@@ -0,0 +1,7 @@
Place values files with different values in this directory to ensure these cases are tested by the CI as well.
https://github.com/helm/chart-testing/blob/main/doc/ct_install.md
```
"Charts may have multiple custom values files matching the glob pattern '*-values.yaml' in a directory named 'ci' in the root of the chart's directory. The chart is installed and tested for each of these files. If no custom values file is present, the chart is installed and tested with defaults."
```

View File

@@ -0,0 +1 @@
# empty so default values.yaml gets tested

View File

@@ -0,0 +1,18 @@
1. Get the application URL:
{{- if .Values.webserver.ingress.enabled }}
http{{ if $.Values.webserver.ingress.tls }}s{{ end }}://{{ .Values.webserver.ingress.hostName }}
{{- else if contains "NodePort" .Values.webserver.service.type }}
export NODE_PORT=$(kubectl get --namespace {{ .Release.Namespace }} -o jsonpath="{.spec.ports[0].nodePort}" services {{ include "clearml.fullname" . }})
export NODE_IP=$(kubectl get nodes --namespace {{ .Release.Namespace }} -o jsonpath="{.items[0].status.addresses[0].address}")
echo http://$NODE_IP:$NODE_PORT
{{- else if contains "LoadBalancer" .Values.webserver.service.type }}
NOTE: It may take a few minutes for the LoadBalancer IP to be available.
You can watch the status of by running 'kubectl get --namespace {{ .Release.Namespace }} svc -w {{ include "clearml.fullname" . }}'
export SERVICE_IP=$(kubectl get svc --namespace {{ .Release.Namespace }} {{ include "clearml.fullname" . }} --template "{{"{{ range (index .status.loadBalancer.ingress 0) }}{{.}}{{ end }}"}}")
echo http://$SERVICE_IP:{{ .Values.webserver.service.port }}
{{- else if contains "ClusterIP" .Values.webserver.service.type }}
export POD_NAME=$(kubectl get pods --namespace {{ .Release.Namespace }} -l "app.kubernetes.io/name={{ include "clearml.name" . }},app.kubernetes.io/instance={{ .Release.Name }}" -o jsonpath="{.items[0].metadata.name}")
export CONTAINER_PORT=$(kubectl get pod --namespace {{ .Release.Namespace }} $POD_NAME -o jsonpath="{.spec.containers[0].ports[0].containerPort}")
echo "Visit http://127.0.0.1:8080 to use your application"
kubectl --namespace {{ .Release.Namespace }} port-forward $POD_NAME 8080:$CONTAINER_PORT
{{- end }}

View File

@@ -0,0 +1,212 @@
{{/*
Expand the name of the chart.
*/}}
{{- define "clearml.name" -}}
{{- default .Chart.Name .Values.nameOverride | trunc 63 | trimSuffix "-" }}
{{- end }}
{{/*
Create a default fully qualified app name.
We truncate at 63 chars because some Kubernetes name fields are limited to this (by the DNS naming spec).
If release name contains chart name it will be used as a full name.
*/}}
{{- define "clearml.fullname" -}}
{{- if .Values.fullnameOverride }}
{{- .Values.fullnameOverride | trunc 63 | trimSuffix "-" }}
{{- else }}
{{- $name := default .Chart.Name .Values.nameOverride }}
{{- if contains $name .Release.Name }}
{{- .Release.Name | trunc 63 | trimSuffix "-" }}
{{- else }}
{{- printf "%s-%s" .Release.Name $name | trunc 63 | trimSuffix "-" }}
{{- end }}
{{- end }}
{{- end }}
{{/*
Create chart name and version as used by the chart label.
*/}}
{{- define "clearml.chart" -}}
{{- printf "%s-%s" .Chart.Name .Chart.Version | replace "+" "_" | trunc 63 | trimSuffix "-" }}
{{- end }}
{{/*
Common labels
*/}}
{{- define "clearml.labels" -}}
helm.sh/chart: {{ include "clearml.chart" . }}
{{ include "clearml.selectorLabels" . }}
{{- if .Chart.AppVersion }}
app.kubernetes.io/version: {{ .Chart.AppVersion | quote }}
{{- end }}
app.kubernetes.io/managed-by: {{ .Release.Service }}
{{- end }}
{{/*
Selector labels
*/}}
{{- define "clearml.selectorLabels" -}}
app.kubernetes.io/name: {{ include "clearml.name" . }}
app.kubernetes.io/instance: {{ .Release.Name }}
{{- end }}
{{/*
Reference Name (apiserver)
*/}}
{{- define "apiserver.referenceName" -}}
{{- include "clearml.name" . }}-apiserver
{{- end }}
{{/*
Selector labels (apiserver)
*/}}
{{- define "apiserver.selectorLabels" -}}
app.kubernetes.io/name: {{ include "clearml.name" . }}
app.kubernetes.io/instance: {{ include "apiserver.referenceName" . }}
{{- end }}
{{/*
Reference Name (fileserver)
*/}}
{{- define "fileserver.referenceName" -}}
{{- include "clearml.name" . }}-fileserver
{{- end }}
{{/*
Selector labels (fileserver)
*/}}
{{- define "fileserver.selectorLabels" -}}
app.kubernetes.io/name: {{ include "clearml.name" . }}
app.kubernetes.io/instance: {{ include "fileserver.referenceName" . }}
{{- end }}
{{/*
Reference Name (webserver)
*/}}
{{- define "webserver.referenceName" -}}
{{- include "clearml.name" . }}-webserver
{{- end }}
{{/*
Selector labels (webserver)
*/}}
{{- define "webserver.selectorLabels" -}}
app.kubernetes.io/name: {{ include "clearml.name" . }}
app.kubernetes.io/instance: {{ include "webserver.referenceName" . }}
{{- end }}
{{/*
Reference Name (apps)
*/}}
{{- define "clearmlApplications.referenceName" -}}
{{- include "clearml.name" . }}-apps
{{- end }}
{{/*
Selector labels (apps)
*/}}
{{- define "clearmlApplications.selectorLabels" -}}
app.kubernetes.io/name: {{ include "clearml.name" . }}
app.kubernetes.io/instance: {{ include "clearmlApplications.referenceName" . }}
{{- end }}
{{/*
Create the name of the service account to use
*/}}
{{- define "clearml.serviceAccountName" -}}
{{- if .Values.serviceAccount.create }}
{{- default (include "clearml.fullname" .) .Values.serviceAccount.name }}
{{- else }}
{{- default "default" .Values.serviceAccount.name }}
{{- end }}
{{- end }}
{{/*
Create secret to access docker registry
*/}}
{{- define "imagePullSecret" }}
{{- with .Values.imageCredentials }}
{{- printf "{\"auths\":{\"%s\":{\"username\":\"%s\",\"password\":\"%s\",\"email\":\"%s\",\"auth\":\"%s\"}}}" .registry .username .password .email (printf "%s:%s" .username .password | b64enc) | b64enc }}
{{- end }}
{{- end }}
{{/*
Create readiness probe auth token
*/}}
{{- define "readinessProbeAuth" }}
{{- printf "%s:%s" .Values.clearml.readinessprobeKey .Values.clearml.readinessprobeSecret | b64enc }}
{{- end }}
{{/*
Elasticsearch Service name
*/}}
{{- define "elasticsearch.servicename" -}}
{{- if .Values.elasticsearch.enabled }}
{{- .Values.elasticsearch.clusterName }}-master
{{- else }}
{{- .Values.externalServices.elasticsearchHost }}
{{- end }}
{{- end }}
{{/*
Elasticsearch Service port
*/}}
{{- define "elasticsearch.serviceport" -}}
{{- if .Values.elasticsearch.enabled }}
{{- .Values.elasticsearch.httpPort }}
{{- else }}
{{- .Values.externalServices.elasticsearchPort }}
{{- end }}
{{- end }}
{{/*
MongoDB Comnnection string
*/}}
{{- define "mongodb.connectionstring" -}}
{{- if eq .Values.mongodb.architecture "standalone" }}
{{- printf "%s%s%s" "mongodb://" .Release.Name "-mongodb:27017" }}
{{- else }}
{{- $connectionString := "mongodb://" }}
{{- range $i,$e := until (.Values.mongodb.replicaCount | int) }}
{{- $connectionString = printf "%s%s%s%s%s%s%s%s%s" $connectionString $.Release.Name "-mongodb-" ( $i | toString ) "." $.Release.Name "-mongodb-headless." $.Release.Namespace ".svc.cluster.local," }}
{{- end }}
{{- printf "%s" ( trimSuffix "," $connectionString ) }}
{{- end }}
{{- end }}
{{/*
Redis Service name
*/}}
{{- define "redis.servicename" -}}
{{- if .Values.redis.enabled }}
{{- tpl .Values.redis.master.name . }}
{{- else }}
{{- .Values.externalServices.redisHost }}
{{- end }}
{{- end }}
{{/*
Redis Service port
*/}}
{{- define "redis.serviceport" -}}
{{- if .Values.redis.enabled }}
{{- .Values.redis.master.port }}
{{- else }}
{{- .Values.externalServices.redisPort }}
{{- end }}
{{- end }}
{{/*
clientConfiguration string compose
*/}}
{{- define "clearml.clientConfiguration" -}}
{{- $clientConfiguration := "" }}
{{- if and (.Values.clearml.clientConfigurationApiUrl) .Values.clearml.clientConfigurationFilesUrl }}
{{- $clientConfiguration = "{\"apiServer\":\"{{ .Values.clearml.clientConfigurationApiUrl }}\",\"filesServer\":\"{{ .Values.clearml.clientConfigurationFilesUrl }}\"}" }}
{{- else if .Values.clearml.clientConfigurationApiUrl }}
{{- $clientConfiguration = "{\"apiServer\":\"{{ .Values.clearml.clientConfigurationApiUrl }}\"}" }}
{{- else if .Values.clearml.clientConfigurationFilesUrl }}
{{- $clientConfiguration = "{\"filesServer\":\"{{ .Values.clearml.clientConfigurationFilesUrl }}\"}" }}
{{- end }}
{{- $clientConfiguration }}
{{- end }}

View File

@@ -0,0 +1,15 @@
{{- if .Values.apiserver.enabled }}
{{- if .Values.apiserver.additionalConfigs }}
apiVersion: v1
kind: ConfigMap
metadata:
name: "{{ include "apiserver.referenceName" . }}-configmap"
labels:
{{- include "clearml.labels" . | nindent 4 }}
data:
{{- range $key, $val := .Values.apiserver.additionalConfigs }}
{{ $key }}: |
{{- $val | nindent 4 }}
{{- end }}
{{- end }}
{{- end }}

View File

@@ -0,0 +1,249 @@
{{- if .Values.apiserver.enabled }}
apiVersion: apps/v1
kind: Deployment
metadata:
name: {{ include "apiserver.referenceName" . }}
labels:
{{- include "clearml.labels" . | nindent 4 }}
spec:
replicas: {{ .Values.apiserver.replicaCount }}
selector:
matchLabels:
{{- include "apiserver.selectorLabels" . | nindent 6 }}
template:
metadata:
{{- with .Values.apiserver.podAnnotations }}
annotations:
{{- toYaml . | nindent 8 }}
{{- end }}
labels:
{{- include "apiserver.selectorLabels" . | nindent 8 }}
spec:
{{- if .Values.imageCredentials.enabled }}
imagePullSecrets:
{{- if .Values.imageCredentials.existingSecret }}
- name: .Values.imageCredentials.existingSecret
{{- else }}
- name: clearml-registry-key
{{- end }}
{{- end }}
{{- if or .Values.apiserver.additionalConfigs .Values.apiserver.existingAdditionalConfigsConfigMap .Values.apiserver.existingAdditionalConfigsSecret }}
volumes:
- name: apiserver-config
{{- if or .Values.apiserver.existingAdditionalConfigsConfigMap }}
configMap:
name: {{ .Values.apiserver.existingAdditionalConfigsConfigMap }}
{{- else if or .Values.apiserver.existingAdditionalConfigsSecret }}
secret:
secretName: {{ .Values.apiserver.existingAdditionalConfigsSecret }}
{{- else if or .Values.apiserver.additionalConfigs }}
configMap:
name: "{{ include "apiserver.referenceName" . }}-configmap"
{{- end }}
{{- end }}
securityContext: {{ toYaml .Values.apiserver.podSecurityContext | nindent 8 }}
initContainers:
- name: init-apiserver
{{- if .Values.enterpriseFeatures.enabled }}
image: "{{ .Values.apiserver.image.repository }}:{{ .Values.enterpriseFeatures.apiserverImageTagOverride }}"
{{- else }}
image: "{{ .Values.apiserver.image.repository }}:{{ .Values.apiserver.image.tag }}"
{{- end }}
command:
- /bin/sh
- -c
- >
set -x;
while [ $(curl -sw '%{http_code}' "http://{{ include "elasticsearch.servicename" . }}:{{ include "elasticsearch.serviceport" . }}/_cluster/health" -o /dev/null) -ne 200 ] ; do
echo "waiting for elasticsearch" ;
sleep 5 ;
done
containers:
- name: clearml-apiserver
{{- if .Values.enterpriseFeatures.enabled }}
image: "{{ .Values.apiserver.image.repository }}:{{ .Values.enterpriseFeatures.apiserverImageTagOverride }}"
{{- else }}
image: "{{ .Values.apiserver.image.repository }}:{{ .Values.apiserver.image.tag }}"
{{- end }}
imagePullPolicy: {{ .Values.apiserver.image.pullPolicy }}
ports:
- name: http
containerPort: 8008
protocol: TCP
env:
- name: CLEARML_ELASTIC_SERVICE_HOST
value: {{ include "elasticsearch.servicename" . }}
- name: CLEARML_ELASTIC_SERVICE_PORT
value: "{{ include "elasticsearch.serviceport" . }}"
{{- if .Values.mongodb.enabled }}
- name: CLEARML_MONGODB_SERVICE_CONNECTION_STRING
value: {{ include "mongodb.connectionstring" . | quote }}
{{- else }}
- name: CLEARML__HOSTS__MONGO__BACKEND__HOST
value: {{ .Values.externalServices.mongodbConnectionStringBackend | quote }}
- name: CLEARML__HOSTS__MONGO__AUTH__HOST
value: {{ .Values.externalServices.mongodbConnectionStringAuth | quote }}
{{- end }}
- name: CLEARML_REDIS_SERVICE_HOST
value: {{ include "redis.servicename" . }}
- name: CLEARML_REDIS_SERVICE_PORT
value: "{{ include "redis.serviceport" . }}"
- name: CLEARML_CONFIG_PATH
value: /opt/clearml/config
- name: CLEARML__apiserver__default_company_name
value: "{{ .Values.clearml.defaultCompany }}"
{{- if not (eq .Values.clearml.cookieDomain "") }}
- name: CLEARML__APISERVER__AUTH__SESSION_AUTH_COOKIE_NAME
value: {{ .Values.clearml.cookieName }}
- name: CLEARML__APISERVER__AUTH__COOKIES__DOMAIN
value: ".{{ .Values.clearml.cookieDomain }}"
{{- end }}
- name: CLEARML__secure__credentials__apiserver__user_key
valueFrom:
secretKeyRef:
name: clearml-conf
key: apiserver_key
- name: CLEARML__secure__credentials__apiserver__user_secret
valueFrom:
secretKeyRef:
name: clearml-conf
key: apiserver_secret
- name: CLEARML__secure__auth__token_secret
valueFrom:
secretKeyRef:
name: clearml-conf
key: secure_auth_token_secret
{{- if .Values.apiserver.prepopulateEnabled }}
- name: CLEARML__APISERVER__PRE_POPULATE__ENABLED
value: "true"
- name: CLEARML__APISERVER__PRE_POPULATE__ZIP_FILES
value: "/opt/clearml/db-pre-populate"
{{- end }}
{{- if .Values.enterpriseFeatures.enabled }}
- name: CLEARML__apiserver__default_company
value: "{{ .Values.enterpriseFeatures.defaultCompanyGuid }}"
- name: APPLY_ES_MAPPINGS
value: "false"
- name: CLEARML__HOSTS__ELASTIC__LOGS__HOSTS
value: "[\"http://{{ include "elasticsearch.servicename" . }}:{{ include "elasticsearch.serviceport" . }}\"]"
- name: NUMBER_OF_GUNICORN_WORKERS
value: "{{ .Values.apiserver.processes.count }}"
- name: GUNICORN_TIMEOUT
value: "{{ .Values.apiserver.processes.timeout }}"
- name: GUNICORN_MAX_REQUESTS
value: "{{ .Values.apiserver.processes.maxRequests }}"
- name: GUNICORN_MAX_REQUESTS_JITTER
value: "{{ .Values.apiserver.processes.maxRequestsJitter }}"
- name: CLEARML_CONFIG_VERBOSE
value: "0"
- name: CLEARML__SERVICES__APPLICATIONS__TEMPLATES__FOLDER
value: "/opt/allegro/config/applications"
- name: CLEARML__apiserver__apilog__prefix
value: "fluentd."
- name: CLEARML__apiserver__apilog__index_name_prefix__default
value: "allegro.apiserver.api-logs."
- name: CLEARML__apiserver__apilog__adapter
value: "logging"
- name: CLEARML__apiserver__apilog__rotation__index_size
value: "225000"
- name: CLEARML__services__tasks__non_responsive_tasks_watchdog__enabled
value: "false"
- name: CLEARML__APISERVER__AUTH__COOKIES__MAX_AGE
value: "2678400"
- name: CLEARML__services__frames__scroll_state_expiration_hours
value: "6"
- name: CLEARML__services__organization__features__applications
value: "true"
- name: CLEARML__services__organization__features__app_management
value: "true"
- name: CLEARML__SERVICES___ELASTIC__MAPPINGS__EVENTS__NUMBER_OF_REPLICAS
value: {{ .Values.apiserver.indexReplicas | quote }}
- name: CLEARML__SERVICES___ELASTIC__MAPPINGS__EVENTS__NUMBER_OF_SHARDS
value: {{ .Values.apiserver.indexShards | quote }}
- name: CLEARML__APISERVER__LOG_CALLS
value: "false"
- name: ALLEGRO_ENV
value: "onprem_k8s"
- name: CLEARML__secure__credentials__fileserver__user_key
valueFrom:
secretKeyRef:
name: clearml-conf
key: fileserver_key
- name: CLEARML__secure__credentials__fileserver__user_secret
valueFrom:
secretKeyRef:
name: clearml-conf
key: fileserver_secret
- name: CLEARML__secure__applications__agents_credentials__apps_agent__user_key
valueFrom:
secretKeyRef:
name: clearml-conf
key: apps_agent_key
- name: CLEARML__secure__applications__agents_credentials__apps_agent__user_secret
valueFrom:
secretKeyRef:
name: clearml-conf
key: apps_agent_secret
{{- else }}
- name: CLEARML__SECURE__CREDENTIALS__TESTS__USER_KEY
valueFrom:
secretKeyRef:
name: "clearml-conf"
key: test_user_key
- name: CLEARML__SECURE__CREDENTIALS__TESTS__USER_SECRET
valueFrom:
secretKeyRef:
name: "clearml-conf"
key: test_user_secret
- name: CLEARML_ENV
value: "helm-cloud"
{{- end }}
{{- if .Values.apiserver.extraEnvs }}
{{ toYaml .Values.apiserver.extraEnvs | nindent 10 }}
{{- end }}
{{- if not .Values.enterpriseFeatures.enabled }}
args:
- apiserver
{{- end }}
livenessProbe:
initialDelaySeconds: 60
httpGet:
path: /debug.ping
port: 8008
readinessProbe:
initialDelaySeconds: 60
failureThreshold: 8
httpGet:
{{- if .Values.enterpriseFeatures.enabled }}
path: /server.health_check
{{- else }}
path: /debug.ping
{{- end }}
port: 8008
httpHeaders:
- name: Authorization
value: Basic {{ include "readinessProbeAuth" . }}
{{- if or .Values.apiserver.additionalConfigs .Values.apiserver.existingAdditionalConfigsConfigMap .Values.apiserver.existingAdditionalConfigsSecret }}
volumeMounts:
- name: apiserver-config
{{- if .Values.enterpriseFeatures.enabled }}
mountPath: /opt/clearml/config/default
{{- else }}
mountPath: /opt/clearml/config
{{- end }}
{{- end }}
resources:
{{- toYaml .Values.apiserver.resources | nindent 12 }}
{{- with .Values.apiserver.nodeSelector }}
nodeSelector:
{{- toYaml . | nindent 8 }}
{{- end }}
{{- with .Values.apiserver.affinity }}
affinity:
{{- toYaml . | nindent 8 }}
{{- end }}
{{- with .Values.apiserver.tolerations }}
tolerations:
{{- toYaml . | nindent 8 }}
{{- end }}
{{- end }}

View File

@@ -0,0 +1,50 @@
{{- if .Values.apiserver.enabled }}
{{- if .Values.apiserver.ingress.enabled }}
{{- if semverCompare ">=1.19-0" .Capabilities.KubeVersion.GitVersion -}}
apiVersion: networking.k8s.io/v1
{{- else if semverCompare ">=1.14-0" .Capabilities.KubeVersion.GitVersion -}}
apiVersion: networking.k8s.io/v1beta1
{{- else -}}
apiVersion: extensions/v1beta1
{{- end }}
kind: Ingress
metadata:
name: {{ include "apiserver.referenceName" . }}
labels:
{{- include "clearml.labels" . | nindent 4 }}
{{- $annotations := .Values.apiserver.ingress.annotations }}
{{- if .Values.apiserver.ingress.annotations }}
{{- $annotations = mergeOverwrite $annotations .Values.apiserver.ingress.annotations }}
{{- end }}
annotations:
{{- toYaml $annotations | nindent 4 }}
spec:
{{- if .Values.apiserver.ingress.ingressClassName }}
ingressClassName: {{ .Values.apiserver.ingress.ingressClassName }}
{{- end }}
{{- if .Values.apiserver.ingress.tlsSecretName }}
tls:
- hosts:
- {{ .Values.apiserver.ingress.hostName }}
secretName: {{ .Values.apiserver.ingress.tlsSecretName }}
{{- end }}
rules:
- host: {{ .Values.apiserver.ingress.hostName }}
http:
paths:
- path: {{ .Values.apiserver.ingress.path }}
{{ if semverCompare ">=1.19-0" .Capabilities.KubeVersion.GitVersion }}
pathType: Prefix
backend:
service:
name: {{ include "apiserver.referenceName" . }}
port:
number: {{ .Values.apiserver.service.port }}
{{ else }}
backend:
serviceName: {{ include "apiserver.referenceName" . }}
servicePort: {{ .Values.apiserver.service.port }}
{{ end }}
{{- end }}
{{- end }}

View File

@@ -0,0 +1,19 @@
{{- if .Values.apiserver.enabled }}
apiVersion: v1
kind: Service
metadata:
name: {{ include "apiserver.referenceName" . }}
labels:
{{- include "clearml.labels" . | nindent 4 }}
spec:
type: {{ .Values.apiserver.service.type }}
ports:
- port: {{ .Values.apiserver.service.port }}
targetPort: {{ .Values.apiserver.service.port }}
{{- if eq .Values.apiserver.service.type "NodePort" }}
nodePort: {{ .Values.apiserver.service.nodePort }}
{{- end }}
protocol: TCP
selector:
{{- include "apiserver.selectorLabels" . | nindent 4 }}
{{- end }}

View File

@@ -0,0 +1,120 @@
{{- if .Values.enterpriseFeatures.enabled }}
{{- if .Values.enterpriseFeatures.clearmlApplications.enabled }}
apiVersion: apps/v1
kind: Deployment
metadata:
name: {{ include "clearmlApplications.referenceName" . }}
labels:
{{- include "clearml.labels" . | nindent 4 }}
spec:
replicas: {{ .Values.enterpriseFeatures.clearmlApplications.replicaCount }}
selector:
matchLabels:
{{- include "clearmlApplications.selectorLabels" . | nindent 6 }}
template:
metadata:
{{- with .Values.enterpriseFeatures.clearmlApplications.podAnnotations }}
annotations:
{{- toYaml . | nindent 8 }}
{{- end }}
labels:
{{- include "clearmlApplications.selectorLabels" . | nindent 8 }}
spec:
{{- if .Values.imageCredentials.enabled }}
imagePullSecrets:
{{- if .Values.imageCredentials.existingSecret }}
- name: .Values.imageCredentials.existingSecret
{{- else }}
- name: clearml-registry-key
{{- end }}
{{- end }}
{{- if .Values.enterpriseFeatures.clearmlApplications.additionalConfigs }}
volumes:
- name: apps-config
configMap:
name: "{{ include "clearmlApplications.referenceName" . }}-configmap"
{{- end }}
serviceAccountName: "clearml-apps-sa"
initContainers:
- name: init-apps
image: "{{ .Values.enterpriseFeatures.clearmlApplications.image.repository }}:{{ .Values.enterpriseFeatures.clearmlApplications.image.tag | default .Chart.AppVersion }}"
command:
- /bin/sh
- -c
- >
set -x;
while [ $(curl -sw '%{http_code}' "http://{{ include "apiserver.referenceName" . }}:{{ .Values.apiserver.service.port }}/debug.ping" -o /dev/null) -ne 200 ] ; do
echo "waiting for apiserver" ;
sleep 5 ;
done
containers:
- name: clearml-apps
image: "{{ .Values.enterpriseFeatures.clearmlApplications.image.repository }}:{{ .Values.enterpriseFeatures.clearmlApplications.image.tag | default .Chart.AppVersion }}"
imagePullPolicy: {{ .Values.enterpriseFeatures.clearmlApplications.image.pullPolicy }}
ports:
- name: http
containerPort: 8008
protocol: TCP
env:
- name: CLEARML_API_HOST
value: "http://{{ include "apiserver.referenceName" . }}:{{ .Values.apiserver.service.port }}"
- name: CLEARML_FILES_HOST
value: "http://{{ include "fileserver.referenceName" . }}:{{ .Values.fileserver.service.port }}"
- name: CLEARML_WEB_HOST
value: "http://{{ include "webserver.referenceName" . }}:{{ .Values.webserver.service.port }}"
- name: CLEARML_AGENT_DEFAULT_BASE_DOCKER
value: "{{ .Values.enterpriseFeatures.clearmlApplications.basePodImage.repository }}:{{ .Values.enterpriseFeatures.clearmlApplications.basePodImage.tag }}"
- name: CLEARML_WORKER_ID
value: "apps-agent-1"
- name: CLEARML_NO_DEFAULT_SERVER
value: "true"
- name: CLEARML_AGENT_DAEMON_OPTIONS
value: "--foreground --create-queue --use-owner-token --child-report-tags application --services-mode=5"
- name: K8S_GLUE_QUEUE
value: "apps_queue"
- name: CLEARML_AGENT_DISABLE_SSH_MOUNT
value: "1"
- name: CLEARML_API_ACCESS_KEY
valueFrom:
secretKeyRef:
name: clearml-conf
key: apps_agent_key
- name: CLEARML_API_SECRET_KEY
valueFrom:
secretKeyRef:
name: clearml-conf
key: apps_agent_secret
- name: CLEARML_AGENT_GIT_USER
valueFrom:
secretKeyRef:
name: clearml-conf
key: apps_git_agent_user
- name: CLEARML_AGENT_GIT_PASS
valueFrom:
secretKeyRef:
name: clearml-conf
key: apps_git_agent_pass
{{- if .Values.enterpriseFeatures.clearmlApplications.extraEnvs }}
{{ toYaml .Values.enterpriseFeatures.clearmlApplications.extraEnvs | nindent 10 }}
{{- end }}
{{- if .Values.enterpriseFeatures.clearmlApplications.additionalConfigs }}
volumeMounts:
- name: apps-config
mountPath: /opt/clearml/config/default
{{- end }}
resources:
{{- toYaml .Values.enterpriseFeatures.clearmlApplications.resources | nindent 12 }}
{{- with .Values.enterpriseFeatures.clearmlApplications.nodeSelector }}
nodeSelector:
{{- toYaml . | nindent 8 }}
{{- end }}
{{- with .Values.enterpriseFeatures.clearmlApplications.affinity }}
affinity:
{{- toYaml . | nindent 8 }}
{{- end }}
{{- with .Values.enterpriseFeatures.clearmlApplications.tolerations }}
tolerations:
{{- toYaml . | nindent 8 }}
{{- end }}
{{- end }}
{{- end }}

View File

@@ -0,0 +1,33 @@
{{- if .Values.enterpriseFeatures.enabled }}
{{- if .Values.enterpriseFeatures.clearmlApplications.enabled }}
apiVersion: v1
kind: ServiceAccount
metadata:
name: "clearml-apps-sa"
namespace: {{ .Release.Namespace }}
---
apiVersion: rbac.authorization.k8s.io/v1
kind: Role
metadata:
name: {{ include "clearmlApplications.referenceName" . }}-kpa
rules:
- apiGroups:
- ""
resources:
- pods
verbs: ["get", "list", "watch", "create", "patch", "delete"]
---
apiVersion: rbac.authorization.k8s.io/v1
kind: RoleBinding
metadata:
name: {{ include "clearmlApplications.referenceName" . }}-kpa
subjects:
- kind: ServiceAccount
name: "clearml-apps-sa"
namespace: {{ .Release.Namespace }}
roleRef:
apiGroup: rbac.authorization.k8s.io
kind: Role
name: {{ include "clearmlApplications.referenceName" . }}-kpa
{{- end }}
{{- end }}

View File

@@ -0,0 +1,28 @@
apiVersion: v1
kind: Secret
metadata:
name: clearml-conf
data:
apiserver_key: {{ .Values.clearml.apiserverKey | b64enc }}
apiserver_secret: {{ .Values.clearml.apiserverSecret | b64enc }}
fileserver_key: {{ .Values.clearml.fileserverKey | b64enc }}
fileserver_secret: {{ .Values.clearml.fileserverSecret | b64enc }}
apps_agent_key: {{ .Values.enterpriseFeatures.clearmlApplications.agentKey | b64enc }}
apps_agent_secret: {{ .Values.enterpriseFeatures.clearmlApplications.agentSecret | b64enc }}
secure_auth_token_secret: {{ .Values.clearml.secureAuthTokenSecret | b64enc }}
apps_git_agent_user: {{ .Values.enterpriseFeatures.clearmlApplications.gitAgentUser | b64enc }}
apps_git_agent_pass: {{ .Values.enterpriseFeatures.clearmlApplications.gitAgentPass | b64enc }}
test_user_key: {{ .Values.clearml.testUserKey | b64enc }}
test_user_secret: {{ .Values.clearml.testUserSecret | b64enc }}
---
{{- if .Values.imageCredentials.enabled }}
{{- if not .Values.imageCredentials.existingSecret }}
apiVersion: v1
kind: Secret
metadata:
name: clearml-registry-key
type: kubernetes.io/dockerconfigjson
data:
.dockerconfigjson: {{ template "imagePullSecret" . }}
{{- end }}
{{- end }}

View File

@@ -0,0 +1,120 @@
{{- if .Values.fileserver.enabled }}
apiVersion: apps/v1
kind: Deployment
metadata:
name: {{ include "fileserver.referenceName" . }}
labels:
{{- include "clearml.labels" . | nindent 4 }}
spec:
replicas: {{ .Values.fileserver.replicaCount }}
selector:
matchLabels:
{{- include "fileserver.selectorLabels" . | nindent 6 }}
template:
metadata:
{{- with .Values.fileserver.podAnnotations }}
annotations:
{{- toYaml . | nindent 8 }}
{{- end }}
labels:
{{- include "fileserver.selectorLabels" . | nindent 8 }}
spec:
{{- if .Values.imageCredentials.enabled }}
imagePullSecrets:
{{- if .Values.imageCredentials.existingSecret }}
- name: .Values.imageCredentials.existingSecret
{{- else }}
- name: clearml-registry-key
{{- end }}
{{- end }}
volumes:
- name: fileserver-data
persistentVolumeClaim:
claimName: {{ include "fileserver.referenceName" . }}-data
securityContext: {{ toYaml .Values.fileserver.podSecurityContext | nindent 8 }}
initContainers:
- name: init-fileserver
{{- if .Values.enterpriseFeatures.enabled }}
image: "{{ .Values.fileserver.image.repository }}:{{ .Values.enterpriseFeatures.fileserverImageTagOverride }}"
{{- else }}
image: "{{ .Values.fileserver.image.repository }}:{{ .Values.fileserver.image.tag }}"
{{- end }}
command:
- /bin/sh
- -c
- >
set -x;
while [ $(curl -sw '%{http_code}' "http://{{ include "apiserver.referenceName" . }}:{{ .Values.apiserver.service.port }}/debug.ping" -o /dev/null) -ne 200 ] ; do
echo "waiting for apiserver" ;
sleep 5 ;
done
containers:
- name: clearml-fileserver
{{- if .Values.enterpriseFeatures.enabled }}
image: "{{ .Values.fileserver.image.repository }}:{{ .Values.enterpriseFeatures.fileserverImageTagOverride }}"
{{- else }}
image: "{{ .Values.fileserver.image.repository }}:{{ .Values.fileserver.image.tag }}"
{{- end }}
imagePullPolicy: {{ .Values.fileserver.image.pullPolicy }}
ports:
- name: http
containerPort: 8081
protocol: TCP
env:
- name: CLEARML__HOSTS__API_SERVER
value: "http://{{ include "apiserver.referenceName" . }}:{{ .Values.apiserver.service.port }}"
- name: CLEARML_REDIS_SERVICE_HOST
value: {{ include "redis.servicename" . }}
- name: CLEARML_REDIS_SERVICE_PORT
value: "{{ include "redis.serviceport" . }}"
{{- if not (eq .Values.clearml.cookieDomain "") }}
- name: CLEARML__FILESERVER__AUTH__COOKIE_NAMES
value: "[ {{ .Values.clearml.cookieName }} ]"
{{- end }}
- name: USER_KEY
valueFrom:
secretKeyRef:
name: clearml-conf
key: fileserver_key
- name: USER_SECRET
valueFrom:
secretKeyRef:
name: clearml-conf
key: fileserver_secret
{{- if .Values.fileserver.extraEnvs }}
{{ toYaml .Values.fileserver.extraEnvs | nindent 10 }}
{{- end }}
{{- if not .Values.enterpriseFeatures.enabled }}
args:
- fileserver
{{- end }}
livenessProbe:
exec:
command:
- curl
- -X OPTIONS
- http://localhost:8081/
readinessProbe:
exec:
command:
- curl
- -X OPTIONS
- http://localhost:8081/
volumeMounts:
- name: fileserver-data
mountPath: /mnt/fileserver
resources:
{{- toYaml .Values.fileserver.resources | nindent 12 }}
{{- with .Values.fileserver.nodeSelector }}
nodeSelector:
{{- toYaml . | nindent 8 }}
{{- end }}
{{- with .Values.fileserver.affinity }}
affinity:
{{- toYaml . | nindent 8 }}
{{- end }}
{{- with .Values.fileserver.tolerations }}
tolerations:
{{- toYaml . | nindent 8 }}
{{- end }}
{{- end }}

View File

@@ -0,0 +1,49 @@
{{- if .Values.fileserver.enabled }}
{{- if .Values.fileserver.ingress.enabled }}
{{- if semverCompare ">=1.19-0" .Capabilities.KubeVersion.GitVersion -}}
apiVersion: networking.k8s.io/v1
{{- else if semverCompare ">=1.14-0" .Capabilities.KubeVersion.GitVersion -}}
apiVersion: networking.k8s.io/v1beta1
{{- else -}}
apiVersion: extensions/v1beta1
{{- end }}
kind: Ingress
metadata:
name: {{ include "fileserver.referenceName" . }}
labels:
{{- include "clearml.labels" . | nindent 4 }}
{{- $annotations := .Values.fileserver.ingress.annotations }}
{{- if .Values.fileserver.ingress.annotations }}
{{- $annotations = mergeOverwrite $annotations .Values.fileserver.ingress.annotations }}
{{- end }}
annotations:
{{- toYaml $annotations | nindent 4 }}
spec:
{{- if .Values.fileserver.ingress.ingressClassName }}
ingressClassName: {{ .Values.fileserver.ingress.ingressClassName }}
{{- end }}
{{- if .Values.fileserver.ingress.tlsSecretName }}
tls:
- hosts:
- {{ .Values.fileserver.ingress.hostName }}
secretName: {{ .Values.fileserver.ingress.tlsSecretName }}
{{- end }}
rules:
- host: {{ .Values.fileserver.ingress.hostName }}
http:
paths:
- path: {{ .Values.fileserver.ingress.path }}
{{ if semverCompare ">=1.19-0" .Capabilities.KubeVersion.GitVersion }}
pathType: Prefix
backend:
service:
name: {{ include "fileserver.referenceName" . }}
port:
number: {{ .Values.fileserver.service.port }}
{{ else }}
backend:
serviceName: {{ include "fileserver.referenceName" . }}
servicePort: {{ .Values.fileserver.service.port }}
{{ end }}
{{- end }}
{{- end }}

View File

@@ -0,0 +1,17 @@
{{- if .Values.fileserver.enabled }}
kind: PersistentVolumeClaim
apiVersion: v1
metadata:
name: {{ include "fileserver.referenceName" . }}-data
labels:
{{- include "clearml.labels" . | nindent 4 }}
spec:
accessModes:
- {{ .Values.fileserver.storage.data.accessMode }}
resources:
requests:
storage: {{ .Values.fileserver.storage.data.size | quote }}
{{- if .Values.fileserver.storage.data.class }}
storageClassName: {{ .Values.fileserver.storage.data.class | quote }}
{{- end -}}
{{- end }}

View File

@@ -0,0 +1,19 @@
{{- if .Values.fileserver.enabled }}
apiVersion: v1
kind: Service
metadata:
name: {{ include "fileserver.referenceName" . }}
labels:
{{- include "clearml.labels" . | nindent 4 }}
spec:
type: {{ .Values.fileserver.service.type }}
ports:
- port: {{ .Values.fileserver.service.port }}
targetPort: 8081
{{- if eq .Values.fileserver.service.type "NodePort" }}
nodePort: {{ .Values.fileserver.service.nodePort }}
{{- end }}
protocol: TCP
selector:
{{- include "fileserver.selectorLabels" . | nindent 4 }}
{{- end }}

View File

@@ -0,0 +1,31 @@
{{- if .Values.webserver.enabled }}
apiVersion: v1
kind: ConfigMap
metadata:
name: "{{ include "webserver.referenceName" . }}-configmap"
labels:
{{- include "clearml.labels" . | nindent 4 }}
data:
{{- if .Values.enterpriseFeatures.enabled }}
configuration.json: |
{
"gettingStartedContext": {
"install":"pip install -U --extra-index-url {{ .Values.enterpriseFeatures.extraIndexUrl }} allegroai",
"configure": "allegroai-init",
"packageName": "allegroai",
"agentName": "allegroai"
},
"docsLink": "https://clear.ml/docs/",
"applicationsBackground": "ui-assets/apps-message.svg"
{{- if and .Values.webserver.overrideReferenceApiUrl .Values.enterpriseFeatures.overrideReferenceFileUrl }}
,
"fileBaseUrl": "{{ .Values.enterpriseFeatures.overrideReferenceFileUrl }}",
"apiBaseUrl": "{{ .Values.enterpriseFeatures.overrideReferenceApiUrl }}"
{{- end }}
}
{{- end }}
{{- range $key, $val := .Values.webserver.additionalConfigs }}
{{ $key }}: |
{{- $val | nindent 4 }}
{{- end }}
{{- end -}}

View File

@@ -0,0 +1,165 @@
{{- if .Values.webserver.enabled }}
apiVersion: apps/v1
kind: Deployment
metadata:
name: {{ include "webserver.referenceName" . }}
labels:
{{- include "clearml.labels" . | nindent 4 }}
spec:
replicas: {{ .Values.webserver.replicaCount }}
selector:
matchLabels:
{{- include "webserver.selectorLabels" . | nindent 6 }}
template:
metadata:
{{- with .Values.webserver.podAnnotations }}
annotations:
{{- toYaml . | nindent 8 }}
{{- end }}
labels:
{{- include "webserver.selectorLabels" . | nindent 8 }}
spec:
{{- if .Values.imageCredentials.enabled }}
imagePullSecrets:
{{- if .Values.imageCredentials.existingSecret }}
- name: .Values.imageCredentials.existingSecret
{{- else }}
- name: clearml-registry-key
{{- end }}
{{- end }}
volumes:
- name: webserver-config
configMap:
name: "{{ include "webserver.referenceName" . }}-configmap"
{{- if .Values.enterpriseFeatures.airGappedDocumentation.enabled }}
- name: documentation
emptyDir: {}
{{- end }}
securityContext: {{ toYaml .Values.webserver.podSecurityContext | nindent 8 }}
initContainers:
{{- if .Values.enterpriseFeatures.airGappedDocumentation.enabled }}
- name: init-airgap-docs
image: "{{ .Values.enterpriseFeatures.airGappedDocumentation.image.repository }}:{{ .Values.enterpriseFeatures.airGappedDocumentation.image.tag | default .Chart.AppVersion }}"
command:
- /bin/sh
- -c
- cp -a /docs_site/* /usr/share/nginx/html/clearml
volumeMounts:
- name: webserver-config
mountPath: /mnt/external_files/configs
{{- if .Values.enterpriseFeatures.airGappedDocumentation.enabled }}
- mountPath: /usr/share/nginx/html/clearml
name: documentation
{{- end }}
{{- end }}
- name: init-webserver
{{- if .Values.enterpriseFeatures.enabled }}
image: "{{ .Values.webserver.image.repository }}:{{ .Values.enterpriseFeatures.webserverImageTagOverride }}"
{{- else }}
image: "{{ .Values.webserver.image.repository }}:{{ .Values.webserver.image.tag }}"
{{- end }}
command:
- /bin/sh
- -c
- >
set -x;
while [ $(curl -sw '%{http_code}' "http://{{ include "apiserver.referenceName" . }}:{{ .Values.apiserver.service.port }}/debug.ping" -o /dev/null) -ne 200 ] ; do
echo "waiting for apiserver" ;
sleep 5 ;
done
containers:
- name: clearml-webserver
{{- if .Values.enterpriseFeatures.enabled }}
image: "{{ .Values.webserver.image.repository }}:{{ .Values.enterpriseFeatures.webserverImageTagOverride }}"
{{- else }}
image: "{{ .Values.webserver.image.repository }}:{{ .Values.webserver.image.tag }}"
{{- end }}
imagePullPolicy: {{ .Values.webserver.image.pullPolicy }}
ports:
- name: http
{{- if .Values.enterpriseFeatures.enabled }}
containerPort: 8080
{{- else }}
containerPort: 80
{{- end }}
protocol: TCP
livenessProbe:
exec:
command:
- curl
- -X OPTIONS
{{- if .Values.enterpriseFeatures.enabled }}
- http://localhost:8080/
{{- else }}
- http://localhost:80/
{{- end }}
readinessProbe:
exec:
command:
- curl
- -X OPTIONS
{{- if .Values.enterpriseFeatures.enabled }}
- http://localhost:8080/
{{- else }}
- http://localhost:80/
{{- end }}
env:
- name: NGINX_APISERVER_ADDRESS
value: "http://{{ include "apiserver.referenceName" . }}:{{ .Values.apiserver.service.port }}"
- name: NGINX_FILESERVER_ADDRESS
value: "http://{{ include "fileserver.referenceName" . }}:{{ .Values.fileserver.service.port }}"
{{- if .Values.enterpriseFeatures.enabled }}
{{- if .Values.enterpriseFeatures.airGappedDocumentation.enabled }}
- name: WEBSERVER__docsLink
value: "\"clearml/docs/\""
{{- end }}
- name: COMPANY_ID
value: "{{ .Values.clearml.defaultCompany }}"
- name: WEBSERVER__appsYouTubeIntroLink
value: "\"https://www.youtube.com/embed/HACL60h1Z54\""
- name: WEBSERVER__appsYouTubeIntroVideoId
value: "\"HACL60h1Z54\""
- name: USER_KEY
valueFrom:
secretKeyRef:
name: clearml-conf
key: apiserver_key
- name: USER_SECRET
valueFrom:
secretKeyRef:
name: clearml-conf
key: apiserver_secret
{{- end }}
{{- if include "clearml.clientConfiguration" . }}
- name: WEBSERVER__displayedServerUrls
value: {{ include "clearml.clientConfiguration" . | quote }}
{{- end }}
{{- if .Values.webserver.extraEnvs }}
{{ toYaml .Values.webserver.extraEnvs | nindent 10 }}
{{- end }}
{{- if not .Values.enterpriseFeatures.enabled }}
args:
- webserver
{{- end }}
volumeMounts:
- name: webserver-config
mountPath: /mnt/external_files/configs
{{- if .Values.enterpriseFeatures.airGappedDocumentation.enabled }}
- mountPath: /usr/share/nginx/html/clearml
name: documentation
{{- end }}
resources:
{{- toYaml .Values.webserver.resources | nindent 12 }}
{{- with .Values.webserver.nodeSelector }}
nodeSelector:
{{- toYaml . | nindent 8 }}
{{- end }}
{{- with .Values.webserver.affinity }}
affinity:
{{- toYaml . | nindent 8 }}
{{- end }}
{{- with .Values.webserver.tolerations }}
tolerations:
{{- toYaml . | nindent 8 }}
{{- end }}
{{- end }}

View File

@@ -0,0 +1,49 @@
{{- if .Values.webserver.enabled }}
{{- if .Values.webserver.ingress.enabled }}
{{- if semverCompare ">=1.19-0" .Capabilities.KubeVersion.GitVersion -}}
apiVersion: networking.k8s.io/v1
{{- else if semverCompare ">=1.14-0" .Capabilities.KubeVersion.GitVersion -}}
apiVersion: networking.k8s.io/v1beta1
{{- else -}}
apiVersion: extensions/v1beta1
{{- end }}
kind: Ingress
metadata:
name: {{ include "webserver.referenceName" . }}
labels:
{{- include "clearml.labels" . | nindent 4 }}
{{- $annotations := .Values.webserver.ingress.annotations }}
{{- if .Values.webserver.ingress.annotations }}
{{- $annotations = mergeOverwrite $annotations .Values.webserver.ingress.annotations }}
{{- end }}
annotations:
{{- toYaml $annotations | nindent 4 }}
spec:
{{- if .Values.webserver.ingress.ingressClassName }}
ingressClassName: {{ .Values.webserver.ingress.ingressClassName }}
{{- end }}
{{- if .Values.webserver.ingress.tlsSecretName }}
tls:
- hosts:
- {{ .Values.webserver.ingress.hostName }}
secretName: {{ .Values.webserver.ingress.tlsSecretName }}
{{- end }}
rules:
- host: {{ .Values.webserver.ingress.hostName }}
http:
paths:
- path: {{ .Values.webserver.ingress.path }}
{{ if semverCompare ">=1.19-0" .Capabilities.KubeVersion.GitVersion }}
pathType: Prefix
backend:
service:
name: {{ include "webserver.referenceName" . }}
port:
number: {{ .Values.webserver.service.port }}
{{ else }}
backend:
serviceName: {{ include "webserver.referenceName" . }}
servicePort: {{ .Values.webserver.service.port }}
{{ end }}
{{- end }}
{{- end }}

View File

@@ -0,0 +1,23 @@
{{- if .Values.webserver.enabled -}}
apiVersion: v1
kind: Service
metadata:
name: {{ include "webserver.referenceName" . }}
labels:
{{- include "clearml.labels" . | nindent 4 }}
spec:
type: {{ .Values.webserver.service.type }}
ports:
- port: {{ .Values.webserver.service.port }}
{{- if .Values.enterpriseFeatures.enabled }}
targetPort: 8080
{{- else }}
targetPort: 80
{{- end }}
{{- if eq .Values.webserver.service.type "NodePort" }}
nodePort: {{ .Values.webserver.service.nodePort }}
{{- end }}
protocol: TCP
selector:
{{- include "webserver.selectorLabels" . | nindent 4 }}
{{- end }}

View File

@@ -0,0 +1,29 @@
apiserver:
service:
type: ClusterIP
ingress:
enabled: true
hostName: "api.clearml.127-0-0-1.nip.io"
fileserver:
service:
type: ClusterIP
ingress:
enabled: true
hostName: "files.clearml.127-0-0-1.nip.io"
webserver:
service:
type: ClusterIP
ingress:
enabled: true
hostName: "app.clearml.127-0-0-1.nip.io"
mongodb:
enabled: true
architecture: replicaset
replicaCount: 3
arbiter:
enabled: false
pdb:
create: true
podAntiAffinityPreset: soft
elasticsearch:
replicas: 3

447
charts/clearml/values.yaml Normal file
View File

@@ -0,0 +1,447 @@
# -- Container registry configuration
imageCredentials:
# -- Use private authentication mode
enabled: false
# -- If this is set, chart will not generate a secret but will use what is defined here
existingSecret: ""
# -- Registry name
registry: docker.io
# -- Registry username
username: someone
# -- Registry password
password: pwd
# -- Email
email: someone@host.com
# -- ClearMl generic configurations
clearml:
# -- Name fo the UI cookie
cookieName: "clearml-token-k8s"
# -- Cookie domain to be left empty if not exposed with an ingress
cookieDomain: ""
# -- Company name
defaultCompany: "d1bd92a3b039400cbafc60a7a5b1e52b"
# -- Api Server basic auth key
apiserverKey: GGS9F4M6XB2DXJ5AFT9F
# -- Api Server basic auth secret
apiserverSecret: 2oGujVFhPfaozhpuz2GzQfA5OyxmMsR3WVJpsCR5hrgHFs20PO
# -- File Server basic auth key
fileserverKey: XXCRJ123CEE2KSQ068WO
# -- File Server basic auth secret
fileserverSecret: YIy8EVAC7QCT4FtgitxAQGyW7xRHDZ4jpYlTE7HKiscpORl1hG
# -- Readiness probe basic auth key
readinessprobeKey: GK4PRTVT3706T25K6BA1
# -- Readiness probe basic auth secret
readinessprobeSecret: ymLh1ok5k5xNUQfS944Xdx9xjf0wueokqKM2dMZfHuH9ayItG2
# -- Secure Auth secret
secureAuthTokenSecret: ymLh1ok5k5xNUQfS944Xdx9xjf0wueokqKM2dMZfHuH9ayItG2
# -- Test Server basic auth key
testUserKey: "ENP39EQM4SLACGD5FXB7"
# -- Test File Server basic auth secret
testUserSecret: "lPcm0imbcBZ8mwgO7tpadutiS3gnJD05x9j7afwXPS35IKbpiQ"
# -- Override the API Urls displayed when showing an example of the SDK's clearml.conf configuration
clientConfigurationApiUrl: ""
# -- Override the Files Urls displayed when showing an example of the SDK's clearml.conf configuration
clientConfigurationFilesUrl: ""
# -- Api Server configurations
apiserver:
# -- Enable/Disable component deployment
enabled: true
# -- Enable/Disable example data load
prepopulateEnabled: true
# -- Api Server image configuration
image:
repository: "allegroai/clearml"
pullPolicy: IfNotPresent
tag: "1.9.2-317"
# -- Api Server internal service configuration
service:
type: NodePort
port: 8008
# -- If service.type set to NodePort, this will be set to service's nodePort field.
# If service.type is set to others, this field will be ignored
nodePort: 30008
# -- Api Server number of pods
replicaCount: 1
# -- Ingress configuration for Api Server component
ingress:
# -- Enable/Disable ingress
enabled: false
# -- ClassName (must be defined if no default ingressClassName is available)
ingressClassName: ""
# -- Ingress hostname domain
hostName: "api.clearml.127-0-0-1.nip.io"
# -- Reference to secret containing TLS certificate. If set, it enables HTTPS on ingress rule.
tlsSecretName: ""
# -- Ingress annotations
annotations: {}
# -- Ingress root path url
path: "/"
# -- Api Server internal processes configuration
processes:
# -- Api Server internal listing processes
count: 8
# -- Api timeout (ms)
timeout: 24000
# -- Api Server maximum number of concurrent requests
maxRequests: 1000
# -- Api Server max jitter on api request
maxRequestsJitter: 300
# -- Api Server extra envrinoment variables
extraEnvs: []
# -- Number of additional replicas in Elasticsearch indexes
indexReplicas: 0
# -- Number of shards in Elasticsearch indexes
indexShards: 1
# -- specific annotation for Api Server pods
podAnnotations: {}
# -- Api Server resources per pod; these are minimal requirements, it's suggested to increase
# these values in production environments
resources:
requests:
cpu: 100m
memory: 256Mi
limits:
cpu: 2000m
memory: 1Gi
# -- Api Server nodeselector
nodeSelector: {}
# -- Api Server tolerations setup
tolerations: []
# -- Api Server affinity setup
affinity: {}
# -- Api Server pod security context
securityContext: {}
# runAsUser: 1001
# fsGroup: 1001
# -- reference for files declared in existing ConfigMap will be mounted and read by apiserver (examples in values.yaml)
existingAdditionalConfigsConfigMap: ""
# -- reference for files declared in existing Secret will be mounted and read by apiserver (examples in values.yaml) if not overridden by existingAdditionalConfigsConfigMap
existingAdditionalConfigsSecret: ""
# -- files declared in this parameter will be mounted and read by apiserver (examples in values.yaml) if not overridden by existingAdditionalConfigsSecret
additionalConfigs: {}
# services.conf: |
# tasks {
# non_responsive_tasks_watchdog {
# # In-progress tasks that haven't been updated for at least 'value' seconds will be stopped by the watchdog
# threshold_sec: 21000
# # Watchdog will sleep for this number of seconds after each cycle
# watch_interval_sec: 900
# }
# }
# apiserver.conf: |
# auth {
# fixed_users {
# enabled: true
# pass_hashed: false
# users: [
# {
# username: "jane"
# password: "12345678"
# name: "Jane Doe"
# },
# {
# username: "john"
# password: "12345678"
# name: "John Doe"
# },
# ]
# }
# }
# -- File Server configurations
fileserver:
# -- Enable/Disable component deployment
enabled: true
# -- File Server image configuration
image:
repository: "allegroai/clearml"
pullPolicy: IfNotPresent
tag: "1.9.2-317"
# -- File Server internal service configuration
service:
type: NodePort
port: 8081
# -- If service.type set to NodePort, this will be set to service's nodePort field.
# If service.type is set to others, this field will be ignored
nodePort: 30081
# -- File Server number of pods
replicaCount: 1
# -- Ingress configuration for File Server component
ingress:
# -- Enable/Disable ingress
enabled: false
# -- ClassName (must be defined if no default ingressClassName is available)
ingressClassName: ""
# -- Ingress hostname domain
hostName: "files.clearml.127-0-0-1.nip.io"
# -- Reference to secret containing TLS certificate. If set, it enables HTTPS on ingress rule.
tlsSecretName: ""
# -- Ingress annotations
annotations: {}
# -- Ingress root path url
path: "/"
# -- File Server extra envrinoment variables
extraEnvs: []
# -- specific annotation for File Server pods
podAnnotations: {}
# -- File Server resources per pod; these are minimal requirements, it's suggested to increase
# these values in production environments
resources:
requests:
cpu: 100m
memory: 256Mi
limits:
cpu: 2000m
memory: 1Gi
# -- File Server nodeselector
nodeSelector: {}
# -- File Server tolerations setup
tolerations: []
# -- File Server affinity setup
affinity: {}
# -- File Server pod security context
securityContext: {}
# runAsUser: 1001
# fsGroup: 1001
# -- File server persistence settings
storage:
data:
# -- Storage class (use default if empty)
class: ""
# -- Access mode (must be ReadWriteMany if fileserver replica > 1)
accessMode: ReadWriteOnce
size: 50Gi
# -- Web Server configurations
webserver:
# -- Enable/Disable component deployment
enabled: true
# -- Web Server image configuration
image:
repository: "allegroai/clearml"
pullPolicy: IfNotPresent
tag: "1.9.2-317"
# -- Web Server internal service configuration
service:
type: NodePort
port: 8080
# -- If service.type set to NodePort, this will be set to service's nodePort field.
# If service.type is set to others, this field will be ignored
nodePort: 30080
# -- Web Server number of pods
replicaCount: 1
# -- Ingress configuration for Web Server component
ingress:
# -- Enable/Disable ingress
enabled: false
# -- ClassName (must be defined if no default ingressClassName is available)
ingressClassName: ""
# -- Ingress hostname domain
hostName: "app.clearml.127-0-0-1.nip.io"
# -- Reference to secret containing TLS certificate. If set, it enables HTTPS on ingress rule.
tlsSecretName: ""
# -- Ingress annotations
annotations: {}
# -- Ingress root path url
path: "/"
# -- Web Server extra envrinoment variables
extraEnvs: []
# -- specific annotation for Web Server pods
podAnnotations: {}
# -- Web Server resources per pod; these are minimal requirements, it's suggested to increase
# these values in production environments
resources:
requests:
cpu: 100m
memory: 256Mi
limits:
cpu: 2000m
memory: 1Gi
# -- Web Server nodeselector
nodeSelector: {}
# -- Web Server tolerations setup
tolerations: []
# -- Web Server affinity setup
affinity: {}
# -- Web Server pod security context
podSecurityContext: {}
# runAsUser: 1001
# fsGroup: 1001
# -- Additional specific webserver configurations
additionalConfigs: {}
# -- Definition of external services to use if not enabled as dependency charts here
externalServices:
# -- Existing ElasticSearch Hostname to use if elasticsearch.enabled is false
elasticsearchHost: ""
# -- Existing ElasticSearch Port to use if elasticsearch.enabled is false
elasticsearchPort: 9200
# -- Existing MongoDB connection string for BACKEND to use if mongodb.enabled is false
mongodbConnectionStringAuth: ""
# -- Existing MongoDB connection string for AUTH to use if mongodb.enabled is false
mongodbConnectionStringBackend: ""
# -- Existing Redis Hostname to use if redis.enabled is false
redisHost: ""
# -- Existing Redis Port to use if redis.enabled is false
redisPort: 6379
# -- Configuration from https://github.com/bitnami/charts/blob/master/bitnami/redis/values.yaml
redis:
enabled: true
usePassword: false
databaseNumber: 0
master:
name: "{{ .Release.Name }}-redis-master"
port: 6379
persistence:
enabled: true
accessModes:
- ReadWriteOnce
size: 5Gi
## If undefined (the default) or set to null, no storageClassName spec is set, choosing the default provisioner
storageClass: null
cluster:
enabled: false
# -- Configuration from https://github.com/bitnami/charts/blob/master/bitnami/mongodb/values.yaml
mongodb:
enabled: true
architecture: standalone
auth:
enabled: false
replicaCount: 1
persistence:
enabled: true
accessModes:
- ReadWriteOnce
size: 50Gi
## If undefined (the default) or set to null, no storageClassName spec is set, choosing the default provisioner
storageClass: null
# -- Configuration from https://github.com/elastic/helm-charts/blob/7.16/elasticsearch/values.yaml
elasticsearch:
enabled: true
httpPort: 9200
roles:
master: "true"
ingest: "true"
data: "true"
remote_cluster_client: "true"
replicas: 1
# Readiness probe hack for a single-node cluster (where status will never be green). Should be removed if using replicas > 1
clusterHealthCheckParams: "wait_for_status=yellow&timeout=1s"
minimumMasterNodes: 1
clusterName: clearml-elastic
esJavaOpts: "-Xmx2g -Xms2g"
extraEnvs:
- name: bootstrap.memory_lock
value: "false"
- name: cluster.routing.allocation.node_initial_primaries_recoveries
value: "500"
- name: cluster.routing.allocation.disk.watermark.low
value: 500mb
- name: cluster.routing.allocation.disk.watermark.high
value: 500mb
- name: cluster.routing.allocation.disk.watermark.flood_stage
value: 500mb
- name: http.compression_level
value: "7"
- name: reindex.remote.whitelist
value: '*.*'
- name: xpack.monitoring.enabled
value: "false"
- name: xpack.security.enabled
value: "false"
resources:
requests:
cpu: 100m
memory: 2Gi
limits:
cpu: 2000m
memory: 4Gi
persistence:
enabled: true
volumeClaimTemplate:
accessModes: ["ReadWriteOnce"]
## If undefined (the default) or set to null, no storageClassName spec is set, choosing the default provisioner
storageClassName: null
resources:
requests:
storage: 50Gi
esConfig:
elasticsearch.yml: |
xpack.security.enabled: false
# -- Enterprise features (work only with an Enterprise license)
enterpriseFeatures:
# -- Enable/Disable Enterprise features
enabled: false
# -- Company ID
defaultCompanyGuid: "d1bd92a3b039400cbafc60a7a5b1e52b"
# -- Image tag override for apiserver enterprise version
apiserverImageTagOverride: "3.15.3-909"
# -- Image tag override for fileserver enterprise version
fileserverImageTagOverride: "3.15.3-909"
# -- Image tag override for webserver enterprise version
webserverImageTagOverride: "3.15.3-801"
# -- Air gapped documentation configurations
airGappedDocumentation:
# -- Enable/Disable air gapped documentation deployment
enabled: false
# -- Air gapped documentation image configuration
image:
repository: ""
tag: "4"
# -- set this value AND overrideReferenceFileUrl if external endpoint exposure is in place (like a LoadBalancer)
# example: "https://api.clearml.local"
overrideReferenceApiUrl: ""
# -- set this value AND overrideReferenceAPIUrl if external endpoint exposure is in place (like a LoadBalancer)
# example: "https://files.clearml.local"
overrideReferenceFileUrl: ""
# -- extra index URL for Enterprise packages
extraIndexUrl: ""
# -- APPS configurations
clearmlApplications:
# -- Apps Server basic auth key
agentKey: GK4PRTVT3706T25K6BA1
# -- Apps Server basic auth secret
agentSecret: ymLh1ok5k5xNUQfS944Xdx9xjf0wueokqKM2dMZfHuH9ayItG2
# -- Apps Server Git user
gitAgentUser: "git_user"
# -- Apps Server Git password
gitAgentPass: "git_password"
# -- Enable/Disable component deployment
enabled: true
# -- APPS image configuration
image:
repository: ""
pullPolicy: IfNotPresent
tag: "1.24-57"
# -- APPS base spawning pods image
basePodImage:
repository: ""
tag: "app-1.1.1-47"
# -- APPS number of pods
replicaCount: 1
# -- APPS extra envrinoment variables
extraEnvs: []
# -- specific annotation for APPS pods
podAnnotations: {}
# -- APPS resources per pod; these are minimal requirements, it's suggested to increase
# these values in production environments
resources:
requests:
cpu: 100m
memory: 256Mi
limits:
cpu: 2000m
memory: 1Gi
# -- APPS nodeselector
nodeSelector: {}
# -- APPS tolerations setup
tolerations: []
# -- APPS affinity setup
affinity: {}

View File

@@ -0,0 +1,2 @@
tests/
.pytest_cache/

View File

@@ -0,0 +1,12 @@
apiVersion: v1
appVersion: 7.16.2
description: Official Elastic helm chart for Elasticsearch
home: https://github.com/elastic/helm-charts
icon: https://helm.elastic.co/icons/elasticsearch.png
maintainers:
- email: helm-charts@elastic.co
name: Elastic
name: elasticsearch
sources:
- https://github.com/elastic/elasticsearch
version: 7.16.2

View File

@@ -0,0 +1 @@
include ../helpers/common.mk

View File

@@ -0,0 +1,457 @@
# Elasticsearch Helm Chart
[![Build Status](https://img.shields.io/jenkins/s/https/devops-ci.elastic.co/job/elastic+helm-charts+master.svg)](https://devops-ci.elastic.co/job/elastic+helm-charts+master/) [![Artifact HUB](https://img.shields.io/endpoint?url=https://artifacthub.io/badge/repository/elastic)](https://artifacthub.io/packages/search?repo=elastic)
This Helm chart is a lightweight way to configure and run our official
[Elasticsearch Docker image][].
<!-- development warning placeholder -->
<!-- START doctoc generated TOC please keep comment here to allow auto update -->
<!-- DON'T EDIT THIS SECTION, INSTEAD RE-RUN doctoc TO UPDATE -->
- [Requirements](#requirements)
- [Installing](#installing)
- [Install released version using Helm repository](#install-released-version-using-helm-repository)
- [Install development version from a branch](#install-development-version-from-a-branch)
- [Upgrading](#upgrading)
- [Usage notes](#usage-notes)
- [Configuration](#configuration)
- [Deprecated](#deprecated)
- [FAQ](#faq)
- [How to deploy this chart on a specific K8S distribution?](#how-to-deploy-this-chart-on-a-specific-k8s-distribution)
- [How to deploy dedicated nodes types?](#how-to-deploy-dedicated-nodes-types)
- [Clustering and Node Discovery](#clustering-and-node-discovery)
- [How to deploy clusters with security (authentication and TLS) enabled?](#how-to-deploy-clusters-with-security-authentication-and-tls-enabled)
- [How to migrate from helm/charts stable chart?](#how-to-migrate-from-helmcharts-stable-chart)
- [How to install plugins?](#how-to-install-plugins)
- [How to use the keystore?](#how-to-use-the-keystore)
- [Basic example](#basic-example)
- [Multiple keys](#multiple-keys)
- [Custom paths and keys](#custom-paths-and-keys)
- [How to enable snapshotting?](#how-to-enable-snapshotting)
- [How to configure templates post-deployment?](#how-to-configure-templates-post-deployment)
- [Contributing](#contributing)
<!-- END doctoc generated TOC please keep comment here to allow auto update -->
<!-- Use this to update TOC: -->
<!-- docker run --rm -it -v $(pwd):/usr/src jorgeandrada/doctoc --github -->
## Requirements
* Kubernetes >= 1.14
* [Helm][] >= 2.17.0
* Minimum cluster requirements include the following to run this chart with
default settings. All of these settings are configurable.
* Three Kubernetes nodes to respect the default "hard" affinity settings
* 1GB of RAM for the JVM heap
See [supported configurations][] for more details.
## Installing
This chart is tested with the latest 7.16.2 version.
### Install released version using Helm repository
* Add the Elastic Helm charts repo:
`helm repo add elastic https://helm.elastic.co`
* Install it:
- with Helm 3: `helm install elasticsearch --version <version> elastic/elasticsearch`
- with Helm 2 (deprecated): `helm install --name elasticsearch --version <version> elastic/elasticsearch`
### Install development version from a branch
* Clone the git repo: `git clone git@github.com:elastic/helm-charts.git`
* Checkout the branch : `git checkout 7.16`
* Install it:
- with Helm 3: `helm install elasticsearch ./helm-charts/elasticsearch --set imageTag=7.16.2`
- with Helm 2 (deprecated): `helm install --name elasticsearch ./helm-charts/elasticsearch --set imageTag=7.16.2`
## Upgrading
Please always check [CHANGELOG.md][] and [BREAKING_CHANGES.md][] before
upgrading to a new chart version.
## Usage notes
* This repo includes a number of [examples][] configurations which can be used
as a reference. They are also used in the automated testing of this chart.
* Automated testing of this chart is currently only run against GKE (Google
Kubernetes Engine).
* The chart deploys a StatefulSet and by default will do an automated rolling
update of your cluster. It does this by waiting for the cluster health to become
green after each instance is updated. If you prefer to update manually you can
set `OnDelete` [updateStrategy][].
* It is important to verify that the JVM heap size in `esJavaOpts` and to set
the CPU/Memory `resources` to something suitable for your cluster.
* To simplify chart and maintenance each set of node groups is deployed as a
separate Helm release. Take a look at the [multi][] example to get an idea for
how this works. Without doing this it isn't possible to resize persistent
volumes in a StatefulSet. By setting it up this way it makes it possible to add
more nodes with a new storage size then drain the old ones. It also solves the
problem of allowing the user to determine which node groups to update first when
doing upgrades or changes.
* We have designed this chart to be very un-opinionated about how to configure
Elasticsearch. It exposes ways to set environment variables and mount secrets
inside of the container. Doing this makes it much easier for this chart to
support multiple versions with minimal changes.
## Configuration
| Parameter | Description | Default |
|------------------------------------|-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|--------------------------------------------------|
| `antiAffinityTopologyKey` | The [anti-affinity][] topology key. By default this will prevent multiple Elasticsearch nodes from running on the same Kubernetes node | `kubernetes.io/hostname` |
| `antiAffinity` | Setting this to hard enforces the [anti-affinity][] rules. If it is set to soft it will be done "best effort". Other values will be ignored | `hard` |
| `clusterHealthCheckParams` | The [Elasticsearch cluster health status params][] that will be used by readiness [probe][] command | `wait_for_status=green&timeout=1s` |
| `clusterName` | This will be used as the Elasticsearch [cluster.name][] and should be unique per cluster in the namespace | `elasticsearch` |
| `clusterDeprecationIndexing` | Enable or disable deprecation logs to be indexed (should be disabled when deploying master only node groups) | `false` |
| `enableServiceLinks` | Set to false to disabling service links, which can cause slow pod startup times when there are many services in the current namespace. | `true` |
| `envFrom` | Templatable string to be passed to the [environment from variables][] which will be appended to the `envFrom:` definition for the container | `[]` |
| `esConfig` | Allows you to add any config files in `/usr/share/elasticsearch/config/` such as `elasticsearch.yml` and `log4j2.properties`. See [values.yaml][] for an example of the formatting | `{}` |
| `esJavaOpts` | [Java options][] for Elasticsearch. This is where you could configure the [jvm heap size][] | `""` |
| `esMajorVersion` | Deprecated. Instead, use the version of the chart corresponding to your ES minor version. Used to set major version specific configuration. If you are using a custom image and not running the default Elasticsearch version you will need to set this to the version you are running (e.g. `esMajorVersion: 6`) | `""` |
| `extraContainers` | Templatable string of additional `containers` to be passed to the `tpl` function | `""` |
| `extraEnvs` | Extra [environment variables][] which will be appended to the `env:` definition for the container | `[]` |
| `extraInitContainers` | Templatable string of additional `initContainers` to be passed to the `tpl` function | `""` |
| `extraVolumeMounts` | Templatable string of additional `volumeMounts` to be passed to the `tpl` function | `""` |
| `extraVolumes` | Templatable string of additional `volumes` to be passed to the `tpl` function | `""` |
| `fullnameOverride` | Overrides the `clusterName` and `nodeGroup` when used in the naming of resources. This should only be used when using a single `nodeGroup`, otherwise you will have name conflicts | `""` |
| `healthNameOverride` | Overrides `test-elasticsearch-health` pod name | `""` |
| `hostAliases` | Configurable [hostAliases][] | `[]` |
| `httpPort` | The http port that Kubernetes will use for the healthchecks and the service. If you change this you will also need to set [http.port][] in `extraEnvs` | `9200` |
| `imagePullPolicy` | The Kubernetes [imagePullPolicy][] value | `IfNotPresent` |
| `imagePullSecrets` | Configuration for [imagePullSecrets][] so that you can use a private registry for your image | `[]` |
| `imageTag` | The Elasticsearch Docker image tag | `7.16.2` |
| `image` | The Elasticsearch Docker image | `docker.elastic.co/elasticsearch/elasticsearch` |
| `ingress` | Configurable [ingress][] to expose the Elasticsearch service. See [values.yaml][] for an example | see [values.yaml][] |
| `initResources` | Allows you to set the [resources][] for the `initContainer` in the StatefulSet | `{}` |
| `keystore` | Allows you map Kubernetes secrets into the keystore. See the [config example][] and [how to use the keystore][] | `[]` |
| `labels` | Configurable [labels][] applied to all Elasticsearch pods | `{}` |
| `lifecycle` | Allows you to add [lifecycle hooks][]. See [values.yaml][] for an example of the formatting | `{}` |
| `masterService` | The service name used to connect to the masters. You only need to set this if your master `nodeGroup` is set to something other than `master`. See [Clustering and Node Discovery][] for more information | `""` |
| `maxUnavailable` | The [maxUnavailable][] value for the pod disruption budget. By default this will prevent Kubernetes from having more than 1 unhealthy pod in the node group | `1` |
| `minimumMasterNodes` | The value for [discovery.zen.minimum_master_nodes][]. Should be set to `(master_eligible_nodes / 2) + 1`. Ignored in Elasticsearch versions >= 7 | `2` |
| `nameOverride` | Overrides the `clusterName` when used in the naming of resources | `""` |
| `networkHost` | Value for the [network.host Elasticsearch setting][] | `0.0.0.0` |
| `networkPolicy` | The [NetworkPolicy](https://kubernetes.io/docs/concepts/services-networking/network-policies/) to set. See [`values.yaml`](./values.yaml) for an example | `{http.enabled: false,transport.enabled: false}` |
| `nodeAffinity` | Value for the [node affinity settings][] | `{}` |
| `nodeGroup` | This is the name that will be used for each group of nodes in the cluster. The name will be `clusterName-nodeGroup-X` , `nameOverride-nodeGroup-X` if a `nameOverride` is specified, and `fullnameOverride-X` if a `fullnameOverride` is specified | `master` |
| `nodeSelector` | Configurable [nodeSelector][] so that you can target specific nodes for your Elasticsearch cluster | `{}` |
| `persistence` | Enables a persistent volume for Elasticsearch data. Can be disabled for nodes that only have [roles][] which don't require persistent data | see [values.yaml][] |
| `podAnnotations` | Configurable [annotations][] applied to all Elasticsearch pods | `{}` |
| `podManagementPolicy` | By default Kubernetes [deploys StatefulSets serially][]. This deploys them in parallel so that they can discover each other | `Parallel` |
| `podSecurityContext` | Allows you to set the [securityContext][] for the pod | see [values.yaml][] |
| `podSecurityPolicy` | Configuration for create a pod security policy with minimal permissions to run this Helm chart with `create: true`. Also can be used to reference an external pod security policy with `name: "externalPodSecurityPolicy"` | see [values.yaml][] |
| `priorityClassName` | The name of the [PriorityClass][]. No default is supplied as the PriorityClass must be created first | `""` |
| `protocol` | The protocol that will be used for the readiness [probe][]. Change this to `https` if you have `xpack.security.http.ssl.enabled` set | `http` |
| `rbac` | Configuration for creating a role, role binding and ServiceAccount as part of this Helm chart with `create: true`. Also can be used to reference an external ServiceAccount with `serviceAccountName: "externalServiceAccountName"`, or automount the service account token | see [values.yaml][] |
| `readinessProbe` | Configuration fields for the readiness [probe][] | see [values.yaml][] |
| `replicas` | Kubernetes replica count for the StatefulSet (i.e. how many pods) | `3` |
| `resources` | Allows you to set the [resources][] for the StatefulSet | see [values.yaml][] |
| `roles` | A hash map with the specific [roles][] for the `nodeGroup` | see [values.yaml][] |
| `schedulerName` | Name of the [alternate scheduler][] | `""` |
| `secretMounts` | Allows you easily mount a secret as a file inside the StatefulSet. Useful for mounting certificates and other secrets. See [values.yaml][] for an example | `[]` |
| `securityContext` | Allows you to set the [securityContext][] for the container | see [values.yaml][] |
| `service.annotations` | [LoadBalancer annotations][] that Kubernetes will use for the service. This will configure load balancer if `service.type` is `LoadBalancer` | `{}` |
| `service.enabled` | Enable non-headless service | `true` |
| `service.externalTrafficPolicy` | Some cloud providers allow you to specify the [LoadBalancer externalTrafficPolicy][]. Kubernetes will use this to preserve the client source IP. This will configure load balancer if `service.type` is `LoadBalancer` | `""` |
| `service.httpPortName` | The name of the http port within the service | `http` |
| `service.labelsHeadless` | Labels to be added to headless service | `{}` |
| `service.labels` | Labels to be added to non-headless service | `{}` |
| `service.loadBalancerIP` | Some cloud providers allow you to specify the [loadBalancer][] IP. If the `loadBalancerIP` field is not specified, the IP is dynamically assigned. If you specify a `loadBalancerIP` but your cloud provider does not support the feature, it is ignored. | `""` |
| `service.loadBalancerSourceRanges` | The IP ranges that are allowed to access | `[]` |
| `service.nodePort` | Custom [nodePort][] port that can be set if you are using `service.type: nodePort` | `""` |
| `service.transportPortName` | The name of the transport port within the service | `transport` |
| `service.type` | Elasticsearch [Service Types][] | `ClusterIP` |
| `sysctlInitContainer` | Allows you to disable the `sysctlInitContainer` if you are setting [sysctl vm.max_map_count][] with another method | `enabled: true` |
| `sysctlVmMaxMapCount` | Sets the [sysctl vm.max_map_count][] needed for Elasticsearch | `262144` |
| `terminationGracePeriod` | The [terminationGracePeriod][] in seconds used when trying to stop the pod | `120` |
| `tests.enabled` | Enable creating test related resources when running `helm template` or `helm test` | `true` |
| `tolerations` | Configurable [tolerations][] | `[]` |
| `transportPort` | The transport port that Kubernetes will use for the service. If you change this you will also need to set [transport port configuration][] in `extraEnvs` | `9300` |
| `updateStrategy` | The [updateStrategy][] for the StatefulSet. By default Kubernetes will wait for the cluster to be green after upgrading each pod. Setting this to `OnDelete` will allow you to manually delete each pod during upgrades | `RollingUpdate` |
| `volumeClaimTemplate` | Configuration for the [volumeClaimTemplate for StatefulSets][]. You will want to adjust the storage (default `30Gi` ) and the `storageClassName` if you are using a different storage class | see [values.yaml][] |
### Deprecated
| Parameter | Description | Default |
|-----------|---------------------------------------------------------------------------------------------------------------|---------|
| `fsGroup` | The Group ID (GID) for [securityContext][] so that the Elasticsearch user can read from the persistent volume | `""` |
## FAQ
### How to deploy this chart on a specific K8S distribution?
This chart is designed to run on production scale Kubernetes clusters with
multiple nodes, lots of memory and persistent storage. For that reason it can be
a bit tricky to run them against local Kubernetes environments such as
[Minikube][].
This chart is highly tested with [GKE][], but some K8S distribution also
requires specific configurations.
We provide examples of configuration for the following K8S providers:
- [Docker for Mac][]
- [KIND][]
- [Minikube][]
- [MicroK8S][]
- [OpenShift][]
### How to deploy dedicated nodes types?
All the Elasticsearch pods deployed share the same configuration. If you need to
deploy dedicated [nodes types][] (for example dedicated master and data nodes),
you can deploy multiple releases of this chart with different configurations
while they share the same `clusterName` value.
For each Helm release, the nodes types can then be defined using `roles` value.
An example of Elasticsearch cluster using 2 different Helm releases for master
and data nodes can be found in [examples/multi][].
#### Clustering and Node Discovery
This chart facilitates Elasticsearch node discovery and services by creating two
`Service` definitions in Kubernetes, one with the name `$clusterName-$nodeGroup`
and another named `$clusterName-$nodeGroup-headless`.
Only `Ready` pods are a part of the `$clusterName-$nodeGroup` service, while all
pods ( `Ready` or not) are a part of `$clusterName-$nodeGroup-headless`.
If your group of master nodes has the default `nodeGroup: master` then you can
just add new groups of nodes with a different `nodeGroup` and they will
automatically discover the correct master. If your master nodes have a different
`nodeGroup` name then you will need to set `masterService` to
`$clusterName-$masterNodeGroup`.
The chart value for `masterService` is used to populate
`discovery.zen.ping.unicast.hosts` , which Elasticsearch nodes will use to
contact master nodes and form a cluster.
Therefore, to add a group of nodes to an existing cluster, setting
`masterService` to the desired `Service` name of the related cluster is
sufficient.
### How to deploy clusters with security (authentication and TLS) enabled?
This Helm chart can use existing [Kubernetes secrets][] to setup
credentials or certificates for examples. These secrets should be created
outside of this chart and accessed using [environment variables][] and volumes.
An example of Elasticsearch cluster using security can be found in
[examples/security][].
### How to migrate from helm/charts stable chart?
If you currently have a cluster deployed with the [helm/charts stable][] chart
you can follow the [migration guide][].
### How to install plugins?
The recommended way to install plugins into our Docker images is to create a
[custom Docker image][].
The Dockerfile would look something like:
```
ARG elasticsearch_version
FROM docker.elastic.co/elasticsearch/elasticsearch:${elasticsearch_version}
RUN bin/elasticsearch-plugin install --batch repository-gcs
```
And then updating the `image` in values to point to your custom image.
There are a couple reasons we recommend this.
1. Tying the availability of Elasticsearch to the download service to install
plugins is not a great idea or something that we recommend. Especially in
Kubernetes where it is normal and expected for a container to be moved to
another host at random times.
2. Mutating the state of a running Docker image (by installing plugins) goes
against best practices of containers and immutable infrastructure.
### How to use the keystore?
#### Basic example
Create the secret, the key name needs to be the keystore key path. In this
example we will create a secret from a file and from a literal string.
```
kubectl create secret generic encryption-key --from-file=xpack.watcher.encryption_key=./watcher_encryption_key
kubectl create secret generic slack-hook --from-literal=xpack.notification.slack.account.monitoring.secure_url='https://hooks.slack.com/services/asdasdasd/asdasdas/asdasd'
```
To add these secrets to the keystore:
```
keystore:
- secretName: encryption-key
- secretName: slack-hook
```
#### Multiple keys
All keys in the secret will be added to the keystore. To create the previous
example in one secret you could also do:
```
kubectl create secret generic keystore-secrets --from-file=xpack.watcher.encryption_key=./watcher_encryption_key --from-literal=xpack.notification.slack.account.monitoring.secure_url='https://hooks.slack.com/services/asdasdasd/asdasdas/asdasd'
```
```
keystore:
- secretName: keystore-secrets
```
#### Custom paths and keys
If you are using these secrets for other applications (besides the Elasticsearch
keystore) then it is also possible to specify the keystore path and which keys
you want to add. Everything specified under each `keystore` item will be passed
through to the `volumeMounts` section for mounting the [secret][]. In this
example we will only add the `slack_hook` key from a secret that also has other
keys. Our secret looks like this:
```
kubectl create secret generic slack-secrets --from-literal=slack_channel='#general' --from-literal=slack_hook='https://hooks.slack.com/services/asdasdasd/asdasdas/asdasd'
```
We only want to add the `slack_hook` key to the keystore at path
`xpack.notification.slack.account.monitoring.secure_url`:
```
keystore:
- secretName: slack-secrets
items:
- key: slack_hook
path: xpack.notification.slack.account.monitoring.secure_url
```
You can also take a look at the [config example][] which is used as part of the
automated testing pipeline.
### How to enable snapshotting?
1. Install your [snapshot plugin][] into a custom Docker image following the
[how to install plugins guide][].
2. Add any required secrets or credentials into an Elasticsearch keystore
following the [how to use the keystore][] guide.
3. Configure the [snapshot repository][] as you normally would.
4. To automate snapshots you can use [Snapshot Lifecycle Management][] or a tool
like [curator][].
### How to configure templates post-deployment?
You can use `postStart` [lifecycle hooks][] to run code triggered after a
container is created.
Here is an example of `postStart` hook to configure templates:
```yaml
lifecycle:
postStart:
exec:
command:
- bash
- -c
- |
#!/bin/bash
# Add a template to adjust number of shards/replicas
TEMPLATE_NAME=my_template
INDEX_PATTERN="logstash-*"
SHARD_COUNT=8
REPLICA_COUNT=1
ES_URL=http://localhost:9200
while [[ "$(curl -s -o /dev/null -w '%{http_code}\n' $ES_URL)" != "200" ]]; do sleep 1; done
curl -XPUT "$ES_URL/_template/$TEMPLATE_NAME" -H 'Content-Type: application/json' -d'{"index_patterns":['\""$INDEX_PATTERN"\"'],"settings":{"number_of_shards":'$SHARD_COUNT',"number_of_replicas":'$REPLICA_COUNT'}}'
```
## Contributing
Please check [CONTRIBUTING.md][] before any contribution or for any questions
about our development and testing process.
[7.16]: https://github.com/elastic/helm-charts/releases
[#63]: https://github.com/elastic/helm-charts/issues/63
[BREAKING_CHANGES.md]: https://github.com/elastic/helm-charts/blob/master/BREAKING_CHANGES.md
[CHANGELOG.md]: https://github.com/elastic/helm-charts/blob/master/CHANGELOG.md
[CONTRIBUTING.md]: https://github.com/elastic/helm-charts/blob/master/CONTRIBUTING.md
[alternate scheduler]: https://kubernetes.io/docs/tasks/administer-cluster/configure-multiple-schedulers/#specify-schedulers-for-pods
[annotations]: https://kubernetes.io/docs/concepts/overview/working-with-objects/annotations/
[anti-affinity]: https://kubernetes.io/docs/concepts/configuration/assign-pod-node/#affinity-and-anti-affinity
[cluster.name]: https://www.elastic.co/guide/en/elasticsearch/reference/7.16/important-settings.html#cluster-name
[clustering and node discovery]: https://github.com/elastic/helm-charts/tree/7.16/elasticsearch/README.md#clustering-and-node-discovery
[config example]: https://github.com/elastic/helm-charts/tree/7.16/elasticsearch/examples/config/values.yaml
[curator]: https://www.elastic.co/guide/en/elasticsearch/client/curator/7.9/snapshot.html
[custom docker image]: https://www.elastic.co/guide/en/elasticsearch/reference/7.16/docker.html#_c_customized_image
[deploys statefulsets serially]: https://kubernetes.io/docs/concepts/workloads/controllers/statefulset/#pod-management-policies
[discovery.zen.minimum_master_nodes]: https://www.elastic.co/guide/en/elasticsearch/reference/7.16/discovery-settings.html#minimum_master_nodes
[docker for mac]: https://github.com/elastic/helm-charts/tree/7.16/elasticsearch/examples/docker-for-mac
[elasticsearch cluster health status params]: https://www.elastic.co/guide/en/elasticsearch/reference/7.16/cluster-health.html#request-params
[elasticsearch docker image]: https://www.elastic.co/guide/en/elasticsearch/reference/7.16/docker.html
[environment variables]: https://kubernetes.io/docs/tasks/inject-data-application/define-environment-variable-container/#using-environment-variables-inside-of-your-config
[environment from variables]: https://kubernetes.io/docs/tasks/configure-pod-container/configure-pod-configmap/#configure-all-key-value-pairs-in-a-configmap-as-container-environment-variables
[examples]: https://github.com/elastic/helm-charts/tree/7.16/elasticsearch/examples/
[examples/multi]: https://github.com/elastic/helm-charts/tree/7.16/elasticsearch/examples/multi
[examples/security]: https://github.com/elastic/helm-charts/tree/7.16/elasticsearch/examples/security
[gke]: https://cloud.google.com/kubernetes-engine
[helm]: https://helm.sh
[helm/charts stable]: https://github.com/helm/charts/tree/master/stable/elasticsearch/
[how to install plugins guide]: https://github.com/elastic/helm-charts/tree/7.16/elasticsearch/README.md#how-to-install-plugins
[how to use the keystore]: https://github.com/elastic/helm-charts/tree/7.16/elasticsearch/README.md#how-to-use-the-keystore
[http.port]: https://www.elastic.co/guide/en/elasticsearch/reference/7.16/modules-http.html#_settings
[imagePullPolicy]: https://kubernetes.io/docs/concepts/containers/images/#updating-images
[imagePullSecrets]: https://kubernetes.io/docs/tasks/configure-pod-container/pull-image-private-registry/#create-a-pod-that-uses-your-secret
[ingress]: https://kubernetes.io/docs/concepts/services-networking/ingress/
[java options]: https://www.elastic.co/guide/en/elasticsearch/reference/7.16/jvm-options.html
[jvm heap size]: https://www.elastic.co/guide/en/elasticsearch/reference/7.16/heap-size.html
[hostAliases]: https://kubernetes.io/docs/concepts/services-networking/add-entries-to-pod-etc-hosts-with-host-aliases/
[kind]: https://github.com/elastic/helm-charts/tree/7.16/elasticsearch/examples/kubernetes-kind
[kubernetes secrets]: https://kubernetes.io/docs/concepts/configuration/secret/
[labels]: https://kubernetes.io/docs/concepts/overview/working-with-objects/labels/
[lifecycle hooks]: https://kubernetes.io/docs/concepts/containers/container-lifecycle-hooks/
[loadBalancer annotations]: https://kubernetes.io/docs/concepts/services-networking/service/#ssl-support-on-aws
[loadBalancer externalTrafficPolicy]: https://kubernetes.io/docs/tasks/access-application-cluster/create-external-load-balancer/#preserving-the-client-source-ip
[loadBalancer]: https://kubernetes.io/docs/concepts/services-networking/service/#loadbalancer
[maxUnavailable]: https://kubernetes.io/docs/tasks/run-application/configure-pdb/#specifying-a-poddisruptionbudget
[migration guide]: https://github.com/elastic/helm-charts/tree/7.16/elasticsearch/examples/migration/README.md
[minikube]: https://github.com/elastic/helm-charts/tree/7.16/elasticsearch/examples/minikube
[microk8s]: https://github.com/elastic/helm-charts/tree/7.16/elasticsearch/examples/microk8s
[multi]: https://github.com/elastic/helm-charts/tree/7.16/elasticsearch/examples/multi/
[network.host elasticsearch setting]: https://www.elastic.co/guide/en/elasticsearch/reference/7.16/network.host.html
[node affinity settings]: https://kubernetes.io/docs/concepts/configuration/assign-pod-node/#node-affinity-beta-feature
[node-certificates]: https://www.elastic.co/guide/en/elasticsearch/reference/7.16/configuring-tls.html#node-certificates
[nodePort]: https://kubernetes.io/docs/concepts/services-networking/service/#nodeport
[nodes types]: https://www.elastic.co/guide/en/elasticsearch/reference/7.16/modules-node.html
[nodeSelector]: https://kubernetes.io/docs/concepts/configuration/assign-pod-node/#nodeselector
[openshift]: https://github.com/elastic/helm-charts/tree/7.16/elasticsearch/examples/openshift
[priorityClass]: https://kubernetes.io/docs/concepts/configuration/pod-priority-preemption/#priorityclass
[probe]: https://kubernetes.io/docs/tasks/configure-pod-container/configure-liveness-readiness-probes/
[resources]: https://kubernetes.io/docs/concepts/configuration/manage-compute-resources-container/
[roles]: https://www.elastic.co/guide/en/elasticsearch/reference/7.16/modules-node.html
[secret]: https://kubernetes.io/docs/concepts/configuration/secret/#using-secrets
[securityContext]: https://kubernetes.io/docs/tasks/configure-pod-container/security-context/
[service types]: https://kubernetes.io/docs/concepts/services-networking/service/#publishing-services-service-types
[snapshot lifecycle management]: https://www.elastic.co/guide/en/elasticsearch/reference/7.16/snapshot-lifecycle-management.html
[snapshot plugin]: https://www.elastic.co/guide/en/elasticsearch/plugins/7.16/repository.html
[snapshot repository]: https://www.elastic.co/guide/en/elasticsearch/reference/7.16/modules-snapshots.html
[supported configurations]: https://github.com/elastic/helm-charts/tree/7.16/README.md#supported-configurations
[sysctl vm.max_map_count]: https://www.elastic.co/guide/en/elasticsearch/reference/7.16/vm-max-map-count.html#vm-max-map-count
[terminationGracePeriod]: https://kubernetes.io/docs/concepts/workloads/pods/pod/#termination-of-pods
[tolerations]: https://kubernetes.io/docs/concepts/configuration/taint-and-toleration/
[transport port configuration]: https://www.elastic.co/guide/en/elasticsearch/reference/7.16/modules-transport.html#_transport_settings
[updateStrategy]: https://kubernetes.io/docs/concepts/workloads/controllers/statefulset/
[values.yaml]: https://github.com/elastic/helm-charts/tree/7.16/elasticsearch/values.yaml
[volumeClaimTemplate for statefulsets]: https://kubernetes.io/docs/concepts/workloads/controllers/statefulset/#stable-storage

View File

@@ -0,0 +1,21 @@
default: test
include ../../../helpers/examples.mk
RELEASE := helm-es-config
TIMEOUT := 1200s
install:
helm upgrade --wait --timeout=$(TIMEOUT) --install --values values.yaml $(RELEASE) ../../
secrets:
kubectl delete secret elastic-config-credentials elastic-config-secret elastic-config-slack elastic-config-custom-path || true
kubectl create secret generic elastic-config-credentials --from-literal=password=changeme --from-literal=username=elastic
kubectl create secret generic elastic-config-slack --from-literal=xpack.notification.slack.account.monitoring.secure_url='https://hooks.slack.com/services/asdasdasd/asdasdas/asdasd'
kubectl create secret generic elastic-config-secret --from-file=xpack.watcher.encryption_key=./watcher_encryption_key
kubectl create secret generic elastic-config-custom-path --from-literal=slack_url='https://hooks.slack.com/services/asdasdasd/asdasdas/asdasd' --from-literal=thing_i_don_tcare_about=test
test: secrets install goss
purge:
helm del $(RELEASE)

View File

@@ -0,0 +1,27 @@
# Config
This example deploy a single node Elasticsearch 7.16.2 with authentication and
custom [values][].
## Usage
* Create the required secrets: `make secrets`
* Deploy Elasticsearch chart with the default values: `make install`
* You can now setup a port forward to query Elasticsearch API:
```
kubectl port-forward svc/config-master 9200
curl -u elastic:changeme http://localhost:9200/_cat/indices
```
## Testing
You can also run [goss integration tests][] using `make test`
[goss integration tests]: https://github.com/elastic/helm-charts/tree/7.16/elasticsearch/examples/config/test/goss.yaml
[values]: https://github.com/elastic/helm-charts/tree/7.16/elasticsearch/examples/config/values.yaml

View File

@@ -0,0 +1,29 @@
http:
http://localhost:9200/_cluster/health:
status: 200
timeout: 2000
username: elastic
password: "{{ .Env.ELASTIC_PASSWORD }}"
body:
- "green"
- '"number_of_nodes":1'
- '"number_of_data_nodes":1'
http://localhost:9200:
status: 200
timeout: 2000
username: elastic
password: "{{ .Env.ELASTIC_PASSWORD }}"
body:
- '"cluster_name" : "config"'
- "You Know, for Search"
command:
"elasticsearch-keystore list":
exit-status: 0
stdout:
- keystore.seed
- bootstrap.password
- xpack.notification.slack.account.monitoring.secure_url
- xpack.notification.slack.account.otheraccount.secure_url
- xpack.watcher.encryption_key

View File

@@ -0,0 +1,27 @@
---
clusterName: "config"
replicas: 1
extraEnvs:
- name: ELASTIC_PASSWORD
valueFrom:
secretKeyRef:
name: elastic-config-credentials
key: password
# This is just a dummy file to make sure that
# the keystore can be mounted at the same time
# as a custom elasticsearch.yml
esConfig:
elasticsearch.yml: |
xpack.security.enabled: true
path.data: /usr/share/elasticsearch/data
keystore:
- secretName: elastic-config-secret
- secretName: elastic-config-slack
- secretName: elastic-config-custom-path
items:
- key: slack_url
path: xpack.notification.slack.account.otheraccount.secure_url

View File

@@ -0,0 +1 @@
supersecret

View File

@@ -0,0 +1,14 @@
default: test
include ../../../helpers/examples.mk
RELEASE := helm-es-default
TIMEOUT := 1200s
install:
helm upgrade --wait --timeout=$(TIMEOUT) --install $(RELEASE) ../../
test: install goss
purge:
helm del $(RELEASE)

View File

@@ -0,0 +1,25 @@
# Default
This example deploy a 3 nodes Elasticsearch 7.16.2 cluster using
[default values][].
## Usage
* Deploy Elasticsearch chart with the default values: `make install`
* You can now setup a port forward to query Elasticsearch API:
```
kubectl port-forward svc/elasticsearch-master 9200
curl localhost:9200/_cat/indices
```
## Testing
You can also run [goss integration tests][] using `make test`
[goss integration tests]: https://github.com/elastic/helm-charts/tree/7.16/elasticsearch/examples/default/test/goss.yaml
[default values]: https://github.com/elastic/helm-charts/tree/7.16/elasticsearch/values.yaml

View File

@@ -0,0 +1,19 @@
#!/usr/bin/env bash -x
kubectl proxy || true &
make &
PROC_ID=$!
while kill -0 "$PROC_ID" >/dev/null 2>&1; do
echo "PROCESS IS RUNNING"
if curl --fail 'http://localhost:8001/api/v1/proxy/namespaces/default/services/elasticsearch-master:9200/_search' ; then
echo "cluster is healthy"
else
echo "cluster not healthy!"
exit 1
fi
sleep 1
done
echo "PROCESS TERMINATED"
exit 0

View File

@@ -0,0 +1,38 @@
kernel-param:
vm.max_map_count:
value: "262144"
http:
http://elasticsearch-master:9200/_cluster/health:
status: 200
timeout: 2000
body:
- "green"
- '"number_of_nodes":3'
- '"number_of_data_nodes":3'
http://localhost:9200:
status: 200
timeout: 2000
body:
- '"number" : "7.16.2"'
- '"cluster_name" : "elasticsearch"'
- "You Know, for Search"
file:
/usr/share/elasticsearch/data:
exists: true
mode: "2775"
owner: root
group: elasticsearch
filetype: directory
mount:
/usr/share/elasticsearch/data:
exists: true
user:
elasticsearch:
exists: true
uid: 1000
gid: 1000

View File

@@ -0,0 +1,13 @@
default: test
RELEASE := helm-es-docker-for-mac
TIMEOUT := 1200s
install:
helm upgrade --wait --timeout=$(TIMEOUT) --install --values values.yaml $(RELEASE) ../../
test: install
helm test $(RELEASE)
purge:
helm del $(RELEASE)

View File

@@ -0,0 +1,23 @@
# Docker for Mac
This example deploy a 3 nodes Elasticsearch 7.16.2 cluster on [Docker for Mac][]
using [custom values][].
Note that this configuration should be used for test only and isn't recommended
for production.
## Usage
* Deploy Elasticsearch chart with the default values: `make install`
* You can now setup a port forward to query Elasticsearch API:
```
kubectl port-forward svc/elasticsearch-master 9200
curl localhost:9200/_cat/indices
```
[custom values]: https://github.com/elastic/helm-charts/tree/7.16/elasticsearch/examples/docker-for-mac/values.yaml
[docker for mac]: https://docs.docker.com/docker-for-mac/kubernetes/

View File

@@ -0,0 +1,23 @@
---
# Permit co-located instances for solitary minikube virtual machines.
antiAffinity: "soft"
# Shrink default JVM heap.
esJavaOpts: "-Xmx128m -Xms128m"
# Allocate smaller chunks of memory per pod.
resources:
requests:
cpu: "100m"
memory: "512M"
limits:
cpu: "1000m"
memory: "512M"
# Request smaller persistent volumes.
volumeClaimTemplate:
accessModes: [ "ReadWriteOnce" ]
storageClassName: "hostpath"
resources:
requests:
storage: 100M

View File

@@ -0,0 +1,17 @@
default: test
RELEASE := helm-es-kind
TIMEOUT := 1200s
install:
helm upgrade --wait --timeout=$(TIMEOUT) --install --values values.yaml $(RELEASE) ../../
install-local-path:
kubectl apply -f https://raw.githubusercontent.com/rancher/local-path-provisioner/master/deploy/local-path-storage.yaml
helm upgrade --wait --timeout=$(TIMEOUT) --install --values values-local-path.yaml $(RELEASE) ../../
test: install
helm test $(RELEASE)
purge:
helm del $(RELEASE)

View File

@@ -0,0 +1,36 @@
# KIND
This example deploy a 3 nodes Elasticsearch 7.16.2 cluster on [Kind][]
using [custom values][].
Note that this configuration should be used for test only and isn't recommended
for production.
Note that Kind < 0.7.0 are affected by a [kind issue][] with mount points
created from PVCs not writable by non-root users. [kubernetes-sigs/kind#1157][]
fix it in Kind 0.7.0.
The workaround for Kind < 0.7.0 is to install manually
[Rancher Local Path Provisioner][] and use `local-path` storage class for
Elasticsearch volumes (see [Makefile][] instructions).
## Usage
* For Kind >= 0.7.0: Deploy Elasticsearch chart with the default values: `make install`
* For Kind < 0.7.0: Deploy Elasticsearch chart with `local-path` storage class: `make install-local-path`
* You can now setup a port forward to query Elasticsearch API:
```
kubectl port-forward svc/elasticsearch-master 9200
curl localhost:9200/_cat/indices
```
[custom values]: https://github.com/elastic/helm-charts/blob/7.16/elasticsearch/examples/kubernetes-kind/values.yaml
[kind]: https://kind.sigs.k8s.io/
[kind issue]: https://github.com/kubernetes-sigs/kind/issues/830
[kubernetes-sigs/kind#1157]: https://github.com/kubernetes-sigs/kind/pull/1157
[rancher local path provisioner]: https://github.com/rancher/local-path-provisioner
[Makefile]: https://github.com/elastic/helm-charts/blob/7.16/elasticsearch/examples/kubernetes-kind/Makefile#L5

View File

@@ -0,0 +1,23 @@
---
# Permit co-located instances for solitary minikube virtual machines.
antiAffinity: "soft"
# Shrink default JVM heap.
esJavaOpts: "-Xmx128m -Xms128m"
# Allocate smaller chunks of memory per pod.
resources:
requests:
cpu: "100m"
memory: "512M"
limits:
cpu: "1000m"
memory: "512M"
# Request smaller persistent volumes.
volumeClaimTemplate:
accessModes: [ "ReadWriteOnce" ]
storageClassName: "local-path"
resources:
requests:
storage: 100M

View File

@@ -0,0 +1,23 @@
---
# Permit co-located instances for solitary minikube virtual machines.
antiAffinity: "soft"
# Shrink default JVM heap.
esJavaOpts: "-Xmx128m -Xms128m"
# Allocate smaller chunks of memory per pod.
resources:
requests:
cpu: "100m"
memory: "512M"
limits:
cpu: "1000m"
memory: "512M"
# Request smaller persistent volumes.
volumeClaimTemplate:
accessModes: [ "ReadWriteOnce" ]
storageClassName: "local-path"
resources:
requests:
storage: 100M

Some files were not shown because too many files have changed in this diff Show More