Compare commits

...

58 Commits

Author SHA1 Message Date
Nikolay Shamanovich
19a6785a03 Allowing auth secrets to be optional #2 (#100)
* Allowing auth secrets to be optional

* Add value secret.existingSecret for clearml chart.

* Add value clearml.existingAgentk8sglueSecret for clearml-agent chart.

* Add value clearml.existingClearmlConfigSecret for clearml-agent chart.

* Split Secret clearml-agent-conf in clearml-agent chart into two
  Secrets: clearml-agent-conf (agent.conf file) and clearml-agent-k8sglue
  (environment variables).

* Update helm-docs
2022-08-22 10:35:47 +02:00
Valeriano Manassero
fdea0c4a3f Use fullname (#97)
* Changed: use fullname to generate resources

* Changed: version bump
2022-08-11 11:23:02 +02:00
zandolsi-psee
ace37019a8 Use existing config map or secret for api-server config (#95)
* adapt ingress

* update docs

* remove idea

* fix imagepullsecret error

* add apiserver configuration

* add apiserver configuration

* add apiserver configuration
2022-08-09 07:08:12 +02:00
Luca Cerone
c3b5198dc9 Clearml agentk8s extra envs (#94)
* Added "extraEnvs" to agentk8s deployment definition

* added extraEnvs to values.yaml

* Bumped version

* Fixed linting issue

* updated docs

* Regenerated README.md

* Update values.yaml

Co-authored-by: Luca Cerone <luca.cerone@team.bumble.com>
Co-authored-by: Valeriano Manassero <14011549+valeriano-manassero@users.noreply.github.com>
2022-08-04 14:17:47 +02:00
Valeriano Manassero
9fd65b68f7 Apache2 License (#93)
* Added: reference to Apache2 license

* Changed: bump up version

* Fixed: License reference
2022-07-21 11:12:10 +02:00
Valeriano Manassero
56880de8bb Enable multiple agents installations (#92)
* Changed: dynamic names

* Changed: bump up version
2022-07-15 08:28:40 +02:00
Valeriano Manassero
d778d0ef93 Changed: update helm chart version (#90) 2022-07-13 10:23:34 +02:00
zandolsi-psee
e28a2f991b Fix imagepullsecret error (#89)
* adapt ingress

* update docs

* remove idea

* fix imagepullsecret error
2022-07-13 10:14:16 +02:00
Valeriano Manassero
dc30518c26 ClearML Version upgrade to 1.6.0 (#88)
* Changed: version upgrade

* Changed: helm docs version update

* Changed: helm-docs update
2022-07-12 10:48:16 +02:00
Valeriano Manassero
50237dcb9d Update agent 1.24 (#86)
* Changed: image update to 1.24

* Changed: bump up version
2022-06-23 11:19:03 +02:00
Valeriano Manassero
1b164c2906 Add configurable default base serve url (#83)
* Added: configurable default base serving url

* Changed: chart version bump
2022-06-23 10:46:18 +02:00
Valeriano Manassero
43806b8e21 Add insecure cert check flag (#85)
* Added: clearmlcheckCertificate flag

* Changed: bump chart
2022-06-23 10:43:39 +02:00
Valeriano Manassero
80072c0654 Add editable config for k8s Agent (#84)
* Added: editable configuration

* Changed: bump up version
2022-06-23 09:52:19 +02:00
Valeriano Manassero
e22bd30764 Upgrade to app version 1.5.0 (#81)
* Changed: upgrade to 1.5.0

* Fixed: inject after ct check

* Fixed: list changd

* Fixed: typo
2022-06-23 07:49:45 +02:00
Valeriano Manassero
84a003b7bc Fix fileserver check on Agent (#82)
* Fixed: fileserver check

* Changed: version bump
2022-06-22 16:52:51 +02:00
Valeriano Manassero
1d95f0c27f Pullsecrets pod template (#80)
* Added: pullsecrets management for pod template

* Changed: version bump
2022-06-22 15:52:14 +02:00
Valeriano Manassero
562815e97a ClearML standalone agent chart (#79)
* Added: agent chart

* Changed: base image tag

* Changed: updated helm-docs

* Fixed: maintainers

* Changed: updated radme

* Fixed: http code to check

* Fixed: default values to check

* Changed: updated helm-docs

* Added: default values to be substituted by GH action

* Added: sed on the fly for testing

* Changed: updated CI images for Kind
2022-06-08 10:01:33 +02:00
Leon Rotim
e317610397 Fix linebreak formatting (#77)
* fix wrong line break delete

* version, bugfix inc

* regenerate README.md

Co-authored-by: Valeriano Manassero <14011549+valeriano-manassero@users.noreply.github.com>
2022-06-08 08:16:52 +02:00
Luca Cerone
16f172fc1c Allowing extraEnv to be added to agentservices and agent deployments. (#76)
* Allowing extraEnv to be added to agentservices and agent deployments.

* Bumped version and generated documentation chart

* Lint fix

* Chart version update

* helm-docs update

Co-authored-by: Valeriano Manassero <14011549+valeriano-manassero@users.noreply.github.com>
2022-06-08 08:06:20 +02:00
Valeriano Manassero
69048b5c96 Fix glue agent image (#78)
* Changed: avoid latest image

* Changed: version bump

* Fixed: pull policy

* Removed: specific ci for glue since now it's on by default

* Fixed: don't refresh dependencies

* Changed: testing chart action version update

* Fixed: action

* Changed: dependency updates required

* Fixed: lint and install

* Revert "Changed: dependency updates required"

This reverts commit 34ee22d7d0.

* Changed: use copy of dep charts because ththey may become unavailable

* Changed: updated readme
2022-06-02 21:20:00 +02:00
wasabipeas
9cf2868738 Clearml serving add support for triton (#74)
* added triton deployment and service, added triton block to values file, added value for CLEARML_DEFAULT_TRITON_GRPC_ADDR env variable in the serving-inference deployment

* re-generated README

* fixed yaml

* added condition to enable triton support

* changed chart version

* changed chart version

* bumped version to 0.3.0

* added conditional extraPythonPackages variable to clearml_serving_triton deploymnent

* added conditional extraPythonPackages to all the relevant deployments

* bumped version to 0.3.0
2022-05-24 13:12:15 +02:00
Valeriano Manassero
8098fd82df Add extra packages (#72)
* Added: extra python packages

* Changed: chart version
2022-05-23 13:06:48 +02:00
wasabipeas
4422cf433d added clearml-serving chart (#69)
* added clearml-serving chart

* fixed typo, added autogenerated README.md

* removed trailing space from values.yaml

* removed namespace definition from the values file and all the templates

* fixed typo

* re-run helm-docs
2022-05-19 07:35:30 +02:00
Valeriano Manassero
10296ac979 Update helm-docs.sh (#70) 2022-05-18 13:21:43 +02:00
Valeriano Manassero
06070a5c20 Default storageclass (#66)
* Changed: use deffault storageclass if not declared

* Changed: chart version
2022-05-02 18:00:46 +02:00
Niels ten Boom
5972fd8e5f fix: k8sagent indentation (#65) 2022-04-27 22:55:27 +02:00
Valeriano Manassero
7a7bd930f8 Fix glue namespace handling (#63)
* Changed: namespace handling for glue

* Changed: set glue as default agent system

* Changed: bump up version
2022-04-22 10:19:03 +02:00
Valeriano Manassero
25dfbd12d6 Changed: bump up versions (#62)
* Changed: bump up versions

* Changed: helmpdocs to 1.8.1
2022-04-19 08:22:08 +02:00
Valeriano Manassero
d7c3b9d5d9 Added: upgrade procedures (#61)
* Added: upgrade procedures

* Changed: template

* Changed: updated chart version
2022-04-04 10:32:51 +02:00
Valeriano Manassero
e16060f2ad Fix empty glue configs (#59)
* Added: use empty values without breaking glue agent

* Added: release namespace

* Changed: bump up version
2022-03-30 16:33:06 +02:00
Valeriano Manassero
27a666d2ae Clarml app 1.3.0 (#57)
* Changed: clarml app version

* Changed: chart version bump

* Added: comment on additional configs
2022-03-28 09:29:04 +02:00
Valeriano Manassero
d7bef0ff9d Add authentication example (#56)
* Added: auth enabled example in additionalConfigs

* Changed: bump up version

* Fixed: remove trailing spaces
2022-03-25 10:27:40 +01:00
Zied ANDOLSI
049e609ce0 add image pull secret + add ingress path (#55) 2022-03-16 18:04:56 +01:00
Niels ten Boom
fa3739b643 Improvements k8sagent (#54) 2022-03-01 17:48:33 +01:00
Valeriano Manassero
018348bc1d Fix image versions (#53)
* Fixed: image versions

* Changed: chart version

* Changed: readme update by helm-docs
2022-02-22 11:42:23 +01:00
Valeriano Manassero
57b85cbfce Update clearml image 1.2.0 (#52) 2022-02-17 15:33:30 +01:00
Niels ten Boom
9c15a8a348 fix: faulty service values references in k8s agent (#50)
* add k8s glue deployment

* more docs

* bump

* disabled by default

* run helm-docs

* fix service references

* fix readme

* add values file where k8sagent enabled

* empty files

* newline

* fix linter

Co-authored-by: Valeriano Manassero <14011549+valeriano-manassero@users.noreply.github.com>
2022-01-21 16:15:09 +01:00
Niels ten Boom
cd7f22f7d8 feat: Add k8s glue agent deployment (#49) 2022-01-18 23:27:12 +01:00
Shaun Howell
078e394e24 update ingress templates to accept per-service annotation overrides (#48)
* update ingress yamls to accept annotation overrides

* bump version to 3.3.1

* update readme via helm-docs
2022-01-18 18:06:01 +01:00
Valeriano Manassero
70b07c637a Update Elasticsearch (#47)
* update elasticsearch

* update elasticsearch reference

* bump up chart version
2022-01-05 08:26:57 +01:00
Valeriano Manassero
7b8e40c626 Agent foreground mode (#46)
* use foreground to push output on console

* bump up version
2021-12-13 09:04:02 +01:00
Valeriano Manassero
d8117eeb0d add k8s 1.22.1 to ci procedure (#44)
After some tests I found 1.22.1 doesn't have ulimit issue so I can include it into the ci process
2021-12-09 11:47:13 +01:00
Valeriano Manassero
4c09ae2c92 Fix env typo (#39)
* typo fix

* bump up version
2021-12-09 11:39:04 +01:00
Valeriano Manassero
478eecd5f2 remove k8s 1.22 from ci (#43)
It looks 1.22 k8s image from kind has a very low ulimit preventing elastic search from installing, removing it waiting for a fix.
2021-12-09 11:30:15 +01:00
Valeriano Manassero
43f4c44219 test one single kind cluster at time to avoid pressure fails (#42) 2021-12-09 11:00:57 +01:00
Valeriano Manassero
b83c8cd0e8 indentation fix (#41) 2021-12-09 10:32:42 +01:00
Valeriano Manassero
97f219228d update kind k8s versions (#40) 2021-12-09 10:31:17 +01:00
Valeriano Manassero
1b5b9407f6 Configurable auth cookies age (#38)
* configurable auth cookies age

* version bump up
2021-12-09 08:14:09 +01:00
Valeriano Manassero
b494a8c0cf External services (#36)
* use external services switch

* bump up version

* readme update
2021-11-26 08:11:55 +01:00
Weixiao Huang
266a1e3c41 feat: make service nodePort configurable and add some doc descriptions (#33)
* feat: make service nodePort configurable

* feat: bump version to 3.0.6

* docs: add descriptions for secret and service fields

* feat: add comments in clearml-kind.yaml of README.md

Co-authored-by: 黄维啸 <huangweixiao@megvii.com>
2021-11-08 14:23:10 +01:00
Weixiao Huang
bba5c0769f feat: make secret configurable and add secret annotations to deployment (#32) 2021-11-04 20:36:21 +01:00
Valeriano Manassero
b7f73e3bd9 Switch enabler agentservices (#31)
* switch to enable/disable agentservices

* bump up version
2021-09-21 14:16:30 +02:00
Valeriano Manassero
d3f6f3e50d Fix helper typo (#30)
* fix helper typo on api service name

* bump up version
2021-09-16 11:21:25 +02:00
Valeriano Manassero
979e73fe3d Fix ingress compat (#29)
* fix ingress compatibility with different k8s version

* bump up version
2021-09-16 10:54:25 +02:00
Valeriano Manassero
7352f35836 Helpers fix (#28)
* fix wrong service names

* bump up version
2021-09-16 09:11:58 +02:00
Valeriano Manassero
82ad17860d New ingress style (#27)
* new ingress style

* bump up version

* hostName fix

* helm-docs update
2021-09-16 08:51:07 +02:00
Valeriano Manassero
aa761dd450 Agent enable switch (#26)
* enable/disable switch

* bump up chart
2021-09-15 08:13:01 +02:00
Valeriano Manassero
7ff2f94d1a Apiserver configmap (#25)
* metadata name fix

* use toString

* use configmap for apiserver configs

* bump up version

* indentation fix

* fix trailing whitespaces
2021-09-14 15:43:10 +02:00
221 changed files with 16901 additions and 1166 deletions

View File

@@ -1,7 +1,7 @@
#!/bin/bash
CHART_DIRS="$(git diff --find-renames --name-only "$(git rev-parse --abbrev-ref HEAD)" remotes/origin/main -- 'charts' | grep '[cC]hart.yaml' | sed -e 's#/[Cc]hart.yaml##g')"
HELM_DOCS_VERSION="1.5.0"
HELM_DOCS_VERSION="1.11.0"
curl --silent --show-error --fail --location --output /tmp/helm-docs.tar.gz https://github.com/norwoodj/helm-docs/releases/download/v"${HELM_DOCS_VERSION}"/helm-docs_"${HELM_DOCS_VERSION}"_Linux_x86_64.tar.gz
tar -xf /tmp/helm-docs.tar.gz helm-docs

View File

@@ -21,28 +21,32 @@ jobs:
strategy:
matrix:
k8s:
- v1.20.7
- v1.21.1
- v1.22.7
- v1.23.6
- v1.24.0
steps:
- name: Checkout
uses: actions/checkout@v1
- name: Create kind ${{ matrix.k8s }} cluster
uses: helm/kind-action@v1.1.0
uses: helm/kind-action@v1.2.0
with:
version: v0.11.1
version: v0.13.0
node_image: kindest/node:${{ matrix.k8s }}
- name: Set up chart-testing
uses: helm/chart-testing-action@v2.0.1
- name: Add bitnami repo
run: helm repo add bitnami https://charts.bitnami.com/bitnami
- name: Add elastic repo
run: helm repo add elastic https://helm.elastic.co
uses: helm/chart-testing-action@v2.2.1
- name: Run chart-testing (list-changed)
id: list-changed
run: |
changed=$(ct list-changed --chart-dirs=charts --target-branch=main)
if [[ -n "$changed" ]]; then
echo "::set-output name=changed::true"
echo "::set-output name=changed_charts::\"${changed//$'\n'/,}\""
fi
- name: Inject secrets
run: |
find ./charts/*/ci/*.yaml -type f -exec sed -i "s/AGENTK8SGLUEKEY/${{ secrets.agentk8sglueKey }}/g" {} \;
find ./charts/*/ci/*.yaml -type f -exec sed -i "s/AGENTK8SGLUESECRET/${{ secrets.agentk8sglueSecret }}/g" {} \;
if: steps.list-changed.outputs.changed == 'true'
- name: Run chart-testing (lint and install)
run: ct lint-and-install --chart-dirs=charts --target-branch=main --helm-extra-args="--timeout=15m" --debug=true
run: ct lint-and-install --chart-dirs=charts --target-branch=main --helm-extra-args="--timeout=15m" --charts=${{steps.list-changed.outputs.changed_charts}} --debug=true
if: steps.list-changed.outputs.changed == 'true'

View File

@@ -38,7 +38,7 @@ will always upgrade with you.
## License
Server Side Public License, Version 1 (see the [LICENSE](https://en.wikipedia.org/wiki/Server_Side_Public_License) for more information)
Apache License, Version 2.0, (see the [LICENSE](https://www.apache.org/licenses/LICENSE-2.0) for more information)
## Requirements

View File

@@ -0,0 +1,23 @@
# Patterns to ignore when building packages.
# This supports shell glob matching, relative path matching, and
# negation (prefixed with !). Only one pattern per line.
.DS_Store
# Common VCS dirs
.git/
.gitignore
.bzr/
.bzrignore
.hg/
.hgignore
.svn/
# Common backup files
*.swp
*.bak
*.tmp
*.orig
*~
# Various IDEs
.project
.idea/
*.tmproj
.vscode/

View File

@@ -0,0 +1,19 @@
apiVersion: v2
name: clearml-agent
description: MLOps platform
type: application
version: "1.3.0"
appVersion: "1.24"
kubeVersion: ">= 1.19.0-0 < 1.25.0-0"
home: https://clear.ml
icon: https://raw.githubusercontent.com/allegroai/clearml/master/docs/clearml-logo.svg
sources:
- https://github.com/allegroai/clearml-helm-charts
- https://github.com/allegroai/clearml
maintainers:
- name: valeriano-manassero
url: https://github.com/valeriano-manassero
keywords:
- clearml
- "machine learning"
- mlops

View File

@@ -0,0 +1,201 @@
Apache License
Version 2.0, January 2004
http://www.apache.org/licenses/
TERMS AND CONDITIONS FOR USE, REPRODUCTION, AND DISTRIBUTION
1. Definitions.
"License" shall mean the terms and conditions for use, reproduction,
and distribution as defined by Sections 1 through 9 of this document.
"Licensor" shall mean the copyright owner or entity authorized by
the copyright owner that is granting the License.
"Legal Entity" shall mean the union of the acting entity and all
other entities that control, are controlled by, or are under common
control with that entity. For the purposes of this definition,
"control" means (i) the power, direct or indirect, to cause the
direction or management of such entity, whether by contract or
otherwise, or (ii) ownership of fifty percent (50%) or more of the
outstanding shares, or (iii) beneficial ownership of such entity.
"You" (or "Your") shall mean an individual or Legal Entity
exercising permissions granted by this License.
"Source" form shall mean the preferred form for making modifications,
including but not limited to software source code, documentation
source, and configuration files.
"Object" form shall mean any form resulting from mechanical
transformation or translation of a Source form, including but
not limited to compiled object code, generated documentation,
and conversions to other media types.
"Work" shall mean the work of authorship, whether in Source or
Object form, made available under the License, as indicated by a
copyright notice that is included in or attached to the work
(an example is provided in the Appendix below).
"Derivative Works" shall mean any work, whether in Source or Object
form, that is based on (or derived from) the Work and for which the
editorial revisions, annotations, elaborations, or other modifications
represent, as a whole, an original work of authorship. For the purposes
of this License, Derivative Works shall not include works that remain
separable from, or merely link (or bind by name) to the interfaces of,
the Work and Derivative Works thereof.
"Contribution" shall mean any work of authorship, including
the original version of the Work and any modifications or additions
to that Work or Derivative Works thereof, that is intentionally
submitted to Licensor for inclusion in the Work by the copyright owner
or by an individual or Legal Entity authorized to submit on behalf of
the copyright owner. For the purposes of this definition, "submitted"
means any form of electronic, verbal, or written communication sent
to the Licensor or its representatives, including but not limited to
communication on electronic mailing lists, source code control systems,
and issue tracking systems that are managed by, or on behalf of, the
Licensor for the purpose of discussing and improving the Work, but
excluding communication that is conspicuously marked or otherwise
designated in writing by the copyright owner as "Not a Contribution."
"Contributor" shall mean Licensor and any individual or Legal Entity
on behalf of whom a Contribution has been received by Licensor and
subsequently incorporated within the Work.
2. Grant of Copyright License. Subject to the terms and conditions of
this License, each Contributor hereby grants to You a perpetual,
worldwide, non-exclusive, no-charge, royalty-free, irrevocable
copyright license to reproduce, prepare Derivative Works of,
publicly display, publicly perform, sublicense, and distribute the
Work and such Derivative Works in Source or Object form.
3. Grant of Patent License. Subject to the terms and conditions of
this License, each Contributor hereby grants to You a perpetual,
worldwide, non-exclusive, no-charge, royalty-free, irrevocable
(except as stated in this section) patent license to make, have made,
use, offer to sell, sell, import, and otherwise transfer the Work,
where such license applies only to those patent claims licensable
by such Contributor that are necessarily infringed by their
Contribution(s) alone or by combination of their Contribution(s)
with the Work to which such Contribution(s) was submitted. If You
institute patent litigation against any entity (including a
cross-claim or counterclaim in a lawsuit) alleging that the Work
or a Contribution incorporated within the Work constitutes direct
or contributory patent infringement, then any patent licenses
granted to You under this License for that Work shall terminate
as of the date such litigation is filed.
4. Redistribution. You may reproduce and distribute copies of the
Work or Derivative Works thereof in any medium, with or without
modifications, and in Source or Object form, provided that You
meet the following conditions:
(a) You must give any other recipients of the Work or
Derivative Works a copy of this License; and
(b) You must cause any modified files to carry prominent notices
stating that You changed the files; and
(c) You must retain, in the Source form of any Derivative Works
that You distribute, all copyright, patent, trademark, and
attribution notices from the Source form of the Work,
excluding those notices that do not pertain to any part of
the Derivative Works; and
(d) If the Work includes a "NOTICE" text file as part of its
distribution, then any Derivative Works that You distribute must
include a readable copy of the attribution notices contained
within such NOTICE file, excluding those notices that do not
pertain to any part of the Derivative Works, in at least one
of the following places: within a NOTICE text file distributed
as part of the Derivative Works; within the Source form or
documentation, if provided along with the Derivative Works; or,
within a display generated by the Derivative Works, if and
wherever such third-party notices normally appear. The contents
of the NOTICE file are for informational purposes only and
do not modify the License. You may add Your own attribution
notices within Derivative Works that You distribute, alongside
or as an addendum to the NOTICE text from the Work, provided
that such additional attribution notices cannot be construed
as modifying the License.
You may add Your own copyright statement to Your modifications and
may provide additional or different license terms and conditions
for use, reproduction, or distribution of Your modifications, or
for any such Derivative Works as a whole, provided Your use,
reproduction, and distribution of the Work otherwise complies with
the conditions stated in this License.
5. Submission of Contributions. Unless You explicitly state otherwise,
any Contribution intentionally submitted for inclusion in the Work
by You to the Licensor shall be under the terms and conditions of
this License, without any additional terms or conditions.
Notwithstanding the above, nothing herein shall supersede or modify
the terms of any separate license agreement you may have executed
with Licensor regarding such Contributions.
6. Trademarks. This License does not grant permission to use the trade
names, trademarks, service marks, or product names of the Licensor,
except as required for reasonable and customary use in describing the
origin of the Work and reproducing the content of the NOTICE file.
7. Disclaimer of Warranty. Unless required by applicable law or
agreed to in writing, Licensor provides the Work (and each
Contributor provides its Contributions) on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or
implied, including, without limitation, any warranties or conditions
of TITLE, NON-INFRINGEMENT, MERCHANTABILITY, or FITNESS FOR A
PARTICULAR PURPOSE. You are solely responsible for determining the
appropriateness of using or redistributing the Work and assume any
risks associated with Your exercise of permissions under this License.
8. Limitation of Liability. In no event and under no legal theory,
whether in tort (including negligence), contract, or otherwise,
unless required by applicable law (such as deliberate and grossly
negligent acts) or agreed to in writing, shall any Contributor be
liable to You for damages, including any direct, indirect, special,
incidental, or consequential damages of any character arising as a
result of this License or out of the use or inability to use the
Work (including but not limited to damages for loss of goodwill,
work stoppage, computer failure or malfunction, or any and all
other commercial damages or losses), even if such Contributor
has been advised of the possibility of such damages.
9. Accepting Warranty or Additional Liability. While redistributing
the Work or Derivative Works thereof, You may choose to offer,
and charge a fee for, acceptance of support, warranty, indemnity,
or other liability obligations and/or rights consistent with this
License. However, in accepting such obligations, You may act only
on Your own behalf and on Your sole responsibility, not on behalf
of any other Contributor, and only if You agree to indemnify,
defend, and hold each Contributor harmless for any liability
incurred by, or claims asserted against, such Contributor by reason
of your accepting any such warranty or additional liability.
END OF TERMS AND CONDITIONS
APPENDIX: How to apply the Apache License to your work.
To apply the Apache License to your work, attach the following
boilerplate notice, with the fields enclosed by brackets "[]"
replaced with your own identifying information. (Don't include
the brackets!) The text should be enclosed in the appropriate
comment syntax for the file format. We also recommend that a
file or class name and description of purpose be included on the
same "printed page" as the copyright notice for easier
identification within third-party archives.
Copyright [yyyy] [name of copyright owner]
Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License.

View File

@@ -0,0 +1,62 @@
# clearml-agent
![Version: 1.3.0](https://img.shields.io/badge/Version-1.3.0-informational?style=flat-square) ![Type: application](https://img.shields.io/badge/Type-application-informational?style=flat-square) ![AppVersion: 1.24](https://img.shields.io/badge/AppVersion-1.24-informational?style=flat-square)
MLOps platform
**Homepage:** <https://clear.ml>
## Maintainers
| Name | Email | Url |
| ---- | ------ | --- |
| valeriano-manassero | | <https://github.com/valeriano-manassero> |
## Source Code
* <https://github.com/allegroai/clearml-helm-charts>
* <https://github.com/allegroai/clearml>
## Requirements
Kubernetes: `>= 1.19.0-0 < 1.25.0-0`
## Values
| Key | Type | Default | Description |
|-----|------|---------|-------------|
| agentk8sglue | object | `{"apiServerUrlReference":"https://api.clear.ml","clearmlcheckCertificate":true,"defaultContainerImage":"ubuntu:18.04","extraEnvs":[],"fileServerUrlReference":"https://files.clear.ml","id":"k8s-agent","image":{"repository":"allegroai/clearml-agent-k8s-base","tag":"1.24-18"},"maxPods":10,"podTemplate":{"env":[],"nodeSelector":{},"resources":{},"tolerations":[],"volumes":[]},"queue":"default","replicaCount":1,"serviceAccountName":"default","webServerUrlReference":"https://app.clear.ml"}` | This agent will spawn queued experiments in new pods, a good use case is to combine this with GPU autoscaling nodes. https://github.com/allegroai/clearml-agent/tree/master/docker/k8s-glue |
| agentk8sglue.apiServerUrlReference | string | `"https://api.clear.ml"` | Reference to Api server url |
| agentk8sglue.clearmlcheckCertificate | bool | `true` | Check certificates validity for evefry UrlReference below. |
| agentk8sglue.defaultContainerImage | string | `"ubuntu:18.04"` | default container image for ClearML Task pod |
| agentk8sglue.extraEnvs | list | `[]` | Environment variables to be exposed in the agentk8sglue pods |
| agentk8sglue.fileServerUrlReference | string | `"https://files.clear.ml"` | Reference to File server url |
| agentk8sglue.id | string | `"k8s-agent"` | ClearML worker ID (must be unique across the entire ClearMLenvironment) |
| agentk8sglue.image | object | `{"repository":"allegroai/clearml-agent-k8s-base","tag":"1.24-18"}` | Glue Agent image configuration |
| agentk8sglue.maxPods | int | `10` | maximum concurrent consume ClearML Task pod |
| agentk8sglue.podTemplate | object | `{"env":[],"nodeSelector":{},"resources":{},"tolerations":[],"volumes":[]}` | template for pods spawned to consume ClearML Task |
| agentk8sglue.podTemplate.env | list | `[]` | environment variables for pods spawned to consume ClearML Task (example in values.yaml comments) |
| agentk8sglue.podTemplate.nodeSelector | object | `{}` | nodeSelector setup for pods spawned to consume ClearML Task (example in values.yaml comments) |
| agentk8sglue.podTemplate.resources | object | `{}` | resources declaration for pods spawned to consume ClearML Task (example in values.yaml comments) |
| agentk8sglue.podTemplate.tolerations | list | `[]` | tolerations setup for pods spawned to consume ClearML Task (example in values.yaml comments) |
| agentk8sglue.podTemplate.volumes | list | `[]` | volumes definition for pods spawned to consume ClearML Task (example in values.yaml comments) |
| agentk8sglue.queue | string | `"default"` | ClearML queue this agent will consume |
| agentk8sglue.replicaCount | int | `1` | Glue Agent number of pods |
| agentk8sglue.serviceAccountName | string | `"default"` | serviceAccountName for pods spawned to consume ClearML Task |
| agentk8sglue.webServerUrlReference | string | `"https://app.clear.ml"` | Reference to Web server url |
| clearml | object | `{"agentk8sglueKey":"ACCESSKEY","agentk8sglueSecret":"SECRETKEY","clearmlConfig":"sdk {\n}","existingAgentk8sglueSecret":"","existingClearmlConfigSecret":""}` | ClearMl generic configurations |
| clearml.agentk8sglueKey | string | `"ACCESSKEY"` | Agent k8s Glue basic auth key |
| clearml.agentk8sglueSecret | string | `"SECRETKEY"` | Agent k8s Glue basic auth secret |
| clearml.clearmlConfig | string | `"sdk {\n}"` | ClearML configuration file |
| clearml.existingAgentk8sglueSecret | string | `""` | If this is set, chart will not generate a secret but will use what is defined here |
| clearml.existingClearmlConfigSecret | string | `""` | If this is set, chart will not generate a secret but will use what is defined here |
| imageCredentials | object | `{"email":"someone@host.com","enabled":false,"existingSecret":"","password":"pwd","registry":"docker.io","username":"someone"}` | Private image registry configuration |
| imageCredentials.email | string | `"someone@host.com"` | Email |
| imageCredentials.enabled | bool | `false` | Use private authentication mode |
| imageCredentials.existingSecret | string | `""` | If this is set, chart will not generate a secret but will use what is defined here |
| imageCredentials.password | string | `"pwd"` | Registry password |
| imageCredentials.registry | string | `"docker.io"` | Registry name |
| imageCredentials.username | string | `"someone"` | Registry username |
----------------------------------------------
Autogenerated from chart metadata using [helm-docs v1.11.0](https://github.com/norwoodj/helm-docs/releases/v1.11.0)

View File

@@ -0,0 +1,3 @@
clearml:
agentk8sglueKey: "AGENTK8SGLUEKEY"
agentk8sglueSecret: "AGENTK8SGLUESECRET"

View File

@@ -0,0 +1 @@
Glue Agent deployed.

View File

@@ -0,0 +1,86 @@
{{/*
Expand the name of the chart.
*/}}
{{- define "clearml.name" -}}
{{- default .Chart.Name .Values.nameOverride | trunc 63 | trimSuffix "-" }}
{{- end }}
{{/*
Create a default fully qualified app name.
We truncate at 63 chars because some Kubernetes name fields are limited to this (by the DNS naming spec).
If release name contains chart name it will be used as a full name.
*/}}
{{- define "clearml.fullname" -}}
{{- if .Values.fullnameOverride }}
{{- .Values.fullnameOverride | trunc 63 | trimSuffix "-" }}
{{- else }}
{{- $name := default .Chart.Name .Values.nameOverride }}
{{- if contains $name .Release.Name }}
{{- .Release.Name | trunc 63 | trimSuffix "-" }}
{{- else }}
{{- printf "%s-%s" .Release.Name $name | trunc 63 | trimSuffix "-" }}
{{- end }}
{{- end }}
{{- end }}
{{/*
Create chart name and version as used by the chart label.
*/}}
{{- define "clearml.chart" -}}
{{- printf "%s-%s" .Chart.Name .Chart.Version | replace "+" "_" | trunc 63 | trimSuffix "-" }}
{{- end }}
{{/*
Common labels
*/}}
{{- define "clearml.labels" -}}
helm.sh/chart: {{ include "clearml.chart" . }}
{{ include "clearml.selectorLabels" . }}
{{- if .Chart.AppVersion }}
app.kubernetes.io/version: {{ .Chart.AppVersion | quote }}
{{- end }}
app.kubernetes.io/managed-by: {{ .Release.Service }}
{{- end }}
{{/*
Selector labels
*/}}
{{- define "clearml.selectorLabels" -}}
app.kubernetes.io/name: {{ include "clearml.name" . }}
app.kubernetes.io/instance: {{ .Release.Name }}
{{- end }}
{{/*
Reference Name (agentk8sglue)
*/}}
{{- define "agentk8sglue.referenceName" -}}
{{- include "clearml.fullname" . }}-agentk8sglue
{{- end }}
{{/*
Selector labels (agentk8sglue)
*/}}
{{- define "agentk8sglue.selectorLabels" -}}
app.kubernetes.io/name: {{ include "clearml.name" . }}
app.kubernetes.io/instance: {{ include "agentk8sglue.referenceName" . }}
{{- end }}
{{/*
Create the name of the service account to use
*/}}
{{- define "clearml.serviceAccountName" -}}
{{- if .Values.serviceAccount.create }}
{{- default (include "clearml.fullname" .) .Values.serviceAccount.name }}
{{- else }}
{{- default "default" .Values.serviceAccount.name }}
{{- end }}
{{- end }}
{{/*
Create secret to access docker registry
*/}}
{{- define "imagePullSecret" }}
{{- with .Values.imageCredentials }}
{{- printf "{\"auths\":{\"%s\":{\"username\":\"%s\",\"password\":\"%s\",\"email\":\"%s\",\"auth\":\"%s\"}}}" .registry .username .password .email (printf "%s:%s" .username .password | b64enc) | b64enc }}
{{- end }}
{{- end }}

View File

@@ -0,0 +1,71 @@
apiVersion: v1
kind: ConfigMap
metadata:
name: {{ include "agentk8sglue.referenceName" . }}-k8sagent-pod-template
data:
template.yaml: |
apiVersion: v1
metadata:
namespace: {{ .Release.Namespace }}
spec:
{{- if .Values.imageCredentials.enabled }}
imagePullSecrets:
{{- if .Values.imageCredentials.existingSecret }}
- name: .Values.imageCredentials.existingSecret
{{- else }}
- name: {{ include "agentk8sglue.referenceName" . }}-clearml-agent-registry-key
{{- end }}
{{- end }}
serviceAccountName: {{ .Values.agentk8sglue.serviceAccountName }}
volumes:
{{- range .Values.agentk8sglue.podTemplate.volumes }}
- name: {{ .name }}
persistentVolumeClaim:
claimName: {{ .name }}
{{- end }}
containers:
- resources:
{{- toYaml .Values.agentk8sglue.podTemplate.resources | nindent 10 }}
ports:
- containerPort: 10022
volumeMounts:
{{- range .Values.agentk8sglue.podTemplate.volumes }}
- mountPath: {{ .path }}
name: {{ .name }}
{{- end }}
env:
- name: CLEARML_API_HOST
value: {{.Values.agentk8sglue.apiServerUrlReference}}
- name: CLEARML_WEB_HOST
value: {{.Values.agentk8sglue.webServerUrlReference}}
- name: CLEARML_FILES_HOST
value: {{.Values.agentk8sglue.fileServerUrlReference}}
- name: CLEARML_API_ACCESS_KEY
valueFrom:
secretKeyRef:
{{- if .Values.clearml.existingAgentk8sglueSecret }}
name: {{ .Values.clearml.existingAgentk8sglueSecret }}
{{- else }}
name: {{ include "agentk8sglue.referenceName" . }}-clearml-agent-k8sglue
{{- end }}
key: agentk8sglue_key
- name: CLEARML_API_SECRET_KEY
valueFrom:
secretKeyRef:
{{- if .Values.clearml.existingAgentk8sglueSecret }}
name: {{ .Values.clearml.existingAgentk8sglueSecret }}
{{- else }}
name: {{ include "agentk8sglue.referenceName" . }}-clearml-agent-k8sglue
{{- end }}
key: agentk8sglue_secret
{{- if .Values.agentk8sglue.podTemplate.env }}
{{ toYaml .Values.agentk8sglue.podTemplate.env | nindent 8 }}
{{- end }}
{{- with .Values.agentk8sglue.podTemplate.nodeSelector}}
nodeSelector:
{{- toYaml . | nindent 8 }}
{{- end }}
{{- with .Values.agentk8sglue.podTemplate.tolerations }}
tolerations:
{{- toYaml . | nindent 8 }}
{{- end }}

View File

@@ -0,0 +1,120 @@
apiVersion: apps/v1
kind: Deployment
metadata:
name: {{ include "agentk8sglue.referenceName" . }}
labels:
{{- include "clearml.labels" . | nindent 4 }}
spec:
replicas: {{ .Values.agentk8sglue.replicaCount }}
selector:
matchLabels:
{{- include "agentk8sglue.selectorLabels" . | nindent 6 }}
template:
metadata:
annotations:
checksum/config: {{ printf "%s%s" .Values.clearml .Values.agentk8sglue | sha256sum }}
labels:
{{- include "agentk8sglue.selectorLabels" . | nindent 8 }}
spec:
{{- if .Values.imageCredentials.enabled }}
imagePullSecrets:
{{- if .Values.imageCredentials.existingSecret }}
- name: .Values.imageCredentials.existingSecret
{{- else }}
- name: {{ include "agentk8sglue.referenceName" . }}-clearml-agent-registry-key
{{- end }}
{{- end }}
initContainers:
- name: init-k8s-glue
image: "{{ .Values.agentk8sglue.image.repository }}:{{ .Values.agentk8sglue.image.tag }}"
command:
- /bin/sh
- -c
- >
set -x;
while [ $(curl {{ if not .Values.agentk8sglue.clearmlcheckCertificate }}--insecure{{ end }} -sw '%{http_code}' "{{.Values.agentk8sglue.apiServerUrlReference}}/debug.ping" -o /dev/null) -ne 200 ] ; do
echo "waiting for apiserver" ;
sleep 5 ;
done;
while [[ $(curl {{ if not .Values.agentk8sglue.clearmlcheckCertificate }}--insecure{{ end }} -sw '%{http_code}' "{{.Values.agentk8sglue.fileServerUrlReference}}/" -o /dev/null) =~ 403|405 ]] ; do
echo "waiting for fileserver" ;
sleep 5 ;
done;
while [ $(curl {{ if not .Values.agentk8sglue.clearmlcheckCertificate }}--insecure{{ end }} -sw '%{http_code}' "{{.Values.agentk8sglue.webServerUrlReference}}/" -o /dev/null) -ne 200 ] ; do
echo "waiting for webserver" ;
sleep 5 ;
done
containers:
- name: k8s-glue
image: "{{ .Values.agentk8sglue.image.repository }}:{{ .Values.agentk8sglue.image.tag }}"
imagePullPolicy: IfNotPresent
command: ["/bin/bash", "-c", "export PATH=$PATH:$HOME/bin; source /root/.bashrc && /root/entrypoint.sh"]
volumeMounts:
- name: {{ include "agentk8sglue.referenceName" . }}-k8sagent-pod-template
mountPath: /root/template
{{- if or .Values.clearml.clearmlConfig .Values.clearml.existingClearmlConfigSecret }}
- name: k8sagent-clearml-conf-volume
mountPath: /root/clearml.conf
subPath: clearml.conf
readOnly: true
{{- end }}
env:
- name: CLEARML_API_HOST
value: "{{.Values.agentk8sglue.apiServerUrlReference}}"
- name: CLEARML_WEB_HOST
value: "{{.Values.agentk8sglue.webServerUrlReference}}"
- name: CLEARML_FILES_HOST
value: "{{.Values.agentk8sglue.fileServerUrlReference}}"
- name: K8S_GLUE_MAX_PODS
value: "{{.Values.agentk8sglue.maxPods}}"
- name: K8S_GLUE_QUEUE
value: "{{.Values.agentk8sglue.queue}}"
- name: K8S_GLUE_EXTRA_ARGS
value: "--namespace {{ .Release.Namespace }} --template-yaml /root/template/template.yaml"
- name: K8S_DEFAULT_NAMESPACE
value: "{{ .Release.Namespace }}"
- name: CLEARML_API_ACCESS_KEY
valueFrom:
secretKeyRef:
{{- if .Values.clearml.existingAgentk8sglueSecret }}
name: {{ .Values.clearml.existingAgentk8sglueSecret }}
{{- else }}
name: {{ include "agentk8sglue.referenceName" . }}-clearml-agent-k8sglue
{{- end }}
key: agentk8sglue_key
- name: CLEARML_API_SECRET_KEY
valueFrom:
secretKeyRef:
{{- if .Values.clearml.existingAgentk8sglueSecret }}
name: {{ .Values.clearml.existingAgentk8sglueSecret }}
{{- else }}
name: {{ include "agentk8sglue.referenceName" . }}-clearml-agent-k8sglue
{{- end }}
key: agentk8sglue_secret
- name: CLEARML_WORKER_ID
value: "{{.Values.agentk8sglue.id}}"
- name: CLEARML_AGENT_UPDATE_REPO
value: ""
- name: FORCE_CLEARML_AGENT_REPO
value: ""
- name: CLEARML_DOCKER_IMAGE
value: "{{.Values.agentk8sglue.defaultContainerImage}}"
{{- if .Values.agentk8sglue.extraEnvs }}
{{ toYaml .Values.agentk8sglue.extraEnvs | nindent 10 }}
{{- end }}
volumes:
- name: {{ include "agentk8sglue.referenceName" . }}-k8sagent-pod-template
configMap:
name: {{ include "agentk8sglue.referenceName" . }}-k8sagent-pod-template
{{- if or .Values.clearml.clearmlConfig .Values.clearml.existingClearmlConfigSecret }}
- name: k8sagent-clearml-conf-volume
secret:
{{- if .Values.clearml.existingClearmlConfigSecret }}
secretName: {{ .Values.clearml.existingClearmlConfigSecret }}
{{- else }}
secretName: {{ include "agentk8sglue.referenceName" . }}-clearml-agent-conf
{{- end }}
items:
- key: clearml.conf
path: clearml.conf
{{ end }}

View File

@@ -0,0 +1,23 @@
apiVersion: rbac.authorization.k8s.io/v1
kind: Role
metadata:
name: {{ include "agentk8sglue.referenceName" . }}-k8sagent-pods-access
rules:
- apiGroups:
- ""
resources:
- pods
verbs: ["get", "list", "watch", "create", "patch", "delete"]
---
apiVersion: rbac.authorization.k8s.io/v1
kind: RoleBinding
metadata:
name: {{ include "agentk8sglue.referenceName" . }}-k8sagent-pods-access
subjects:
- kind: ServiceAccount
name: default
namespace: {{ .Release.Namespace }}
roleRef:
apiGroup: rbac.authorization.k8s.io
kind: Role
name: {{ include "agentk8sglue.referenceName" . }}-k8sagent-pods-access

View File

@@ -0,0 +1,30 @@
{{- if not .Values.clearml.existingAgentk8sglueSecret }}
apiVersion: v1
kind: Secret
metadata:
name: {{ include "agentk8sglue.referenceName" . }}-clearml-agent-k8sglue
data:
agentk8sglue_key: {{ .Values.clearml.agentk8sglueKey | b64enc }}
agentk8sglue_secret: {{ .Values.clearml.agentk8sglueSecret | b64enc }}
{{- end }}
---
{{- if not .Values.clearml.existingClearmlConfigSecret }}
apiVersion: v1
kind: Secret
metadata:
name: {{ include "agentk8sglue.referenceName" . }}-clearml-agent-conf
data:
clearml.conf: {{ .Values.clearml.clearmlConfig | b64enc }}
---
{{- end }}
{{- if .Values.imageCredentials.enabled }}
{{- if not .Values.imageCredentials.existingSecret }}
apiVersion: v1
kind: Secret
metadata:
name: {{ include "agentk8sglue.referenceName" . }}-clearml-agent-registry-key
type: kubernetes.io/dockerconfigjson
data:
.dockerconfigjson: {{ template "imagePullSecret" . }}
{{- end }}
{{- end }}

View File

@@ -0,0 +1,96 @@
# -- Private image registry configuration
imageCredentials:
# -- Use private authentication mode
enabled: false
# -- If this is set, chart will not generate a secret but will use what is defined here
existingSecret: ""
# -- Registry name
registry: docker.io
# -- Registry username
username: someone
# -- Registry password
password: pwd
# -- Email
email: someone@host.com
# -- ClearMl generic configurations
clearml:
# -- If this is set, chart will not generate a secret but will use what is defined here
existingAgentk8sglueSecret: ""
# -- Agent k8s Glue basic auth key
agentk8sglueKey: "ACCESSKEY"
# -- Agent k8s Glue basic auth secret
agentk8sglueSecret: "SECRETKEY"
# -- If this is set, chart will not generate a secret but will use what is defined here
existingClearmlConfigSecret: ""
# -- ClearML configuration file
clearmlConfig: |-
sdk {
}
# -- This agent will spawn queued experiments in new pods, a good use case is to combine this with
# GPU autoscaling nodes.
# https://github.com/allegroai/clearml-agent/tree/master/docker/k8s-glue
agentk8sglue:
# -- Glue Agent image configuration
image:
repository: "allegroai/clearml-agent-k8s-base"
tag: "1.24-18"
# -- Glue Agent number of pods
replicaCount: 1
# -- Check certificates validity for evefry UrlReference below.
clearmlcheckCertificate: true
# -- Reference to Api server url
apiServerUrlReference: "https://api.clear.ml"
# -- Reference to File server url
fileServerUrlReference: "https://files.clear.ml"
# -- Reference to Web server url
webServerUrlReference: "https://app.clear.ml"
# -- serviceAccountName for pods spawned to consume ClearML Task
serviceAccountName: default
# -- maximum concurrent consume ClearML Task pod
maxPods: 10
# -- default container image for ClearML Task pod
defaultContainerImage: ubuntu:18.04
# -- ClearML queue this agent will consume
queue: default
# -- ClearML worker ID (must be unique across the entire ClearMLenvironment)
id: k8s-agent
# -- Environment variables to be exposed in the agentk8sglue pods
extraEnvs: []
# -- template for pods spawned to consume ClearML Task
podTemplate:
# -- volumes definition for pods spawned to consume ClearML Task (example in values.yaml comments)
volumes: []
# - name: "yourvolume"
# path: "/yourpath"
# -- environment variables for pods spawned to consume ClearML Task (example in values.yaml comments)
env: []
# # to setup access to private repo, setup secret with git credentials:
# - name: CLEARML_AGENT_GIT_USER
# value: mygitusername
# - name: CLEARML_AGENT_GIT_PASS
# valueFrom:
# secretKeyRef:
# name: git-password
# key: git-password
# -- resources declaration for pods spawned to consume ClearML Task (example in values.yaml comments)
resources: {}
# limits:
# nvidia.com/gpu: 1
# -- tolerations setup for pods spawned to consume ClearML Task (example in values.yaml comments)
tolerations: []
# - key: "nvidia.com/gpu"
# operator: Exists
# effect: "NoSchedule"
# -- nodeSelector setup for pods spawned to consume ClearML Task (example in values.yaml comments)
nodeSelector: {}
# fleet: gpu-nodes

View File

@@ -0,0 +1,23 @@
# Patterns to ignore when building packages.
# This supports shell glob matching, relative path matching, and
# negation (prefixed with !). Only one pattern per line.
.DS_Store
# Common VCS dirs
.git/
.gitignore
.bzr/
.bzrignore
.hg/
.hgignore
.svn/
# Common backup files
*.swp
*.bak
*.tmp
*.orig
*~
# Various IDEs
.project
.idea/
*.tmproj
.vscode/

View File

@@ -0,0 +1,16 @@
apiVersion: v2
name: clearml-serving
description: ClearML Serving Helm Chart
type: application
version: 0.4.1
appVersion: "0.9.0"
maintainers:
- name: valeriano-manassero
url: https://github.com/valeriano-manassero
- name: stefano-cherchi
url: https://github.com/stefano-cherchi
keywords:
- clearml
- "machine learning"
- mlops
- "model serving"

View File

@@ -0,0 +1,201 @@
Apache License
Version 2.0, January 2004
http://www.apache.org/licenses/
TERMS AND CONDITIONS FOR USE, REPRODUCTION, AND DISTRIBUTION
1. Definitions.
"License" shall mean the terms and conditions for use, reproduction,
and distribution as defined by Sections 1 through 9 of this document.
"Licensor" shall mean the copyright owner or entity authorized by
the copyright owner that is granting the License.
"Legal Entity" shall mean the union of the acting entity and all
other entities that control, are controlled by, or are under common
control with that entity. For the purposes of this definition,
"control" means (i) the power, direct or indirect, to cause the
direction or management of such entity, whether by contract or
otherwise, or (ii) ownership of fifty percent (50%) or more of the
outstanding shares, or (iii) beneficial ownership of such entity.
"You" (or "Your") shall mean an individual or Legal Entity
exercising permissions granted by this License.
"Source" form shall mean the preferred form for making modifications,
including but not limited to software source code, documentation
source, and configuration files.
"Object" form shall mean any form resulting from mechanical
transformation or translation of a Source form, including but
not limited to compiled object code, generated documentation,
and conversions to other media types.
"Work" shall mean the work of authorship, whether in Source or
Object form, made available under the License, as indicated by a
copyright notice that is included in or attached to the work
(an example is provided in the Appendix below).
"Derivative Works" shall mean any work, whether in Source or Object
form, that is based on (or derived from) the Work and for which the
editorial revisions, annotations, elaborations, or other modifications
represent, as a whole, an original work of authorship. For the purposes
of this License, Derivative Works shall not include works that remain
separable from, or merely link (or bind by name) to the interfaces of,
the Work and Derivative Works thereof.
"Contribution" shall mean any work of authorship, including
the original version of the Work and any modifications or additions
to that Work or Derivative Works thereof, that is intentionally
submitted to Licensor for inclusion in the Work by the copyright owner
or by an individual or Legal Entity authorized to submit on behalf of
the copyright owner. For the purposes of this definition, "submitted"
means any form of electronic, verbal, or written communication sent
to the Licensor or its representatives, including but not limited to
communication on electronic mailing lists, source code control systems,
and issue tracking systems that are managed by, or on behalf of, the
Licensor for the purpose of discussing and improving the Work, but
excluding communication that is conspicuously marked or otherwise
designated in writing by the copyright owner as "Not a Contribution."
"Contributor" shall mean Licensor and any individual or Legal Entity
on behalf of whom a Contribution has been received by Licensor and
subsequently incorporated within the Work.
2. Grant of Copyright License. Subject to the terms and conditions of
this License, each Contributor hereby grants to You a perpetual,
worldwide, non-exclusive, no-charge, royalty-free, irrevocable
copyright license to reproduce, prepare Derivative Works of,
publicly display, publicly perform, sublicense, and distribute the
Work and such Derivative Works in Source or Object form.
3. Grant of Patent License. Subject to the terms and conditions of
this License, each Contributor hereby grants to You a perpetual,
worldwide, non-exclusive, no-charge, royalty-free, irrevocable
(except as stated in this section) patent license to make, have made,
use, offer to sell, sell, import, and otherwise transfer the Work,
where such license applies only to those patent claims licensable
by such Contributor that are necessarily infringed by their
Contribution(s) alone or by combination of their Contribution(s)
with the Work to which such Contribution(s) was submitted. If You
institute patent litigation against any entity (including a
cross-claim or counterclaim in a lawsuit) alleging that the Work
or a Contribution incorporated within the Work constitutes direct
or contributory patent infringement, then any patent licenses
granted to You under this License for that Work shall terminate
as of the date such litigation is filed.
4. Redistribution. You may reproduce and distribute copies of the
Work or Derivative Works thereof in any medium, with or without
modifications, and in Source or Object form, provided that You
meet the following conditions:
(a) You must give any other recipients of the Work or
Derivative Works a copy of this License; and
(b) You must cause any modified files to carry prominent notices
stating that You changed the files; and
(c) You must retain, in the Source form of any Derivative Works
that You distribute, all copyright, patent, trademark, and
attribution notices from the Source form of the Work,
excluding those notices that do not pertain to any part of
the Derivative Works; and
(d) If the Work includes a "NOTICE" text file as part of its
distribution, then any Derivative Works that You distribute must
include a readable copy of the attribution notices contained
within such NOTICE file, excluding those notices that do not
pertain to any part of the Derivative Works, in at least one
of the following places: within a NOTICE text file distributed
as part of the Derivative Works; within the Source form or
documentation, if provided along with the Derivative Works; or,
within a display generated by the Derivative Works, if and
wherever such third-party notices normally appear. The contents
of the NOTICE file are for informational purposes only and
do not modify the License. You may add Your own attribution
notices within Derivative Works that You distribute, alongside
or as an addendum to the NOTICE text from the Work, provided
that such additional attribution notices cannot be construed
as modifying the License.
You may add Your own copyright statement to Your modifications and
may provide additional or different license terms and conditions
for use, reproduction, or distribution of Your modifications, or
for any such Derivative Works as a whole, provided Your use,
reproduction, and distribution of the Work otherwise complies with
the conditions stated in this License.
5. Submission of Contributions. Unless You explicitly state otherwise,
any Contribution intentionally submitted for inclusion in the Work
by You to the Licensor shall be under the terms and conditions of
this License, without any additional terms or conditions.
Notwithstanding the above, nothing herein shall supersede or modify
the terms of any separate license agreement you may have executed
with Licensor regarding such Contributions.
6. Trademarks. This License does not grant permission to use the trade
names, trademarks, service marks, or product names of the Licensor,
except as required for reasonable and customary use in describing the
origin of the Work and reproducing the content of the NOTICE file.
7. Disclaimer of Warranty. Unless required by applicable law or
agreed to in writing, Licensor provides the Work (and each
Contributor provides its Contributions) on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or
implied, including, without limitation, any warranties or conditions
of TITLE, NON-INFRINGEMENT, MERCHANTABILITY, or FITNESS FOR A
PARTICULAR PURPOSE. You are solely responsible for determining the
appropriateness of using or redistributing the Work and assume any
risks associated with Your exercise of permissions under this License.
8. Limitation of Liability. In no event and under no legal theory,
whether in tort (including negligence), contract, or otherwise,
unless required by applicable law (such as deliberate and grossly
negligent acts) or agreed to in writing, shall any Contributor be
liable to You for damages, including any direct, indirect, special,
incidental, or consequential damages of any character arising as a
result of this License or out of the use or inability to use the
Work (including but not limited to damages for loss of goodwill,
work stoppage, computer failure or malfunction, or any and all
other commercial damages or losses), even if such Contributor
has been advised of the possibility of such damages.
9. Accepting Warranty or Additional Liability. While redistributing
the Work or Derivative Works thereof, You may choose to offer,
and charge a fee for, acceptance of support, warranty, indemnity,
or other liability obligations and/or rights consistent with this
License. However, in accepting such obligations, You may act only
on Your own behalf and on Your sole responsibility, not on behalf
of any other Contributor, and only if You agree to indemnify,
defend, and hold each Contributor harmless for any liability
incurred by, or claims asserted against, such Contributor by reason
of your accepting any such warranty or additional liability.
END OF TERMS AND CONDITIONS
APPENDIX: How to apply the Apache License to your work.
To apply the Apache License to your work, attach the following
boilerplate notice, with the fields enclosed by brackets "[]"
replaced with your own identifying information. (Don't include
the brackets!) The text should be enclosed in the appropriate
comment syntax for the file format. We also recommend that a
file or class name and description of purpose be included on the
same "printed page" as the copyright notice for easier
identification within third-party archives.
Copyright [yyyy] [name of copyright owner]
Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License.

View File

@@ -0,0 +1,71 @@
# clearml-serving
![Version: 0.4.1](https://img.shields.io/badge/Version-0.4.1-informational?style=flat-square) ![Type: application](https://img.shields.io/badge/Type-application-informational?style=flat-square) ![AppVersion: 0.9.0](https://img.shields.io/badge/AppVersion-0.9.0-informational?style=flat-square)
ClearML Serving Helm Chart
## Maintainers
| Name | Email | Url |
| ---- | ------ | --- |
| valeriano-manassero | | <https://github.com/valeriano-manassero> |
| stefano-cherchi | | <https://github.com/stefano-cherchi> |
## Values
| Key | Type | Default | Description |
|-----|------|---------|-------------|
| alertmanager.affinity | object | `{}` | |
| alertmanager.image | string | `"prom/alertmanager:v0.23.0"` | |
| alertmanager.nodeSelector | object | `{}` | |
| alertmanager.resources | object | `{}` | |
| alertmanager.tolerations | list | `[]` | |
| clearml.apiAccessKey | string | `"ClearML API Access Key"` | |
| clearml.apiHost | string | `"http://clearml-server-apiserver:8008"` | |
| clearml.apiSecretKey | string | `"ClearML API Secret Key"` | |
| clearml.defaultBaseServeUrl | string | `"http://127.0.0.1:8080/serve"` | |
| clearml.filesHost | string | `"http://clearml-server-fileserver:8081"` | |
| clearml.servingTaskId | string | `"ClearML Serving Task ID"` | |
| clearml.webHost | string | `"http://clearml-server-webserver:80"` | |
| clearml_serving_inference.affinity | object | `{}` | |
| clearml_serving_inference.extraPythonPackages | list | `[]` | Extra Python Packages to be installed in running pods |
| clearml_serving_inference.image | string | `"allegroai/clearml-serving-inference"` | |
| clearml_serving_inference.nodeSelector | object | `{}` | |
| clearml_serving_inference.resources | object | `{}` | |
| clearml_serving_inference.tolerations | list | `[]` | |
| clearml_serving_statistics.affinity | object | `{}` | |
| clearml_serving_statistics.extraPythonPackages | list | `[]` | Extra Python Packages to be installed in running pods |
| clearml_serving_statistics.image | string | `"allegroai/clearml-serving-statistics"` | |
| clearml_serving_statistics.nodeSelector | object | `{}` | |
| clearml_serving_statistics.resources | object | `{}` | |
| clearml_serving_statistics.tolerations | list | `[]` | |
| clearml_serving_triton.affinity | object | `{}` | |
| clearml_serving_triton.enabled | bool | `true` | |
| clearml_serving_triton.extraPythonPackages | list | `[]` | Extra Python Packages to be installed in running pods |
| clearml_serving_triton.image | string | `"allegroai/clearml-serving-triton"` | |
| clearml_serving_triton.nodeSelector | object | `{}` | |
| clearml_serving_triton.resources | object | `{}` | |
| clearml_serving_triton.tolerations | list | `[]` | |
| grafana.affinity | object | `{}` | |
| grafana.image | string | `"grafana/grafana:8.4.4-ubuntu"` | |
| grafana.nodeSelector | object | `{}` | |
| grafana.resources | object | `{}` | |
| grafana.tolerations | list | `[]` | |
| kafka.affinity | object | `{}` | |
| kafka.image | string | `"bitnami/kafka:3.1.0"` | |
| kafka.nodeSelector | object | `{}` | |
| kafka.resources | object | `{}` | |
| kafka.tolerations | list | `[]` | |
| prometheus.affinity | object | `{}` | |
| prometheus.image | string | `"prom/prometheus:v2.34.0"` | |
| prometheus.nodeSelector | object | `{}` | |
| prometheus.resources | object | `{}` | |
| prometheus.tolerations | list | `[]` | |
| zookeeper.affinity | object | `{}` | |
| zookeeper.image | string | `"bitnami/zookeeper:3.7.0"` | |
| zookeeper.nodeSelector | object | `{}` | |
| zookeeper.resources | object | `{}` | |
| zookeeper.tolerations | list | `[]` | |
----------------------------------------------
Autogenerated from chart metadata using [helm-docs v1.11.0](https://github.com/norwoodj/helm-docs/releases/v1.11.0)

View File

@@ -0,0 +1,62 @@
{{/*
Expand the name of the chart.
*/}}
{{- define "clearml-serving.name" -}}
{{- default .Chart.Name .Values.nameOverride | trunc 63 | trimSuffix "-" }}
{{- end }}
{{/*
Create a default fully qualified app name.
We truncate at 63 chars because some Kubernetes name fields are limited to this (by the DNS naming spec).
If release name contains chart name it will be used as a full name.
*/}}
{{- define "clearml-serving.fullname" -}}
{{- if .Values.fullnameOverride }}
{{- .Values.fullnameOverride | trunc 63 | trimSuffix "-" }}
{{- else }}
{{- $name := default .Chart.Name .Values.nameOverride }}
{{- if contains $name .Release.Name }}
{{- .Release.Name | trunc 63 | trimSuffix "-" }}
{{- else }}
{{- printf "%s-%s" .Release.Name $name | trunc 63 | trimSuffix "-" }}
{{- end }}
{{- end }}
{{- end }}
{{/*
Create chart name and version as used by the chart label.
*/}}
{{- define "clearml-serving.chart" -}}
{{- printf "%s-%s" .Chart.Name .Chart.Version | replace "+" "_" | trunc 63 | trimSuffix "-" }}
{{- end }}
{{/*
Common labels
*/}}
{{- define "clearml-serving.labels" -}}
helm.sh/chart: {{ include "clearml-serving.chart" . }}
{{ include "clearml-serving.selectorLabels" . }}
{{- if .Chart.AppVersion }}
app.kubernetes.io/version: {{ .Chart.AppVersion | quote }}
{{- end }}
app.kubernetes.io/managed-by: {{ .Release.Service }}
{{- end }}
{{/*
Selector labels
*/}}
{{- define "clearml-serving.selectorLabels" -}}
app.kubernetes.io/name: {{ include "clearml-serving.name" . }}
app.kubernetes.io/instance: {{ .Release.Name }}
{{- end }}
{{/*
Create the name of the service account to use
*/}}
{{- define "clearml-serving.serviceAccountName" -}}
{{- if .Values.serviceAccount.create }}
{{- default (include "clearml-serving.fullname" .) .Values.serviceAccount.name }}
{{- else }}
{{- default "default" .Values.serviceAccount.name }}
{{- end }}
{{- end }}

View File

@@ -0,0 +1,28 @@
apiVersion: apps/v1
kind: Deployment
metadata:
annotations: {}
labels:
clearml.serving.service: alertmanager
name: alertmanager
spec:
replicas: 1
selector:
matchLabels:
clearml.serving.service: alertmanager
strategy: {}
template:
metadata:
annotations: {}
labels:
clearml.serving.network/clearml-serving-backend: "true"
clearml.serving.service: alertmanager
spec:
containers:
- image: {{ .Values.alertmanager.image }}
name: clearml-serving-alertmanager
ports:
- containerPort: 9093
resources: {}
restartPolicy: Always
status: {}

View File

@@ -0,0 +1,16 @@
apiVersion: v1
kind: Service
metadata:
annotations: {}
labels:
clearml.serving.service: alertmanager
name: clearml-serving-alertmanager
spec:
ports:
- name: "9093"
port: 9093
targetPort: 9093
selector:
clearml.serving.service: alertmanager
status:
loadBalancer: {}

View File

@@ -0,0 +1,13 @@
apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
name: clearml-serving-backend
spec:
ingress:
- from:
- podSelector:
matchLabels:
clearml.serving.network/clearml-serving-backend: "true"
podSelector:
matchLabels:
clearml.serving.network/clearml-serving-backend: "true"

View File

@@ -0,0 +1,63 @@
apiVersion: apps/v1
kind: Deployment
metadata:
annotations: {}
labels:
clearml.serving.service: clearml-serving-inference
name: clearml-serving-inference
spec:
replicas: 1
selector:
matchLabels:
clearml.serving.service: clearml-serving-inference
strategy: {}
template:
metadata:
annotations: {}
labels:
clearml.serving.network/clearml-serving-backend: "true"
clearml.serving.service: clearml-serving-inference
spec:
containers:
- env:
- name: CLEARML_API_ACCESS_KEY
value: "{{ .Values.clearml.apiAccessKey }}"
- name: CLEARML_API_SECRET_KEY
value: "{{ .Values.clearml.apiSecretKey }}"
- name: CLEARML_API_HOST
value: "{{ .Values.clearml.apiHost }}"
- name: CLEARML_FILES_HOST
value: "{{ .Values.clearml.filesHost }}"
- name: CLEARML_WEB_HOST
value: "{{ .Values.clearml.webHost }}"
- name: CLEARML_DEFAULT_KAFKA_SERVE_URL
value: clearml-serving-kafka:9092
- name: CLEARML_SERVING_POLL_FREQ
value: "1.0"
- name: CLEARML_DEFAULT_BASE_SERVE_URL
value: "{{ .Values.clearml.defaultBaseServeUrl }}"
- name: CLEARML_DEFAULT_TRITON_GRPC_ADDR
{{- if .Values.clearml_serving_triton.enabled }}
value: "clearml-serving-triton:8001"
{{- else }}
value: ""
{{- end }}
- name: CLEARML_SERVING_NUM_PROCESS
value: "2"
- name: CLEARML_SERVING_PORT
value: "8080"
- name: CLEARML_SERVING_TASK_ID
value: "{{ .Values.clearml.servingTaskId }}"
- name: CLEARML_USE_GUNICORN
value: "true"
{{- if .Values.clearml_serving_inference.extraPythonPackages }}
- name: EXTRA_PYTHON_PACKAGES
value: '{{ join " " .Values.clearml_serving_inference.extraPythonPackages }}'
{{- end }}
image: "{{ .Values.clearml_serving_inference.image }}:{{ .Chart.AppVersion }}"
name: clearml-serving-inference
ports:
- containerPort: 8080
resources: {}
restartPolicy: Always
status: {}

View File

@@ -0,0 +1,16 @@
apiVersion: v1
kind: Service
metadata:
annotations: {}
labels:
clearml.serving.service: clearml-serving-inference
name: clearml-serving-inference
spec:
ports:
- name: "8080"
port: 8080
targetPort: 8080
selector:
clearml.serving.service: clearml-serving-inference
status:
loadBalancer: {}

View File

@@ -0,0 +1,49 @@
apiVersion: apps/v1
kind: Deployment
metadata:
annotations: {}
labels:
clearml.serving.service: clearml-serving-statistics
name: clearml-serving-statistics
spec:
replicas: 1
selector:
matchLabels:
clearml.serving.service: clearml-serving-statistics
strategy: {}
template:
metadata:
annotations: {}
labels:
clearml.serving.network/clearml-serving-backend: "true"
clearml.serving.service: clearml-serving-statistics
spec:
containers:
- env:
- name: CLEARML_API_ACCESS_KEY
value: "{{ .Values.clearml.apiAccessKey }}"
- name: CLEARML_API_SECRET_KEY
value: "{{ .Values.clearml.apiSecretKey }}"
- name: CLEARML_API_HOST
value: "{{ .Values.clearml.apiHost }}"
- name: CLEARML_FILES_HOST
value: "{{ .Values.clearml.filesHost }}"
- name: CLEARML_WEB_HOST
value: "{{ .Values.clearml.webHost }}"
- name: CLEARML_DEFAULT_KAFKA_SERVE_URL
value: clearml-serving-kafka:9092
- name: CLEARML_SERVING_POLL_FREQ
value: "1.0"
- name: CLEARML_SERVING_TASK_ID
value: "{{ .Values.clearml.servingTaskId }}"
{{- if .Values.clearml_serving_statistics.extraPythonPackages }}
- name: EXTRA_PYTHON_PACKAGES
value: '{{ join " " .Values.clearml_serving_statistics.extraPythonPackages }}'
{{- end }}
image: "{{ .Values.clearml_serving_statistics.image }}:{{ .Chart.AppVersion }}"
name: clearml-serving-statistics
ports:
- containerPort: 9999
resources: {}
restartPolicy: Always
status: {}

View File

@@ -0,0 +1,16 @@
apiVersion: v1
kind: Service
metadata:
annotations: {}
labels:
clearml.serving.service: clearml-serving-statistics
name: clearml-serving-statistics
spec:
ports:
- name: "9999"
port: 9999
targetPort: 9999
selector:
clearml.serving.service: clearml-serving-statistics
status:
loadBalancer: {}

View File

@@ -0,0 +1,52 @@
{{ if .Values.clearml_serving_triton.enabled }}
apiVersion: apps/v1
kind: Deployment
metadata:
annotations: {}
labels:
clearml.serving.service: clearml-serving-triton
name: clearml-serving-triton
spec:
replicas: 1
selector:
matchLabels:
clearml.serving.service: clearml-serving-triton
strategy: {}
template:
metadata:
annotations: {}
labels:
clearml.serving.network/clearml-serving-backend: "true"
clearml.serving.service: clearml-serving-triton
spec:
containers:
- env:
- name: CLEARML_API_ACCESS_KEY
value: "{{ .Values.clearml.apiAccessKey }}"
- name: CLEARML_API_SECRET_KEY
value: "{{ .Values.clearml.apiSecretKey }}"
- name: CLEARML_API_HOST
value: "{{ .Values.clearml.apiHost }}"
- name: CLEARML_FILES_HOST
value: "{{ .Values.clearml.filesHost }}"
- name: CLEARML_WEB_HOST
value: "{{ .Values.clearml.webHost }}"
- name: CLEARML_SERVING_TASK_ID
value: "{{ .Values.clearml.servingTaskId }}"
- name: CLEARML_TRITON_POLL_FREQ
value: "1.0"
- name: CLEARML_TRITON_METRIC_FREQ
value: "1.0"
{{- if .Values.clearml_serving_triton.extraPythonPackages }}
- name: EXTRA_PYTHON_PACKAGES
value: '{{ join " " .Values.clearml_serving_triton.extraPythonPackages }}'
{{- end }}
image: "{{ .Values.clearml_serving_triton.image }}:{{ .Chart.AppVersion }}"
name: clearml-serving-triton
ports:
- containerPort: 8001
resources: {}
restartPolicy: Always
status: {}
{{ end }}

View File

@@ -0,0 +1,18 @@
{{ if .Values.clearml_serving_triton.enabled }}
apiVersion: v1
kind: Service
metadata:
annotations: {}
labels:
clearml.serving.service: clearml-serving-triton
name: clearml-serving-triton
spec:
ports:
- name: "8001"
port: 8001
targetPort: 8001
selector:
clearml.serving.service: clearml-serving-triton
status:
loadBalancer: {}
{{ end }}

View File

@@ -0,0 +1,14 @@
apiVersion: v1
kind: Secret
metadata:
name: grafana-config
stringData:
datasource.yaml: |-
apiVersion: 1
datasources:
- name: Prometheus
type: prometheus
# Access mode - proxy (server in the UI) or direct (browser in the UI).
access: proxy
url: http://clearml-serving-prometheus:9090

View File

@@ -0,0 +1,36 @@
apiVersion: apps/v1
kind: Deployment
metadata:
annotations: {}
labels:
clearml.serving.service: grafana
name: grafana
spec:
replicas: 1
selector:
matchLabels:
clearml.serving.service: grafana
strategy:
type: Recreate
template:
metadata:
annotations: {}
labels:
clearml.serving.network/clearml-serving-backend: "true"
clearml.serving.service: grafana
spec:
containers:
- image: {{ .Values.grafana.image }}
name: clearml-serving-grafana
ports:
- containerPort: 3000
resources: {}
volumeMounts:
- mountPath: /etc/grafana/provisioning/datasources/
name: grafana-conf
restartPolicy: Always
volumes:
- name: grafana-conf
secret:
secretName: grafana-config
status: {}

View File

@@ -0,0 +1,16 @@
apiVersion: v1
kind: Service
metadata:
annotations: {}
labels:
clearml.serving.service: grafana
name: clearml-serving-grafana
spec:
ports:
- name: "3000"
port: 3000
targetPort: 3000
selector:
clearml.serving.service: grafana
status:
loadBalancer: {}

View File

@@ -0,0 +1,41 @@
apiVersion: apps/v1
kind: Deployment
metadata:
annotations: {}
labels:
clearml.serving.service: kafka
name: kafka
spec:
replicas: 1
selector:
matchLabels:
clearml.serving.service: kafka
strategy: {}
template:
metadata:
annotations: {}
labels:
clearml.serving.network/clearml-serving-backend: "true"
clearml.serving.service: kafka
spec:
containers:
- env:
- name: ALLOW_PLAINTEXT_LISTENER
value: "yes"
- name: KAFKA_BROKER_ID
value: "1"
- name: KAFKA_CFG_ADVERTISED_LISTENERS
value: PLAINTEXT://clearml-serving-kafka:9092
- name: KAFKA_CFG_LISTENERS
value: PLAINTEXT://0.0.0.0:9092
- name: KAFKA_CFG_ZOOKEEPER_CONNECT
value: clearml-serving-zookeeper:2181
- name: KAFKA_CREATE_TOPICS
value: '"topic_test:1:1"'
image: {{ .Values.kafka.image }}
name: clearml-serving-kafka
ports:
- containerPort: 9092
resources: {}
restartPolicy: Always
status: {}

View File

@@ -0,0 +1,16 @@
apiVersion: v1
kind: Service
metadata:
annotations: {}
labels:
clearml.serving.service: kafka
name: clearml-serving-kafka
spec:
ports:
- name: "9092"
port: 9092
targetPort: 9092
selector:
clearml.serving.service: kafka
status:
loadBalancer: {}

View File

@@ -0,0 +1,28 @@
apiVersion: v1
kind: Secret
metadata:
name: prometheus-config
stringData:
prometheus.yml: |-
global:
scrape_interval: "15s" # By default, scrape targets every 15 seconds.
evaluation_interval: 15s # By default, scrape targets every 15 seconds.
external_labels:
monitor: 'clearml-serving'
scrape_configs:
# The job name is added as a label `job=<job_name>` to any timeseries scraped from this config.
- job_name: 'prometheus'
scrape_interval: 5s
static_configs:
- targets: ['localhost:9090']
# The job name is added as a label `job=<job_name>` to any timeseries scraped from this config.
- job_name: 'clearml-inference-stats'
scrape_interval: 5s
static_configs:
- targets: ['clearml-serving-statistics:9999']

View File

@@ -0,0 +1,43 @@
apiVersion: apps/v1
kind: Deployment
metadata:
annotations: {}
labels:
clearml.serving.service: prometheus
name: prometheus
spec:
replicas: 1
selector:
matchLabels:
clearml.serving.service: prometheus
strategy:
type: Recreate
template:
metadata:
annotations: {}
labels:
clearml.serving.network/clearml-serving-backend: "true"
clearml.serving.service: prometheus
spec:
containers:
- args:
- --config.file=/mnt/prometheus.yml
- --storage.tsdb.path=/prometheus
- --web.console.libraries=/etc/prometheus/console_libraries
- --web.console.templates=/etc/prometheus/consoles
- --storage.tsdb.retention.time=200h
- --web.enable-lifecycle
image: {{ .Values.prometheus.image }}
name: clearml-serving-prometheus
ports:
- containerPort: 9090
resources: {}
volumeMounts:
- mountPath: /mnt
name: prometheus-conf
restartPolicy: Always
volumes:
- name: prometheus-conf
secret:
secretName: prometheus-config
status: {}

View File

@@ -0,0 +1,16 @@
apiVersion: v1
kind: Service
metadata:
annotations: {}
labels:
clearml.serving.service: prometheus
name: clearml-serving-prometheus
spec:
ports:
- name: "9090"
port: 9090
targetPort: 9090
selector:
clearml.serving.service: prometheus
status:
loadBalancer: {}

View File

@@ -0,0 +1,31 @@
apiVersion: apps/v1
kind: Deployment
metadata:
annotations: {}
labels:
clearml.serving.service: zookeeper
name: zookeeper
spec:
replicas: 1
selector:
matchLabels:
clearml.serving.service: zookeeper
strategy: {}
template:
metadata:
annotations: {}
labels:
clearml.serving.network/clearml-serving-backend: "true"
clearml.serving.service: zookeeper
spec:
containers:
- env:
- name: ALLOW_ANONYMOUS_LOGIN
value: "yes"
image: {{ .Values.zookeeper.image }}
name: clearml-serving-zookeeper
ports:
- containerPort: 2181
resources: {}
restartPolicy: Always
status: {}

View File

@@ -0,0 +1,16 @@
apiVersion: v1
kind: Service
metadata:
annotations: {}
labels:
clearml.serving.service: zookeeper
name: clearml-serving-zookeeper
spec:
ports:
- name: "2181"
port: 2181
targetPort: 2181
selector:
clearml.serving.service: zookeeper
status:
loadBalancer: {}

View File

@@ -0,0 +1,79 @@
# Default values for clearml-serving.
clearml:
apiAccessKey: "ClearML API Access Key"
apiSecretKey: "ClearML API Secret Key"
apiHost: http://clearml-server-apiserver:8008
filesHost: http://clearml-server-fileserver:8081
webHost: http://clearml-server-webserver:80
defaultBaseServeUrl: http://127.0.0.1:8080/serve
servingTaskId: "ClearML Serving Task ID"
zookeeper:
image: bitnami/zookeeper:3.7.0
nodeSelector: {}
tolerations: []
affinity: {}
resources: {}
kafka:
image: bitnami/kafka:3.1.0
nodeSelector: {}
tolerations: []
affinity: {}
resources: {}
prometheus:
image: prom/prometheus:v2.34.0
nodeSelector: {}
tolerations: []
affinity: {}
resources: {}
grafana:
image: grafana/grafana:8.4.4-ubuntu
nodeSelector: {}
tolerations: []
affinity: {}
resources: {}
alertmanager:
image: prom/alertmanager:v0.23.0
nodeSelector: {}
tolerations: []
affinity: {}
resources: {}
clearml_serving_statistics:
image: allegroai/clearml-serving-statistics
nodeSelector: {}
tolerations: []
affinity: {}
resources: {}
# -- Extra Python Packages to be installed in running pods
extraPythonPackages: []
# - numpy==1.22.4
# - pandas==1.4.2
clearml_serving_inference:
image: allegroai/clearml-serving-inference
nodeSelector: {}
tolerations: []
affinity: {}
resources: {}
# -- Extra Python Packages to be installed in running pods
extraPythonPackages: []
# - numpy==1.22.4
# - pandas==1.4.2
clearml_serving_triton:
enabled: true
image: allegroai/clearml-serving-triton
nodeSelector: {}
tolerations: []
affinity: {}
resources: {}
# -- Extra Python Packages to be installed in running pods
extraPythonPackages: []
# - numpy==1.22.4
# - pandas==1.4.2

View File

@@ -1,12 +1,12 @@
dependencies:
- name: redis
repository: https://charts.bitnami.com/bitnami
repository: file://../../dependency_charts/redis
version: 10.9.0
- name: mongodb
repository: https://charts.bitnami.com/bitnami
repository: file://../../dependency_charts/mongodb
version: 10.3.4
- name: elasticsearch
repository: https://helm.elastic.co
version: 7.10.1
digest: sha256:aefd3992b2ab085161e4cca35c6f73dd33f8d19272a9405b5ee4e8c2a0e79bba
generated: "2021-01-05T14:26:33.629164+01:00"
repository: file://../../dependency_charts/elasticsearch
version: 7.16.2
digest: sha256:149b5a49382d280b1e083f3c193d014d3d2eb7fcdf3ec1402008996960cc173a
generated: "2022-06-02T21:09:00.961174+02:00"

View File

@@ -2,8 +2,8 @@ apiVersion: v2
name: clearml
description: MLOps platform
type: application
version: "2.2.2"
appVersion: "1.1.1"
version: "4.2.0"
appVersion: "1.6.0"
home: https://clear.ml
icon: https://raw.githubusercontent.com/allegroai/clearml/master/docs/clearml-logo.svg
sources:
@@ -18,14 +18,14 @@ keywords:
- mlops
dependencies:
- name: redis
version: "~10.9.0"
repository: "https://charts.bitnami.com/bitnami"
version: "10.9.0"
repository: "file://../../dependency_charts/redis"
condition: redis.enabled
- name: mongodb
version: "~10.3.2"
repository: "https://charts.bitnami.com/bitnami"
version: "10.3.4"
repository: "file://../../dependency_charts/mongodb"
condition: mongodb.enabled
- name: elasticsearch
version: "~7.10.1"
repository: "https://helm.elastic.co"
version: "7.16.2"
repository: "file://../../dependency_charts/elasticsearch"
condition: elasticsearch.enabled

View File

@@ -1,557 +1,201 @@
Server Side Public License
VERSION 1, OCTOBER 16, 2018
Copyright © 2021 allegro.ai, Inc.
Everyone is permitted to copy and distribute verbatim copies of this
license document, but changing it is not allowed.
TERMS AND CONDITIONS
0. Definitions.
“This License” refers to Server Side Public License.
“Copyright” also means copyright-like laws that apply to other kinds of
works, such as semiconductor masks.
“The Program” refers to any copyrightable work licensed under this
License. Each licensee is addressed as “you”. “Licensees” and
“recipients” may be individuals or organizations.
To “modify” a work means to copy from or adapt all or part of the work in
a fashion requiring copyright permission, other than the making of an
exact copy. The resulting work is called a “modified version” of the
earlier work or a work “based on” the earlier work.
A “covered work” means either the unmodified Program or a work based on
the Program.
To “propagate” a work means to do anything with it that, without
permission, would make you directly or secondarily liable for
infringement under applicable copyright law, except executing it on a
computer or modifying a private copy. Propagation includes copying,
distribution (with or without modification), making available to the
public, and in some countries other activities as well.
To “convey” a work means any kind of propagation that enables other
parties to make or receive copies. Mere interaction with a user through a
computer network, with no transfer of a copy, is not conveying.
An interactive user interface displays “Appropriate Legal Notices” to the
extent that it includes a convenient and prominently visible feature that
(1) displays an appropriate copyright notice, and (2) tells the user that
there is no warranty for the work (except to the extent that warranties
are provided), that licensees may convey the work under this License, and
how to view a copy of this License. If the interface presents a list of
user commands or options, such as a menu, a prominent item in the list
meets this criterion.
1. Source Code.
The “source code” for a work means the preferred form of the work for
making modifications to it. “Object code” means any non-source form of a
work.
A “Standard Interface” means an interface that either is an official
standard defined by a recognized standards body, or, in the case of
interfaces specified for a particular programming language, one that is
widely used among developers working in that language. The “System
Libraries” of an executable work include anything, other than the work as
a whole, that (a) is included in the normal form of packaging a Major
Component, but which is not part of that Major Component, and (b) serves
only to enable use of the work with that Major Component, or to implement
a Standard Interface for which an implementation is available to the
public in source code form. A “Major Component”, in this context, means a
major essential component (kernel, window system, and so on) of the
specific operating system (if any) on which the executable work runs, or
a compiler used to produce the work, or an object code interpreter used
to run it.
The “Corresponding Source” for a work in object code form means all the
source code needed to generate, install, and (for an executable work) run
the object code and to modify the work, including scripts to control
those activities. However, it does not include the work's System
Libraries, or general-purpose tools or generally available free programs
which are used unmodified in performing those activities but which are
not part of the work. For example, Corresponding Source includes
interface definition files associated with source files for the work, and
the source code for shared libraries and dynamically linked subprograms
that the work is specifically designed to require, such as by intimate
data communication or control flow between those subprograms and other
parts of the work.
The Corresponding Source need not include anything that users can
regenerate automatically from other parts of the Corresponding Source.
The Corresponding Source for a work in source code form is that same work.
2. Basic Permissions.
All rights granted under this License are granted for the term of
copyright on the Program, and are irrevocable provided the stated
conditions are met. This License explicitly affirms your unlimited
permission to run the unmodified Program, subject to section 13. The
output from running a covered work is covered by this License only if the
output, given its content, constitutes a covered work. This License
acknowledges your rights of fair use or other equivalent, as provided by
copyright law. Subject to section 13, you may make, run and propagate
covered works that you do not convey, without conditions so long as your
license otherwise remains in force. You may convey covered works to
others for the sole purpose of having them make modifications exclusively
for you, or provide you with facilities for running those works, provided
that you comply with the terms of this License in conveying all
material for which you do not control copyright. Those thus making or
running the covered works for you must do so exclusively on your
behalf, under your direction and control, on terms that prohibit them
from making any copies of your copyrighted material outside their
relationship with you.
Conveying under any other circumstances is permitted solely under the
conditions stated below. Sublicensing is not allowed; section 10 makes it
unnecessary.
3. Protecting Users' Legal Rights From Anti-Circumvention Law.
No covered work shall be deemed part of an effective technological
measure under any applicable law fulfilling obligations under article 11
of the WIPO copyright treaty adopted on 20 December 1996, or similar laws
prohibiting or restricting circumvention of such measures.
When you convey a covered work, you waive any legal power to forbid
circumvention of technological measures to the extent such circumvention is
effected by exercising rights under this License with respect to the
covered work, and you disclaim any intention to limit operation or
modification of the work as a means of enforcing, against the work's users,
your or third parties' legal rights to forbid circumvention of
technological measures.
4. Conveying Verbatim Copies.
You may convey verbatim copies of the Program's source code as you
receive it, in any medium, provided that you conspicuously and
appropriately publish on each copy an appropriate copyright notice; keep
intact all notices stating that this License and any non-permissive terms
added in accord with section 7 apply to the code; keep intact all notices
of the absence of any warranty; and give all recipients a copy of this
License along with the Program. You may charge any price or no price for
each copy that you convey, and you may offer support or warranty
protection for a fee.
5. Conveying Modified Source Versions.
You may convey a work based on the Program, or the modifications to
produce it from the Program, in the form of source code under the terms
of section 4, provided that you also meet all of these conditions:
a) The work must carry prominent notices stating that you modified it,
and giving a relevant date.
b) The work must carry prominent notices stating that it is released
under this License and any conditions added under section 7. This
requirement modifies the requirement in section 4 to “keep intact all
notices”.
c) You must license the entire work, as a whole, under this License to
anyone who comes into possession of a copy. This License will therefore
apply, along with any applicable section 7 additional terms, to the
whole of the work, and all its parts, regardless of how they are
packaged. This License gives no permission to license the work in any
other way, but it does not invalidate such permission if you have
separately received it.
d) If the work has interactive user interfaces, each must display
Appropriate Legal Notices; however, if the Program has interactive
interfaces that do not display Appropriate Legal Notices, your work
need not make them do so.
A compilation of a covered work with other separate and independent
works, which are not by their nature extensions of the covered work, and
which are not combined with it such as to form a larger program, in or on
a volume of a storage or distribution medium, is called an “aggregate” if
the compilation and its resulting copyright are not used to limit the
access or legal rights of the compilation's users beyond what the
individual works permit. Inclusion of a covered work in an aggregate does
not cause this License to apply to the other parts of the aggregate.
6. Conveying Non-Source Forms.
You may convey a covered work in object code form under the terms of
sections 4 and 5, provided that you also convey the machine-readable
Corresponding Source under the terms of this License, in one of these
ways:
a) Convey the object code in, or embodied in, a physical product
(including a physical distribution medium), accompanied by the
Corresponding Source fixed on a durable physical medium customarily
used for software interchange.
b) Convey the object code in, or embodied in, a physical product
(including a physical distribution medium), accompanied by a written
offer, valid for at least three years and valid for as long as you
offer spare parts or customer support for that product model, to give
anyone who possesses the object code either (1) a copy of the
Corresponding Source for all the software in the product that is
covered by this License, on a durable physical medium customarily used
for software interchange, for a price no more than your reasonable cost
of physically performing this conveying of source, or (2) access to
copy the Corresponding Source from a network server at no charge.
c) Convey individual copies of the object code with a copy of the
written offer to provide the Corresponding Source. This alternative is
allowed only occasionally and noncommercially, and only if you received
the object code with such an offer, in accord with subsection 6b.
d) Convey the object code by offering access from a designated place
(gratis or for a charge), and offer equivalent access to the
Corresponding Source in the same way through the same place at no
further charge. You need not require recipients to copy the
Corresponding Source along with the object code. If the place to copy
the object code is a network server, the Corresponding Source may be on
a different server (operated by you or a third party) that supports
equivalent copying facilities, provided you maintain clear directions
next to the object code saying where to find the Corresponding Source.
Regardless of what server hosts the Corresponding Source, you remain
obligated to ensure that it is available for as long as needed to
satisfy these requirements.
e) Convey the object code using peer-to-peer transmission, provided you
inform other peers where the object code and Corresponding Source of
the work are being offered to the general public at no charge under
subsection 6d.
A separable portion of the object code, whose source code is excluded
from the Corresponding Source as a System Library, need not be included
in conveying the object code work.
A “User Product” is either (1) a “consumer product”, which means any
tangible personal property which is normally used for personal, family,
or household purposes, or (2) anything designed or sold for incorporation
into a dwelling. In determining whether a product is a consumer product,
doubtful cases shall be resolved in favor of coverage. For a particular
product received by a particular user, “normally used” refers to a
typical or common use of that class of product, regardless of the status
of the particular user or of the way in which the particular user
actually uses, or expects or is expected to use, the product. A product
is a consumer product regardless of whether the product has substantial
commercial, industrial or non-consumer uses, unless such uses represent
the only significant mode of use of the product.
“Installation Information” for a User Product means any methods,
procedures, authorization keys, or other information required to install
and execute modified versions of a covered work in that User Product from
a modified version of its Corresponding Source. The information must
suffice to ensure that the continued functioning of the modified object
code is in no case prevented or interfered with solely because
modification has been made.
If you convey an object code work under this section in, or with, or
specifically for use in, a User Product, and the conveying occurs as part
of a transaction in which the right of possession and use of the User
Product is transferred to the recipient in perpetuity or for a fixed term
(regardless of how the transaction is characterized), the Corresponding
Source conveyed under this section must be accompanied by the
Installation Information. But this requirement does not apply if neither
you nor any third party retains the ability to install modified object
code on the User Product (for example, the work has been installed in
ROM).
The requirement to provide Installation Information does not include a
requirement to continue to provide support service, warranty, or updates
for a work that has been modified or installed by the recipient, or for
the User Product in which it has been modified or installed. Access
to a network may be denied when the modification itself materially
and adversely affects the operation of the network or violates the
rules and protocols for communication across the network.
Corresponding Source conveyed, and Installation Information provided, in
accord with this section must be in a format that is publicly documented
(and with an implementation available to the public in source code form),
and must require no special password or key for unpacking, reading or
copying.
7. Additional Terms.
“Additional permissions” are terms that supplement the terms of this
License by making exceptions from one or more of its conditions.
Additional permissions that are applicable to the entire Program shall be
treated as though they were included in this License, to the extent that
they are valid under applicable law. If additional permissions apply only
to part of the Program, that part may be used separately under those
permissions, but the entire Program remains governed by this License
without regard to the additional permissions. When you convey a copy of
a covered work, you may at your option remove any additional permissions
from that copy, or from any part of it. (Additional permissions may be
written to require their own removal in certain cases when you modify the
work.) You may place additional permissions on material, added by you to
a covered work, for which you have or can give appropriate copyright
permission.
Notwithstanding any other provision of this License, for material you add
to a covered work, you may (if authorized by the copyright holders of
that material) supplement the terms of this License with terms:
a) Disclaiming warranty or limiting liability differently from the
terms of sections 15 and 16 of this License; or
b) Requiring preservation of specified reasonable legal notices or
author attributions in that material or in the Appropriate Legal
Notices displayed by works containing it; or
c) Prohibiting misrepresentation of the origin of that material, or
requiring that modified versions of such material be marked in
reasonable ways as different from the original version; or
d) Limiting the use for publicity purposes of names of licensors or
authors of the material; or
e) Declining to grant rights under trademark law for use of some trade
names, trademarks, or service marks; or
f) Requiring indemnification of licensors and authors of that material
by anyone who conveys the material (or modified versions of it) with
contractual assumptions of liability to the recipient, for any
liability that these contractual assumptions directly impose on those
licensors and authors.
All other non-permissive additional terms are considered “further
restrictions” within the meaning of section 10. If the Program as you
received it, or any part of it, contains a notice stating that it is
governed by this License along with a term that is a further restriction,
you may remove that term. If a license document contains a further
restriction but permits relicensing or conveying under this License, you
may add to a covered work material governed by the terms of that license
document, provided that the further restriction does not survive such
relicensing or conveying.
If you add terms to a covered work in accord with this section, you must
place, in the relevant source files, a statement of the additional terms
that apply to those files, or a notice indicating where to find the
applicable terms. Additional terms, permissive or non-permissive, may be
stated in the form of a separately written license, or stated as
exceptions; the above requirements apply either way.
8. Termination.
You may not propagate or modify a covered work except as expressly
provided under this License. Any attempt otherwise to propagate or modify
it is void, and will automatically terminate your rights under this
License (including any patent licenses granted under the third paragraph
of section 11).
However, if you cease all violation of this License, then your license
from a particular copyright holder is reinstated (a) provisionally,
unless and until the copyright holder explicitly and finally terminates
your license, and (b) permanently, if the copyright holder fails to
notify you of the violation by some reasonable means prior to 60 days
after the cessation.
Moreover, your license from a particular copyright holder is reinstated
permanently if the copyright holder notifies you of the violation by some
reasonable means, this is the first time you have received notice of
violation of this License (for any work) from that copyright holder, and
you cure the violation prior to 30 days after your receipt of the notice.
Termination of your rights under this section does not terminate the
licenses of parties who have received copies or rights from you under
this License. If your rights have been terminated and not permanently
reinstated, you do not qualify to receive new licenses for the same
material under section 10.
9. Acceptance Not Required for Having Copies.
You are not required to accept this License in order to receive or run a
copy of the Program. Ancillary propagation of a covered work occurring
solely as a consequence of using peer-to-peer transmission to receive a
copy likewise does not require acceptance. However, nothing other than
this License grants you permission to propagate or modify any covered
work. These actions infringe copyright if you do not accept this License.
Therefore, by modifying or propagating a covered work, you indicate your
acceptance of this License to do so.
10. Automatic Licensing of Downstream Recipients.
Each time you convey a covered work, the recipient automatically receives
a license from the original licensors, to run, modify and propagate that
work, subject to this License. You are not responsible for enforcing
compliance by third parties with this License.
An “entity transaction” is a transaction transferring control of an
organization, or substantially all assets of one, or subdividing an
organization, or merging organizations. If propagation of a covered work
results from an entity transaction, each party to that transaction who
receives a copy of the work also receives whatever licenses to the work
the party's predecessor in interest had or could give under the previous
paragraph, plus a right to possession of the Corresponding Source of the
work from the predecessor in interest, if the predecessor has it or can
get it with reasonable efforts.
You may not impose any further restrictions on the exercise of the rights
granted or affirmed under this License. For example, you may not impose a
license fee, royalty, or other charge for exercise of rights granted
under this License, and you may not initiate litigation (including a
cross-claim or counterclaim in a lawsuit) alleging that any patent claim
is infringed by making, using, selling, offering for sale, or importing
the Program or any portion of it.
11. Patents.
A “contributor” is a copyright holder who authorizes use under this
License of the Program or a work on which the Program is based. The work
thus licensed is called the contributor's “contributor version”.
A contributor's “essential patent claims” are all patent claims owned or
controlled by the contributor, whether already acquired or hereafter
acquired, that would be infringed by some manner, permitted by this
License, of making, using, or selling its contributor version, but do not
include claims that would be infringed only as a consequence of further
modification of the contributor version. For purposes of this definition,
“control” includes the right to grant patent sublicenses in a manner
consistent with the requirements of this License.
Each contributor grants you a non-exclusive, worldwide, royalty-free
patent license under the contributor's essential patent claims, to make,
use, sell, offer for sale, import and otherwise run, modify and propagate
the contents of its contributor version.
In the following three paragraphs, a “patent license” is any express
agreement or commitment, however denominated, not to enforce a patent
(such as an express permission to practice a patent or covenant not to
sue for patent infringement). To “grant” such a patent license to a party
means to make such an agreement or commitment not to enforce a patent
against the party.
If you convey a covered work, knowingly relying on a patent license, and
the Corresponding Source of the work is not available for anyone to copy,
free of charge and under the terms of this License, through a publicly
available network server or other readily accessible means, then you must
either (1) cause the Corresponding Source to be so available, or (2)
arrange to deprive yourself of the benefit of the patent license for this
particular work, or (3) arrange, in a manner consistent with the
requirements of this License, to extend the patent license to downstream
recipients. “Knowingly relying” means you have actual knowledge that, but
for the patent license, your conveying the covered work in a country, or
your recipient's use of the covered work in a country, would infringe
one or more identifiable patents in that country that you have reason
to believe are valid.
If, pursuant to or in connection with a single transaction or
arrangement, you convey, or propagate by procuring conveyance of, a
covered work, and grant a patent license to some of the parties receiving
the covered work authorizing them to use, propagate, modify or convey a
specific copy of the covered work, then the patent license you grant is
automatically extended to all recipients of the covered work and works
based on it.
A patent license is “discriminatory” if it does not include within the
scope of its coverage, prohibits the exercise of, or is conditioned on
the non-exercise of one or more of the rights that are specifically
granted under this License. You may not convey a covered work if you are
a party to an arrangement with a third party that is in the business of
distributing software, under which you make payment to the third party
based on the extent of your activity of conveying the work, and under
which the third party grants, to any of the parties who would receive the
covered work from you, a discriminatory patent license (a) in connection
with copies of the covered work conveyed by you (or copies made from
those copies), or (b) primarily for and in connection with specific
products or compilations that contain the covered work, unless you
entered into that arrangement, or that patent license was granted, prior
to 28 March 2007.
Nothing in this License shall be construed as excluding or limiting any
implied license or other defenses to infringement that may otherwise be
available to you under applicable patent law.
12. No Surrender of Others' Freedom.
If conditions are imposed on you (whether by court order, agreement or
otherwise) that contradict the conditions of this License, they do not
excuse you from the conditions of this License. If you cannot use,
propagate or convey a covered work so as to satisfy simultaneously your
obligations under this License and any other pertinent obligations, then
as a consequence you may not use, propagate or convey it at all. For
example, if you agree to terms that obligate you to collect a royalty for
further conveying from those to whom you convey the Program, the only way
you could satisfy both those terms and this License would be to refrain
entirely from conveying the Program.
13. Offering the Program as a Service.
If you make the functionality of the Program or a modified version
available to third parties as a service, you must make the Service Source
Code available via network download to everyone at no charge, under the
terms of this License. Making the functionality of the Program or
modified version available to third parties as a service includes,
without limitation, enabling third parties to interact with the
functionality of the Program or modified version remotely through a
computer network, offering a service the value of which entirely or
primarily derives from the value of the Program or modified version, or
offering a service that accomplishes for users the primary purpose of the
Program or modified version.
“Service Source Code” means the Corresponding Source for the Program or
the modified version, and the Corresponding Source for all programs that
you use to make the Program or modified version available as a service,
including, without limitation, management software, user interfaces,
application program interfaces, automation software, monitoring software,
backup software, storage software and hosting software, all such that a
user could run an instance of the service using the Service Source Code
you make available.
14. Revised Versions of this License.
MongoDB, Inc. may publish revised and/or new versions of the Server Side
Public License from time to time. Such new versions will be similar in
spirit to the present version, but may differ in detail to address new
problems or concerns.
Each version is given a distinguishing version number. If the Program
specifies that a certain numbered version of the Server Side Public
License “or any later version” applies to it, you have the option of
following the terms and conditions either of that numbered version or of
any later version published by MongoDB, Inc. If the Program does not
specify a version number of the Server Side Public License, you may
choose any version ever published by MongoDB, Inc.
If the Program specifies that a proxy can decide which future versions of
the Server Side Public License can be used, that proxy's public statement
of acceptance of a version permanently authorizes you to choose that
version for the Program.
Later license versions may give you additional or different permissions.
However, no additional obligations are imposed on any author or copyright
holder as a result of your choosing to follow a later version.
15. Disclaimer of Warranty.
THERE IS NO WARRANTY FOR THE PROGRAM, TO THE EXTENT PERMITTED BY
APPLICABLE LAW. EXCEPT WHEN OTHERWISE STATED IN WRITING THE COPYRIGHT
HOLDERS AND/OR OTHER PARTIES PROVIDE THE PROGRAM “AS IS” WITHOUT WARRANTY
OF ANY KIND, EITHER EXPRESSED OR IMPLIED, INCLUDING, BUT NOT LIMITED TO,
THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR
PURPOSE. THE ENTIRE RISK AS TO THE QUALITY AND PERFORMANCE OF THE PROGRAM
IS WITH YOU. SHOULD THE PROGRAM PROVE DEFECTIVE, YOU ASSUME THE COST OF
ALL NECESSARY SERVICING, REPAIR OR CORRECTION.
16. Limitation of Liability.
IN NO EVENT UNLESS REQUIRED BY APPLICABLE LAW OR AGREED TO IN WRITING
WILL ANY COPYRIGHT HOLDER, OR ANY OTHER PARTY WHO MODIFIES AND/OR CONVEYS
THE PROGRAM AS PERMITTED ABOVE, BE LIABLE TO YOU FOR DAMAGES, INCLUDING
ANY GENERAL, SPECIAL, INCIDENTAL OR CONSEQUENTIAL DAMAGES ARISING OUT OF
THE USE OR INABILITY TO USE THE PROGRAM (INCLUDING BUT NOT LIMITED TO
LOSS OF DATA OR DATA BEING RENDERED INACCURATE OR LOSSES SUSTAINED BY YOU
OR THIRD PARTIES OR A FAILURE OF THE PROGRAM TO OPERATE WITH ANY OTHER
PROGRAMS), EVEN IF SUCH HOLDER OR OTHER PARTY HAS BEEN ADVISED OF THE
POSSIBILITY OF SUCH DAMAGES.
17. Interpretation of Sections 15 and 16.
If the disclaimer of warranty and limitation of liability provided above
cannot be given local legal effect according to their terms, reviewing
courts shall apply local law that most closely approximates an absolute
waiver of all civil liability in connection with the Program, unless a
warranty or assumption of liability accompanies a copy of the Program in
return for a fee.
END OF TERMS AND CONDITIONS
Apache License
Version 2.0, January 2004
http://www.apache.org/licenses/
TERMS AND CONDITIONS FOR USE, REPRODUCTION, AND DISTRIBUTION
1. Definitions.
"License" shall mean the terms and conditions for use, reproduction,
and distribution as defined by Sections 1 through 9 of this document.
"Licensor" shall mean the copyright owner or entity authorized by
the copyright owner that is granting the License.
"Legal Entity" shall mean the union of the acting entity and all
other entities that control, are controlled by, or are under common
control with that entity. For the purposes of this definition,
"control" means (i) the power, direct or indirect, to cause the
direction or management of such entity, whether by contract or
otherwise, or (ii) ownership of fifty percent (50%) or more of the
outstanding shares, or (iii) beneficial ownership of such entity.
"You" (or "Your") shall mean an individual or Legal Entity
exercising permissions granted by this License.
"Source" form shall mean the preferred form for making modifications,
including but not limited to software source code, documentation
source, and configuration files.
"Object" form shall mean any form resulting from mechanical
transformation or translation of a Source form, including but
not limited to compiled object code, generated documentation,
and conversions to other media types.
"Work" shall mean the work of authorship, whether in Source or
Object form, made available under the License, as indicated by a
copyright notice that is included in or attached to the work
(an example is provided in the Appendix below).
"Derivative Works" shall mean any work, whether in Source or Object
form, that is based on (or derived from) the Work and for which the
editorial revisions, annotations, elaborations, or other modifications
represent, as a whole, an original work of authorship. For the purposes
of this License, Derivative Works shall not include works that remain
separable from, or merely link (or bind by name) to the interfaces of,
the Work and Derivative Works thereof.
"Contribution" shall mean any work of authorship, including
the original version of the Work and any modifications or additions
to that Work or Derivative Works thereof, that is intentionally
submitted to Licensor for inclusion in the Work by the copyright owner
or by an individual or Legal Entity authorized to submit on behalf of
the copyright owner. For the purposes of this definition, "submitted"
means any form of electronic, verbal, or written communication sent
to the Licensor or its representatives, including but not limited to
communication on electronic mailing lists, source code control systems,
and issue tracking systems that are managed by, or on behalf of, the
Licensor for the purpose of discussing and improving the Work, but
excluding communication that is conspicuously marked or otherwise
designated in writing by the copyright owner as "Not a Contribution."
"Contributor" shall mean Licensor and any individual or Legal Entity
on behalf of whom a Contribution has been received by Licensor and
subsequently incorporated within the Work.
2. Grant of Copyright License. Subject to the terms and conditions of
this License, each Contributor hereby grants to You a perpetual,
worldwide, non-exclusive, no-charge, royalty-free, irrevocable
copyright license to reproduce, prepare Derivative Works of,
publicly display, publicly perform, sublicense, and distribute the
Work and such Derivative Works in Source or Object form.
3. Grant of Patent License. Subject to the terms and conditions of
this License, each Contributor hereby grants to You a perpetual,
worldwide, non-exclusive, no-charge, royalty-free, irrevocable
(except as stated in this section) patent license to make, have made,
use, offer to sell, sell, import, and otherwise transfer the Work,
where such license applies only to those patent claims licensable
by such Contributor that are necessarily infringed by their
Contribution(s) alone or by combination of their Contribution(s)
with the Work to which such Contribution(s) was submitted. If You
institute patent litigation against any entity (including a
cross-claim or counterclaim in a lawsuit) alleging that the Work
or a Contribution incorporated within the Work constitutes direct
or contributory patent infringement, then any patent licenses
granted to You under this License for that Work shall terminate
as of the date such litigation is filed.
4. Redistribution. You may reproduce and distribute copies of the
Work or Derivative Works thereof in any medium, with or without
modifications, and in Source or Object form, provided that You
meet the following conditions:
(a) You must give any other recipients of the Work or
Derivative Works a copy of this License; and
(b) You must cause any modified files to carry prominent notices
stating that You changed the files; and
(c) You must retain, in the Source form of any Derivative Works
that You distribute, all copyright, patent, trademark, and
attribution notices from the Source form of the Work,
excluding those notices that do not pertain to any part of
the Derivative Works; and
(d) If the Work includes a "NOTICE" text file as part of its
distribution, then any Derivative Works that You distribute must
include a readable copy of the attribution notices contained
within such NOTICE file, excluding those notices that do not
pertain to any part of the Derivative Works, in at least one
of the following places: within a NOTICE text file distributed
as part of the Derivative Works; within the Source form or
documentation, if provided along with the Derivative Works; or,
within a display generated by the Derivative Works, if and
wherever such third-party notices normally appear. The contents
of the NOTICE file are for informational purposes only and
do not modify the License. You may add Your own attribution
notices within Derivative Works that You distribute, alongside
or as an addendum to the NOTICE text from the Work, provided
that such additional attribution notices cannot be construed
as modifying the License.
You may add Your own copyright statement to Your modifications and
may provide additional or different license terms and conditions
for use, reproduction, or distribution of Your modifications, or
for any such Derivative Works as a whole, provided Your use,
reproduction, and distribution of the Work otherwise complies with
the conditions stated in this License.
5. Submission of Contributions. Unless You explicitly state otherwise,
any Contribution intentionally submitted for inclusion in the Work
by You to the Licensor shall be under the terms and conditions of
this License, without any additional terms or conditions.
Notwithstanding the above, nothing herein shall supersede or modify
the terms of any separate license agreement you may have executed
with Licensor regarding such Contributions.
6. Trademarks. This License does not grant permission to use the trade
names, trademarks, service marks, or product names of the Licensor,
except as required for reasonable and customary use in describing the
origin of the Work and reproducing the content of the NOTICE file.
7. Disclaimer of Warranty. Unless required by applicable law or
agreed to in writing, Licensor provides the Work (and each
Contributor provides its Contributions) on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or
implied, including, without limitation, any warranties or conditions
of TITLE, NON-INFRINGEMENT, MERCHANTABILITY, or FITNESS FOR A
PARTICULAR PURPOSE. You are solely responsible for determining the
appropriateness of using or redistributing the Work and assume any
risks associated with Your exercise of permissions under this License.
8. Limitation of Liability. In no event and under no legal theory,
whether in tort (including negligence), contract, or otherwise,
unless required by applicable law (such as deliberate and grossly
negligent acts) or agreed to in writing, shall any Contributor be
liable to You for damages, including any direct, indirect, special,
incidental, or consequential damages of any character arising as a
result of this License or out of the use or inability to use the
Work (including but not limited to damages for loss of goodwill,
work stoppage, computer failure or malfunction, or any and all
other commercial damages or losses), even if such Contributor
has been advised of the possibility of such damages.
9. Accepting Warranty or Additional Liability. While redistributing
the Work or Derivative Works thereof, You may choose to offer,
and charge a fee for, acceptance of support, warranty, indemnity,
or other liability obligations and/or rights consistent with this
License. However, in accepting such obligations, You may act only
on Your own behalf and on Your sole responsibility, not on behalf
of any other Contributor, and only if You agree to indemnify,
defend, and hold each Contributor harmless for any liability
incurred by, or claims asserted against, such Contributor by reason
of your accepting any such warranty or additional liability.
END OF TERMS AND CONDITIONS
APPENDIX: How to apply the Apache License to your work.
To apply the Apache License to your work, attach the following
boilerplate notice, with the fields enclosed by brackets "[]"
replaced with your own identifying information. (Don't include
the brackets!) The text should be enclosed in the appropriate
comment syntax for the file format. We also recommend that a
file or class name and description of purpose be included on the
same "printed page" as the copyright notice for easier
identification within third-party archives.
Copyright [yyyy] [name of copyright owner]
Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License.

View File

@@ -1,6 +1,6 @@
# ClearML Ecosystem for Kubernetes
![Version: 2.2.2](https://img.shields.io/badge/Version-2.2.2-informational?style=flat-square) ![Type: application](https://img.shields.io/badge/Type-application-informational?style=flat-square) ![AppVersion: 1.1.1](https://img.shields.io/badge/AppVersion-1.1.1-informational?style=flat-square)
![Version: 4.2.0](https://img.shields.io/badge/Version-4.2.0-informational?style=flat-square) ![Type: application](https://img.shields.io/badge/Type-application-informational?style=flat-square) ![AppVersion: 1.6.0](https://img.shields.io/badge/AppVersion-1.6.0-informational?style=flat-square)
MLOps platform
@@ -10,7 +10,7 @@ MLOps platform
| Name | Email | Url |
| ---- | ------ | --- |
| valeriano-manassero | | https://github.com/valeriano-manassero |
| valeriano-manassero | | <https://github.com/valeriano-manassero> |
## Introduction
@@ -31,22 +31,26 @@ For development/evaluation it's possible to use [kind](https://kind.sigs.k8s.io)
After installation, following commands will create a complete ClearML insatllation:
```
mkdir -pm 777 /tmp/clearml-kind
cat <<EOF > /tmp/clearml-kind.yaml
cat <<EOF | kind create cluster --config=- ─╯
kind: Cluster
apiVersion: kind.x-k8s.io/v1alpha4
nodes:
- role: control-plane
extraPortMappings:
# API server's default nodePort is 30008. If you customize it in helm values by
# `apiserver.service.nodePort`, `containerPort` should match it
- containerPort: 30008
hostPort: 30008
listenAddress: "127.0.0.1"
protocol: TCP
# Web server's default nodePort is 30080. If you customize it in helm values by
# `webserver.service.nodePort`, `containerPort` should match it
- containerPort: 30080
hostPort: 30080
listenAddress: "127.0.0.1"
protocol: TCP
# File server's default nodePort is 30081. If you customize it in helm values by
# `fileserver.service.nodePort`, `containerPort` should match it
- containerPort: 30081
hostPort: 30081
listenAddress: "127.0.0.1"
@@ -56,8 +60,6 @@ nodes:
containerPath: /var/local-path-provisioner
EOF
kind create cluster --config /tmp/clearml-kind.yaml
helm install clearml allegroai/clearml
```
@@ -83,6 +85,24 @@ This will create 3 ingress rules:
Just pointing the domain records to the IP where ingress controller is responding will complete the deployment process.
## Upgrades/ Values upgrades
Updating to latest version of this chart can be done in two steps:
```
helm repo update
helm upgrade clearml allegroai/clearml
```
Changing values on existing installation can be done with:
```
helm upgrade clearml allegroai/clearml --version <CURRENT CHART VERSION> -f custom_values.yaml
```
Please note: updating values only should always be done setting explicit chart version to avoid a possible chart update.
Keeping separate updates procedures between version and values can be a good practice to seprate potential concerns.
## Additional Configuration for ClearML Server
You can also configure the **clearml-server** for:
@@ -101,91 +121,22 @@ For detailed instructions, see the [Optional Configuration](https://github.com/a
| Repository | Name | Version |
|------------|------|---------|
| https://charts.bitnami.com/bitnami | mongodb | ~10.3.2 |
| https://charts.bitnami.com/bitnami | redis | ~10.9.0 |
| https://helm.elastic.co | elasticsearch | ~7.10.1 |
| file://../../dependency_charts/elasticsearch | elasticsearch | 7.16.2 |
| file://../../dependency_charts/mongodb | mongodb | 10.3.4 |
| file://../../dependency_charts/redis | redis | 10.9.0 |
## Values
| Key | Type | Default | Description |
|-----|------|---------|-------------|
| agentGroups.agent-group-cpu.affinity | object | `{}` | |
| agentGroups.agent-group-cpu.agentVersion | string | `""` | |
| agentGroups.agent-group-cpu.awsAccessKeyId | string | `nil` | |
| agentGroups.agent-group-cpu.awsDefaultRegion | string | `nil` | |
| agentGroups.agent-group-cpu.awsSecretAccessKey | string | `nil` | |
| agentGroups.agent-group-cpu.azureStorageAccount | string | `nil` | |
| agentGroups.agent-group-cpu.azureStorageKey | string | `nil` | |
| agentGroups.agent-group-cpu.clearmlAccessKey | string | `nil` | |
| agentGroups.agent-group-cpu.clearmlConfig | string | `"sdk {\n}"` | |
| agentGroups.agent-group-cpu.clearmlGitPassword | string | `nil` | |
| agentGroups.agent-group-cpu.clearmlGitUser | string | `nil` | |
| agentGroups.agent-group-cpu.clearmlSecretKey | string | `nil` | |
| agentGroups.agent-group-cpu.image.pullPolicy | string | `"IfNotPresent"` | |
| agentGroups.agent-group-cpu.image.repository | string | `"ubuntu"` | |
| agentGroups.agent-group-cpu.image.tag | string | `"18.04"` | |
| agentGroups.agent-group-cpu.name | string | `"agent-group-cpu"` | |
| agentGroups.agent-group-cpu.nodeSelector | object | `{}` | |
| agentGroups.agent-group-cpu.nvidiaGpusPerAgent | int | `0` | |
| agentGroups.agent-group-cpu.podAnnotations | object | `{}` | |
| agentGroups.agent-group-cpu.queues | string | `"default"` | |
| agentGroups.agent-group-cpu.replicaCount | int | `1` | |
| agentGroups.agent-group-cpu.tolerations | list | `[]` | |
| agentGroups.agent-group-cpu.updateStrategy | string | `"Recreate"` | |
| agentGroups.agent-group-gpu.affinity | object | `{}` | |
| agentGroups.agent-group-gpu.agentVersion | string | `""` | |
| agentGroups.agent-group-gpu.awsAccessKeyId | string | `nil` | |
| agentGroups.agent-group-gpu.awsDefaultRegion | string | `nil` | |
| agentGroups.agent-group-gpu.awsSecretAccessKey | string | `nil` | |
| agentGroups.agent-group-gpu.azureStorageAccount | string | `nil` | |
| agentGroups.agent-group-gpu.azureStorageKey | string | `nil` | |
| agentGroups.agent-group-gpu.clearmlAccessKey | string | `nil` | |
| agentGroups.agent-group-gpu.clearmlConfig | string | `"sdk {\n}"` | |
| agentGroups.agent-group-gpu.clearmlGitPassword | string | `nil` | |
| agentGroups.agent-group-gpu.clearmlGitUser | string | `nil` | |
| agentGroups.agent-group-gpu.clearmlSecretKey | string | `nil` | |
| agentGroups.agent-group-gpu.image.pullPolicy | string | `"IfNotPresent"` | |
| agentGroups.agent-group-gpu.image.repository | string | `"nvidia/cuda"` | |
| agentGroups.agent-group-gpu.image.tag | string | `"11.0-base-ubuntu18.04"` | |
| agentGroups.agent-group-gpu.name | string | `"agent-group-gpu"` | |
| agentGroups.agent-group-gpu.nodeSelector | object | `{}` | |
| agentGroups.agent-group-gpu.nvidiaGpusPerAgent | int | `1` | |
| agentGroups.agent-group-gpu.podAnnotations | object | `{}` | |
| agentGroups.agent-group-gpu.queues | string | `"default"` | |
| agentGroups.agent-group-gpu.replicaCount | int | `0` | |
| agentGroups.agent-group-gpu.tolerations | list | `[]` | |
| agentGroups.agent-group-gpu.updateStrategy | string | `"Recreate"` | |
| agentservices.affinity | object | `{}` | |
| agentservices.agentVersion | string | `""` | |
| agentservices.awsAccessKeyId | string | `nil` | |
| agentservices.awsDefaultRegion | string | `nil` | |
| agentservices.awsSecretAccessKey | string | `nil` | |
| agentservices.azureStorageAccount | string | `nil` | |
| agentservices.azureStorageKey | string | `nil` | |
| agentservices.clearmlFilesHost | string | `nil` | |
| agentservices.clearmlGitPassword | string | `nil` | |
| agentservices.clearmlGitUser | string | `nil` | |
| agentservices.clearmlHostIp | string | `nil` | |
| agentservices.clearmlWebHost | string | `nil` | |
| agentservices.clearmlWorkerId | string | `"clearml-services"` | |
| agentservices.extraEnvs | list | `[]` | |
| agentservices.googleCredentials | string | `nil` | |
| agentservices.image.pullPolicy | string | `"IfNotPresent"` | |
| agentservices.image.repository | string | `"allegroai/clearml-agent-services"` | |
| agentservices.image.tag | string | `"latest"` | |
| agentservices.nodeSelector | object | `{}` | |
| agentservices.podAnnotations | object | `{}` | |
| agentservices.replicaCount | int | `1` | |
| agentservices.resources | object | `{}` | |
| agentservices.storage.data.class | string | `"standard"` | |
| agentservices.storage.data.size | string | `"50Gi"` | |
| agentservices.tolerations | list | `[]` | |
| apiserver.affinity | object | `{}` | |
| apiserver.authCookiesMaxAge | int | `864000` | Amount of seconds the authorization cookie will last in user browser |
| apiserver.configDir | string | `"/opt/clearml/config"` | |
| apiserver.configuration | object | `{"additionalConfigs":{},"configRefName":"","secretRefName":""}` | additional configurations that can be used by api server; check examples in values.yaml file |
| apiserver.extraEnvs | list | `[]` | |
| apiserver.image.pullPolicy | string | `"IfNotPresent"` | |
| apiserver.image.repository | string | `"allegroai/clearml"` | |
| apiserver.image.tag | string | `"1.1.1"` | |
| apiserver.image.tag | string | `"1.6.0"` | |
| apiserver.livenessDelay | int | `60` | |
| apiserver.nodeSelector | object | `{}` | |
| apiserver.podAnnotations | object | `{}` | |
@@ -195,13 +146,11 @@ For detailed instructions, see the [Optional Configuration](https://github.com/a
| apiserver.readinessDelay | int | `60` | |
| apiserver.replicaCount | int | `1` | |
| apiserver.resources | object | `{}` | |
| apiserver.service.nodePort | int | `30008` | If service.type set to NodePort, this will be set to service's nodePort field. If service.type is set to others, this field will be ignored |
| apiserver.service.port | int | `8008` | |
| apiserver.service.type | string | `"NodePort"` | |
| apiserver.storage.config.class | string | `"standard"` | |
| apiserver.storage.config.size | string | `"1Gi"` | |
| apiserver.storage.enableConfigVolume | bool | `false` | |
| apiserver.service.type | string | `"NodePort"` | This will set to service's spec.type field |
| apiserver.tolerations | list | `[]` | |
| clearml.defaultCompany | string | `"d1bd92a3b039400cbafc60a7a5b1e52b"` | |
| clearml | object | `{"defaultCompany":"d1bd92a3b039400cbafc60a7a5b1e52b"}` | ClearMl generic configurations |
| elasticsearch.clusterHealthCheckParams | string | `"wait_for_status=yellow&timeout=1s"` | |
| elasticsearch.clusterName | string | `"clearml-elastic"` | |
| elasticsearch.enabled | bool | `true` | |
@@ -237,28 +186,51 @@ For detailed instructions, see the [Optional Configuration](https://github.com/a
| elasticsearch.roles.remote_cluster_client | string | `"true"` | |
| elasticsearch.volumeClaimTemplate.accessModes[0] | string | `"ReadWriteOnce"` | |
| elasticsearch.volumeClaimTemplate.resources.requests.storage | string | `"50Gi"` | |
| externalServices.elasticsearchHost | string | `""` | Existing ElasticSearch Hostname to use if elasticsearch.enabled is false |
| externalServices.elasticsearchPort | int | `9200` | Existing ElasticSearch Port to use if elasticsearch.enabled is false |
| externalServices.mongodbHost | string | `""` | Existing MongoDB Hostname to use if elasticsearch.enabled is false |
| externalServices.mongodbPort | int | `27017` | Existing MongoDB Port to use if elasticsearch.enabled is false |
| externalServices.redisHost | string | `""` | Existing Redis Hostname to use if elasticsearch.enabled is false |
| externalServices.redisPort | int | `6379` | Existing Redis Port to use if elasticsearch.enabled is false |
| fileserver.affinity | object | `{}` | |
| fileserver.extraEnvs | list | `[]` | |
| fileserver.image.pullPolicy | string | `"IfNotPresent"` | |
| fileserver.image.repository | string | `"allegroai/clearml"` | |
| fileserver.image.tag | string | `"1.1.1"` | |
| fileserver.image.tag | string | `"1.6.0"` | |
| fileserver.nodeSelector | object | `{}` | |
| fileserver.podAnnotations | object | `{}` | |
| fileserver.replicaCount | int | `1` | |
| fileserver.resources | object | `{}` | |
| fileserver.service.nodePort | int | `30081` | If service.type set to NodePort, this will be set to service's nodePort field. If service.type is set to others, this field will be ignored |
| fileserver.service.port | int | `8081` | |
| fileserver.service.type | string | `"NodePort"` | |
| fileserver.storage.data.class | string | `"standard"` | |
| fileserver.service.type | string | `"NodePort"` | This will set to service's spec.type field |
| fileserver.storage.data.class | string | `""` | |
| fileserver.storage.data.size | string | `"50Gi"` | |
| fileserver.tolerations | list | `[]` | |
| imageCredentials | object | `{"email":"someone@host.com","enabled":false,"existingSecret":"","password":"pwd","registry":"docker.io","username":"someone"}` | Private image registry configuration |
| imageCredentials.email | string | `"someone@host.com"` | Email |
| imageCredentials.enabled | bool | `false` | Use private authentication mode |
| imageCredentials.existingSecret | string | `""` | If this is set, chart will not generate a secret but will use what is defined here |
| imageCredentials.password | string | `"pwd"` | Registry password |
| imageCredentials.registry | string | `"docker.io"` | Registry name |
| imageCredentials.username | string | `"someone"` | Registry username |
| ingress.annotations | object | `{}` | |
| ingress.enabled | bool | `false` | |
| ingress.host | string | `""` | |
| ingress.hostPrefixApi | string | `"api."` | |
| ingress.hostPrefixApp | string | `"app."` | |
| ingress.hostPrefixFiles | string | `"files."` | |
| ingress.api.annotations | object | `{}` | |
| ingress.api.enabled | bool | `false` | |
| ingress.api.hostName | string | `"api.clearml.127-0-0-1.nip.io"` | |
| ingress.api.path | string | `"/"` | |
| ingress.api.tlsSecretName | string | `""` | |
| ingress.app.annotations | object | `{}` | |
| ingress.app.enabled | bool | `false` | |
| ingress.app.hostName | string | `"app.clearml.127-0-0-1.nip.io"` | |
| ingress.app.path | string | `"/"` | |
| ingress.app.tlsSecretName | string | `""` | |
| ingress.files.annotations | object | `{}` | |
| ingress.files.enabled | bool | `false` | |
| ingress.files.hostName | string | `"files.clearml.127-0-0-1.nip.io"` | |
| ingress.files.path | string | `"/"` | |
| ingress.files.tlsSecretName | string | `""` | |
| ingress.name | string | `"clearml-server-ingress"` | |
| ingress.tls.secretName | string | `""` | |
| mongodb.architecture | string | `"standalone"` | |
| mongodb.auth.enabled | bool | `false` | |
| mongodb.enabled | bool | `true` | |
@@ -279,15 +251,24 @@ For detailed instructions, see the [Optional Configuration](https://github.com/a
| redis.master.persistence.size | string | `"5Gi"` | |
| redis.master.port | int | `6379` | |
| redis.usePassword | bool | `false` | |
| secret.authToken | string | `"1SCf0ov3Nm544Td2oZ0gXSrsNx5XhMWdVlKz1tOgcx158bD5RV"` | Set for auth_token field |
| secret.credentials.apiserver.accessKey | string | `"5442F3443MJMORWZA3ZH"` | Set for apiserver_key field |
| secret.credentials.apiserver.secretKey | string | `"BxapIRo9ZINi8x25CRxz8Wdmr2pQjzuWVB4PNASZqCtTyWgWVQ"` | Set for apiserver_secret field |
| secret.credentials.tests.accessKey | string | `"ENP39EQM4SLACGD5FXB7"` | Set for tests_user_key field |
| secret.credentials.tests.secretKey | string | `"lPcm0imbcBZ8mwgO7tpadutiS3gnJD05x9j7afwXPS35IKbpiQ"` | Set for tests_user_secret field |
| secret.existingSecret | string | `""` | If this is set, chart will not generate a secret but will use what is defined here |
| secret.httpSession | string | `"9Tw20RbhJ1bLBiHEOWXvhplKGUbTgLzAtwFN2oLQvWwS0uRpD5"` | Set for http_session field |
| webserver.additionalConfigs | object | `{}` | |
| webserver.affinity | object | `{}` | |
| webserver.extraEnvs | list | `[]` | |
| webserver.image.pullPolicy | string | `"IfNotPresent"` | |
| webserver.image.repository | string | `"allegroai/clearml"` | |
| webserver.image.tag | string | `"1.1.1"` | |
| webserver.image.tag | string | `"1.6.0"` | |
| webserver.nodeSelector | object | `{}` | |
| webserver.podAnnotations | object | `{}` | |
| webserver.replicaCount | int | `1` | |
| webserver.resources | object | `{}` | |
| webserver.service.nodePort | int | `30080` | If service.type set to NodePort, this will be set to service's nodePort field. If service.type is set to others, this field will be ignored |
| webserver.service.port | int | `80` | |
| webserver.service.type | string | `"NodePort"` | |
| webserver.service.type | string | `"NodePort"` | This will set to service's spec.type field |
| webserver.tolerations | list | `[]` | |

View File

@@ -28,22 +28,26 @@ For development/evaluation it's possible to use [kind](https://kind.sigs.k8s.io)
After installation, following commands will create a complete ClearML insatllation:
```
mkdir -pm 777 /tmp/clearml-kind
cat <<EOF > /tmp/clearml-kind.yaml
cat <<EOF | kind create cluster --config=- ─╯
kind: Cluster
apiVersion: kind.x-k8s.io/v1alpha4
nodes:
- role: control-plane
extraPortMappings:
# API server's default nodePort is 30008. If you customize it in helm values by
# `apiserver.service.nodePort`, `containerPort` should match it
- containerPort: 30008
hostPort: 30008
listenAddress: "127.0.0.1"
protocol: TCP
# Web server's default nodePort is 30080. If you customize it in helm values by
# `webserver.service.nodePort`, `containerPort` should match it
- containerPort: 30080
hostPort: 30080
listenAddress: "127.0.0.1"
protocol: TCP
# File server's default nodePort is 30081. If you customize it in helm values by
# `fileserver.service.nodePort`, `containerPort` should match it
- containerPort: 30081
hostPort: 30081
listenAddress: "127.0.0.1"
@@ -53,8 +57,6 @@ nodes:
containerPath: /var/local-path-provisioner
EOF
kind create cluster --config /tmp/clearml-kind.yaml
helm install clearml allegroai/clearml
```
@@ -80,6 +82,24 @@ This will create 3 ingress rules:
Just pointing the domain records to the IP where ingress controller is responding will complete the deployment process.
## Upgrades/ Values upgrades
Updating to latest version of this chart can be done in two steps:
```
helm repo update
helm upgrade clearml allegroai/clearml
```
Changing values on existing installation can be done with:
```
helm upgrade clearml allegroai/clearml --version <CURRENT CHART VERSION> -f custom_values.yaml
```
Please note: updating values only should always be done setting explicit chart version to avoid a possible chart update.
Keeping separate updates procedures between version and values can be a good practice to seprate potential concerns.
## Additional Configuration for ClearML Server
You can also configure the **clearml-server** for:

Binary file not shown.

View File

@@ -0,0 +1,7 @@
Place values files with different values in this directory to ensure these cases are tested by the CI as well.
https://github.com/helm/chart-testing/blob/main/doc/ct_install.md
```
"Charts may have multiple custom values files matching the glob pattern '*-values.yaml' in a directory named 'ci' in the root of the chart's directory. The chart is installed and tested for each of these files. If no custom values file is present, the chart is installed and tested with defaults."
```

View File

@@ -0,0 +1 @@
# empty so default values.yaml gets tested

View File

@@ -101,13 +101,13 @@ Create the name of the App service to use
*/}}
{{- define "clearml.serviceApp" -}}
{{- if .Values.ingress.enabled }}
{{- if .Values.ingress.tls.secretName }}
{{- printf "%s%s%s" "https://" .Values.ingress.hostPrefixApp .Values.ingress.host }}
{{- if .Values.ingress.app.tlsSecretName }}
{{- printf "%s%s" "https://" .Values.ingress.app.hostName }}
{{- else }}
{{- printf "%s%s%s" "http://" .Values.ingress.hostPrefixApp .Values.ingress.host }}
{{- printf "%s%s" "http://" .Values.ingress.app.hostName }}
{{- end }}
{{- else }}
{{- printf "%s%s%s%s" "http://" (include "clearml.fullname" .) "-webserver:" (.Values.webserver.service.port | quote) }}
{{- printf "%s%s%s%s" "http://" (include "clearml.fullname" .) "-webserver:" (.Values.webserver.service.port | toString) }}
{{- end }}
{{- end }}
@@ -116,13 +116,13 @@ Create the name of the Api service to use
*/}}
{{- define "clearml.serviceApi" -}}
{{- if .Values.ingress.enabled }}
{{- if .Values.ingress.tls.secretName }}
{{- printf "%s%s%s" "https://" .Values.ingress.hostPrefixApi .Values.ingress.host }}
{{- if .Values.ingress.api.tlsSecretName }}
{{- printf "%s%s" "https://" .Values.ingress.api.hostName }}
{{- else }}
{{- printf "%s%s%s" "http://" .Values.ingress.hostPrefixApi .Values.ingress.host }}
{{- printf "%s%s" "http://" .Values.ingress.api.hostName }}
{{- end }}
{{- else }}
{{- printf "%s%s%s%s" "http://" (include "clearml.fullname" .) "-apiserver:" (.Values.apiserver.service.port | quote) }}
{{- printf "%s%s%s%s" "http://" (include "clearml.fullname" .) "-apiserver:" (.Values.apiserver.service.port | toString) }}
{{- end }}
{{- end }}
@@ -131,12 +131,19 @@ Create the name of the Files service to use
*/}}
{{- define "clearml.serviceFiles" -}}
{{- if .Values.ingress.enabled }}
{{- if .Values.ingress.tls.secretName }}
{{- printf "%s%s%s" "https://" .Values.ingress.hostPrefixFiles .Values.ingress.host }}
{{- if .Values.ingress.files.tlsSecretName }}
{{- printf "%s%s" "https://" .Values.ingress.files.hostName }}
{{- else }}
{{- printf "%s%s%s" "http://" .Values.ingress.hostPrefixFiles .Values.ingress.host }}
{{- printf "%s%s" "http://" .Values.ingress.files.hostName }}
{{- end }}
{{- else }}
{{- printf "%s%s%s%s" "http://" (include "clearml.fullname" .) "-fileserver:" (.Values.fileserver.service.port | quote) }}
{{- printf "%s%s%s%s" "http://" (include "clearml.fullname" .) "-fileserver:" (.Values.fileserver.service.port | toString) }}
{{- end }}
{{- end }}
{{/*
Return the proper Docker Image Registry Secret Names
*/}}
{{- define "imagePullSecret" }}
{{- printf "{\"auths\": {\"%s\": {\"auth\": \"%s\"}}}" .Values.imageCredentials.registry (printf "%s:%s" .Values.imageCredentials.username .Values.imageCredentials.password | b64enc) | b64enc }}
{{- end }}

View File

@@ -0,0 +1,13 @@
{{- if .Values.apiserver.configuration.additionalConfigs -}}
apiVersion: v1
kind: ConfigMap
metadata:
name: "{{ include "clearml.fullname" . }}-apiserver-configmap"
labels:
{{- include "clearml.labels" . | nindent 4 }}
data:
{{- range $key, $val := .Values.apiserver.configuration.additionalConfigs }}
{{ $key }}: |
{{- $val | nindent 4 }}
{{- end }}
{{- end -}}

View File

@@ -0,0 +1,13 @@
{{- if .Values.webserver.additionalConfigs -}}
apiVersion: v1
kind: ConfigMap
metadata:
name: "{{ include "clearml.fullname" . }}-webserver-configmap"
labels:
{{- include "clearml.labels" . | nindent 4 }}
data:
{{- range $key, $val := .Values.webserver.additionalConfigs }}
{{ $key }}: |
{{- $val | nindent 4 }}
{{- end }}
{{- end -}}

View File

@@ -1,116 +0,0 @@
{{- range $key, $value := .Values.agentGroups }}
{{- with $value }}
---
apiVersion: apps/v1
kind: Deployment
metadata:
name: {{ include "clearml.fullname" $ }}-{{ .name }}-agent
labels:
{{- include "clearml.labels" $ | nindent 4 }}
spec:
replicas: {{ .replicaCount }}
strategy:
type: {{ .updateStrategy }}
selector:
matchLabels:
{{- include "clearml.selectorLabelsAgent" $ | nindent 6 }}
template:
metadata:
{{- with .podAnnotations }}
annotations:
{{- toYaml . | nindent 8 }}
{{- end }}
labels:
{{- include "clearml.selectorLabelsAgent" $ | nindent 8 }}
spec:
volumes:
{{ if .clearmlConfig }}
- name: agent-clearml-conf-volume
secret:
secretName: {{ .name }}-conf
items:
- key: clearml.conf
path: clearml.conf
{{ end }}
initContainers:
- name: init-agent-{{ .name }}
image: "{{ .image.repository }}:{{ .image.tag | default $.Chart.AppVersion }}"
command:
- /bin/sh
- -c
- >
set -x;
while [ $(curl -sw '%{http_code}' "{{ include "clearml.serviceApi" $ }}/debug.ping" -o /dev/null) -ne 200 ] ; do
echo "waiting for apiserver" ;
sleep 5 ;
done
containers:
- name: {{ $.Chart.Name }}-{{ .name }}
image: "{{ .image.repository }}:{{ .image.tag }}"
imagePullPolicy: {{ .image.pullPolicy }}
securityContext:
privileged: true
resources:
limits:
nvidia.com/gpu:
{{ .nvidiaGpusPerAgent }}
env:
- name: CLEARML_API_HOST
value: {{ include "clearml.serviceApi" $ }}
- name: CLEARML_WEB_HOST
value: {{ include "clearml.serviceApp" $ }}
- name: CLEARML_FILES_HOST
value: {{ include "clearml.serviceFiles" $ }}
- name: CLEARML_AGENT_GIT_USER
value: {{ .clearmlGitUser}}
- name: CLEARML_AGENT_GIT_PASS
value: {{ .clearmlGitPassword}}
- name: AWS_ACCESS_KEY_ID
value: {{ .awsAccessKeyId}}
- name: AWS_SECRET_ACCESS_KEY
value: {{ .awsSecretAccessKey}}
- name: AWS_DEFAULT_REGION
value: {{ .awsDefaultRegion}}
- name: AZURE_STORAGE_ACCOUNT
value: {{ .azureStorageAccount}}
- name: AZURE_STORAGE_KEY
value: {{ .azureStorageKey}}
- name: CLEARML_API_ACCESS_KEY
valueFrom:
secretKeyRef:
name: clearml-conf
key: tests_user_key
- name: CLEARML_API_SECRET_KEY
valueFrom:
secretKeyRef:
name: clearml-conf
key: tests_user_secret
command:
- /bin/sh
- -c
- "apt-get update ;
apt-get install -y curl python3-pip git;
python3 -m pip install -U pip ;
python3 -m pip install clearml-agent{{ .agentVersion}} ;
CLEARML_AGENT_K8S_HOST_MOUNT=/root/.clearml:/root/.clearml clearml-agent daemon --queue {{ .queues}}"
{{ if .clearmlConfig }}
volumeMounts:
- name: agent-clearml-conf-volume
mountPath: /root/clearml.conf
subPath: clearml.conf
readOnly: true
{{- end }}
{{- with .nodeSelector }}
nodeSelector:
{{- toYaml . | nindent 8 }}
{{- end }}
{{- with .affinity }}
affinity:
{{- toYaml . | nindent 8 }}
{{- end }}
{{- with .tolerations }}
tolerations:
{{- toYaml . | nindent 8 }}
{{- end }}
{{- end }}
{{- end }}

View File

@@ -1,100 +0,0 @@
apiVersion: apps/v1
kind: Deployment
metadata:
name: {{ include "clearml.fullname" . }}-agentservices
labels:
{{- include "clearml.labels" . | nindent 4 }}
spec:
replicas: {{ .Values.agentservices.replicaCount }}
selector:
matchLabels:
{{- include "clearml.selectorLabelsAgentServices" . | nindent 6 }}
template:
metadata:
{{- with .Values.agentservices.podAnnotations }}
annotations:
{{- toYaml . | nindent 8 }}
{{- end }}
labels:
{{- include "clearml.selectorLabelsAgentServices" . | nindent 8 }}
spec:
volumes:
- name: agentservices-data
persistentVolumeClaim:
claimName: {{ include "clearml.fullname" . }}-agentservices-data
initContainers:
- name: init-agentservices
image: "{{ .Values.agentservices.image.repository }}:{{ .Values.agentservices.image.tag | default .Chart.AppVersion }}"
command:
- /bin/sh
- -c
- >
set -x;
while [ $(curl -sw '%{http_code}' "{{ include "clearml.serviceApi" $ }}/debug.ping" -o /dev/null) -ne 200 ] ; do
echo "waiting for apiserver" ;
sleep 5 ;
done
containers:
- name: {{ .Chart.Name }}
image: "{{ .Values.agentservices.image.repository }}:{{ .Values.agentservices.image.tag | default .Chart.AppVersion }}"
imagePullPolicy: {{ .Values.agentservices.image.pullPolicy }}
env:
- name: CLEARML_HOST_IP
value: {{ .Values.agentservices.clearmlHostIp }}
- name: CLEARML_API_HOST
value: {{ include "clearml.serviceApi" $ }}
- name: CLEARML_WEB_HOST
value: {{ .Values.agentservices.clearmlWebHost }}
- name: CLEARML_FILES_HOST
value: {{ .Values.agentservices.clearmlFilesHost }}
- name: CLEARML_AGENT_GIT_USER
value: {{ .Values.agentservices.clearmlGitUser }}
- name: CLEARML_AGENT_GIT_PASS
value: {{ .Values.agentservices.clearmlGitPassword }}
- name: CLEARML_AGENT_UPDATE_VERSION
value: {{ .Values.agentservices.agentVersion }}
- name: CLEARML_AGENT_DEFAULT_BASE_DOCKER
value: {{ .Values.agentservices.defaultBaseDocker }}
- name: AWS_ACCESS_KEY_ID
value: {{ .Values.agentservices.awsAccessKeyId }}
- name: AWS_SECRET_ACCESS_KEY
value: {{ .Values.agentservices.awsSecretAccessKey }}
- name: AWS_DEFAULT_REGION
value: {{ .Values.agentservices.awsDefaultRegion }}
- name: AZURE_STORAGE_ACCOUNT
value: {{ .Values.agentservices.azureStorageAccount }}
- name: AZURE_STORAGE_KEY
value: {{ .Values.agentservices.azureStorageKey }}
- name: GOOGLE_APPLICATION_CREDENTIALS
value: {{ .Values.agentservices.googleCredentials }}
- name: CLEARML_WORKER_ID
value: {{ .Values.agentservices.clearmlWorkerId }}
- name: CLEARML_API_ACCESS_KEY
valueFrom:
secretKeyRef:
name: clearml-conf
key: tests_user_key
- name: CLEARML_API_SECRET_KEY
valueFrom:
secretKeyRef:
name: clearml-conf
key: tests_user_secret
args:
- agentservices
volumeMounts:
- name: agentservices-data
mountPath: /root/.clearml
resources:
{{- toYaml .Values.agentservices.resources | nindent 12 }}
{{- with .Values.agentservices.nodeSelector }}
nodeSelector:
{{- toYaml . | nindent 8 }}
{{- end }}
{{- with .Values.agentservices.affinity }}
affinity:
{{- toYaml . | nindent 8 }}
{{- end }}
{{- with .Values.agentservices.tolerations }}
tolerations:
{{- toYaml . | nindent 8 }}
{{- end }}

View File

@@ -11,18 +11,21 @@ spec:
{{- include "clearml.selectorLabelsApiServer" . | nindent 6 }}
template:
metadata:
{{- with .Values.apiserver.podAnnotations }}
annotations:
checksum/secret: {{ include (print $.Template.BasePath "/secrets.yaml") . | sha256sum }}
{{- with .Values.apiserver.podAnnotations }}
{{- toYaml . | nindent 8 }}
{{- end }}
labels:
{{- include "clearml.selectorLabelsApiServer" . | nindent 8 }}
spec:
{{- if .Values.apiserver.storage.enableConfigVolume }}
volumes:
- name: apiserver-config
persistentVolumeClaim:
claimName: {{ include "clearml.fullname" . }}-apiserver-config
{{- if .Values.imageCredentials.enabled }}
imagePullSecrets:
{{- if .Values.imageCredentials.existingSecret }}
- name: {{ .Values.imageCredentials.existingSecret }}
{{- else }}
- name: clearml-agent-registry-key
{{- end }}
{{- end }}
containers:
- name: {{ .Chart.Name }}
@@ -34,23 +37,49 @@ spec:
protocol: TCP
env:
- name: CLEARML_ELASTIC_SERVICE_HOST
{{- if .Values.elasticsearch.enabled }}
value: "{{ .Values.elasticsearch.clusterName }}-master"
{{- else }}
value: "{{ .Values.externalServices.elasticsearchHost }}"
{{- end }}
- name: CLEARML_ELASTIC_SERVICE_PORT
{{- if .Values.elasticsearch.enabled }}
value: "{{ .Values.elasticsearch.httpPort }}"
{{- else }}
value: "{{ .Values.externalServices.elasticsearchPort }}"
{{- end }}
- name: CLEARML_MONGODB_SERVICE_HOST
{{- if .Values.mongodb.enabled }}
value: "{{ tpl .Values.mongodb.service.name . }}"
{{- else }}
value: "{{ .Values.externalServices.mongodbHost }}"
{{- end }}
- name: CLEARML_MONGODB_SERVICE_PORT
{{- if .Values.mongodb.enabled }}
value: "{{ .Values.mongodb.service.port }}"
{{- else }}
value: "{{ .Values.externalServices.mongodbPort }}"
{{- end }}
- name: CLEARML_REDIS_SERVICE_HOST
{{- if .Values.redis.enabled }}
value: "{{ tpl .Values.redis.master.name . }}"
{{- else }}
value: "{{ .Values.externalServices.redisHost }}"
{{- end }}
- name: CLEARML_REDIS_SERVICE_PORT
{{- if .Values.redis.enabled }}
value: "{{ .Values.redis.master.port }}"
{{- else }}
value: "{{ .Values.externalServices.redisPort }}"
{{- end }}
- name: CLEARML__APISERVER__PRE_POPULATE__ENABLED
value: "{{ .Values.apiserver.prepopulateEnabled }}"
- name: CLEARML__APISERVER__PRE_POPULATE__ZIP_FILES
value: "{{ .Values.apiserver.prepopulateZipFiles }}"
- name: CLEARML_SERVER_DEPLOYMENT_TYPE
value: "helm-cloud"
- name: CLEARML__APISERVER__AUTH__COOKIES__MAX_AGE
value: "{{ .Values.apiserver.authCookiesMaxAge }}"
- name: CLEARML_CONFIG_DIR
value: /opt/clearml/config
- name: CLEARML__APISERVER__DEFAULT_COMPANY
@@ -58,32 +87,32 @@ spec:
- name: CLEARML__SECURE__HTTP__SESSION_SECRET__APISERVER
valueFrom:
secretKeyRef:
name: clearml-conf
name: {{ default "clearml-conf" .Values.secret.existingSecret }}
key: http_session
- name: CLEARML__SECURE__AUTH__TOKEN_SECRET
valueFrom:
secretKeyRef:
name: clearml-conf
name: {{ default "clearml-conf" .Values.secret.existingSecret }}
key: auth_token
- name: CLEARML__SECURE__CREDENTIALS__APISERVER__USER_KEY
valueFrom:
secretKeyRef:
name: clearml-conf
name: {{ default "clearml-conf" .Values.secret.existingSecret }}
key: apiserver_key
- name: CLEARML__SECURE__CREDENTIALS__APISERVER__USER_SECRET
valueFrom:
secretKeyRef:
name: clearml-conf
name: {{ default "clearml-conf" .Values.secret.existingSecret }}
key: apiserver_secret
- name: CLEARML__SECURE__CREDENTIALS__TESTS__USER_KEY
valueFrom:
secretKeyRef:
name: clearml-conf
name: {{ default "clearml-conf" .Values.secret.existingSecret }}
key: tests_user_key
- name: CLEARML__SECURE__CREDENTIALS__TESTS__USER_SECRET
valueFrom:
secretKeyRef:
name: clearml-conf
name: {{ default "clearml-conf" .Values.secret.existingSecret }}
key: tests_user_secret
{{- if .Values.apiserver.extraEnvs }}
{{ toYaml .Values.apiserver.extraEnvs | nindent 10 }}
@@ -101,13 +130,27 @@ spec:
httpGet:
path: /debug.ping
port: 8008
{{- if .Values.apiserver.storage.enableConfigVolume }}
{{- if or .Values.apiserver.configuration.additionalConfigs .Values.apiserver.configuration.configRefName .Values.apiserver.configuration.secretRefName }}
volumeMounts:
- name: apiserver-config
mountPath: /opt/clearml/config
{{- end }}
resources:
{{- toYaml .Values.apiserver.resources | nindent 12 }}
{{- if or .Values.apiserver.configuration.additionalConfigs .Values.apiserver.configuration.configRefName .Values.apiserver.configuration.secretRefName }}
volumes:
- name: apiserver-config
{{- if or .Values.apiserver.configuration.configRefName }}
configMap:
name: {{ .Values.apiserver.configuration.configRefName }}
{{- else if or .Values.apiserver.configuration.secretRefName }}
secret:
secretName: {{ .Values.apiserver.configuration.secretRefName }}
{{- else if or .Values.apiserver.configuration.additionalConfigs }}
configMap:
name: "{{ include "clearml.fullname" . }}-apiserver-configmap"
{{- end }}
{{- end }}
{{- with .Values.apiserver.nodeSelector }}
nodeSelector:
{{- toYaml . | nindent 8 }}

View File

@@ -22,6 +22,14 @@ spec:
- name: fileserver-data
persistentVolumeClaim:
claimName: {{ include "clearml.fullname" . }}-fileserver-data
{{- if .Values.imageCredentials.enabled }}
imagePullSecrets:
{{- if .Values.imageCredentials.existingSecret }}
- name: {{ .Values.imageCredentials.existingSecret }}
{{- else }}
- name: clearml-agent-registry-key
{{- end }}
{{- end }}
containers:
- name: {{ .Chart.Name }}
image: "{{ .Values.fileserver.image.repository }}:{{ .Values.fileserver.image.tag | default .Chart.AppVersion }}"

View File

@@ -18,6 +18,14 @@ spec:
labels:
{{- include "clearml.selectorLabelsWebServer" . | nindent 8 }}
spec:
{{- if .Values.imageCredentials.enabled }}
imagePullSecrets:
{{- if .Values.imageCredentials.existingSecret }}
- name: {{ .Values.imageCredentials.existingSecret }}
{{- else }}
- name: clearml-agent-registry-key
{{- end }}
{{- end }}
containers:
- name: {{ .Chart.Name }}
image: "{{ .Values.webserver.image.repository }}:{{ .Values.webserver.image.tag | default .Chart.AppVersion }}"
@@ -38,6 +46,11 @@ spec:
- curl
- -X OPTIONS
- http://0.0.0.0:80/
{{- if .Values.webserver.additionalConfigs }}
volumeMounts:
- name: webserver-config
mountPath: /opt/clearml/config
{{- end }}
env:
- name: NGINX_APISERVER_ADDRESS
value: "http://{{ include "clearml.fullname" . }}-apiserver:{{ .Values.apiserver.service.port }}"
@@ -50,6 +63,12 @@ spec:
- webserver
resources:
{{- toYaml .Values.webserver.resources | nindent 12 }}
{{- if .Values.webserver.additionalConfigs }}
volumes:
- name: webserver-config
configMap:
name: "{{ include "clearml.fullname" . }}-webserver-configmap"
{{- end }}
{{- with .Values.webserver.nodeSelector }}
nodeSelector:
{{- toYaml . | nindent 8 }}
@@ -61,4 +80,4 @@ spec:
{{- with .Values.webserver.tolerations }}
tolerations:
{{- toYaml . | nindent 8 }}
{{- end }}
{{- end }}

View File

@@ -0,0 +1,9 @@
{{- if .Values.imageCredentials.enabled -}}
apiVersion: v1
kind: Secret
metadata:
name: clearml-agent-registry-key
type: kubernetes.io/dockerconfigjson
data:
.dockerconfigjson: {{ template "imagePullSecret" . }}
{{- end }}

View File

@@ -1,33 +1,45 @@
{{- if .Values.ingress.enabled -}}
{{- $fullName := include "clearml.fullname" . -}}
{{- if semverCompare ">=1.14-0" .Capabilities.KubeVersion.GitVersion -}}
{{- if .Values.ingress.api.enabled -}}
{{- if semverCompare ">=1.19-0" .Capabilities.KubeVersion.GitVersion -}}
apiVersion: networking.k8s.io/v1
{{- else if semverCompare ">=1.14-0" .Capabilities.KubeVersion.GitVersion -}}
apiVersion: networking.k8s.io/v1beta1
{{- else -}}
apiVersion: extensions/v1beta1
{{- end }}
kind: Ingress
metadata:
name: {{ $fullName }}-api
name: {{ include "clearml.fullname" . }}-api
labels:
{{- include "clearml.labels" . | nindent 4 }}
{{- with .Values.ingress.annotations }}
annotations:
{{- toYaml . | nindent 4 }}
{{- $annotations := .Values.ingress.annotations }}
{{- if .Values.ingress.api.annotations }}
{{- $annotations = mergeOverwrite $annotations .Values.ingress.api.annotations }}
{{- end }}
annotations:
{{- toYaml $annotations | nindent 4 }}
spec:
{{- if .Values.ingress.tls.secretName }}
{{- if .Values.ingress.api.tlsSecretName }}
tls:
- hosts:
- "{{ .Values.ingress.hostPrefixApi }}{{ .Values.ingress.host }}"
secretName: {{ .Values.ingress.tls.secretName }}
- {{ .Values.ingress.api.hostName }}
secretName: {{ .Values.ingress.api.tlsSecretName }}
{{- end }}
rules:
- host: "{{ .Values.ingress.hostPrefixApi }}{{ .Values.ingress.host }}"
http:
paths:
- path: "/"
pathType: Prefix
backend:
serviceName: {{ include "clearml.fullname" . }}-apiserver
servicePort: {{ .Values.apiserver.service.port }}
- host: {{ .Values.ingress.api.hostName }}
http:
paths:
- path: {{ .Values.ingress.api.path }}
{{ if semverCompare ">=1.19-0" .Capabilities.KubeVersion.GitVersion }}
pathType: Prefix
backend:
service:
name: {{ include "clearml.fullname" . }}-apiserver
port:
number: {{ .Values.apiserver.service.port }}
{{ else }}
backend:
serviceName: {{ include "clearml.fullname" . }}-apiserver
servicePort: {{ .Values.apiserver.service.port }}
{{ end }}
{{- end }}

View File

@@ -1,33 +1,44 @@
{{- if .Values.ingress.enabled -}}
{{- $fullName := include "clearml.fullname" . -}}
{{- if semverCompare ">=1.14-0" .Capabilities.KubeVersion.GitVersion -}}
{{- if .Values.ingress.app.enabled -}}
{{- if semverCompare ">=1.19-0" .Capabilities.KubeVersion.GitVersion -}}
apiVersion: networking.k8s.io/v1
{{- else if semverCompare ">=1.14-0" .Capabilities.KubeVersion.GitVersion -}}
apiVersion: networking.k8s.io/v1beta1
{{- else -}}
apiVersion: extensions/v1beta1
{{- end }}
kind: Ingress
metadata:
name: {{ $fullName }}-app
name: {{ include "clearml.fullname" . }}-app
labels:
{{- include "clearml.labels" . | nindent 4 }}
{{- with .Values.ingress.annotations }}
annotations:
{{- toYaml . | nindent 4 }}
{{- $annotations := .Values.ingress.annotations }}
{{- if .Values.ingress.app.annotations }}
{{- $annotations = mergeOverwrite $annotations .Values.ingress.app.annotations }}
{{- end }}
annotations:
{{- toYaml $annotations | nindent 4 }}
spec:
{{- if .Values.ingress.tls.secretName }}
{{- if .Values.ingress.app.tlsSecretName }}
tls:
- hosts:
- "{{ .Values.ingress.hostPrefixApp }}{{ .Values.ingress.host }}"
secretName: {{ .Values.ingress.tls.secretName }}
- {{ .Values.ingress.app.hostName }}
secretName: {{ .Values.ingress.app.tlsSecretName }}
{{- end }}
rules:
- host: "{{ .Values.ingress.hostPrefixApp }}{{ .Values.ingress.host }}"
http:
paths:
- path: "/"
pathType: Prefix
backend:
serviceName: {{ include "clearml.fullname" . }}-webserver
servicePort: {{ .Values.webserver.service.port }}
- host: {{ .Values.ingress.app.hostName }}
http:
paths:
- path: {{ .Values.ingress.app.path }}
{{ if semverCompare ">=1.19-0" .Capabilities.KubeVersion.GitVersion }}
pathType: Prefix
backend:
service:
name: {{ include "clearml.fullname" . }}-webserver
port:
number: {{ .Values.webserver.service.port }}
{{ else }}
backend:
serviceName: {{ include "clearml.fullname" . }}-webserver
servicePort: {{ .Values.webserver.service.port }}
{{ end }}
{{- end }}

View File

@@ -1,33 +1,44 @@
{{- if .Values.ingress.enabled -}}
{{- $fullName := include "clearml.fullname" . -}}
{{- if semverCompare ">=1.14-0" .Capabilities.KubeVersion.GitVersion -}}
{{- if .Values.ingress.files.enabled -}}
{{- if semverCompare ">=1.19-0" .Capabilities.KubeVersion.GitVersion -}}
apiVersion: networking.k8s.io/v1
{{- else if semverCompare ">=1.14-0" .Capabilities.KubeVersion.GitVersion -}}
apiVersion: networking.k8s.io/v1beta1
{{- else -}}
apiVersion: extensions/v1beta1
{{- end }}
kind: Ingress
metadata:
name: {{ $fullName }}-files
name: {{ include "clearml.fullname" . }}-files
labels:
{{- include "clearml.labels" . | nindent 4 }}
{{- with .Values.ingress.annotations }}
annotations:
{{- toYaml . | nindent 4 }}
{{- $annotations := .Values.ingress.annotations }}
{{- if .Values.ingress.files.annotations }}
{{- $annotations = mergeOverwrite $annotations .Values.ingress.files.annotations }}
{{- end }}
annotations:
{{- toYaml $annotations | nindent 4 }}
spec:
{{- if .Values.ingress.tls.secretName }}
{{- if .Values.ingress.files.tlsSecretName }}
tls:
- hosts:
- "{{ .Values.ingress.hostPrefixFiles }}{{ .Values.ingress.host }}"
secretName: {{ .Values.ingress.tls.secretName }}
- {{ .Values.ingress.files.hostName }}
secretName: {{ .Values.ingress.files.tlsSecretName }}
{{- end }}
rules:
- host: "{{ .Values.ingress.hostPrefixFiles }}{{ .Values.ingress.host }}"
http:
paths:
- path: "/"
pathType: Prefix
backend:
serviceName: {{ include "clearml.fullname" . }}-fileserver
servicePort: {{ .Values.fileserver.service.port }}
- host: {{ .Values.ingress.files.hostName }}
http:
paths:
- path: {{ .Values.ingress.files.path }}
{{ if semverCompare ">=1.19-0" .Capabilities.KubeVersion.GitVersion }}
pathType: Prefix
backend:
service:
name: {{ include "clearml.fullname" . }}-fileserver
port:
number: {{ .Values.fileserver.service.port }}
{{ else }}
backend:
serviceName: {{ include "clearml.fullname" . }}-fileserver
servicePort: {{ .Values.fileserver.service.port }}
{{ end }}
{{- end }}

View File

@@ -1,13 +0,0 @@
kind: PersistentVolumeClaim
apiVersion: v1
metadata:
name: {{ include "clearml.fullname" . }}-agentservices-data
labels:
{{- include "clearml.labels" . | nindent 4 }}
spec:
accessModes:
- ReadWriteOnce
resources:
requests:
storage: {{ .Values.agentservices.storage.data.size | quote }}
storageClassName: {{ .Values.agentservices.storage.data.class | quote }}

View File

@@ -1,15 +0,0 @@
{{- if .Values.apiserver.storage.enableConfigVolume }}
kind: PersistentVolumeClaim
apiVersion: v1
metadata:
name: {{ include "clearml.fullname" . }}-apiserver-config
labels:
{{- include "clearml.labels" . | nindent 4 }}
spec:
accessModes:
- ReadWriteOnce
resources:
requests:
storage: {{ .Values.apiserver.storage.config.size | quote }}
storageClassName: {{ .Values.apiserver.storage.config.class | quote }}
{{- end }}

View File

@@ -10,4 +10,7 @@ spec:
resources:
requests:
storage: {{ .Values.fileserver.storage.data.size | quote }}
storageClassName: {{ .Values.fileserver.storage.data.class | quote }}
{{- if .Values.fileserver.storage.data.class }}
storageClassName: {{ .Values.fileserver.storage.data.class | quote }}
{{- end -}}

View File

@@ -1,13 +0,0 @@
{{- range $key, $value := .Values.agentGroups }}
{{- with $value }}
---
{{ if .clearmlConfig }}
apiVersion: v1
kind: Secret
metadata:
name: {{ .name }}-conf
data:
clearml.conf: {{ .clearmlConfig | b64enc }}
{{ end }}
{{- end }}
{{- end }}

View File

@@ -1,11 +1,13 @@
{{- if not .Values.secret.existingSecret }}
apiVersion: v1
kind: Secret
metadata:
name: clearml-conf
data:
apiserver_key: NTQ0MkYzNDQzTUpNT1JXWkEzWkg=
apiserver_secret: QnhhcElSbzlaSU5pOHgyNUNSeHo4V2RtcjJwUWp6dVdWQjRQTkFTWnFDdFR5V2dXVlE=
http_session: OVR3MjBSYmhKMWJMQmlIRU9XWHZocGxLR1ViVGdMekF0d0ZOMm9MUXZXd1MwdVJwRDU=
auth_token: MVNDZjBvdjNObTU0NFRkMm9aMGdYU3JzTng1WGhNV2RWbEt6MXRPZ2N4MTU4YkQ1UlY=
tests_user_key: RU5QMzlFUU00U0xBQ0dENUZYQjc=
tests_user_secret: bFBjbTBpbWJjQlo4bXdnTzd0cGFkdXRpUzNnbkpEMDV4OWo3YWZ3WFBTMzVJS2JwaVE=
stringData:
apiserver_key: {{ .Values.secret.credentials.apiserver.accessKey }}
apiserver_secret: {{ .Values.secret.credentials.apiserver.secretKey }}
http_session: {{ .Values.secret.httpSession }}
auth_token: {{ .Values.secret.authToken }}
tests_user_key: {{ .Values.secret.credentials.tests.accessKey }}
tests_user_secret: {{ .Values.secret.credentials.tests.secretKey }}
{{- end }}

View File

@@ -9,7 +9,9 @@ spec:
ports:
- port: {{ .Values.apiserver.service.port }}
targetPort: {{ .Values.apiserver.service.port }}
nodePort: 30008
{{- if eq .Values.apiserver.service.type "NodePort" }}
nodePort: {{ .Values.apiserver.service.nodePort }}
{{- end }}
protocol: TCP
selector:
{{- include "clearml.selectorLabelsApiServer" . | nindent 4 }}

View File

@@ -9,7 +9,9 @@ spec:
ports:
- port: {{ .Values.fileserver.service.port }}
targetPort: {{ .Values.fileserver.service.port }}
nodePort: 30081
{{- if eq .Values.fileserver.service.type "NodePort" }}
nodePort: {{ .Values.fileserver.service.nodePort }}
{{- end }}
protocol: TCP
selector:
{{- include "clearml.selectorLabelsFileServer" . | nindent 4 }}

View File

@@ -9,7 +9,9 @@ spec:
ports:
- port: {{ .Values.webserver.service.port }}
targetPort: {{ .Values.webserver.service.port }}
nodePort: 30080
{{- if eq .Values.webserver.service.type "NodePort" }}
nodePort: {{ .Values.webserver.service.nodePort }}
{{- end }}
protocol: TCP
selector:
{{- include "clearml.selectorLabelsWebServer" . | nindent 4 }}

253
charts/clearml/values.yaml Normal file → Executable file
View File

@@ -1,15 +1,62 @@
# -- Private image registry configuration
imageCredentials:
# -- Use private authentication mode
enabled: false
# -- If this is set, chart will not generate a secret but will use what is defined here
existingSecret: ""
# -- Registry name
registry: docker.io
# -- Registry username
username: someone
# -- Registry password
password: pwd
# -- Email
email: someone@host.com
# -- ClearMl generic configurations
clearml:
defaultCompany: "d1bd92a3b039400cbafc60a7a5b1e52b"
ingress:
enabled: false
name: clearml-server-ingress
annotations: {}
host: ""
hostPrefixApp: "app."
hostPrefixApi: "api."
hostPrefixFiles: "files."
tls:
secretName: ""
app:
enabled: false
hostName: "app.clearml.127-0-0-1.nip.io"
tlsSecretName: ""
annotations: {}
path: "/"
api:
enabled: false
hostName: "api.clearml.127-0-0-1.nip.io"
tlsSecretName: ""
annotations: {}
path: "/"
files:
enabled: false
hostName: "files.clearml.127-0-0-1.nip.io"
tlsSecretName: ""
annotations: {}
path: "/"
secret:
# -- If this is set, chart will not generate a secret but will use what is defined here
existingSecret: ""
# -- Set for http_session field
httpSession: "9Tw20RbhJ1bLBiHEOWXvhplKGUbTgLzAtwFN2oLQvWwS0uRpD5"
# -- Set for auth_token field
authToken: "1SCf0ov3Nm544Td2oZ0gXSrsNx5XhMWdVlKz1tOgcx158bD5RV"
credentials:
apiserver:
# -- Set for apiserver_key field
accessKey: "5442F3443MJMORWZA3ZH"
# -- Set for apiserver_secret field
secretKey: "BxapIRo9ZINi8x25CRxz8Wdmr2pQjzuWVB4PNASZqCtTyWgWVQ"
tests:
# -- Set for tests_user_key field
accessKey: "ENP39EQM4SLACGD5FXB7"
# -- Set for tests_user_secret field
secretKey: "lPcm0imbcBZ8mwgO7tpadutiS3gnJD05x9j7afwXPS35IKbpiQ"
apiserver:
prepopulateEnabled: "true"
@@ -17,9 +64,16 @@ apiserver:
prepopulateArtifactsPath: "/mnt/fileserver"
configDir: /opt/clearml/config
# -- Amount of seconds the authorization cookie will last in user browser
authCookiesMaxAge: 864000
service:
# -- This will set to service's spec.type field
type: NodePort
port: 8008
# -- If service.type set to NodePort, this will be set to service's nodePort field.
# If service.type is set to others, this field will be ignored
nodePort: 30008
livenessDelay: 60
readinessDelay: 60
@@ -29,7 +83,7 @@ apiserver:
image:
repository: "allegroai/clearml"
pullPolicy: IfNotPresent
tag: "1.1.1"
tag: "1.6.0"
extraEnvs: []
@@ -53,24 +107,55 @@ apiserver:
affinity: {}
# Optional: used in pvc-apiserver containing optional server configuration files
storage:
enableConfigVolume: false
config:
class: "standard"
size: 1Gi
# -- additional configurations that can be used by api server; check examples in values.yaml file
configuration:
configRefName: ""
secretRefName: ""
additionalConfigs: {}
# services.conf: |
# tasks {
# non_responsive_tasks_watchdog {
# # In-progress tasks that haven't been updated for at least 'value' seconds will be stopped by the watchdog
# threshold_sec: 21000
# # Watchdog will sleep for this number of seconds after each cycle
# watch_interval_sec: 900
# }
# }
# apiserver.conf: |
# auth {
# fixed_users {
# enabled: true
# pass_hashed: false
# users: [
# {
# username: "jane"
# password: "12345678"
# name: "Jane Doe"
# },
# {
# username: "john"
# password: "12345678"
# name: "John Doe"
# },
# ]
# }
# }
fileserver:
service:
# -- This will set to service's spec.type field
type: NodePort
port: 8081
# -- If service.type set to NodePort, this will be set to service's nodePort field.
# If service.type is set to others, this field will be ignored
nodePort: 30081
replicaCount: 1
image:
repository: "allegroai/clearml"
pullPolicy: IfNotPresent
tag: "1.1.1"
tag: "1.6.0"
extraEnvs: []
@@ -96,22 +181,26 @@ fileserver:
storage:
data:
class: "standard"
class: ""
size: 50Gi
webserver:
extraEnvs: []
service:
# -- This will set to service's spec.type field
type: NodePort
port: 80
# -- If service.type set to NodePort, this will be set to service's nodePort field.
# If service.type is set to others, this field will be ignored
nodePort: 30080
replicaCount: 1
image:
repository: "allegroai/clearml"
pullPolicy: IfNotPresent
tag: "1.1.1"
tag: "1.6.0"
podAnnotations: {}
@@ -133,121 +222,21 @@ webserver:
affinity: {}
agentservices:
clearmlHostIp: null
agentVersion: ""
clearmlWebHost: null
clearmlFilesHost: null
clearmlGitUser: null
clearmlGitPassword: null
awsAccessKeyId: null
awsSecretAccessKey: null
awsDefaultRegion: null
azureStorageAccount: null
azureStorageKey: null
googleCredentials: null
clearmlWorkerId: "clearml-services"
additionalConfigs: {}
replicaCount: 1
image:
repository: "allegroai/clearml-agent-services"
pullPolicy: IfNotPresent
tag: "latest"
extraEnvs: []
podAnnotations: {}
resources: {}
# We usually recommend not to specify default resources and to leave this as a conscious
# choice for the user. This also increases chances charts run on environments with little
# resources, such as Minikube. If you do want to specify resources, uncomment the following
# lines, adjust them as necessary, and remove the curly braces after 'resources:'.
# limits:
# cpu: 100m
# memory: 128Mi
# requests:
# cpu: 100m
# memory: 128Mi
nodeSelector: {}
tolerations: []
affinity: {}
storage:
data:
class: "standard"
size: 50Gi
agentGroups:
agent-group-cpu:
name: agent-group-cpu
replicaCount: 1
updateStrategy: Recreate
nvidiaGpusPerAgent: 0
agentVersion: "" # if set, it *MUST* include comparison operator (e.g. ">=0.16.1")
queues: "default" # multiple queues can be specified separated by a space (e.g. "important_jobs default")
clearmlGitUser: null
clearmlGitPassword: null
clearmlAccessKey: null
clearmlSecretKey: null
awsAccessKeyId: null
awsSecretAccessKey: null
awsDefaultRegion: null
azureStorageAccount: null
azureStorageKey: null
clearmlConfig: |-
sdk {
}
image:
repository: "ubuntu"
pullPolicy: IfNotPresent
tag: "18.04"
podAnnotations: {}
nodeSelector: {}
tolerations: []
affinity: {}
agent-group-gpu:
name: agent-group-gpu
replicaCount: 0
updateStrategy: Recreate
nvidiaGpusPerAgent: 1
agentVersion: "" # if set, it *MUST* include comparison operator (e.g. ">=0.16.1")
queues: "default" # multiple queues can be specified separated by a space (e.g. "important_jobs default")
clearmlGitUser: null
clearmlGitPassword: null
clearmlAccessKey: null
clearmlSecretKey: null
awsAccessKeyId: null
awsSecretAccessKey: null
awsDefaultRegion: null
azureStorageAccount: null
azureStorageKey: null
clearmlConfig: |-
sdk {
}
image:
repository: "nvidia/cuda"
pullPolicy: IfNotPresent
tag: "11.0-base-ubuntu18.04"
podAnnotations: {}
nodeSelector: {}
tolerations: []
affinity: {}
externalServices:
# -- Existing ElasticSearch Hostname to use if elasticsearch.enabled is false
elasticsearchHost: ""
# -- Existing ElasticSearch Port to use if elasticsearch.enabled is false
elasticsearchPort: 9200
# -- Existing MongoDB Hostname to use if elasticsearch.enabled is false
mongodbHost: ""
# -- Existing MongoDB Port to use if elasticsearch.enabled is false
mongodbPort: 27017
# -- Existing Redis Hostname to use if elasticsearch.enabled is false
redisHost: ""
# -- Existing Redis Port to use if elasticsearch.enabled is false
redisPort: 6379
redis: # configuration from https://github.com/bitnami/charts/blob/master/bitnami/redis/values.yaml
enabled: true
@@ -281,7 +270,7 @@ mongodb: # configuration from https://github.com/bitnami/charts/blob/master/bit
port: 27017
portName: mongo-service
elasticsearch: # configuration from https://github.com/elastic/helm-charts/blob/7.10/elasticsearch/values.yaml
elasticsearch: # configuration from https://github.com/elastic/helm-charts/blob/7.16/elasticsearch/values.yaml
enabled: true
httpPort: 9200
roles:

View File

@@ -0,0 +1,2 @@
tests/
.pytest_cache/

View File

@@ -0,0 +1,12 @@
apiVersion: v1
appVersion: 7.16.2
description: Official Elastic helm chart for Elasticsearch
home: https://github.com/elastic/helm-charts
icon: https://helm.elastic.co/icons/elasticsearch.png
maintainers:
- email: helm-charts@elastic.co
name: Elastic
name: elasticsearch
sources:
- https://github.com/elastic/elasticsearch
version: 7.16.2

View File

@@ -0,0 +1 @@
include ../helpers/common.mk

View File

@@ -0,0 +1,457 @@
# Elasticsearch Helm Chart
[![Build Status](https://img.shields.io/jenkins/s/https/devops-ci.elastic.co/job/elastic+helm-charts+master.svg)](https://devops-ci.elastic.co/job/elastic+helm-charts+master/) [![Artifact HUB](https://img.shields.io/endpoint?url=https://artifacthub.io/badge/repository/elastic)](https://artifacthub.io/packages/search?repo=elastic)
This Helm chart is a lightweight way to configure and run our official
[Elasticsearch Docker image][].
<!-- development warning placeholder -->
<!-- START doctoc generated TOC please keep comment here to allow auto update -->
<!-- DON'T EDIT THIS SECTION, INSTEAD RE-RUN doctoc TO UPDATE -->
- [Requirements](#requirements)
- [Installing](#installing)
- [Install released version using Helm repository](#install-released-version-using-helm-repository)
- [Install development version from a branch](#install-development-version-from-a-branch)
- [Upgrading](#upgrading)
- [Usage notes](#usage-notes)
- [Configuration](#configuration)
- [Deprecated](#deprecated)
- [FAQ](#faq)
- [How to deploy this chart on a specific K8S distribution?](#how-to-deploy-this-chart-on-a-specific-k8s-distribution)
- [How to deploy dedicated nodes types?](#how-to-deploy-dedicated-nodes-types)
- [Clustering and Node Discovery](#clustering-and-node-discovery)
- [How to deploy clusters with security (authentication and TLS) enabled?](#how-to-deploy-clusters-with-security-authentication-and-tls-enabled)
- [How to migrate from helm/charts stable chart?](#how-to-migrate-from-helmcharts-stable-chart)
- [How to install plugins?](#how-to-install-plugins)
- [How to use the keystore?](#how-to-use-the-keystore)
- [Basic example](#basic-example)
- [Multiple keys](#multiple-keys)
- [Custom paths and keys](#custom-paths-and-keys)
- [How to enable snapshotting?](#how-to-enable-snapshotting)
- [How to configure templates post-deployment?](#how-to-configure-templates-post-deployment)
- [Contributing](#contributing)
<!-- END doctoc generated TOC please keep comment here to allow auto update -->
<!-- Use this to update TOC: -->
<!-- docker run --rm -it -v $(pwd):/usr/src jorgeandrada/doctoc --github -->
## Requirements
* Kubernetes >= 1.14
* [Helm][] >= 2.17.0
* Minimum cluster requirements include the following to run this chart with
default settings. All of these settings are configurable.
* Three Kubernetes nodes to respect the default "hard" affinity settings
* 1GB of RAM for the JVM heap
See [supported configurations][] for more details.
## Installing
This chart is tested with the latest 7.16.2 version.
### Install released version using Helm repository
* Add the Elastic Helm charts repo:
`helm repo add elastic https://helm.elastic.co`
* Install it:
- with Helm 3: `helm install elasticsearch --version <version> elastic/elasticsearch`
- with Helm 2 (deprecated): `helm install --name elasticsearch --version <version> elastic/elasticsearch`
### Install development version from a branch
* Clone the git repo: `git clone git@github.com:elastic/helm-charts.git`
* Checkout the branch : `git checkout 7.16`
* Install it:
- with Helm 3: `helm install elasticsearch ./helm-charts/elasticsearch --set imageTag=7.16.2`
- with Helm 2 (deprecated): `helm install --name elasticsearch ./helm-charts/elasticsearch --set imageTag=7.16.2`
## Upgrading
Please always check [CHANGELOG.md][] and [BREAKING_CHANGES.md][] before
upgrading to a new chart version.
## Usage notes
* This repo includes a number of [examples][] configurations which can be used
as a reference. They are also used in the automated testing of this chart.
* Automated testing of this chart is currently only run against GKE (Google
Kubernetes Engine).
* The chart deploys a StatefulSet and by default will do an automated rolling
update of your cluster. It does this by waiting for the cluster health to become
green after each instance is updated. If you prefer to update manually you can
set `OnDelete` [updateStrategy][].
* It is important to verify that the JVM heap size in `esJavaOpts` and to set
the CPU/Memory `resources` to something suitable for your cluster.
* To simplify chart and maintenance each set of node groups is deployed as a
separate Helm release. Take a look at the [multi][] example to get an idea for
how this works. Without doing this it isn't possible to resize persistent
volumes in a StatefulSet. By setting it up this way it makes it possible to add
more nodes with a new storage size then drain the old ones. It also solves the
problem of allowing the user to determine which node groups to update first when
doing upgrades or changes.
* We have designed this chart to be very un-opinionated about how to configure
Elasticsearch. It exposes ways to set environment variables and mount secrets
inside of the container. Doing this makes it much easier for this chart to
support multiple versions with minimal changes.
## Configuration
| Parameter | Description | Default |
|------------------------------------|-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|--------------------------------------------------|
| `antiAffinityTopologyKey` | The [anti-affinity][] topology key. By default this will prevent multiple Elasticsearch nodes from running on the same Kubernetes node | `kubernetes.io/hostname` |
| `antiAffinity` | Setting this to hard enforces the [anti-affinity][] rules. If it is set to soft it will be done "best effort". Other values will be ignored | `hard` |
| `clusterHealthCheckParams` | The [Elasticsearch cluster health status params][] that will be used by readiness [probe][] command | `wait_for_status=green&timeout=1s` |
| `clusterName` | This will be used as the Elasticsearch [cluster.name][] and should be unique per cluster in the namespace | `elasticsearch` |
| `clusterDeprecationIndexing` | Enable or disable deprecation logs to be indexed (should be disabled when deploying master only node groups) | `false` |
| `enableServiceLinks` | Set to false to disabling service links, which can cause slow pod startup times when there are many services in the current namespace. | `true` |
| `envFrom` | Templatable string to be passed to the [environment from variables][] which will be appended to the `envFrom:` definition for the container | `[]` |
| `esConfig` | Allows you to add any config files in `/usr/share/elasticsearch/config/` such as `elasticsearch.yml` and `log4j2.properties`. See [values.yaml][] for an example of the formatting | `{}` |
| `esJavaOpts` | [Java options][] for Elasticsearch. This is where you could configure the [jvm heap size][] | `""` |
| `esMajorVersion` | Deprecated. Instead, use the version of the chart corresponding to your ES minor version. Used to set major version specific configuration. If you are using a custom image and not running the default Elasticsearch version you will need to set this to the version you are running (e.g. `esMajorVersion: 6`) | `""` |
| `extraContainers` | Templatable string of additional `containers` to be passed to the `tpl` function | `""` |
| `extraEnvs` | Extra [environment variables][] which will be appended to the `env:` definition for the container | `[]` |
| `extraInitContainers` | Templatable string of additional `initContainers` to be passed to the `tpl` function | `""` |
| `extraVolumeMounts` | Templatable string of additional `volumeMounts` to be passed to the `tpl` function | `""` |
| `extraVolumes` | Templatable string of additional `volumes` to be passed to the `tpl` function | `""` |
| `fullnameOverride` | Overrides the `clusterName` and `nodeGroup` when used in the naming of resources. This should only be used when using a single `nodeGroup`, otherwise you will have name conflicts | `""` |
| `healthNameOverride` | Overrides `test-elasticsearch-health` pod name | `""` |
| `hostAliases` | Configurable [hostAliases][] | `[]` |
| `httpPort` | The http port that Kubernetes will use for the healthchecks and the service. If you change this you will also need to set [http.port][] in `extraEnvs` | `9200` |
| `imagePullPolicy` | The Kubernetes [imagePullPolicy][] value | `IfNotPresent` |
| `imagePullSecrets` | Configuration for [imagePullSecrets][] so that you can use a private registry for your image | `[]` |
| `imageTag` | The Elasticsearch Docker image tag | `7.16.2` |
| `image` | The Elasticsearch Docker image | `docker.elastic.co/elasticsearch/elasticsearch` |
| `ingress` | Configurable [ingress][] to expose the Elasticsearch service. See [values.yaml][] for an example | see [values.yaml][] |
| `initResources` | Allows you to set the [resources][] for the `initContainer` in the StatefulSet | `{}` |
| `keystore` | Allows you map Kubernetes secrets into the keystore. See the [config example][] and [how to use the keystore][] | `[]` |
| `labels` | Configurable [labels][] applied to all Elasticsearch pods | `{}` |
| `lifecycle` | Allows you to add [lifecycle hooks][]. See [values.yaml][] for an example of the formatting | `{}` |
| `masterService` | The service name used to connect to the masters. You only need to set this if your master `nodeGroup` is set to something other than `master`. See [Clustering and Node Discovery][] for more information | `""` |
| `maxUnavailable` | The [maxUnavailable][] value for the pod disruption budget. By default this will prevent Kubernetes from having more than 1 unhealthy pod in the node group | `1` |
| `minimumMasterNodes` | The value for [discovery.zen.minimum_master_nodes][]. Should be set to `(master_eligible_nodes / 2) + 1`. Ignored in Elasticsearch versions >= 7 | `2` |
| `nameOverride` | Overrides the `clusterName` when used in the naming of resources | `""` |
| `networkHost` | Value for the [network.host Elasticsearch setting][] | `0.0.0.0` |
| `networkPolicy` | The [NetworkPolicy](https://kubernetes.io/docs/concepts/services-networking/network-policies/) to set. See [`values.yaml`](./values.yaml) for an example | `{http.enabled: false,transport.enabled: false}` |
| `nodeAffinity` | Value for the [node affinity settings][] | `{}` |
| `nodeGroup` | This is the name that will be used for each group of nodes in the cluster. The name will be `clusterName-nodeGroup-X` , `nameOverride-nodeGroup-X` if a `nameOverride` is specified, and `fullnameOverride-X` if a `fullnameOverride` is specified | `master` |
| `nodeSelector` | Configurable [nodeSelector][] so that you can target specific nodes for your Elasticsearch cluster | `{}` |
| `persistence` | Enables a persistent volume for Elasticsearch data. Can be disabled for nodes that only have [roles][] which don't require persistent data | see [values.yaml][] |
| `podAnnotations` | Configurable [annotations][] applied to all Elasticsearch pods | `{}` |
| `podManagementPolicy` | By default Kubernetes [deploys StatefulSets serially][]. This deploys them in parallel so that they can discover each other | `Parallel` |
| `podSecurityContext` | Allows you to set the [securityContext][] for the pod | see [values.yaml][] |
| `podSecurityPolicy` | Configuration for create a pod security policy with minimal permissions to run this Helm chart with `create: true`. Also can be used to reference an external pod security policy with `name: "externalPodSecurityPolicy"` | see [values.yaml][] |
| `priorityClassName` | The name of the [PriorityClass][]. No default is supplied as the PriorityClass must be created first | `""` |
| `protocol` | The protocol that will be used for the readiness [probe][]. Change this to `https` if you have `xpack.security.http.ssl.enabled` set | `http` |
| `rbac` | Configuration for creating a role, role binding and ServiceAccount as part of this Helm chart with `create: true`. Also can be used to reference an external ServiceAccount with `serviceAccountName: "externalServiceAccountName"`, or automount the service account token | see [values.yaml][] |
| `readinessProbe` | Configuration fields for the readiness [probe][] | see [values.yaml][] |
| `replicas` | Kubernetes replica count for the StatefulSet (i.e. how many pods) | `3` |
| `resources` | Allows you to set the [resources][] for the StatefulSet | see [values.yaml][] |
| `roles` | A hash map with the specific [roles][] for the `nodeGroup` | see [values.yaml][] |
| `schedulerName` | Name of the [alternate scheduler][] | `""` |
| `secretMounts` | Allows you easily mount a secret as a file inside the StatefulSet. Useful for mounting certificates and other secrets. See [values.yaml][] for an example | `[]` |
| `securityContext` | Allows you to set the [securityContext][] for the container | see [values.yaml][] |
| `service.annotations` | [LoadBalancer annotations][] that Kubernetes will use for the service. This will configure load balancer if `service.type` is `LoadBalancer` | `{}` |
| `service.enabled` | Enable non-headless service | `true` |
| `service.externalTrafficPolicy` | Some cloud providers allow you to specify the [LoadBalancer externalTrafficPolicy][]. Kubernetes will use this to preserve the client source IP. This will configure load balancer if `service.type` is `LoadBalancer` | `""` |
| `service.httpPortName` | The name of the http port within the service | `http` |
| `service.labelsHeadless` | Labels to be added to headless service | `{}` |
| `service.labels` | Labels to be added to non-headless service | `{}` |
| `service.loadBalancerIP` | Some cloud providers allow you to specify the [loadBalancer][] IP. If the `loadBalancerIP` field is not specified, the IP is dynamically assigned. If you specify a `loadBalancerIP` but your cloud provider does not support the feature, it is ignored. | `""` |
| `service.loadBalancerSourceRanges` | The IP ranges that are allowed to access | `[]` |
| `service.nodePort` | Custom [nodePort][] port that can be set if you are using `service.type: nodePort` | `""` |
| `service.transportPortName` | The name of the transport port within the service | `transport` |
| `service.type` | Elasticsearch [Service Types][] | `ClusterIP` |
| `sysctlInitContainer` | Allows you to disable the `sysctlInitContainer` if you are setting [sysctl vm.max_map_count][] with another method | `enabled: true` |
| `sysctlVmMaxMapCount` | Sets the [sysctl vm.max_map_count][] needed for Elasticsearch | `262144` |
| `terminationGracePeriod` | The [terminationGracePeriod][] in seconds used when trying to stop the pod | `120` |
| `tests.enabled` | Enable creating test related resources when running `helm template` or `helm test` | `true` |
| `tolerations` | Configurable [tolerations][] | `[]` |
| `transportPort` | The transport port that Kubernetes will use for the service. If you change this you will also need to set [transport port configuration][] in `extraEnvs` | `9300` |
| `updateStrategy` | The [updateStrategy][] for the StatefulSet. By default Kubernetes will wait for the cluster to be green after upgrading each pod. Setting this to `OnDelete` will allow you to manually delete each pod during upgrades | `RollingUpdate` |
| `volumeClaimTemplate` | Configuration for the [volumeClaimTemplate for StatefulSets][]. You will want to adjust the storage (default `30Gi` ) and the `storageClassName` if you are using a different storage class | see [values.yaml][] |
### Deprecated
| Parameter | Description | Default |
|-----------|---------------------------------------------------------------------------------------------------------------|---------|
| `fsGroup` | The Group ID (GID) for [securityContext][] so that the Elasticsearch user can read from the persistent volume | `""` |
## FAQ
### How to deploy this chart on a specific K8S distribution?
This chart is designed to run on production scale Kubernetes clusters with
multiple nodes, lots of memory and persistent storage. For that reason it can be
a bit tricky to run them against local Kubernetes environments such as
[Minikube][].
This chart is highly tested with [GKE][], but some K8S distribution also
requires specific configurations.
We provide examples of configuration for the following K8S providers:
- [Docker for Mac][]
- [KIND][]
- [Minikube][]
- [MicroK8S][]
- [OpenShift][]
### How to deploy dedicated nodes types?
All the Elasticsearch pods deployed share the same configuration. If you need to
deploy dedicated [nodes types][] (for example dedicated master and data nodes),
you can deploy multiple releases of this chart with different configurations
while they share the same `clusterName` value.
For each Helm release, the nodes types can then be defined using `roles` value.
An example of Elasticsearch cluster using 2 different Helm releases for master
and data nodes can be found in [examples/multi][].
#### Clustering and Node Discovery
This chart facilitates Elasticsearch node discovery and services by creating two
`Service` definitions in Kubernetes, one with the name `$clusterName-$nodeGroup`
and another named `$clusterName-$nodeGroup-headless`.
Only `Ready` pods are a part of the `$clusterName-$nodeGroup` service, while all
pods ( `Ready` or not) are a part of `$clusterName-$nodeGroup-headless`.
If your group of master nodes has the default `nodeGroup: master` then you can
just add new groups of nodes with a different `nodeGroup` and they will
automatically discover the correct master. If your master nodes have a different
`nodeGroup` name then you will need to set `masterService` to
`$clusterName-$masterNodeGroup`.
The chart value for `masterService` is used to populate
`discovery.zen.ping.unicast.hosts` , which Elasticsearch nodes will use to
contact master nodes and form a cluster.
Therefore, to add a group of nodes to an existing cluster, setting
`masterService` to the desired `Service` name of the related cluster is
sufficient.
### How to deploy clusters with security (authentication and TLS) enabled?
This Helm chart can use existing [Kubernetes secrets][] to setup
credentials or certificates for examples. These secrets should be created
outside of this chart and accessed using [environment variables][] and volumes.
An example of Elasticsearch cluster using security can be found in
[examples/security][].
### How to migrate from helm/charts stable chart?
If you currently have a cluster deployed with the [helm/charts stable][] chart
you can follow the [migration guide][].
### How to install plugins?
The recommended way to install plugins into our Docker images is to create a
[custom Docker image][].
The Dockerfile would look something like:
```
ARG elasticsearch_version
FROM docker.elastic.co/elasticsearch/elasticsearch:${elasticsearch_version}
RUN bin/elasticsearch-plugin install --batch repository-gcs
```
And then updating the `image` in values to point to your custom image.
There are a couple reasons we recommend this.
1. Tying the availability of Elasticsearch to the download service to install
plugins is not a great idea or something that we recommend. Especially in
Kubernetes where it is normal and expected for a container to be moved to
another host at random times.
2. Mutating the state of a running Docker image (by installing plugins) goes
against best practices of containers and immutable infrastructure.
### How to use the keystore?
#### Basic example
Create the secret, the key name needs to be the keystore key path. In this
example we will create a secret from a file and from a literal string.
```
kubectl create secret generic encryption-key --from-file=xpack.watcher.encryption_key=./watcher_encryption_key
kubectl create secret generic slack-hook --from-literal=xpack.notification.slack.account.monitoring.secure_url='https://hooks.slack.com/services/asdasdasd/asdasdas/asdasd'
```
To add these secrets to the keystore:
```
keystore:
- secretName: encryption-key
- secretName: slack-hook
```
#### Multiple keys
All keys in the secret will be added to the keystore. To create the previous
example in one secret you could also do:
```
kubectl create secret generic keystore-secrets --from-file=xpack.watcher.encryption_key=./watcher_encryption_key --from-literal=xpack.notification.slack.account.monitoring.secure_url='https://hooks.slack.com/services/asdasdasd/asdasdas/asdasd'
```
```
keystore:
- secretName: keystore-secrets
```
#### Custom paths and keys
If you are using these secrets for other applications (besides the Elasticsearch
keystore) then it is also possible to specify the keystore path and which keys
you want to add. Everything specified under each `keystore` item will be passed
through to the `volumeMounts` section for mounting the [secret][]. In this
example we will only add the `slack_hook` key from a secret that also has other
keys. Our secret looks like this:
```
kubectl create secret generic slack-secrets --from-literal=slack_channel='#general' --from-literal=slack_hook='https://hooks.slack.com/services/asdasdasd/asdasdas/asdasd'
```
We only want to add the `slack_hook` key to the keystore at path
`xpack.notification.slack.account.monitoring.secure_url`:
```
keystore:
- secretName: slack-secrets
items:
- key: slack_hook
path: xpack.notification.slack.account.monitoring.secure_url
```
You can also take a look at the [config example][] which is used as part of the
automated testing pipeline.
### How to enable snapshotting?
1. Install your [snapshot plugin][] into a custom Docker image following the
[how to install plugins guide][].
2. Add any required secrets or credentials into an Elasticsearch keystore
following the [how to use the keystore][] guide.
3. Configure the [snapshot repository][] as you normally would.
4. To automate snapshots you can use [Snapshot Lifecycle Management][] or a tool
like [curator][].
### How to configure templates post-deployment?
You can use `postStart` [lifecycle hooks][] to run code triggered after a
container is created.
Here is an example of `postStart` hook to configure templates:
```yaml
lifecycle:
postStart:
exec:
command:
- bash
- -c
- |
#!/bin/bash
# Add a template to adjust number of shards/replicas
TEMPLATE_NAME=my_template
INDEX_PATTERN="logstash-*"
SHARD_COUNT=8
REPLICA_COUNT=1
ES_URL=http://localhost:9200
while [[ "$(curl -s -o /dev/null -w '%{http_code}\n' $ES_URL)" != "200" ]]; do sleep 1; done
curl -XPUT "$ES_URL/_template/$TEMPLATE_NAME" -H 'Content-Type: application/json' -d'{"index_patterns":['\""$INDEX_PATTERN"\"'],"settings":{"number_of_shards":'$SHARD_COUNT',"number_of_replicas":'$REPLICA_COUNT'}}'
```
## Contributing
Please check [CONTRIBUTING.md][] before any contribution or for any questions
about our development and testing process.
[7.16]: https://github.com/elastic/helm-charts/releases
[#63]: https://github.com/elastic/helm-charts/issues/63
[BREAKING_CHANGES.md]: https://github.com/elastic/helm-charts/blob/master/BREAKING_CHANGES.md
[CHANGELOG.md]: https://github.com/elastic/helm-charts/blob/master/CHANGELOG.md
[CONTRIBUTING.md]: https://github.com/elastic/helm-charts/blob/master/CONTRIBUTING.md
[alternate scheduler]: https://kubernetes.io/docs/tasks/administer-cluster/configure-multiple-schedulers/#specify-schedulers-for-pods
[annotations]: https://kubernetes.io/docs/concepts/overview/working-with-objects/annotations/
[anti-affinity]: https://kubernetes.io/docs/concepts/configuration/assign-pod-node/#affinity-and-anti-affinity
[cluster.name]: https://www.elastic.co/guide/en/elasticsearch/reference/7.16/important-settings.html#cluster-name
[clustering and node discovery]: https://github.com/elastic/helm-charts/tree/7.16/elasticsearch/README.md#clustering-and-node-discovery
[config example]: https://github.com/elastic/helm-charts/tree/7.16/elasticsearch/examples/config/values.yaml
[curator]: https://www.elastic.co/guide/en/elasticsearch/client/curator/7.9/snapshot.html
[custom docker image]: https://www.elastic.co/guide/en/elasticsearch/reference/7.16/docker.html#_c_customized_image
[deploys statefulsets serially]: https://kubernetes.io/docs/concepts/workloads/controllers/statefulset/#pod-management-policies
[discovery.zen.minimum_master_nodes]: https://www.elastic.co/guide/en/elasticsearch/reference/7.16/discovery-settings.html#minimum_master_nodes
[docker for mac]: https://github.com/elastic/helm-charts/tree/7.16/elasticsearch/examples/docker-for-mac
[elasticsearch cluster health status params]: https://www.elastic.co/guide/en/elasticsearch/reference/7.16/cluster-health.html#request-params
[elasticsearch docker image]: https://www.elastic.co/guide/en/elasticsearch/reference/7.16/docker.html
[environment variables]: https://kubernetes.io/docs/tasks/inject-data-application/define-environment-variable-container/#using-environment-variables-inside-of-your-config
[environment from variables]: https://kubernetes.io/docs/tasks/configure-pod-container/configure-pod-configmap/#configure-all-key-value-pairs-in-a-configmap-as-container-environment-variables
[examples]: https://github.com/elastic/helm-charts/tree/7.16/elasticsearch/examples/
[examples/multi]: https://github.com/elastic/helm-charts/tree/7.16/elasticsearch/examples/multi
[examples/security]: https://github.com/elastic/helm-charts/tree/7.16/elasticsearch/examples/security
[gke]: https://cloud.google.com/kubernetes-engine
[helm]: https://helm.sh
[helm/charts stable]: https://github.com/helm/charts/tree/master/stable/elasticsearch/
[how to install plugins guide]: https://github.com/elastic/helm-charts/tree/7.16/elasticsearch/README.md#how-to-install-plugins
[how to use the keystore]: https://github.com/elastic/helm-charts/tree/7.16/elasticsearch/README.md#how-to-use-the-keystore
[http.port]: https://www.elastic.co/guide/en/elasticsearch/reference/7.16/modules-http.html#_settings
[imagePullPolicy]: https://kubernetes.io/docs/concepts/containers/images/#updating-images
[imagePullSecrets]: https://kubernetes.io/docs/tasks/configure-pod-container/pull-image-private-registry/#create-a-pod-that-uses-your-secret
[ingress]: https://kubernetes.io/docs/concepts/services-networking/ingress/
[java options]: https://www.elastic.co/guide/en/elasticsearch/reference/7.16/jvm-options.html
[jvm heap size]: https://www.elastic.co/guide/en/elasticsearch/reference/7.16/heap-size.html
[hostAliases]: https://kubernetes.io/docs/concepts/services-networking/add-entries-to-pod-etc-hosts-with-host-aliases/
[kind]: https://github.com/elastic/helm-charts/tree/7.16/elasticsearch/examples/kubernetes-kind
[kubernetes secrets]: https://kubernetes.io/docs/concepts/configuration/secret/
[labels]: https://kubernetes.io/docs/concepts/overview/working-with-objects/labels/
[lifecycle hooks]: https://kubernetes.io/docs/concepts/containers/container-lifecycle-hooks/
[loadBalancer annotations]: https://kubernetes.io/docs/concepts/services-networking/service/#ssl-support-on-aws
[loadBalancer externalTrafficPolicy]: https://kubernetes.io/docs/tasks/access-application-cluster/create-external-load-balancer/#preserving-the-client-source-ip
[loadBalancer]: https://kubernetes.io/docs/concepts/services-networking/service/#loadbalancer
[maxUnavailable]: https://kubernetes.io/docs/tasks/run-application/configure-pdb/#specifying-a-poddisruptionbudget
[migration guide]: https://github.com/elastic/helm-charts/tree/7.16/elasticsearch/examples/migration/README.md
[minikube]: https://github.com/elastic/helm-charts/tree/7.16/elasticsearch/examples/minikube
[microk8s]: https://github.com/elastic/helm-charts/tree/7.16/elasticsearch/examples/microk8s
[multi]: https://github.com/elastic/helm-charts/tree/7.16/elasticsearch/examples/multi/
[network.host elasticsearch setting]: https://www.elastic.co/guide/en/elasticsearch/reference/7.16/network.host.html
[node affinity settings]: https://kubernetes.io/docs/concepts/configuration/assign-pod-node/#node-affinity-beta-feature
[node-certificates]: https://www.elastic.co/guide/en/elasticsearch/reference/7.16/configuring-tls.html#node-certificates
[nodePort]: https://kubernetes.io/docs/concepts/services-networking/service/#nodeport
[nodes types]: https://www.elastic.co/guide/en/elasticsearch/reference/7.16/modules-node.html
[nodeSelector]: https://kubernetes.io/docs/concepts/configuration/assign-pod-node/#nodeselector
[openshift]: https://github.com/elastic/helm-charts/tree/7.16/elasticsearch/examples/openshift
[priorityClass]: https://kubernetes.io/docs/concepts/configuration/pod-priority-preemption/#priorityclass
[probe]: https://kubernetes.io/docs/tasks/configure-pod-container/configure-liveness-readiness-probes/
[resources]: https://kubernetes.io/docs/concepts/configuration/manage-compute-resources-container/
[roles]: https://www.elastic.co/guide/en/elasticsearch/reference/7.16/modules-node.html
[secret]: https://kubernetes.io/docs/concepts/configuration/secret/#using-secrets
[securityContext]: https://kubernetes.io/docs/tasks/configure-pod-container/security-context/
[service types]: https://kubernetes.io/docs/concepts/services-networking/service/#publishing-services-service-types
[snapshot lifecycle management]: https://www.elastic.co/guide/en/elasticsearch/reference/7.16/snapshot-lifecycle-management.html
[snapshot plugin]: https://www.elastic.co/guide/en/elasticsearch/plugins/7.16/repository.html
[snapshot repository]: https://www.elastic.co/guide/en/elasticsearch/reference/7.16/modules-snapshots.html
[supported configurations]: https://github.com/elastic/helm-charts/tree/7.16/README.md#supported-configurations
[sysctl vm.max_map_count]: https://www.elastic.co/guide/en/elasticsearch/reference/7.16/vm-max-map-count.html#vm-max-map-count
[terminationGracePeriod]: https://kubernetes.io/docs/concepts/workloads/pods/pod/#termination-of-pods
[tolerations]: https://kubernetes.io/docs/concepts/configuration/taint-and-toleration/
[transport port configuration]: https://www.elastic.co/guide/en/elasticsearch/reference/7.16/modules-transport.html#_transport_settings
[updateStrategy]: https://kubernetes.io/docs/concepts/workloads/controllers/statefulset/
[values.yaml]: https://github.com/elastic/helm-charts/tree/7.16/elasticsearch/values.yaml
[volumeClaimTemplate for statefulsets]: https://kubernetes.io/docs/concepts/workloads/controllers/statefulset/#stable-storage

View File

@@ -0,0 +1,21 @@
default: test
include ../../../helpers/examples.mk
RELEASE := helm-es-config
TIMEOUT := 1200s
install:
helm upgrade --wait --timeout=$(TIMEOUT) --install --values values.yaml $(RELEASE) ../../
secrets:
kubectl delete secret elastic-config-credentials elastic-config-secret elastic-config-slack elastic-config-custom-path || true
kubectl create secret generic elastic-config-credentials --from-literal=password=changeme --from-literal=username=elastic
kubectl create secret generic elastic-config-slack --from-literal=xpack.notification.slack.account.monitoring.secure_url='https://hooks.slack.com/services/asdasdasd/asdasdas/asdasd'
kubectl create secret generic elastic-config-secret --from-file=xpack.watcher.encryption_key=./watcher_encryption_key
kubectl create secret generic elastic-config-custom-path --from-literal=slack_url='https://hooks.slack.com/services/asdasdasd/asdasdas/asdasd' --from-literal=thing_i_don_tcare_about=test
test: secrets install goss
purge:
helm del $(RELEASE)

View File

@@ -0,0 +1,27 @@
# Config
This example deploy a single node Elasticsearch 7.16.2 with authentication and
custom [values][].
## Usage
* Create the required secrets: `make secrets`
* Deploy Elasticsearch chart with the default values: `make install`
* You can now setup a port forward to query Elasticsearch API:
```
kubectl port-forward svc/config-master 9200
curl -u elastic:changeme http://localhost:9200/_cat/indices
```
## Testing
You can also run [goss integration tests][] using `make test`
[goss integration tests]: https://github.com/elastic/helm-charts/tree/7.16/elasticsearch/examples/config/test/goss.yaml
[values]: https://github.com/elastic/helm-charts/tree/7.16/elasticsearch/examples/config/values.yaml

View File

@@ -0,0 +1,29 @@
http:
http://localhost:9200/_cluster/health:
status: 200
timeout: 2000
username: elastic
password: "{{ .Env.ELASTIC_PASSWORD }}"
body:
- "green"
- '"number_of_nodes":1'
- '"number_of_data_nodes":1'
http://localhost:9200:
status: 200
timeout: 2000
username: elastic
password: "{{ .Env.ELASTIC_PASSWORD }}"
body:
- '"cluster_name" : "config"'
- "You Know, for Search"
command:
"elasticsearch-keystore list":
exit-status: 0
stdout:
- keystore.seed
- bootstrap.password
- xpack.notification.slack.account.monitoring.secure_url
- xpack.notification.slack.account.otheraccount.secure_url
- xpack.watcher.encryption_key

View File

@@ -0,0 +1,27 @@
---
clusterName: "config"
replicas: 1
extraEnvs:
- name: ELASTIC_PASSWORD
valueFrom:
secretKeyRef:
name: elastic-config-credentials
key: password
# This is just a dummy file to make sure that
# the keystore can be mounted at the same time
# as a custom elasticsearch.yml
esConfig:
elasticsearch.yml: |
xpack.security.enabled: true
path.data: /usr/share/elasticsearch/data
keystore:
- secretName: elastic-config-secret
- secretName: elastic-config-slack
- secretName: elastic-config-custom-path
items:
- key: slack_url
path: xpack.notification.slack.account.otheraccount.secure_url

View File

@@ -0,0 +1 @@
supersecret

View File

@@ -0,0 +1,14 @@
default: test
include ../../../helpers/examples.mk
RELEASE := helm-es-default
TIMEOUT := 1200s
install:
helm upgrade --wait --timeout=$(TIMEOUT) --install $(RELEASE) ../../
test: install goss
purge:
helm del $(RELEASE)

View File

@@ -0,0 +1,25 @@
# Default
This example deploy a 3 nodes Elasticsearch 7.16.2 cluster using
[default values][].
## Usage
* Deploy Elasticsearch chart with the default values: `make install`
* You can now setup a port forward to query Elasticsearch API:
```
kubectl port-forward svc/elasticsearch-master 9200
curl localhost:9200/_cat/indices
```
## Testing
You can also run [goss integration tests][] using `make test`
[goss integration tests]: https://github.com/elastic/helm-charts/tree/7.16/elasticsearch/examples/default/test/goss.yaml
[default values]: https://github.com/elastic/helm-charts/tree/7.16/elasticsearch/values.yaml

View File

@@ -0,0 +1,19 @@
#!/usr/bin/env bash -x
kubectl proxy || true &
make &
PROC_ID=$!
while kill -0 "$PROC_ID" >/dev/null 2>&1; do
echo "PROCESS IS RUNNING"
if curl --fail 'http://localhost:8001/api/v1/proxy/namespaces/default/services/elasticsearch-master:9200/_search' ; then
echo "cluster is healthy"
else
echo "cluster not healthy!"
exit 1
fi
sleep 1
done
echo "PROCESS TERMINATED"
exit 0

View File

@@ -0,0 +1,38 @@
kernel-param:
vm.max_map_count:
value: "262144"
http:
http://elasticsearch-master:9200/_cluster/health:
status: 200
timeout: 2000
body:
- "green"
- '"number_of_nodes":3'
- '"number_of_data_nodes":3'
http://localhost:9200:
status: 200
timeout: 2000
body:
- '"number" : "7.16.2"'
- '"cluster_name" : "elasticsearch"'
- "You Know, for Search"
file:
/usr/share/elasticsearch/data:
exists: true
mode: "2775"
owner: root
group: elasticsearch
filetype: directory
mount:
/usr/share/elasticsearch/data:
exists: true
user:
elasticsearch:
exists: true
uid: 1000
gid: 1000

View File

@@ -0,0 +1,13 @@
default: test
RELEASE := helm-es-docker-for-mac
TIMEOUT := 1200s
install:
helm upgrade --wait --timeout=$(TIMEOUT) --install --values values.yaml $(RELEASE) ../../
test: install
helm test $(RELEASE)
purge:
helm del $(RELEASE)

View File

@@ -0,0 +1,23 @@
# Docker for Mac
This example deploy a 3 nodes Elasticsearch 7.16.2 cluster on [Docker for Mac][]
using [custom values][].
Note that this configuration should be used for test only and isn't recommended
for production.
## Usage
* Deploy Elasticsearch chart with the default values: `make install`
* You can now setup a port forward to query Elasticsearch API:
```
kubectl port-forward svc/elasticsearch-master 9200
curl localhost:9200/_cat/indices
```
[custom values]: https://github.com/elastic/helm-charts/tree/7.16/elasticsearch/examples/docker-for-mac/values.yaml
[docker for mac]: https://docs.docker.com/docker-for-mac/kubernetes/

View File

@@ -0,0 +1,23 @@
---
# Permit co-located instances for solitary minikube virtual machines.
antiAffinity: "soft"
# Shrink default JVM heap.
esJavaOpts: "-Xmx128m -Xms128m"
# Allocate smaller chunks of memory per pod.
resources:
requests:
cpu: "100m"
memory: "512M"
limits:
cpu: "1000m"
memory: "512M"
# Request smaller persistent volumes.
volumeClaimTemplate:
accessModes: [ "ReadWriteOnce" ]
storageClassName: "hostpath"
resources:
requests:
storage: 100M

View File

@@ -0,0 +1,17 @@
default: test
RELEASE := helm-es-kind
TIMEOUT := 1200s
install:
helm upgrade --wait --timeout=$(TIMEOUT) --install --values values.yaml $(RELEASE) ../../
install-local-path:
kubectl apply -f https://raw.githubusercontent.com/rancher/local-path-provisioner/master/deploy/local-path-storage.yaml
helm upgrade --wait --timeout=$(TIMEOUT) --install --values values-local-path.yaml $(RELEASE) ../../
test: install
helm test $(RELEASE)
purge:
helm del $(RELEASE)

View File

@@ -0,0 +1,36 @@
# KIND
This example deploy a 3 nodes Elasticsearch 7.16.2 cluster on [Kind][]
using [custom values][].
Note that this configuration should be used for test only and isn't recommended
for production.
Note that Kind < 0.7.0 are affected by a [kind issue][] with mount points
created from PVCs not writable by non-root users. [kubernetes-sigs/kind#1157][]
fix it in Kind 0.7.0.
The workaround for Kind < 0.7.0 is to install manually
[Rancher Local Path Provisioner][] and use `local-path` storage class for
Elasticsearch volumes (see [Makefile][] instructions).
## Usage
* For Kind >= 0.7.0: Deploy Elasticsearch chart with the default values: `make install`
* For Kind < 0.7.0: Deploy Elasticsearch chart with `local-path` storage class: `make install-local-path`
* You can now setup a port forward to query Elasticsearch API:
```
kubectl port-forward svc/elasticsearch-master 9200
curl localhost:9200/_cat/indices
```
[custom values]: https://github.com/elastic/helm-charts/blob/7.16/elasticsearch/examples/kubernetes-kind/values.yaml
[kind]: https://kind.sigs.k8s.io/
[kind issue]: https://github.com/kubernetes-sigs/kind/issues/830
[kubernetes-sigs/kind#1157]: https://github.com/kubernetes-sigs/kind/pull/1157
[rancher local path provisioner]: https://github.com/rancher/local-path-provisioner
[Makefile]: https://github.com/elastic/helm-charts/blob/7.16/elasticsearch/examples/kubernetes-kind/Makefile#L5

View File

@@ -0,0 +1,23 @@
---
# Permit co-located instances for solitary minikube virtual machines.
antiAffinity: "soft"
# Shrink default JVM heap.
esJavaOpts: "-Xmx128m -Xms128m"
# Allocate smaller chunks of memory per pod.
resources:
requests:
cpu: "100m"
memory: "512M"
limits:
cpu: "1000m"
memory: "512M"
# Request smaller persistent volumes.
volumeClaimTemplate:
accessModes: [ "ReadWriteOnce" ]
storageClassName: "local-path"
resources:
requests:
storage: 100M

View File

@@ -0,0 +1,23 @@
---
# Permit co-located instances for solitary minikube virtual machines.
antiAffinity: "soft"
# Shrink default JVM heap.
esJavaOpts: "-Xmx128m -Xms128m"
# Allocate smaller chunks of memory per pod.
resources:
requests:
cpu: "100m"
memory: "512M"
limits:
cpu: "1000m"
memory: "512M"
# Request smaller persistent volumes.
volumeClaimTemplate:
accessModes: [ "ReadWriteOnce" ]
storageClassName: "local-path"
resources:
requests:
storage: 100M

View File

@@ -0,0 +1,13 @@
default: test
RELEASE := helm-es-microk8s
TIMEOUT := 1200s
install:
helm upgrade --wait --timeout=$(TIMEOUT) --install --values values.yaml $(RELEASE) ../../
test: install
helm test $(RELEASE)
purge:
helm del $(RELEASE)

View File

@@ -0,0 +1,32 @@
# MicroK8S
This example deploy a 3 nodes Elasticsearch 7.16.2 cluster on [MicroK8S][]
using [custom values][].
Note that this configuration should be used for test only and isn't recommended
for production.
## Requirements
The following MicroK8S [addons][] need to be enabled:
- `dns`
- `helm`
- `storage`
## Usage
* Deploy Elasticsearch chart with the default values: `make install`
* You can now setup a port forward to query Elasticsearch API:
```
kubectl port-forward svc/elasticsearch-master 9200
curl localhost:9200/_cat/indices
```
[addons]: https://microk8s.io/docs/addons
[custom values]: https://github.com/elastic/helm-charts/tree/7.16/elasticsearch/examples/microk8s/values.yaml
[MicroK8S]: https://microk8s.io

View File

@@ -0,0 +1,32 @@
---
# Disable privileged init Container creation.
sysctlInitContainer:
enabled: false
# Restrict the use of the memory-mapping when sysctlInitContainer is disabled.
esConfig:
elasticsearch.yml: |
node.store.allow_mmap: false
# Permit co-located instances for solitary minikube virtual machines.
antiAffinity: "soft"
# Shrink default JVM heap.
esJavaOpts: "-Xmx128m -Xms128m"
# Allocate smaller chunks of memory per pod.
resources:
requests:
cpu: "100m"
memory: "512M"
limits:
cpu: "1000m"
memory: "512M"
# Request smaller persistent volumes.
volumeClaimTemplate:
accessModes: [ "ReadWriteOnce" ]
storageClassName: "microk8s-hostpath"
resources:
requests:
storage: 100M

View File

@@ -0,0 +1,10 @@
PREFIX := helm-es-migration
data:
helm upgrade --wait --timeout=$(TIMEOUT) --install --values data.yaml $(PREFIX)-data ../../
master:
helm upgrade --wait --timeout=$(TIMEOUT) --install --values master.yaml $(PREFIX)-master ../../
client:
helm upgrade --wait --timeout=$(TIMEOUT) --install --values client.yaml $(PREFIX)-client ../../

View File

@@ -0,0 +1,167 @@
# Migration Guide from helm/charts
There are two viable options for migrating from the community Elasticsearch Helm
chart from the [helm/charts][] repo.
1. Restoring from Snapshot to a fresh cluster
2. Live migration by joining a new cluster to the existing cluster.
## Restoring from Snapshot
This is the recommended and preferred option. The downside is that it will
involve a period of write downtime during the migration. If you have a way to
temporarily stop writes to your cluster then this is the way to go. This is also
a lot simpler as it just involves launching a fresh cluster and restoring a
snapshot following the [restoring to a different cluster guide][].
## Live migration
If restoring from a snapshot is not possible due to the write downtime then a
live migration is also possible. It is very important to first test this in a
testing environment to make sure you are comfortable with the process and fully
understand what is happening.
This process will involve joining a new set of master, data and client nodes to
an existing cluster that has been deployed using the [helm/charts][] community
chart. Nodes will then be replaced one by one in a controlled fashion to
decommission the old cluster.
This example will be using the default values for the existing helm/charts
release and for the Elastic helm-charts release. If you have changed any of the
default values then you will need to first make sure that your values are
configured in a compatible way before starting the migration.
The process will involve a re-sync and a rolling restart of all of your data
nodes. Therefore it is important to disable shard allocation and perform a synced
flush like you normally would during any other rolling upgrade. See the
[rolling upgrades guide][] for more information.
* The default image for this chart is
`docker.elastic.co/elasticsearch/elasticsearch` which contains the default
distribution of Elasticsearch with a [basic license][]. Make sure to update the
`image` and `imageTag` values to the correct Docker image and Elasticsearch
version that you currently have deployed.
* Convert your current helm/charts configuration into something that is
compatible with this chart.
* Take a fresh snapshot of your cluster. If something goes wrong you want to be
able to restore your data no matter what.
* Check that your clusters health is green. If not abort and make sure your
cluster is healthy before continuing:
```
curl localhost:9200/_cluster/health
```
* Deploy new data nodes which will join the existing cluster. Take a look at the
configuration in [data.yaml][]:
```
make data
```
* Check that the new nodes have joined the cluster (run this and any other curl
commands from within one of your pods):
```
curl localhost:9200/_cat/nodes
```
* Check that your cluster is still green. If so we can now start to scale down
the existing data nodes. Assuming you have the default amount of data nodes (2)
we now want to scale it down to 1:
```
kubectl scale statefulsets my-release-elasticsearch-data --replicas=1
```
* Wait for your cluster to become green again:
```
watch 'curl -s localhost:9200/_cluster/health'
```
* Once the cluster is green we can scale down again:
```
kubectl scale statefulsets my-release-elasticsearch-data --replicas=0
```
* Wait for the cluster to be green again.
* OK. We now have all data nodes running in the new cluster. Time to replace the
masters by firstly scaling down the masters from 3 to 2. Between each step make
sure to wait for the cluster to become green again, and check with
`curl localhost:9200/_cat/nodes` that you see the correct amount of master
nodes. During this process we will always make sure to keep at least 2 master
nodes as to not lose quorum:
```
kubectl scale statefulsets my-release-elasticsearch-master --replicas=2
```
* Now deploy a single new master so that we have 3 masters again. See
[master.yaml][] for the configuration:
```
make master
```
* Scale down old masters to 1:
```
kubectl scale statefulsets my-release-elasticsearch-master --replicas=1
```
* Edit the masters in [masters.yaml][] to 2 and redeploy:
```
make master
```
* Scale down the old masters to 0:
```
kubectl scale statefulsets my-release-elasticsearch-master --replicas=0
```
* Edit the [masters.yaml][] to have 3 replicas and remove the
`discovery.zen.ping.unicast.hosts` entry from `extraEnvs` then redeploy the
masters. This will make sure all 3 masters are running in the new cluster and
are pointing at each other for discovery:
```
make master
```
* Remove the `discovery.zen.ping.unicast.hosts` entry from `extraEnvs` then
redeploy the data nodes to make sure they are pointing at the new masters:
```
make data
```
* Deploy the client nodes:
```
make client
```
* Update any processes that are talking to the existing client nodes and point
them to the new client nodes. Once this is done you can scale down the old
client nodes:
```
kubectl scale deployment my-release-elasticsearch-client --replicas=0
```
* The migration should now be complete. After verifying that everything is
working correctly you can cleanup leftover resources from your old cluster.
[basic license]: https://www.elastic.co/subscriptions
[data.yaml]: https://github.com/elastic/helm-charts/blob/7.16/elasticsearch/examples/migration/data.yaml
[helm/charts]: https://github.com/helm/charts/tree/7.16/stable/elasticsearch
[master.yaml]: https://github.com/elastic/helm-charts/blob/7.16/elasticsearch/examples/migration/master.yaml
[restoring to a different cluster guide]: https://www.elastic.co/guide/en/elasticsearch/reference/6.8/modules-snapshots.html#_restoring_to_a_different_cluster
[rolling upgrades guide]: https://www.elastic.co/guide/en/elasticsearch/reference/6.8/rolling-upgrades.html

View File

@@ -0,0 +1,23 @@
---
replicas: 2
clusterName: "elasticsearch"
nodeGroup: "client"
esMajorVersion: 6
roles:
master: "false"
ingest: "false"
data: "false"
volumeClaimTemplate:
accessModes: [ "ReadWriteOnce" ]
storageClassName: "standard"
resources:
requests:
storage: 1Gi # Currently needed till pvcs are made optional
persistence:
enabled: false

View File

@@ -0,0 +1,17 @@
---
replicas: 2
esMajorVersion: 6
extraEnvs:
- name: discovery.zen.ping.unicast.hosts
value: "my-release-elasticsearch-discovery"
clusterName: "elasticsearch"
nodeGroup: "data"
roles:
master: "false"
ingest: "false"
data: "true"

View File

@@ -0,0 +1,26 @@
---
# Temporarily set to 3 so we can scale up/down the old a new cluster
# one at a time whilst always keeping 3 masters running
replicas: 1
esMajorVersion: 6
extraEnvs:
- name: discovery.zen.ping.unicast.hosts
value: "my-release-elasticsearch-discovery"
clusterName: "elasticsearch"
nodeGroup: "master"
roles:
master: "true"
ingest: "false"
data: "false"
volumeClaimTemplate:
accessModes: [ "ReadWriteOnce" ]
storageClassName: "standard"
resources:
requests:
storage: 4Gi

Some files were not shown because too many files have changed in this diff Show More