Compare commits

...

7 Commits

Author SHA1 Message Date
Valeriano Manassero
78ba93a0df 210 wrong usage of extra python packages environment variable (#212)
* Fixed: env var name reference

* Changed: version bump
2023-05-29 12:55:00 +02:00
Valeriano Manassero
1b6b3dce94 Changed: slack channel reference (#209) 2023-05-18 11:12:03 +03:00
Valeriano Manassero
1ba6440c58 [Serving] Fix resources setting (#208)
* Fixed: resources

* Changed: bump up version

* Fixed: indentation

* Fixed: indentation
2023-05-11 16:35:40 +02:00
Valeriano Manassero
5b31ea8599 Remove unsupported dynamic svc (#206)
* Removed: unsupported values

* Changed: version bump

* Changed: removed not needed value

* Changed: helm-docs

* Removed: unsupported values
2023-05-08 17:25:01 +02:00
Valeriano Manassero
876df432d4 Changed: version bump 2023-04-14 12:11:19 +02:00
Valeriano Manassero
bf755ed6b8 Changed: set replication for this scenario 2023-04-14 12:11:10 +02:00
Valeriano Manassero
9b6372d730 Fixed: redis svc name creation 2023-04-14 12:10:55 +02:00
16 changed files with 31 additions and 79 deletions

View File

@@ -4,7 +4,7 @@
Contribution comes in many forms:
* Reporting [issues](https://github.com/allegroai/clearml-helm-charts/issues) you've come upon
* Participating in issue discussions in the [issue tracker](https://github.com/allegroai/clearml-helm-charts/issues) and the [ClearML community slack space](https://join.slack.com/t/allegroai-trains/shared_invite/enQtOTQyMTI1MzQxMzE4LTY5NTUxOTY1NmQ1MzQ5MjRhMGRhZmM4ODE5NTNjMTg2NTBlZGQzZGVkMWU3ZDg1MGE1MjQxNDEzMWU2NmVjZmY)
* Participating in issue discussions in the [issue tracker](https://github.com/allegroai/clearml-helm-charts/issues) and the [ClearML community slack space](https://join.slack.com/t/clearml/shared_invite/zt-1v74jzwkn-~XsuWB0btXOlfFQCh8DJQw)
* Suggesting new features or enhancements
* Implementing new features or fixing outstanding issues
@@ -51,7 +51,7 @@ Enhancement suggestions are tracked as GitHub issues. After you determine which
Before you submit a new PR:
* Verify the work you plan to merge addresses an existing [issue](https://github.com/allegroai/clearml-helm-charts/issues) (If not, open a new one)
* Check related discussions in the [ClearML slack community](https://join.slack.com/t/allegroai-trains/shared_invite/enQtOTQyMTI1MzQxMzE4LTY5NTUxOTY1NmQ1MzQ5MjRhMGRhZmM4ODE5NTNjMTg2NTBlZGQzZGVkMWU3ZDg1MGE1MjQxNDEzMWU2NmVjZmY) (Or start your own discussion on the `#clearml-dev` channel)
* Check related discussions in the [ClearML slack community](https://join.slack.com/t/clearml/shared_invite/zt-1v74jzwkn-~XsuWB0btXOlfFQCh8DJQw) (Or start your own discussion on the `#clearml-dev` channel)
* Make sure your code conforms to the ClearML coding standards by running:
`flake8 --max-line-length=120 --statistics --show-source --extend-ignore=E501 ./clearml*`

View File

@@ -52,7 +52,7 @@ For installation instruction, follow related [Installation Guide](INSTALL.md).
See more information in the [official documentation](https://clear.ml/docs/latest/docs) and [on YouTube](https://www.youtube.com/c/ClearML).
If you have any questions, post on our [Slack Channel](https://join.slack.com/t/clearml/shared_invite/zt-c0t13pty-aVUZZW1TSSSg2vyIGVPBhg), or tag your questions on [stackoverflow](https://stackoverflow.com/questions/tagged/clearml) with '**[clearml](https://stackoverflow.com/questions/tagged/clearml)**' tag (*previously [trains](https://stackoverflow.com/questions/tagged/trains) tag*).
If you have any questions, post on our [Slack Channel](https://join.slack.com/t/clearml/shared_invite/zt-1v74jzwkn-~XsuWB0btXOlfFQCh8DJQw), or tag your questions on [stackoverflow](https://stackoverflow.com/questions/tagged/clearml) with '**[clearml](https://stackoverflow.com/questions/tagged/clearml)**' tag (*previously [trains](https://stackoverflow.com/questions/tagged/trains) tag*).
For feature requests or bug reports, please use [GitHub issues](https://github.com/allegroai/clearml-helm-charts/issues).

View File

@@ -2,7 +2,7 @@ apiVersion: v2
name: clearml-agent
description: MLOps platform Task running agent
type: application
version: "5.0.0"
version: "5.0.1"
appVersion: "1.24"
kubeVersion: ">= 1.21.0-0 < 1.28.0-0"
home: https://clear.ml

View File

@@ -1,6 +1,6 @@
# ClearML Kubernetes Agent
![Version: 5.0.0](https://img.shields.io/badge/Version-5.0.0-informational?style=flat-square) ![Type: application](https://img.shields.io/badge/Type-application-informational?style=flat-square) ![AppVersion: 1.24](https://img.shields.io/badge/AppVersion-1.24-informational?style=flat-square)
![Version: 5.0.1](https://img.shields.io/badge/Version-5.0.1-informational?style=flat-square) ![Type: application](https://img.shields.io/badge/Type-application-informational?style=flat-square) ![AppVersion: 1.24](https://img.shields.io/badge/AppVersion-1.24-informational?style=flat-square)
MLOps platform Task running agent
@@ -53,7 +53,7 @@ Kubernetes: `>= 1.21.0-0 < 1.28.0-0`
| Key | Type | Default | Description |
|-----|------|---------|-------------|
| agentk8sglue | object | `{"additionalClusterRoleBindings":[],"additionalRoleBindings":[],"affinity":{},"annotations":{},"apiServerUrlReference":"https://api.clear.ml","basePodTemplate":{"affinity":{},"annotations":{},"containerSecurityContext":{},"env":[],"fileMounts":[],"hostAliases":[],"initContainers":[],"labels":{},"nodeSelector":{},"podSecurityContext":{},"priorityClassName":"","resources":{},"schedulerName":"","tolerations":[],"volumeMounts":[],"volumes":[]},"clearmlcheckCertificate":true,"containerCustomBashScript":"","containerSecurityContext":{},"customBashScript":"","debugMode":false,"defaultContainerImage":"ubuntu:18.04","extraEnvs":[],"fileMounts":[],"fileServerUrlReference":"https://files.clear.ml","image":{"registry":"","repository":"allegroai/clearml-agent-k8s-base","tag":"1.24-21"},"labels":{},"nodeSelector":{},"podSecurityContext":{},"queue":"default","replicaCount":1,"serviceExistingAccountName":"","taskAsJob":false,"tolerations":[],"volumeMounts":[],"volumes":[],"webServerUrlReference":"https://app.clear.ml"}` | This agent will spawn queued experiments in new pods, a good use case is to combine this with GPU autoscaling nodes. https://github.com/allegroai/clearml-agent/tree/master/docker/k8s-glue |
| agentk8sglue | object | `{"additionalClusterRoleBindings":[],"additionalRoleBindings":[],"affinity":{},"annotations":{},"apiServerUrlReference":"https://api.clear.ml","basePodTemplate":{"affinity":{},"annotations":{},"containerSecurityContext":{},"env":[],"fileMounts":[],"hostAliases":[],"initContainers":[],"labels":{},"nodeSelector":{},"podSecurityContext":{},"priorityClassName":"","resources":{},"schedulerName":"","tolerations":[],"volumeMounts":[],"volumes":[]},"clearmlcheckCertificate":true,"containerSecurityContext":{},"defaultContainerImage":"ubuntu:18.04","extraEnvs":[],"fileMounts":[],"fileServerUrlReference":"https://files.clear.ml","image":{"registry":"","repository":"allegroai/clearml-agent-k8s-base","tag":"1.24-21"},"labels":{},"nodeSelector":{},"podSecurityContext":{},"queue":"default","replicaCount":1,"serviceExistingAccountName":"","tolerations":[],"volumeMounts":[],"volumes":[],"webServerUrlReference":"https://app.clear.ml"}` | This agent will spawn queued experiments in new pods, a good use case is to combine this with GPU autoscaling nodes. https://github.com/allegroai/clearml-agent/tree/master/docker/k8s-glue |
| agentk8sglue.additionalClusterRoleBindings | list | `[]` | additional existing ClusterRoleBindings |
| agentk8sglue.additionalRoleBindings | list | `[]` | additional existing RoleBindings |
| agentk8sglue.affinity | object | `{}` | affinity setup for Agent pod (example in values.yaml comments) |
@@ -77,10 +77,7 @@ Kubernetes: `>= 1.21.0-0 < 1.28.0-0`
| agentk8sglue.basePodTemplate.volumeMounts | list | `[]` | volume mounts definition for pods spawned to consume ClearML Task (example in values.yaml comments) |
| agentk8sglue.basePodTemplate.volumes | list | `[]` | volumes definition for pods spawned to consume ClearML Task (example in values.yaml comments) |
| agentk8sglue.clearmlcheckCertificate | bool | `true` | Check certificates validity for evefry UrlReference below. |
| agentk8sglue.containerCustomBashScript | string | `""` | Custom Bash script for the Task Pods ran by Glue Agent |
| agentk8sglue.containerSecurityContext | object | `{}` | container securityContext setup for Agent pod (example in values.yaml comments) |
| agentk8sglue.customBashScript | string | `""` | Custom Bash script for the Agent pod ran by Glue Agent |
| agentk8sglue.debugMode | bool | `false` | Enable Debugging logs for Agent pod |
| agentk8sglue.defaultContainerImage | string | `"ubuntu:18.04"` | default container image for ClearML Task pod |
| agentk8sglue.extraEnvs | list | `[]` | Extra Environment variables for Glue Agent |
| agentk8sglue.fileMounts | list | `[]` | file definition for Glue Agent (example in values.yaml comments) |
@@ -92,7 +89,6 @@ Kubernetes: `>= 1.21.0-0 < 1.28.0-0`
| agentk8sglue.queue | string | `"default"` | ClearML queue this agent will consume |
| agentk8sglue.replicaCount | int | `1` | Glue Agent number of pods |
| agentk8sglue.serviceExistingAccountName | string | `""` | if set, don't create a serviceAccountName but use defined existing one |
| agentk8sglue.taskAsJob | bool | `false` | ClearML spawn tasks as jobs instead of pods |
| agentk8sglue.tolerations | list | `[]` | tolerations setup for Agent pod (example in values.yaml comments) |
| agentk8sglue.volumeMounts | list | `[]` | volume mounts definition for Glue Agent (example in values.yaml comments) |
| agentk8sglue.volumes | list | `[]` | volumes definition for Glue Agent (example in values.yaml comments) |
@@ -112,12 +108,10 @@ Kubernetes: `>= 1.21.0-0 < 1.28.0-0`
| imageCredentials.password | string | `"pwd"` | Registry password |
| imageCredentials.registry | string | `"docker.io"` | Registry name |
| imageCredentials.username | string | `"someone"` | Registry username |
| sessions | object | `{"dynamicSvcs":false,"externalIP":"0.0.0.0","maxServices":20,"portModeEnabled":false,"setInteractiveQueuesTag":true,"startingPort":30000,"svcAnnotations":{},"svcType":"NodePort"}` | Sessions internal service configuration |
| sessions.dynamicSvcs | bool | `false` | Enable/Disable dynamic svc for sessions pods |
| sessions | object | `{"externalIP":"0.0.0.0","maxServices":20,"portModeEnabled":false,"startingPort":30000,"svcAnnotations":{},"svcType":"NodePort"}` | Sessions internal service configuration |
| sessions.externalIP | string | `"0.0.0.0"` | External IP sessions clients can connect to |
| sessions.maxServices | int | `20` | maximum number of NodePorts exposed |
| sessions.portModeEnabled | bool | `false` | Enable/Disable sessions portmode WARNING: only one Agent deployment can have this set to true |
| sessions.setInteractiveQueuesTag | bool | `true` | set interactive queue tags |
| sessions.startingPort | int | `30000` | starting range of exposed NodePorts |
| sessions.svcAnnotations | object | `{}` | specific annotations for session services |
| sessions.svcType | string | `"NodePort"` | service type ("NodePort" or "ClusterIP" or "LoadBalancer") |

View File

@@ -97,12 +97,6 @@ spec:
--ports-mode --num-of-services {{ .Values.sessions.maxServices }} \
--base-port {{ .Values.sessions.startingPort }} \
--gateway-address {{ .Values.sessions.externalIP }}"
{{- if .Values.sessions.dynamicSvcs }}
- name: CLEARML_K8S_GLUE_POD_POST_APPLY_CMD
value: "kubectl -n {namespace} apply -f ~/template/services-{pod_number}.yaml ; kubectl -n {namespace} label svc clearml-session-{pod_number} service-for={pod_name}"
- name: CLEARML_K8S_GLUE_POD_POST_DELETE_CMD
value: "kubectl -n {namespace} delete svc -l service-for={pod_name}"
{{- end }}
{{- else}}
- name: K8S_GLUE_EXTRA_ARGS
value: "--namespace {{ .Release.Namespace }} --template-yaml /root/template/template.yaml"
@@ -111,10 +105,6 @@ spec:
- name: CLEARML_CONFIG_FILE
value: /root/clearml.conf
{{- end }}
- name: CLEARML_K8S_GLUE_LIMIT_POD_LABEL
value: "ai.allegro.agent.serial=pod-{pod_number}"
- name: CLEARML_K8S_SECRETS_LIST_FILE
value: /root/template/secrets.yaml
- name: K8S_DEFAULT_NAMESPACE
value: "{{ .Release.Namespace }}"
- name: CLEARML_API_ACCESS_KEY
@@ -135,31 +125,6 @@ spec:
value: ""
- name: CLEARML_DOCKER_IMAGE
value: "{{.Values.agentk8sglue.defaultContainerImage}}"
{{- if .Values.agentk8sglue.customBashScript }}
- name: CLEARML_K8S_GLUE_EXTRA_BASH_SCRIPT
value: "{{.Values.agentk8sglue.customBashScript}}"
{{- end }}
{{- if .Values.agentk8sglue.containerCustomBashScript }}
- name: CLEARML_K8S_GLUE_POD_BASH_SCRIPT
value: "{{.Values.agentk8sglue.containerCustomBashScript}}"
{{- end }}
{{- if .Values.agentk8sglue.debugMode }}
- name: "CLEARML_K8S_GLUE_DEBUG"
value: "1"
{{- end }}
{{- if .Values.sessions.portModeEnabled }}
{{- if .Values.sessions.setInteractiveQueuesTag }}
- name: "CLEARML_K8S_GLUE_SET_QUEUE_SYSTEM_TAGS"
value: "interactive"
{{- end }}
{{- end }}
{{- if .Values.agentk8sglue.taskAsJob }}
- name: "CLEARML_K8S_GLUE_KIND"
value: "job"
{{- else }}
- name: "CLEARML_K8S_GLUE_KIND"
value: "pod"
{{- end }}
- name: K8S_GLUE_QUEUE
value: {{ .Values.agentk8sglue.queue }}
{{- if .Values.agentk8sglue.extraEnvs }}

View File

@@ -1,5 +1,4 @@
{{- if .Values.sessions.portModeEnabled }}
{{- if not .Values.sessions.dynamicSvcs }}
{{- range untilStep 1 ( ( add .Values.sessions.maxServices 1 ) | int ) 1 }}
---
apiVersion: v1
@@ -29,4 +28,3 @@ spec:
ai.allegro.agent.serial: pod-{{ . }}
{{- end }}
{{- end }}
{{- end }}

View File

@@ -53,9 +53,6 @@ agentk8sglue:
# -- Check certificates validity for evefry UrlReference below.
clearmlcheckCertificate: true
# -- Enable Debugging logs for Agent pod
debugMode: false
# -- Reference to Api server url
apiServerUrlReference: "https://api.clear.ml"
# -- Reference to File server url
@@ -67,19 +64,12 @@ agentk8sglue:
defaultContainerImage: ubuntu:18.04
# -- ClearML queue this agent will consume
queue: default
# -- ClearML spawn tasks as jobs instead of pods
taskAsJob: false
# -- Custom Bash script for the Glue Agent
# -- labels setup for Agent pod (example in values.yaml comments)
labels: {}
# schedulerName: scheduler
# -- annotations setup for Agent pod (example in values.yaml comments)
annotations: {}
# key1: value1
# -- Custom Bash script for the Agent pod ran by Glue Agent
customBashScript: ""
# -- Custom Bash script for the Task Pods ran by Glue Agent
containerCustomBashScript: ""
# -- Extra Environment variables for Glue Agent
extraEnvs: []
# - name: PYTHONPATH
@@ -216,8 +206,6 @@ agentk8sglue:
sessions:
# -- Enable/Disable sessions portmode WARNING: only one Agent deployment can have this set to true
portModeEnabled: false
# -- Enable/Disable dynamic svc for sessions pods
dynamicSvcs: false
# -- specific annotations for session services
svcAnnotations: {}
# -- service type ("NodePort" or "ClusterIP" or "LoadBalancer")
@@ -228,5 +216,3 @@ sessions:
startingPort: 30000
# -- maximum number of NodePorts exposed
maxServices: 20
# -- set interactive queue tags
setInteractiveQueuesTag: true

View File

@@ -2,7 +2,7 @@ apiVersion: v2
name: clearml-serving
description: ClearML Serving Helm Chart
type: application
version: "1.0.1"
version: "1.0.3"
appVersion: "1.2.0"
kubeVersion: ">= 1.21.0-0 < 1.28.0-0"
home: https://clear.ml
@@ -33,5 +33,5 @@ dependencies:
condition: grafana.enabled
annotations:
artifacthub.io/changes: |
- kind: added
description: support for k8s 1.27
- kind: fixed
description: env var referenced for extra packages

View File

@@ -1,6 +1,6 @@
# ClearML Kubernetes Serving
![Version: 1.0.1](https://img.shields.io/badge/Version-1.0.1-informational?style=flat-square) ![Type: application](https://img.shields.io/badge/Type-application-informational?style=flat-square) ![AppVersion: 1.2.0](https://img.shields.io/badge/AppVersion-1.2.0-informational?style=flat-square)
![Version: 1.0.3](https://img.shields.io/badge/Version-1.0.3-informational?style=flat-square) ![Type: application](https://img.shields.io/badge/Type-application-informational?style=flat-square) ![AppVersion: 1.2.0](https://img.shields.io/badge/AppVersion-1.2.0-informational?style=flat-square)
ClearML Serving Helm Chart

View File

@@ -51,12 +51,13 @@ spec:
- name: CLEARML_USE_GUNICORN
value: "true"
{{- if .Values.clearml_serving_inference.extraPythonPackages }}
- name: EXTRA_PYTHON_PACKAGES
- name: CLEARML_EXTRA_PYTHON_PACKAGES
value: '{{ join " " .Values.clearml_serving_inference.extraPythonPackages }}'
{{- end }}
image: "{{ .Values.clearml_serving_inference.image.repository }}:{{ .Values.clearml_serving_inference.image.tag }}"
name: {{ include "clearmlServing.fullname" . }}-inference
ports:
- containerPort: 8080
resources: {}
resources:
{{- toYaml .Values.clearml_serving_inference.resources | nindent 12 }}
restartPolicy: Always

View File

@@ -37,12 +37,13 @@ spec:
- name: CLEARML_SERVING_TASK_ID
value: "{{ .Values.clearml.servingTaskId }}"
{{- if .Values.clearml_serving_statistics.extraPythonPackages }}
- name: EXTRA_PYTHON_PACKAGES
- name: CLEARML_EXTRA_PYTHON_PACKAGES
value: '{{ join " " .Values.clearml_serving_statistics.extraPythonPackages }}'
{{- end }}
image: "{{ .Values.clearml_serving_statistics.image.repository }}:{{ .Values.clearml_serving_statistics.image.tag }}"
name: {{ include "clearmlServing.fullname" . }}-statistics
ports:
- containerPort: 9999
resources: {}
resources:
{{- toYaml .Values.clearml_serving_statistics.resources | nindent 12 }}
restartPolicy: Always

View File

@@ -38,14 +38,15 @@ spec:
- name: CLEARML_TRITON_METRIC_FREQ
value: "1.0"
{{- if .Values.clearml_serving_triton.extraPythonPackages }}
- name: EXTRA_PYTHON_PACKAGES
- name: CLEARML_EXTRA_PYTHON_PACKAGES
value: '{{ join " " .Values.clearml_serving_triton.extraPythonPackages }}'
{{- end }}
image: "{{ .Values.clearml_serving_triton.image.repository }}:{{ .Values.clearml_serving_triton.image.tag }}"
name: {{ include "clearmlServing.fullname" . }}-triton
ports:
- containerPort: 8001
resources: {}
resources:
{{- toYaml .Values.clearml_serving_triton.resources | nindent 12 }}
restartPolicy: Always
{{ end }}

View File

@@ -2,7 +2,7 @@ apiVersion: v2
name: clearml
description: MLOps platform
type: application
version: "7.0.0"
version: "7.0.1"
appVersion: "1.10.0"
kubeVersion: ">= 1.21.0-0 < 1.28.0-0"
home: https://clear.ml
@@ -32,5 +32,5 @@ dependencies:
condition: elasticsearch.enabled
annotations:
artifacthub.io/changes: |
- kind: changed
description: removed support for enterprise features due to chart split
- kind: fixed
description: redis reference for production

View File

@@ -1,6 +1,6 @@
# ClearML Ecosystem for Kubernetes
![Version: 7.0.0](https://img.shields.io/badge/Version-7.0.0-informational?style=flat-square) ![Type: application](https://img.shields.io/badge/Type-application-informational?style=flat-square) ![AppVersion: 1.10.0](https://img.shields.io/badge/AppVersion-1.10.0-informational?style=flat-square)
![Version: 7.0.1](https://img.shields.io/badge/Version-7.0.1-informational?style=flat-square) ![Type: application](https://img.shields.io/badge/Type-application-informational?style=flat-square) ![AppVersion: 1.10.0](https://img.shields.io/badge/AppVersion-1.10.0-informational?style=flat-square)
MLOps platform

View File

@@ -194,7 +194,7 @@ MongoDB Comnnection string
{{- end }}
{{/*
MongoDB hotname
MongoDB hostname
*/}}
{{- define "mongodb.hostname" -}}
{{- if eq .Values.mongodb.architecture "standalone" }}
@@ -209,8 +209,12 @@ Redis Service name
*/}}
{{- define "redis.servicename" -}}
{{- if .Values.redis.enabled }}
{{- if eq .Values.redis.architecture "standalone" }}
{{- tpl .Values.redis.master.name . }}
{{- else }}
{{- printf "%s-headless" (tpl .Values.redis.master.name . ) }}
{{- end }}
{{- else }}
{{- .Values.externalServices.redisHost }}
{{- end }}
{{- end }}

View File

@@ -34,6 +34,8 @@ redis:
size: 5Gi
## If undefined (the default) or set to null, no storageClassName spec is set, choosing the default provisioner
storageClass: null
architecture: replication
mongodb:
enabled: true
architecture: replicaset