mirror of
https://github.com/clearml/clearml-helm-charts
synced 2025-04-17 01:31:13 +00:00
Compare commits
20 Commits
clearml-2.
...
clearml-3.
| Author | SHA1 | Date | |
|---|---|---|---|
|
|
979e73fe3d | ||
|
|
7352f35836 | ||
|
|
82ad17860d | ||
|
|
aa761dd450 | ||
|
|
7ff2f94d1a | ||
|
|
618a269c97 | ||
|
|
3f215d2d90 | ||
|
|
03223fc1c1 | ||
|
|
898089b7fb | ||
|
|
732bb970aa | ||
|
|
91d45281fa | ||
|
|
28b6e9f4e4 | ||
|
|
7f6df85ec5 | ||
|
|
97f1077072 | ||
|
|
189de106c9 | ||
|
|
d269374a49 | ||
|
|
cc8789d71f | ||
|
|
6a2e3ed47e | ||
|
|
873fb6f7f0 | ||
|
|
d6e967c9f5 |
2
.github/workflows/release.yaml
vendored
2
.github/workflows/release.yaml
vendored
@@ -4,6 +4,8 @@ on:
|
||||
push:
|
||||
branches:
|
||||
- main
|
||||
paths:
|
||||
- 'charts/**'
|
||||
|
||||
jobs:
|
||||
release:
|
||||
|
||||
@@ -1,16 +1,60 @@
|
||||
# Contributing to AllegroAI helm charts repository
|
||||
# Guidelines for Contributing
|
||||
|
||||
:+1::tada: First off, thank you for taking the time to contribute! :tada::+1:
|
||||
:+1::tada: Firstly, we thank you for taking the time to contribute! :tada::+1:
|
||||
|
||||
This page contains information about reporting issues as well as some tips and guidelines useful to experienced open source contributors.
|
||||
Contribution comes in many forms:
|
||||
* Reporting [issues](https://github.com/allegroai/clearml-helm-charts/issues) you've come upon
|
||||
* Participating in issue discussions in the [issue tracker](https://github.com/allegroai/clearml-helm-charts/issues) and the [ClearML community slack space](https://join.slack.com/t/allegroai-trains/shared_invite/enQtOTQyMTI1MzQxMzE4LTY5NTUxOTY1NmQ1MzQ5MjRhMGRhZmM4ODE5NTNjMTg2NTBlZGQzZGVkMWU3ZDg1MGE1MjQxNDEzMWU2NmVjZmY)
|
||||
* Suggesting new features or enhancements
|
||||
* Implementing new features or fixing outstanding issues
|
||||
|
||||
## Reporting issues
|
||||
A great way to contribute to the project is to send a detailed report when you encounter an issue. We always appreciate a well-written, thorough bug report, and will thank you for it!
|
||||
The following is a set of guidelines for contributing to ClearML.
|
||||
These are primarily guidelines, not rules.
|
||||
Use your best judgment and feel free to propose changes to this document in a pull request.
|
||||
|
||||
Check that our [issue database](https://github.com/allegroai/clearml-helm-charts/issues) doesn't already include that problem or suggestion before submitting an issue. If you find a match, you can use the "subscribe" button to get notified on updates. Do not leave random "+1" or "I have this too" comments, as they only clutter the discussion, and don't help resolving it. However, if you have ways to reproduce the issue or have additional information that may help resolving the issue, please leave a comment.
|
||||
## Reporting Issues
|
||||
|
||||
By following these guidelines, you help maintainers and the community understand your report, reproduce the behavior, and find related reports.
|
||||
|
||||
## Contributing code
|
||||
Pull requests are always welcome! Not sure if that typo is worth a pull request? Found a bug and know how to fix it? Do it! We appreciate the help. Any significant improvement should be documented as [a GitHub issue](https://github.com/allegroai/clearml-helm-charts/issues) before starting working on it.
|
||||
Before reporting an issue, please check whether it already appears [here](https://github.com/allegroai/clearml-helm-charts/issues).
|
||||
If it does, join the on-going discussion instead.
|
||||
|
||||
We are always thrilled to receive pull requests and we do our best to process them quickly and provide feedback
|
||||
**Note**: If you find a **Closed** issue that may be the same issue which you are currently experiencing,
|
||||
then open a **New** issue and include a link to the original (Closed) issue in the body of your new one.
|
||||
|
||||
When reporting an issue, please include as much detail as possible: explain the problem and include additional details to help maintainers reproduce the problem:
|
||||
|
||||
* **Use a clear and descriptive title** for the issue to identify the problem.
|
||||
* **Describe the exact steps necessary to reproduce the problem** in as much detail as possible. Please do not just summarize what you did. Make sure to explain how you did it.
|
||||
* **Provide the specific environment setup.** Include the `pip freeze` output, specific environment variables, Python version, and other relevant information.
|
||||
* **Provide specific examples to demonstrate the steps.** Include links to files or GitHub projects, or copy/paste snippets which you use in those examples.
|
||||
* **If you are reporting any ClearML crash,** include a crash report with a stack trace from the operating system. Make sure to add the crash report in the issue and place it in a [code block](https://help.github.com/en/articles/getting-started-with-writing-and-formatting-on-github#multiple-lines),
|
||||
a [file attachment](https://help.github.com/articles/file-attachments-on-issues-and-pull-requests/), or just put it in a [gist](https://gist.github.com/) (and provide link to that gist).
|
||||
* **Describe the behavior you observed after following the steps** and the exact problem with that behavior.
|
||||
* **Explain which behavior you expected to see and why.**
|
||||
* **For Web-App issues, please include screenshots and animated GIFs** which recreate the described steps and clearly demonstrate the problem. You can use [LICEcap](https://www.cockos.com/licecap/) to record GIFs on macOS and Windows, and [silentcast](https://github.com/colinkeenan/silentcast) or [byzanz](https://github.com/threedaymonk/byzanz) on Linux.
|
||||
|
||||
## Suggesting New Features and Enhancements
|
||||
|
||||
By following these guidelines, you help maintainers and the community understand your suggestion and find related suggestions.
|
||||
|
||||
Enhancement suggestions are tracked as GitHub issues. After you determine which repository your enhancement suggestion is related to, create an issue on that repository and provide the following:
|
||||
|
||||
* **A clear and descriptive title** for the issue to identify the suggestion.
|
||||
* **A step-by-step description of the suggested enhancement** in as much detail as possible.
|
||||
* **Specific examples to demonstrate the steps.** Include copy/pasteable snippets which you use in those examples as [Markdown code blocks](https://help.github.com/articles/markdown-basics/#multiple-lines).
|
||||
* **Describe the current behavior and explain which behavior you expected to see instead and why.**
|
||||
* **Include screenshots or animated GIFs** which help you demonstrate the steps or point out the part of ClearML which the suggestion is related to. You can use [LICEcap](https://www.cockos.com/licecap/) to record GIFs on macOS and Windows, and [silentcast](https://github.com/colinkeenan/silentcast) or [byzanz](https://github.com/threedaymonk/byzanz) on Linux.
|
||||
|
||||
## Pull Requests
|
||||
|
||||
Before you submit a new PR:
|
||||
|
||||
* Verify the work you plan to merge addresses an existing [issue](https://github.com/allegroai/clearml-helm-charts/issues) (If not, open a new one)
|
||||
* Check related discussions in the [ClearML slack community](https://join.slack.com/t/allegroai-trains/shared_invite/enQtOTQyMTI1MzQxMzE4LTY5NTUxOTY1NmQ1MzQ5MjRhMGRhZmM4ODE5NTNjMTg2NTBlZGQzZGVkMWU3ZDg1MGE1MjQxNDEzMWU2NmVjZmY) (Or start your own discussion on the `#clearml-dev` channel)
|
||||
* Make sure your code conforms to the ClearML coding standards by running:
|
||||
`flake8 --max-line-length=120 --statistics --show-source --extend-ignore=E501 ./clearml*`
|
||||
|
||||
In your PR include:
|
||||
* A reference to the issue it addresses
|
||||
* A brief description of the approach you've taken for implementing
|
||||
|
||||
62
README.md
62
README.md
@@ -1,13 +1,55 @@
|
||||
# ClearML Helm Charts Library for Kubernetes
|
||||
|
||||
## Auto-Magical Experiment Manager & Version Control for AI
|
||||
|
||||
Helm charts provided by [Allegro AI](https://clear.ml), ready to launch on Kubernetes using [Kubernetes Helm](https://github.com/helm/helm).
|
||||
|
||||
## Introduction
|
||||
|
||||
The **clearml-server** is the backend service infrastructure for [ClearML](https://github.com/allegroai/clearml).
|
||||
It allows multiple users to collaborate and manage their experiments.
|
||||
By default, **ClearML** is set up to work with the **ClearML** demo server, which is open to anyone and resets periodically.
|
||||
In order to host your own server, you will need to install **clearml-server** and point **ClearML** to it.
|
||||
|
||||
**clearml-server** contains the following components:
|
||||
|
||||
* The **ClearML** Web-App, a single-page UI for experiment management and browsing
|
||||
* RESTful API for:
|
||||
* Documenting and logging experiment information, statistics and results
|
||||
* Querying experiments history, logs and results
|
||||
* Locally-hosted file server for storing images and models making them easily accessible using the Web-App
|
||||
|
||||
Use this repository to deploy **clearml-server** on Kubernetes clusters.
|
||||
|
||||
## Provided in this repository
|
||||
|
||||
### [All around Helm Chart](https://github.com/allegroai/clearml-helm-charts/tree/main/charts/clearml)
|
||||
|
||||
## Who We Are
|
||||
|
||||
ClearML is supported by the team behind *allegro.ai*,
|
||||
where we build deep learning pipelines and infrastructure for enterprise companies.
|
||||
|
||||
We built ClearML to track and control the glorious but messy process of training production-grade deep learning models.
|
||||
We are committed to vigorously supporting and expanding the capabilities of ClearML.
|
||||
|
||||
We promise to always be backwardly compatible, making sure all your logs, data and pipelines
|
||||
will always upgrade with you.
|
||||
|
||||
## License
|
||||
|
||||
Server Side Public License, Version 1 (see the [LICENSE](https://en.wikipedia.org/wiki/Server_Side_Public_License) for more information)
|
||||
|
||||
## Requirements
|
||||
|
||||
### Setup a Kubernetes Cluster
|
||||
|
||||
For setting up Kubernetes on various platforms refer to the Kubernetes [getting started guide](http://kubernetes.io/docs/getting-started-guides/).
|
||||
|
||||
### Setup a single node LOCAL Kubernetes on laptop/desktop
|
||||
|
||||
For setting up Kubernetes on your laptop/desktop we suggest [kind](https://kind.sigs.k8s.io).
|
||||
|
||||
### Install Helm
|
||||
|
||||
Helm is a tool for managing Kubernetes charts. Charts are packages of pre-configured Kubernetes resources.
|
||||
@@ -18,6 +60,24 @@ To install Helm, refer to the [Helm install guide](https://github.com/helm/helm#
|
||||
|
||||
```bash
|
||||
$ helm repo add allegroai https://allegroai.github.io/clearml-helm-charts
|
||||
$ helm repo update
|
||||
$ helm search repo allegroai
|
||||
$ helm install my-release allegroai/<chart>
|
||||
$ helm install <release-name> allegroai/<chart>
|
||||
```
|
||||
|
||||
## Documentation, Community & Support
|
||||
|
||||
More information in the [official documentation](https://allegro.ai/clearml/docs) and [on YouTube](https://www.youtube.com/c/ClearML).
|
||||
|
||||
If you have any questions: post on our [Slack Channel](https://join.slack.com/t/clearml/shared_invite/zt-c0t13pty-aVUZZW1TSSSg2vyIGVPBhg), or tag your questions on [stackoverflow](https://stackoverflow.com/questions/tagged/clearml) with '**[clearml](https://stackoverflow.com/questions/tagged/clearml)**' tag (*previously [trains](https://stackoverflow.com/questions/tagged/trains) tag*).
|
||||
|
||||
For feature requests or bug reports, please use [GitHub issues](https://github.com/allegroai/clearml-helm-charts/issues).
|
||||
|
||||
Additionally, you can always find us at *clearml@allegro.ai*
|
||||
|
||||
## Contributing
|
||||
|
||||
**PRs are always welcomed** :heart: See more details in the ClearML [Guidelines for Contributing](https://github.com/allegroai/clearml-helm-charts/blob/main/CONTRIBUTING.md).
|
||||
|
||||
|
||||
_May the force (and the goddess of learning rates) be with you!_
|
||||
|
||||
@@ -2,8 +2,8 @@ apiVersion: v2
|
||||
name: clearml
|
||||
description: MLOps platform
|
||||
type: application
|
||||
version: "2.0.0-alpha1"
|
||||
appVersion: "1.0.2"
|
||||
version: "3.0.2"
|
||||
appVersion: "1.1.1"
|
||||
home: https://clear.ml
|
||||
icon: https://raw.githubusercontent.com/allegroai/clearml/master/docs/clearml-logo.svg
|
||||
sources:
|
||||
|
||||
@@ -9,7 +9,7 @@
|
||||
TERMS AND CONDITIONS
|
||||
|
||||
0. Definitions.
|
||||
|
||||
|
||||
“This License” refers to Server Side Public License.
|
||||
|
||||
“Copyright” also means copyright-like laws that apply to other kinds of
|
||||
@@ -173,7 +173,7 @@
|
||||
access or legal rights of the compilation's users beyond what the
|
||||
individual works permit. Inclusion of a covered work in an aggregate does
|
||||
not cause this License to apply to the other parts of the aggregate.
|
||||
|
||||
|
||||
6. Conveying Non-Source Forms.
|
||||
|
||||
You may convey a covered work in object code form under the terms of
|
||||
@@ -185,7 +185,7 @@
|
||||
(including a physical distribution medium), accompanied by the
|
||||
Corresponding Source fixed on a durable physical medium customarily
|
||||
used for software interchange.
|
||||
|
||||
|
||||
b) Convey the object code in, or embodied in, a physical product
|
||||
(including a physical distribution medium), accompanied by a written
|
||||
offer, valid for at least three years and valid for as long as you
|
||||
@@ -196,12 +196,12 @@
|
||||
for software interchange, for a price no more than your reasonable cost
|
||||
of physically performing this conveying of source, or (2) access to
|
||||
copy the Corresponding Source from a network server at no charge.
|
||||
|
||||
|
||||
c) Convey individual copies of the object code with a copy of the
|
||||
written offer to provide the Corresponding Source. This alternative is
|
||||
allowed only occasionally and noncommercially, and only if you received
|
||||
the object code with such an offer, in accord with subsection 6b.
|
||||
|
||||
|
||||
d) Convey the object code by offering access from a designated place
|
||||
(gratis or for a charge), and offer equivalent access to the
|
||||
Corresponding Source in the same way through the same place at no
|
||||
@@ -214,7 +214,7 @@
|
||||
Regardless of what server hosts the Corresponding Source, you remain
|
||||
obligated to ensure that it is available for as long as needed to
|
||||
satisfy these requirements.
|
||||
|
||||
|
||||
e) Convey the object code using peer-to-peer transmission, provided you
|
||||
inform other peers where the object code and Corresponding Source of
|
||||
the work are being offered to the general public at no charge under
|
||||
@@ -496,7 +496,7 @@
|
||||
application program interfaces, automation software, monitoring software,
|
||||
backup software, storage software and hosting software, all such that a
|
||||
user could run an instance of the service using the Service Source Code
|
||||
you make available.
|
||||
you make available.
|
||||
|
||||
14. Revised Versions of this License.
|
||||
|
||||
@@ -532,9 +532,9 @@
|
||||
PURPOSE. THE ENTIRE RISK AS TO THE QUALITY AND PERFORMANCE OF THE PROGRAM
|
||||
IS WITH YOU. SHOULD THE PROGRAM PROVE DEFECTIVE, YOU ASSUME THE COST OF
|
||||
ALL NECESSARY SERVICING, REPAIR OR CORRECTION.
|
||||
|
||||
|
||||
16. Limitation of Liability.
|
||||
|
||||
|
||||
IN NO EVENT UNLESS REQUIRED BY APPLICABLE LAW OR AGREED TO IN WRITING
|
||||
WILL ANY COPYRIGHT HOLDER, OR ANY OTHER PARTY WHO MODIFIES AND/OR CONVEYS
|
||||
THE PROGRAM AS PERMITTED ABOVE, BE LIABLE TO YOU FOR DAMAGES, INCLUDING
|
||||
@@ -544,7 +544,7 @@
|
||||
OR THIRD PARTIES OR A FAILURE OF THE PROGRAM TO OPERATE WITH ANY OTHER
|
||||
PROGRAMS), EVEN IF SUCH HOLDER OR OTHER PARTY HAS BEEN ADVISED OF THE
|
||||
POSSIBILITY OF SUCH DAMAGES.
|
||||
|
||||
|
||||
17. Interpretation of Sections 15 and 16.
|
||||
|
||||
If the disclaimer of warranty and limitation of liability provided above
|
||||
@@ -553,5 +553,5 @@
|
||||
waiver of all civil liability in connection with the Program, unless a
|
||||
warranty or assumption of liability accompanies a copy of the Program in
|
||||
return for a fee.
|
||||
|
||||
END OF TERMS AND CONDITIONS
|
||||
|
||||
END OF TERMS AND CONDITIONS
|
||||
@@ -1,6 +1,6 @@
|
||||
# clearml
|
||||
# ClearML Ecosystem for Kubernetes
|
||||
|
||||
  
|
||||
  
|
||||
|
||||
MLOps platform
|
||||
|
||||
@@ -12,6 +12,86 @@ MLOps platform
|
||||
| ---- | ------ | --- |
|
||||
| valeriano-manassero | | https://github.com/valeriano-manassero |
|
||||
|
||||
## Introduction
|
||||
|
||||
The **clearml-server** is the backend service infrastructure for [ClearML](https://github.com/allegroai/clearml).
|
||||
It allows multiple users to collaborate and manage their experiments.
|
||||
|
||||
**clearml-server** contains the following components:
|
||||
|
||||
* The ClearML Web-App, a single-page UI for experiment management and browsing
|
||||
* RESTful API for:
|
||||
* Documenting and logging experiment information, statistics and results
|
||||
* Querying experiments history, logs and results
|
||||
* Locally-hosted file server for storing images and models making them easily accessible using the Web-App
|
||||
|
||||
## Local environment
|
||||
|
||||
For development/evaluation it's possible to use [kind](https://kind.sigs.k8s.io).
|
||||
After installation, following commands will create a complete ClearML insatllation:
|
||||
|
||||
```
|
||||
mkdir -pm 777 /tmp/clearml-kind
|
||||
|
||||
cat <<EOF > /tmp/clearml-kind.yaml
|
||||
kind: Cluster
|
||||
apiVersion: kind.x-k8s.io/v1alpha4
|
||||
nodes:
|
||||
- role: control-plane
|
||||
extraPortMappings:
|
||||
- containerPort: 30008
|
||||
hostPort: 30008
|
||||
listenAddress: "127.0.0.1"
|
||||
protocol: TCP
|
||||
- containerPort: 30080
|
||||
hostPort: 30080
|
||||
listenAddress: "127.0.0.1"
|
||||
protocol: TCP
|
||||
- containerPort: 30081
|
||||
hostPort: 30081
|
||||
listenAddress: "127.0.0.1"
|
||||
protocol: TCP
|
||||
extraMounts:
|
||||
- hostPath: /tmp/clearml-kind/
|
||||
containerPath: /var/local-path-provisioner
|
||||
EOF
|
||||
|
||||
kind create cluster --config /tmp/clearml-kind.yaml
|
||||
|
||||
helm install clearml allegroai/clearml
|
||||
```
|
||||
|
||||
After deployment, the services will be exposed on localhost on the following ports:
|
||||
|
||||
* API server on `30008`
|
||||
* Web server on `30080`
|
||||
* File server on `30081`
|
||||
|
||||
Data persisted in every Kubernetes volume by ClearML will be accessible in /tmp/clearml-kind folder on the host.
|
||||
|
||||
## Production cluster environment
|
||||
|
||||
In a production environment it's suggested to install an ingress controller and verify that is working correctly.
|
||||
During ClearML deployment enable `ingress` section of chart values.
|
||||
This will create 3 ingress rules:
|
||||
|
||||
* `app.<your domain name>`
|
||||
* `files.<your domain name>`
|
||||
* `api.<your domain name>`
|
||||
|
||||
(*for example, `app.clearml.mydomainname.com`, `files.clearml.mydomainname.com` and `api.clearml.mydomainname.com`*)
|
||||
|
||||
Just pointing the domain records to the IP where ingress controller is responding will complete the deployment process.
|
||||
|
||||
## Additional Configuration for ClearML Server
|
||||
|
||||
You can also configure the **clearml-server** for:
|
||||
|
||||
* fixed users (users with credentials)
|
||||
* non-responsive experiment watchdog settings
|
||||
|
||||
For detailed instructions, see the [Optional Configuration](https://github.com/allegroai/clearml-server#optional-configuration) section in the **clearml-server** repository README file.
|
||||
|
||||
## Source Code
|
||||
|
||||
* <https://github.com/allegroai/clearml-helm-charts>
|
||||
@@ -29,28 +109,54 @@ MLOps platform
|
||||
|
||||
| Key | Type | Default | Description |
|
||||
|-----|------|---------|-------------|
|
||||
| agentGroups.agent-group0.affinity | object | `{}` | |
|
||||
| agentGroups.agent-group0.agentVersion | string | `""` | |
|
||||
| agentGroups.agent-group0.awsAccessKeyId | string | `nil` | |
|
||||
| agentGroups.agent-group0.awsDefaultRegion | string | `nil` | |
|
||||
| agentGroups.agent-group0.awsSecretAccessKey | string | `nil` | |
|
||||
| agentGroups.agent-group0.azureStorageAccount | string | `nil` | |
|
||||
| agentGroups.agent-group0.azureStorageKey | string | `nil` | |
|
||||
| agentGroups.agent-group0.clearmlAccessKey | string | `nil` | |
|
||||
| agentGroups.agent-group0.clearmlConfig | string | `"sdk {\n}"` | |
|
||||
| agentGroups.agent-group0.clearmlGitPassword | string | `nil` | |
|
||||
| agentGroups.agent-group0.clearmlGitUser | string | `nil` | |
|
||||
| agentGroups.agent-group0.clearmlSecretKey | string | `nil` | |
|
||||
| agentGroups.agent-group0.image.pullPolicy | string | `"IfNotPresent"` | |
|
||||
| agentGroups.agent-group0.image.repository | string | `"nvidia/cuda"` | |
|
||||
| agentGroups.agent-group0.image.tag | string | `"11.0-base-ubuntu18.04"` | |
|
||||
| agentGroups.agent-group0.name | string | `"agent-group0"` | |
|
||||
| agentGroups.agent-group0.nodeSelector | object | `{}` | |
|
||||
| agentGroups.agent-group0.nvidiaGpusPerAgent | int | `1` | |
|
||||
| agentGroups.agent-group0.podAnnotations | object | `{}` | |
|
||||
| agentGroups.agent-group0.queues | string | `"default"` | |
|
||||
| agentGroups.agent-group0.replicaCount | int | `0` | |
|
||||
| agentGroups.agent-group0.tolerations | list | `[]` | |
|
||||
| agentGroups.agent-group-cpu.affinity | object | `{}` | |
|
||||
| agentGroups.agent-group-cpu.agentVersion | string | `""` | |
|
||||
| agentGroups.agent-group-cpu.awsAccessKeyId | string | `nil` | |
|
||||
| agentGroups.agent-group-cpu.awsDefaultRegion | string | `nil` | |
|
||||
| agentGroups.agent-group-cpu.awsSecretAccessKey | string | `nil` | |
|
||||
| agentGroups.agent-group-cpu.azureStorageAccount | string | `nil` | |
|
||||
| agentGroups.agent-group-cpu.azureStorageKey | string | `nil` | |
|
||||
| agentGroups.agent-group-cpu.clearmlAccessKey | string | `nil` | |
|
||||
| agentGroups.agent-group-cpu.clearmlConfig | string | `"sdk {\n}"` | |
|
||||
| agentGroups.agent-group-cpu.clearmlGitPassword | string | `nil` | |
|
||||
| agentGroups.agent-group-cpu.clearmlGitUser | string | `nil` | |
|
||||
| agentGroups.agent-group-cpu.clearmlSecretKey | string | `nil` | |
|
||||
| agentGroups.agent-group-cpu.enabled | bool | `true` | |
|
||||
| agentGroups.agent-group-cpu.image.pullPolicy | string | `"IfNotPresent"` | |
|
||||
| agentGroups.agent-group-cpu.image.repository | string | `"ubuntu"` | |
|
||||
| agentGroups.agent-group-cpu.image.tag | string | `"18.04"` | |
|
||||
| agentGroups.agent-group-cpu.name | string | `"agent-group-cpu"` | |
|
||||
| agentGroups.agent-group-cpu.nodeSelector | object | `{}` | |
|
||||
| agentGroups.agent-group-cpu.nvidiaGpusPerAgent | int | `0` | |
|
||||
| agentGroups.agent-group-cpu.podAnnotations | object | `{}` | |
|
||||
| agentGroups.agent-group-cpu.queues | string | `"default"` | |
|
||||
| agentGroups.agent-group-cpu.replicaCount | int | `1` | |
|
||||
| agentGroups.agent-group-cpu.tolerations | list | `[]` | |
|
||||
| agentGroups.agent-group-cpu.updateStrategy | string | `"Recreate"` | |
|
||||
| agentGroups.agent-group-gpu.affinity | object | `{}` | |
|
||||
| agentGroups.agent-group-gpu.agentVersion | string | `""` | |
|
||||
| agentGroups.agent-group-gpu.awsAccessKeyId | string | `nil` | |
|
||||
| agentGroups.agent-group-gpu.awsDefaultRegion | string | `nil` | |
|
||||
| agentGroups.agent-group-gpu.awsSecretAccessKey | string | `nil` | |
|
||||
| agentGroups.agent-group-gpu.azureStorageAccount | string | `nil` | |
|
||||
| agentGroups.agent-group-gpu.azureStorageKey | string | `nil` | |
|
||||
| agentGroups.agent-group-gpu.clearmlAccessKey | string | `nil` | |
|
||||
| agentGroups.agent-group-gpu.clearmlConfig | string | `"sdk {\n}"` | |
|
||||
| agentGroups.agent-group-gpu.clearmlGitPassword | string | `nil` | |
|
||||
| agentGroups.agent-group-gpu.clearmlGitUser | string | `nil` | |
|
||||
| agentGroups.agent-group-gpu.clearmlSecretKey | string | `nil` | |
|
||||
| agentGroups.agent-group-gpu.enabled | bool | `true` | |
|
||||
| agentGroups.agent-group-gpu.image.pullPolicy | string | `"IfNotPresent"` | |
|
||||
| agentGroups.agent-group-gpu.image.repository | string | `"nvidia/cuda"` | |
|
||||
| agentGroups.agent-group-gpu.image.tag | string | `"11.0-base-ubuntu18.04"` | |
|
||||
| agentGroups.agent-group-gpu.name | string | `"agent-group-gpu"` | |
|
||||
| agentGroups.agent-group-gpu.nodeSelector | object | `{}` | |
|
||||
| agentGroups.agent-group-gpu.nvidiaGpusPerAgent | int | `1` | |
|
||||
| agentGroups.agent-group-gpu.podAnnotations | object | `{}` | |
|
||||
| agentGroups.agent-group-gpu.queues | string | `"default"` | |
|
||||
| agentGroups.agent-group-gpu.replicaCount | int | `0` | |
|
||||
| agentGroups.agent-group-gpu.tolerations | list | `[]` | |
|
||||
| agentGroups.agent-group-gpu.updateStrategy | string | `"Recreate"` | |
|
||||
| agentservices.affinity | object | `{}` | |
|
||||
| agentservices.agentVersion | string | `""` | |
|
||||
| agentservices.awsAccessKeyId | string | `nil` | |
|
||||
@@ -76,12 +182,13 @@ MLOps platform
|
||||
| agentservices.storage.data.class | string | `"standard"` | |
|
||||
| agentservices.storage.data.size | string | `"50Gi"` | |
|
||||
| agentservices.tolerations | list | `[]` | |
|
||||
| apiserver.additionalConfigs | object | `{}` | |
|
||||
| apiserver.affinity | object | `{}` | |
|
||||
| apiserver.configDir | string | `"/opt/clearml/config"` | |
|
||||
| apiserver.extraEnvs | list | `[]` | |
|
||||
| apiserver.image.pullPolicy | string | `"IfNotPresent"` | |
|
||||
| apiserver.image.repository | string | `"allegroai/clearml"` | |
|
||||
| apiserver.image.tag | string | `"1.0.2"` | |
|
||||
| apiserver.image.tag | string | `"1.1.1"` | |
|
||||
| apiserver.livenessDelay | int | `60` | |
|
||||
| apiserver.nodeSelector | object | `{}` | |
|
||||
| apiserver.podAnnotations | object | `{}` | |
|
||||
@@ -93,9 +200,6 @@ MLOps platform
|
||||
| apiserver.resources | object | `{}` | |
|
||||
| apiserver.service.port | int | `8008` | |
|
||||
| apiserver.service.type | string | `"NodePort"` | |
|
||||
| apiserver.storage.config.class | string | `"standard"` | |
|
||||
| apiserver.storage.config.size | string | `"1Gi"` | |
|
||||
| apiserver.storage.enableConfigVolume | bool | `false` | |
|
||||
| apiserver.tolerations | list | `[]` | |
|
||||
| clearml.defaultCompany | string | `"d1bd92a3b039400cbafc60a7a5b1e52b"` | |
|
||||
| elasticsearch.clusterHealthCheckParams | string | `"wait_for_status=yellow&timeout=1s"` | |
|
||||
@@ -137,7 +241,7 @@ MLOps platform
|
||||
| fileserver.extraEnvs | list | `[]` | |
|
||||
| fileserver.image.pullPolicy | string | `"IfNotPresent"` | |
|
||||
| fileserver.image.repository | string | `"allegroai/clearml"` | |
|
||||
| fileserver.image.tag | string | `"1.0.2"` | |
|
||||
| fileserver.image.tag | string | `"1.1.1"` | |
|
||||
| fileserver.nodeSelector | object | `{}` | |
|
||||
| fileserver.podAnnotations | object | `{}` | |
|
||||
| fileserver.replicaCount | int | `1` | |
|
||||
@@ -148,10 +252,14 @@ MLOps platform
|
||||
| fileserver.storage.data.size | string | `"50Gi"` | |
|
||||
| fileserver.tolerations | list | `[]` | |
|
||||
| ingress.annotations | object | `{}` | |
|
||||
| ingress.api.hostName | string | `"api.clearml.127-0-0-1.nip.io"` | |
|
||||
| ingress.api.tlsSecretName | string | `""` | |
|
||||
| ingress.app.hostName | string | `"app.clearml.127-0-0-1.nip.io"` | |
|
||||
| ingress.app.tlsSecretName | string | `""` | |
|
||||
| ingress.enabled | bool | `false` | |
|
||||
| ingress.host | string | `""` | |
|
||||
| ingress.files.hostName | string | `"files.clearml.127-0-0-1.nip.io"` | |
|
||||
| ingress.files.tlsSecretName | string | `""` | |
|
||||
| ingress.name | string | `"clearml-server-ingress"` | |
|
||||
| ingress.tls.secretName | string | `""` | |
|
||||
| mongodb.architecture | string | `"standalone"` | |
|
||||
| mongodb.auth.enabled | bool | `false` | |
|
||||
| mongodb.enabled | bool | `true` | |
|
||||
@@ -176,7 +284,7 @@ MLOps platform
|
||||
| webserver.extraEnvs | list | `[]` | |
|
||||
| webserver.image.pullPolicy | string | `"IfNotPresent"` | |
|
||||
| webserver.image.repository | string | `"allegroai/clearml"` | |
|
||||
| webserver.image.tag | string | `"1.0.2"` | |
|
||||
| webserver.image.tag | string | `"1.1.1"` | |
|
||||
| webserver.nodeSelector | object | `{}` | |
|
||||
| webserver.podAnnotations | object | `{}` | |
|
||||
| webserver.replicaCount | int | `1` | |
|
||||
@@ -184,6 +292,3 @@ MLOps platform
|
||||
| webserver.service.port | int | `80` | |
|
||||
| webserver.service.type | string | `"NodePort"` | |
|
||||
| webserver.tolerations | list | `[]` | |
|
||||
|
||||
----------------------------------------------
|
||||
Autogenerated from chart metadata using [helm-docs v1.5.0](https://github.com/norwoodj/helm-docs/releases/v1.5.0)
|
||||
|
||||
96
charts/clearml/README.md.gotmpl
Normal file
96
charts/clearml/README.md.gotmpl
Normal file
@@ -0,0 +1,96 @@
|
||||
# ClearML Ecosystem for Kubernetes
|
||||
{{ template "chart.deprecationWarning" . }}
|
||||
|
||||
{{ template "chart.badgesSection" . }}
|
||||
|
||||
{{ template "chart.description" . }}
|
||||
|
||||
{{ template "chart.homepageLine" . }}
|
||||
|
||||
{{ template "chart.maintainersSection" . }}
|
||||
|
||||
## Introduction
|
||||
|
||||
The **clearml-server** is the backend service infrastructure for [ClearML](https://github.com/allegroai/clearml).
|
||||
It allows multiple users to collaborate and manage their experiments.
|
||||
|
||||
**clearml-server** contains the following components:
|
||||
|
||||
* The ClearML Web-App, a single-page UI for experiment management and browsing
|
||||
* RESTful API for:
|
||||
* Documenting and logging experiment information, statistics and results
|
||||
* Querying experiments history, logs and results
|
||||
* Locally-hosted file server for storing images and models making them easily accessible using the Web-App
|
||||
|
||||
## Local environment
|
||||
|
||||
For development/evaluation it's possible to use [kind](https://kind.sigs.k8s.io).
|
||||
After installation, following commands will create a complete ClearML insatllation:
|
||||
|
||||
```
|
||||
mkdir -pm 777 /tmp/clearml-kind
|
||||
|
||||
cat <<EOF > /tmp/clearml-kind.yaml
|
||||
kind: Cluster
|
||||
apiVersion: kind.x-k8s.io/v1alpha4
|
||||
nodes:
|
||||
- role: control-plane
|
||||
extraPortMappings:
|
||||
- containerPort: 30008
|
||||
hostPort: 30008
|
||||
listenAddress: "127.0.0.1"
|
||||
protocol: TCP
|
||||
- containerPort: 30080
|
||||
hostPort: 30080
|
||||
listenAddress: "127.0.0.1"
|
||||
protocol: TCP
|
||||
- containerPort: 30081
|
||||
hostPort: 30081
|
||||
listenAddress: "127.0.0.1"
|
||||
protocol: TCP
|
||||
extraMounts:
|
||||
- hostPath: /tmp/clearml-kind/
|
||||
containerPath: /var/local-path-provisioner
|
||||
EOF
|
||||
|
||||
kind create cluster --config /tmp/clearml-kind.yaml
|
||||
|
||||
helm install clearml allegroai/clearml
|
||||
```
|
||||
|
||||
After deployment, the services will be exposed on localhost on the following ports:
|
||||
|
||||
* API server on `30008`
|
||||
* Web server on `30080`
|
||||
* File server on `30081`
|
||||
|
||||
Data persisted in every Kubernetes volume by ClearML will be accessible in /tmp/clearml-kind folder on the host.
|
||||
|
||||
## Production cluster environment
|
||||
|
||||
In a production environment it's suggested to install an ingress controller and verify that is working correctly.
|
||||
During ClearML deployment enable `ingress` section of chart values.
|
||||
This will create 3 ingress rules:
|
||||
|
||||
* `app.<your domain name>`
|
||||
* `files.<your domain name>`
|
||||
* `api.<your domain name>`
|
||||
|
||||
(*for example, `app.clearml.mydomainname.com`, `files.clearml.mydomainname.com` and `api.clearml.mydomainname.com`*)
|
||||
|
||||
Just pointing the domain records to the IP where ingress controller is responding will complete the deployment process.
|
||||
|
||||
## Additional Configuration for ClearML Server
|
||||
|
||||
You can also configure the **clearml-server** for:
|
||||
|
||||
* fixed users (users with credentials)
|
||||
* non-responsive experiment watchdog settings
|
||||
|
||||
For detailed instructions, see the [Optional Configuration](https://github.com/allegroai/clearml-server#optional-configuration) section in the **clearml-server** repository README file.
|
||||
|
||||
{{ template "chart.sourcesSection" . }}
|
||||
|
||||
{{ template "chart.requirementsSection" . }}
|
||||
|
||||
{{ template "chart.valuesSection" . }}
|
||||
@@ -95,3 +95,48 @@ Create the name of the service account to use
|
||||
{{- default "default" .Values.serviceAccount.name }}
|
||||
{{- end }}
|
||||
{{- end }}
|
||||
|
||||
{{/*
|
||||
Create the name of the App service to use
|
||||
*/}}
|
||||
{{- define "clearml.serviceApp" -}}
|
||||
{{- if .Values.ingress.enabled }}
|
||||
{{- if .Values.ingress.app.tlsSecretName }}
|
||||
{{- printf "%s%s%s" "https://" .Values.ingress.app.hostName }}
|
||||
{{- else }}
|
||||
{{- printf "%s%s%s" "http://" .Values.ingress.app.hostName }}
|
||||
{{- end }}
|
||||
{{- else }}
|
||||
{{- printf "%s%s%s%s" "http://" (include "clearml.fullname" .) "-webserver:" (.Values.webserver.service.port | toString) }}
|
||||
{{- end }}
|
||||
{{- end }}
|
||||
|
||||
{{/*
|
||||
Create the name of the Api service to use
|
||||
*/}}
|
||||
{{- define "clearml.serviceApi" -}}
|
||||
{{- if .Values.ingress.enabled }}
|
||||
{{- if .Values.ingress.api.tlsSecretName }}
|
||||
{{- printf "%s%s%s" "https://" .Values.ingress.api.hostName }}
|
||||
{{- else }}
|
||||
{{- printf "%s%s%s" "http://" .Values.ingress.api.hostName }}
|
||||
{{- end }}
|
||||
{{- else }}
|
||||
{{- printf "%s%s%s%s" "http://" (include "clearml.fullname" .) "-apiserver:" (.Values.apiserver.service.port | toString) }}
|
||||
{{- end }}
|
||||
{{- end }}
|
||||
|
||||
{{/*
|
||||
Create the name of the Files service to use
|
||||
*/}}
|
||||
{{- define "clearml.serviceFiles" -}}
|
||||
{{- if .Values.ingress.enabled }}
|
||||
{{- if .Values.ingress.files.tlsSecretName }}
|
||||
{{- printf "%s%s%s" "https://" .Values.ingress.files.hostName }}
|
||||
{{- else }}
|
||||
{{- printf "%s%s%s" "http://" .Values.ingress.files.hostName }}
|
||||
{{- end }}
|
||||
{{- else }}
|
||||
{{- printf "%s%s%s%s" "http://" (include "clearml.fullname" .) "-fileserver:" (.Values.fileserver.service.port | toString) }}
|
||||
{{- end }}
|
||||
{{- end }}
|
||||
|
||||
13
charts/clearml/templates/configmap-apiserver.yaml
Normal file
13
charts/clearml/templates/configmap-apiserver.yaml
Normal file
@@ -0,0 +1,13 @@
|
||||
{{- if .Values.apiserver.additionalConfigs -}}
|
||||
apiVersion: v1
|
||||
kind: ConfigMap
|
||||
metadata:
|
||||
name: "{{ include "clearml.fullname" . }}-apiserver-configmap"
|
||||
labels:
|
||||
{{- include "clearml.labels" . | nindent 4 }}
|
||||
data:
|
||||
{{- range $key, $val := .Values.apiserver.additionalConfigs }}
|
||||
{{ $key }}: |
|
||||
{{- $val | nindent 4 }}
|
||||
{{- end }}
|
||||
{{- end -}}
|
||||
@@ -1,5 +1,6 @@
|
||||
{{- range $key, $value := .Values.agentGroups }}
|
||||
{{- with $value }}
|
||||
{{- if .enabled }}
|
||||
---
|
||||
apiVersion: apps/v1
|
||||
kind: Deployment
|
||||
@@ -9,6 +10,8 @@ metadata:
|
||||
{{- include "clearml.labels" $ | nindent 4 }}
|
||||
spec:
|
||||
replicas: {{ .replicaCount }}
|
||||
strategy:
|
||||
type: {{ .updateStrategy }}
|
||||
selector:
|
||||
matchLabels:
|
||||
{{- include "clearml.selectorLabelsAgent" $ | nindent 6 }}
|
||||
@@ -38,7 +41,7 @@ spec:
|
||||
- -c
|
||||
- >
|
||||
set -x;
|
||||
while [ $(curl -sw '%{http_code}' "http://{{ include "clearml.fullname" $ }}-apiserver:{{ $.Values.apiserver.service.port }}/debug.ping" -o /dev/null) -ne 200 ] ; do
|
||||
while [ $(curl -sw '%{http_code}' "{{ include "clearml.serviceApi" $ }}/debug.ping" -o /dev/null) -ne 200 ] ; do
|
||||
echo "waiting for apiserver" ;
|
||||
sleep 5 ;
|
||||
done
|
||||
@@ -54,11 +57,11 @@ spec:
|
||||
{{ .nvidiaGpusPerAgent }}
|
||||
env:
|
||||
- name: CLEARML_API_HOST
|
||||
value: 'http://{{ include "clearml.fullname" $ }}-apiserver:{{ $.Values.apiserver.service.port }}'
|
||||
value: {{ include "clearml.serviceApi" $ }}
|
||||
- name: CLEARML_WEB_HOST
|
||||
value: 'http://{{ include "clearml.fullname" $ }}-webserver:{{ $.Values.webserver.service.port }}'
|
||||
value: {{ include "clearml.serviceApp" $ }}
|
||||
- name: CLEARML_FILES_HOST
|
||||
value: 'http://{{ include "clearml.fullname" $ }}-fileserver:{{ $.Values.fileserver.service.port }}'
|
||||
value: {{ include "clearml.serviceFiles" $ }}
|
||||
- name: CLEARML_AGENT_GIT_USER
|
||||
value: {{ .clearmlGitUser}}
|
||||
- name: CLEARML_AGENT_GIT_PASS
|
||||
@@ -91,6 +94,13 @@ spec:
|
||||
python3 -m pip install -U pip ;
|
||||
python3 -m pip install clearml-agent{{ .agentVersion}} ;
|
||||
CLEARML_AGENT_K8S_HOST_MOUNT=/root/.clearml:/root/.clearml clearml-agent daemon --queue {{ .queues}}"
|
||||
{{ if .clearmlConfig }}
|
||||
volumeMounts:
|
||||
- name: agent-clearml-conf-volume
|
||||
mountPath: /root/clearml.conf
|
||||
subPath: clearml.conf
|
||||
readOnly: true
|
||||
{{- end }}
|
||||
{{- with .nodeSelector }}
|
||||
nodeSelector:
|
||||
{{- toYaml . | nindent 8 }}
|
||||
@@ -105,3 +115,4 @@ spec:
|
||||
{{- end }}
|
||||
{{- end }}
|
||||
{{- end }}
|
||||
{{- end }}
|
||||
|
||||
@@ -30,7 +30,7 @@ spec:
|
||||
- -c
|
||||
- >
|
||||
set -x;
|
||||
while [ $(curl -sw '%{http_code}' "http://{{ include "clearml.fullname" . }}-apiserver:{{ .Values.apiserver.service.port }}/debug.ping" -o /dev/null) -ne 200 ] ; do
|
||||
while [ $(curl -sw '%{http_code}' "{{ include "clearml.serviceApi" $ }}/debug.ping" -o /dev/null) -ne 200 ] ; do
|
||||
echo "waiting for apiserver" ;
|
||||
sleep 5 ;
|
||||
done
|
||||
@@ -42,7 +42,7 @@ spec:
|
||||
- name: CLEARML_HOST_IP
|
||||
value: {{ .Values.agentservices.clearmlHostIp }}
|
||||
- name: CLEARML_API_HOST
|
||||
value: "http://{{ include "clearml.fullname" . }}-apiserver:{{ .Values.apiserver.service.port }}"
|
||||
value: {{ include "clearml.serviceApi" $ }}
|
||||
- name: CLEARML_WEB_HOST
|
||||
value: {{ .Values.agentservices.clearmlWebHost }}
|
||||
- name: CLEARML_FILES_HOST
|
||||
|
||||
@@ -18,12 +18,6 @@ spec:
|
||||
labels:
|
||||
{{- include "clearml.selectorLabelsApiServer" . | nindent 8 }}
|
||||
spec:
|
||||
{{- if .Values.apiserver.storage.enableConfigVolume }}
|
||||
volumes:
|
||||
- name: apiserver-config
|
||||
persistentVolumeClaim:
|
||||
claimName: {{ include "clearml.fullname" . }}-apiserver-config
|
||||
{{- end }}
|
||||
containers:
|
||||
- name: {{ .Chart.Name }}
|
||||
image: "{{ .Values.apiserver.image.repository }}:{{ .Values.apiserver.image.tag | default .Chart.AppVersion }}"
|
||||
@@ -101,13 +95,19 @@ spec:
|
||||
httpGet:
|
||||
path: /debug.ping
|
||||
port: 8008
|
||||
{{- if .Values.apiserver.storage.enableConfigVolume }}
|
||||
{{- if .Values.apiserver.additionalConfigs }}
|
||||
volumeMounts:
|
||||
- name: apiserver-config
|
||||
mountPath: /opt/clearml/config
|
||||
{{- end }}
|
||||
resources:
|
||||
{{- toYaml .Values.apiserver.resources | nindent 12 }}
|
||||
{{- if .Values.apiserver.additionalConfigs }}
|
||||
volumes:
|
||||
- name: apiserver-config
|
||||
configMap:
|
||||
name: "{{ include "clearml.fullname" . }}-apiserver-configmap"
|
||||
{{- end }}
|
||||
{{- with .Values.apiserver.nodeSelector }}
|
||||
nodeSelector:
|
||||
{{- toYaml . | nindent 8 }}
|
||||
|
||||
42
charts/clearml/templates/ingress-api.yaml
Normal file
42
charts/clearml/templates/ingress-api.yaml
Normal file
@@ -0,0 +1,42 @@
|
||||
{{- if .Values.ingress.enabled -}}
|
||||
{{- if semverCompare ">=1.19-0" .Capabilities.KubeVersion.GitVersion -}}
|
||||
apiVersion: networking.k8s.io/v1
|
||||
{{- else if semverCompare ">=1.14-0" .Capabilities.KubeVersion.GitVersion -}}
|
||||
apiVersion: networking.k8s.io/v1beta1
|
||||
{{- else -}}
|
||||
apiVersion: extensions/v1beta1
|
||||
{{- end }}
|
||||
kind: Ingress
|
||||
metadata:
|
||||
name: {{ include "clearml.fullname" . }}-api
|
||||
labels:
|
||||
{{- include "clearml.labels" . | nindent 4 }}
|
||||
{{- with .Values.ingress.annotations }}
|
||||
annotations:
|
||||
{{- toYaml . | nindent 4 }}
|
||||
{{- end }}
|
||||
spec:
|
||||
{{- if .Values.ingress.api.tlsSecretName }}
|
||||
tls:
|
||||
- hosts:
|
||||
- {{ .Values.ingress.api.hostName }}
|
||||
secretName: {{ .Values.ingress.api.tlsSecretName }}
|
||||
{{- end }}
|
||||
rules:
|
||||
- host: {{ .Values.ingress.api.hostName }}
|
||||
http:
|
||||
paths:
|
||||
- path: "/"
|
||||
{{ if semverCompare ">=1.19-0" .Capabilities.KubeVersion.GitVersion }}
|
||||
pathType: Prefix
|
||||
backend:
|
||||
service:
|
||||
name: {{ include "clearml.fullname" . }}-apiserver
|
||||
port:
|
||||
number: {{ .Values.apiserver.service.port }}
|
||||
{{ else }}
|
||||
backend:
|
||||
serviceName: {{ include "clearml.fullname" . }}-apiserver
|
||||
servicePort: {{ .Values.apiserver.service.port }}
|
||||
{{ end }}
|
||||
{{- end }}
|
||||
42
charts/clearml/templates/ingress-app.yaml
Normal file
42
charts/clearml/templates/ingress-app.yaml
Normal file
@@ -0,0 +1,42 @@
|
||||
{{- if .Values.ingress.enabled -}}
|
||||
{{- if semverCompare ">=1.19-0" .Capabilities.KubeVersion.GitVersion -}}
|
||||
apiVersion: networking.k8s.io/v1
|
||||
{{- else if semverCompare ">=1.14-0" .Capabilities.KubeVersion.GitVersion -}}
|
||||
apiVersion: networking.k8s.io/v1beta1
|
||||
{{- else -}}
|
||||
apiVersion: extensions/v1beta1
|
||||
{{- end }}
|
||||
kind: Ingress
|
||||
metadata:
|
||||
name: {{ include "clearml.fullname" . }}-app
|
||||
labels:
|
||||
{{- include "clearml.labels" . | nindent 4 }}
|
||||
{{- with .Values.ingress.annotations }}
|
||||
annotations:
|
||||
{{- toYaml . | nindent 4 }}
|
||||
{{- end }}
|
||||
spec:
|
||||
{{- if .Values.ingress.app.tlsSecretName }}
|
||||
tls:
|
||||
- hosts:
|
||||
- {{ .Values.ingress.app.hostName }}
|
||||
secretName: {{ .Values.ingress.app.tlsSecretName }}
|
||||
{{- end }}
|
||||
rules:
|
||||
- host: {{ .Values.ingress.app.hostName }}
|
||||
http:
|
||||
paths:
|
||||
- path: "/"
|
||||
{{ if semverCompare ">=1.19-0" .Capabilities.KubeVersion.GitVersion }}
|
||||
pathType: Prefix
|
||||
backend:
|
||||
service:
|
||||
name: {{ include "clearml.fullname" . }}-webserver
|
||||
port:
|
||||
number: {{ .Values.webserver.service.port }}
|
||||
{{ else }}
|
||||
backend:
|
||||
serviceName: {{ include "clearml.fullname" . }}-webserver
|
||||
servicePort: {{ .Values.webserver.service.port }}
|
||||
{{ end }}
|
||||
{{- end }}
|
||||
42
charts/clearml/templates/ingress-files.yaml
Normal file
42
charts/clearml/templates/ingress-files.yaml
Normal file
@@ -0,0 +1,42 @@
|
||||
{{- if .Values.ingress.enabled -}}
|
||||
{{- if semverCompare ">=1.19-0" .Capabilities.KubeVersion.GitVersion -}}
|
||||
apiVersion: networking.k8s.io/v1
|
||||
{{- else if semverCompare ">=1.14-0" .Capabilities.KubeVersion.GitVersion -}}
|
||||
apiVersion: networking.k8s.io/v1beta1
|
||||
{{- else -}}
|
||||
apiVersion: extensions/v1beta1
|
||||
{{- end }}
|
||||
kind: Ingress
|
||||
metadata:
|
||||
name: {{ include "clearml.fullname" . }}-files
|
||||
labels:
|
||||
{{- include "clearml.labels" . | nindent 4 }}
|
||||
{{- with .Values.ingress.annotations }}
|
||||
annotations:
|
||||
{{- toYaml . | nindent 4 }}
|
||||
{{- end }}
|
||||
spec:
|
||||
{{- if .Values.ingress.files.tlsSecretName }}
|
||||
tls:
|
||||
- hosts:
|
||||
- {{ .Values.ingress.files.hostName }}
|
||||
secretName: {{ .Values.ingress.files.tlsSecretName }}
|
||||
{{- end }}
|
||||
rules:
|
||||
- host: {{ .Values.ingress.files.hostName }}
|
||||
http:
|
||||
paths:
|
||||
- path: "/"
|
||||
{{ if semverCompare ">=1.19-0" .Capabilities.KubeVersion.GitVersion }}
|
||||
pathType: Prefix
|
||||
backend:
|
||||
service:
|
||||
name: {{ include "clearml.fullname" . }}-fileserver
|
||||
port:
|
||||
number: {{ .Values.fileserver.service.port }}
|
||||
{{ else }}
|
||||
backend:
|
||||
serviceName: {{ include "clearml.fullname" . }}-fileserver
|
||||
servicePort: {{ .Values.fileserver.service.port }}
|
||||
{{ end }}
|
||||
{{- end }}
|
||||
@@ -1,48 +0,0 @@
|
||||
{{- if .Values.ingress.enabled -}}
|
||||
{{- $fullName := include "clearml.fullname" . -}}
|
||||
{{- if semverCompare ">=1.14-0" .Capabilities.KubeVersion.GitVersion -}}
|
||||
apiVersion: networking.k8s.io/v1beta1
|
||||
{{- else -}}
|
||||
apiVersion: extensions/v1beta1
|
||||
{{- end }}
|
||||
kind: Ingress
|
||||
metadata:
|
||||
name: {{ $fullName }}
|
||||
labels:
|
||||
{{- include "clearml.labels" . | nindent 4 }}
|
||||
{{- with .Values.ingress.annotations }}
|
||||
annotations:
|
||||
{{- toYaml . | nindent 4 }}
|
||||
{{- end }}
|
||||
spec:
|
||||
{{- if .Values.ingress.tls.secretName }}
|
||||
tls:
|
||||
- hosts:
|
||||
- "app.{{ .Values.ingress.host }}"
|
||||
- "files.{{ .Values.ingress.host }}"
|
||||
- "api.{{ .Values.ingress.host }}"
|
||||
secretName: {{ .Values.ingress.tls.secretName }}
|
||||
{{- end }}
|
||||
rules:
|
||||
- host: "app.{{ .Values.ingress.host }}"
|
||||
http:
|
||||
paths:
|
||||
- path: "/*"
|
||||
backend:
|
||||
serviceName: {{ include "clearml.fullname" . }}-webserver
|
||||
servicePort: {{ .Values.webserver.service.port }}
|
||||
- host: "api.{{ .Values.ingress.host }}"
|
||||
http:
|
||||
paths:
|
||||
- path: "/*"
|
||||
backend:
|
||||
serviceName: {{ include "clearml.fullname" . }}-apiserver
|
||||
servicePort: {{ .Values.apiserver.service.port }}
|
||||
- host: "files.{{ .Values.ingress.host }}"
|
||||
http:
|
||||
paths:
|
||||
- path: "/*"
|
||||
backend:
|
||||
serviceName: {{ include "clearml.fullname" . }}-fileserver
|
||||
servicePort: {{ .Values.fileserver.service.port }}
|
||||
{{- end }}
|
||||
@@ -1,15 +0,0 @@
|
||||
{{- if .Values.apiserver.storage.enableConfigVolume }}
|
||||
kind: PersistentVolumeClaim
|
||||
apiVersion: v1
|
||||
metadata:
|
||||
name: {{ include "clearml.fullname" . }}-apiserver-config
|
||||
labels:
|
||||
{{- include "clearml.labels" . | nindent 4 }}
|
||||
spec:
|
||||
accessModes:
|
||||
- ReadWriteOnce
|
||||
resources:
|
||||
requests:
|
||||
storage: {{ .Values.apiserver.storage.config.size | quote }}
|
||||
storageClassName: {{ .Values.apiserver.storage.config.class | quote }}
|
||||
{{- end }}
|
||||
@@ -4,9 +4,15 @@ ingress:
|
||||
enabled: false
|
||||
name: clearml-server-ingress
|
||||
annotations: {}
|
||||
host: ""
|
||||
tls:
|
||||
secretName: ""
|
||||
app:
|
||||
hostName: "app.clearml.127-0-0-1.nip.io"
|
||||
tlsSecretName: ""
|
||||
api:
|
||||
hostName: "api.clearml.127-0-0-1.nip.io"
|
||||
tlsSecretName: ""
|
||||
files:
|
||||
hostName: "files.clearml.127-0-0-1.nip.io"
|
||||
tlsSecretName: ""
|
||||
|
||||
apiserver:
|
||||
prepopulateEnabled: "true"
|
||||
@@ -26,7 +32,7 @@ apiserver:
|
||||
image:
|
||||
repository: "allegroai/clearml"
|
||||
pullPolicy: IfNotPresent
|
||||
tag: "1.0.2"
|
||||
tag: "1.1.1"
|
||||
|
||||
extraEnvs: []
|
||||
|
||||
@@ -50,12 +56,16 @@ apiserver:
|
||||
|
||||
affinity: {}
|
||||
|
||||
# Optional: used in pvc-apiserver containing optional server configuration files
|
||||
storage:
|
||||
enableConfigVolume: false
|
||||
config:
|
||||
class: "standard"
|
||||
size: 1Gi
|
||||
additionalConfigs: {}
|
||||
# services.conf: |
|
||||
# tasks {
|
||||
# non_responsive_tasks_watchdog {
|
||||
# # In-progress tasks that haven't been updated for at least 'value' seconds will be stopped by the watchdog
|
||||
# threshold_sec: 21000
|
||||
# # Watchdog will sleep for this number of seconds after each cycle
|
||||
# watch_interval_sec: 900
|
||||
# }
|
||||
# }
|
||||
|
||||
fileserver:
|
||||
service:
|
||||
@@ -67,7 +77,7 @@ fileserver:
|
||||
image:
|
||||
repository: "allegroai/clearml"
|
||||
pullPolicy: IfNotPresent
|
||||
tag: "1.0.2"
|
||||
tag: "1.1.1"
|
||||
|
||||
extraEnvs: []
|
||||
|
||||
@@ -108,7 +118,7 @@ webserver:
|
||||
image:
|
||||
repository: "allegroai/clearml"
|
||||
pullPolicy: IfNotPresent
|
||||
tag: "1.0.2"
|
||||
tag: "1.1.1"
|
||||
|
||||
podAnnotations: {}
|
||||
|
||||
@@ -180,9 +190,45 @@ agentservices:
|
||||
size: 50Gi
|
||||
|
||||
agentGroups:
|
||||
agent-group0:
|
||||
name: agent-group0
|
||||
agent-group-cpu:
|
||||
enabled: true
|
||||
name: agent-group-cpu
|
||||
replicaCount: 1
|
||||
updateStrategy: Recreate
|
||||
nvidiaGpusPerAgent: 0
|
||||
agentVersion: "" # if set, it *MUST* include comparison operator (e.g. ">=0.16.1")
|
||||
queues: "default" # multiple queues can be specified separated by a space (e.g. "important_jobs default")
|
||||
clearmlGitUser: null
|
||||
clearmlGitPassword: null
|
||||
clearmlAccessKey: null
|
||||
clearmlSecretKey: null
|
||||
awsAccessKeyId: null
|
||||
awsSecretAccessKey: null
|
||||
awsDefaultRegion: null
|
||||
azureStorageAccount: null
|
||||
azureStorageKey: null
|
||||
clearmlConfig: |-
|
||||
sdk {
|
||||
}
|
||||
|
||||
image:
|
||||
repository: "ubuntu"
|
||||
pullPolicy: IfNotPresent
|
||||
tag: "18.04"
|
||||
|
||||
podAnnotations: {}
|
||||
|
||||
nodeSelector: {}
|
||||
|
||||
tolerations: []
|
||||
|
||||
affinity: {}
|
||||
|
||||
agent-group-gpu:
|
||||
enabled: true
|
||||
name: agent-group-gpu
|
||||
replicaCount: 0
|
||||
updateStrategy: Recreate
|
||||
nvidiaGpusPerAgent: 1
|
||||
agentVersion: "" # if set, it *MUST* include comparison operator (e.g. ">=0.16.1")
|
||||
queues: "default" # multiple queues can be specified separated by a space (e.g. "important_jobs default")
|
||||
|
||||
Reference in New Issue
Block a user