allegroai
|
176b4a4cde
|
Fix --services-mode when the execute agent fails when starting to run with error code 0
|
2021-06-16 18:32:29 +03:00 |
|
allegroai
|
29bf993be7
|
Add printout when using key/secret from env vars
|
2021-06-02 21:15:48 +03:00 |
|
allegroai
|
eda597dea5
|
Version bump
|
2021-06-02 13:17:57 +03:00 |
|
allegroai
|
8c56777125
|
Add CLEARML_AGENT_DISABLE_SSH_MOUNT allowing disabling the auto .ssh mount into the docker
|
2021-06-02 13:16:58 +03:00 |
|
allegroai
|
7e90ebd5db
|
Fix _dynamic_gpu_get_available worker timeout increase to 10 minutes
|
2021-06-02 13:16:17 +03:00 |
|
allegroai
|
3a07bfe1d7
|
Version bump
|
2021-05-31 23:19:46 +03:00 |
|
allegroai
|
742cbf5767
|
Add docker environment arguments log masking support (issue #67)
|
2021-05-25 19:31:45 +03:00 |
|
allegroai
|
e93384b99b
|
Fix --stop with dynamic gpus
|
2021-05-20 10:58:46 +03:00 |
|
allegroai
|
3c4e976093
|
Add agent.ignore_requested_python_version to config file
|
2021-05-19 15:20:44 +03:00 |
|
allegroai
|
1e795beec8
|
Fix support for spaces in docker arguments (issue #358)
|
2021-05-19 15:20:03 +03:00 |
|
allegroai
|
4f7407084d
|
Fix standalone script with pre-exiting conda venv
|
2021-05-12 15:46:25 +03:00 |
|
allegroai
|
ae3d034531
|
Protect against None in execution.repository
|
2021-05-12 15:45:31 +03:00 |
|
allegroai
|
a2db1f5ab5
|
Remove queue name from pod name in k8s glue, add queue name and ID to pod labels (issue #64)
|
2021-05-05 12:03:35 +03:00 |
|
allegroai
|
cec6420c8f
|
Version bump to v1.0.0
|
2021-05-03 18:33:53 +03:00 |
|
allegroai
|
4f18bb7ea0
|
Add k8s glue default restartPolicy=Never to template to prevent pods from restarting
|
2021-04-28 13:20:13 +03:00 |
|
Revital
|
24dc59e31f
|
add space to help message
|
2021-04-27 13:50:44 +03:00 |
|
allegroai
|
08ff5e6db7
|
Add number of pods limit to k8s glue
|
2021-04-25 10:47:49 +03:00 |
|
allegroai
|
e60a6f9d14
|
Fix --stop support for dynamic gpus
|
2021-04-25 10:46:43 +03:00 |
|
Revital
|
35e714d8d9
|
fix --downtime help
|
2021-04-21 09:13:47 +03:00 |
|
allegroai
|
6f8d5710d6
|
Fix dynamic gpus priority queue
|
2021-04-20 18:11:59 +03:00 |
|
allegroai
|
a671692832
|
Fix --services-mode with instance limit
|
2021-04-20 18:11:36 +03:00 |
|
allegroai
|
5c8675e43a
|
Add support for dynamic gpus opportunistic scheduling (with min/max gpus per queue)
|
2021-04-20 18:11:16 +03:00 |
|
allegroai
|
60a58f6fad
|
Fix poetry support (issue #57)
|
2021-04-14 11:22:07 +03:00 |
|
allegroai
|
537b67e0cd
|
Fix agent can return non-zero error code and pods will end up restarting forever (issue #56)
|
2021-04-12 23:00:59 +03:00 |
|
allegroai
|
82c5e55fe4
|
Fix usage of not_set in k8s template merge
|
2021-04-07 21:30:13 +03:00 |
|
allegroai
|
945dd816ad
|
Fix no docker arguments
|
2021-04-07 18:47:13 +03:00 |
|
allegroai
|
45009e6cc2
|
Add support for updating back docker on new API v2.13
|
2021-04-07 18:46:58 +03:00 |
|
allegroai
|
3774fa6abd
|
Add support for new container base setup script feature
|
2021-04-07 18:46:14 +03:00 |
|
allegroai
|
e71e6865d2
|
Add agent.docker_install_opencv_libs (default: True) to enable auto opencv libs install for faster docker spin-up
|
2021-04-07 18:45:44 +03:00 |
|
allegroai
|
0e8f1528b1
|
Remove redundant py2 code
|
2021-04-07 18:44:59 +03:00 |
|
allegroai
|
c331babf51
|
Add stopping message on Task process termination
Fix --stop on dynamic gpus venv mode
|
2021-04-07 18:44:33 +03:00 |
|
allegroai
|
c59d268995
|
Fix venv cache crash on bad symbolic links
|
2021-04-07 18:44:11 +03:00 |
|
allegroai
|
9e9fcb0ba9
|
Add dynamic mode terminate dockers on sig_term
|
2021-04-07 18:43:44 +03:00 |
|
allegroai
|
f33e0b2f78
|
Verify docker command exists when running in docker mode
|
2021-04-07 18:42:27 +03:00 |
|
allegroai
|
0e4b99351f
|
Add --stop support for dynamic gpus
Fix --stop mark tasks as aborted (not failed as before)
|
2021-04-07 18:42:10 +03:00 |
|
allegroai
|
81edd2860f
|
Fix --dynamic-gpus should keep original queue priority order
|
2021-03-31 23:55:12 +03:00 |
|
allegroai
|
14ac584577
|
Support k8s glue container env vars merging
|
2021-03-31 23:53:58 +03:00 |
|
allegroai
|
9ce6baf074
|
Fix broken k8s glue docker args parsing
Fix empty env prevents override when merging template
|
2021-03-26 12:26:15 +03:00 |
|
allegroai
|
92a1e07b33
|
Fix local path replace back when using cache
|
2021-03-26 12:16:05 +03:00 |
|
allegroai
|
cb6bdece39
|
Fix cuda version from driver does not return minor version
|
2021-03-18 10:07:59 +02:00 |
|
allegroai
|
2ea38364bb
|
Change the default conda channel order, so it pulls the correct pytorch
|
2021-03-18 10:07:58 +02:00 |
|
allegroai
|
cf6fdc0d81
|
Add support for PyJWT v2
|
2021-03-18 10:07:58 +02:00 |
|
allegroai
|
91eec99563
|
Add conda debug prints (--debug)
|
2021-03-18 10:07:58 +02:00 |
|
allegroai
|
d9b9b4984b
|
Version bump to v0.17.2
|
2021-03-04 20:12:50 +02:00 |
|
allegroai
|
205f9dd816
|
Fix k8s glue does not pass docker environment variables
Remove deprecated flags
|
2021-03-03 15:07:06 +02:00 |
|
allegroai
|
9dfa1294e2
|
Add agent.enable_task_env set the OS environment based on the Environment section of the Task.
|
2021-02-28 19:47:44 +02:00 |
|
allegroai
|
f019905720
|
Fix venv cache support for local folders
|
2021-02-28 19:47:09 +02:00 |
|
allegroai
|
9c257858dd
|
Fix venv cache support for local folders
|
2021-02-23 18:54:38 +02:00 |
|
allegroai
|
2006ab20dd
|
Fix conda support for git+http links
|
2021-02-23 12:46:06 +02:00 |
|
allegroai
|
0caf31719c
|
Fix venv caching always reinstall git repositories and local repositories
|
2021-02-23 12:45:34 +02:00 |
|