Commit Graph

777 Commits

Author SHA1 Message Date
allegroai
93df021108 Add support for .ipynb script entry files (install nbconvert in runtime, copnvert to python and execute the python script), including CLEARML_AGENT_FORCE_TASK_INIT patching of ipynb files (post python conversion) 2024-07-24 17:41:59 +03:00
allegroai
700ae85de0 Fix file mode should be optional in configuration files section 2024-07-24 17:41:06 +03:00
allegroai
f367c5a571 Fix git fetch did not update new tags #209 2024-07-24 17:39:53 +03:00
allegroai
ebc5944b44 Fix setting tasks that someone just marked as aborted to started - only force Task to started after dequeuing it otherwise lease it as is 2024-07-24 17:39:26 +03:00
allegroai
8f41002845 Add task.script.binary /bin/bash support
Fix -m module $env to support parsing the $env before launching
2024-07-24 17:37:26 +03:00
allegroai
7e8670d57f Find the correct python version when using a pre-installed python environment 2024-07-21 14:10:38 +03:00
allegroai
77de343863 Use "venv" module if virtualenv is not supported 2024-07-19 13:18:07 +03:00
allegroai
6b31883e45 Fix queue resolution when no queue is passed 2024-05-15 18:30:24 +03:00
allegroai
e48b4756fa Add Python 3.12 support 2024-05-15 18:25:29 +03:00
allegroai
47147e3237 Fix cached repositories were not passing user/token when pulling, agent.vcs_cache.clone_on_pull_fail now defaults to false 2024-04-19 23:50:17 +03:00
allegroai
41fc4ec646 Fix disabling vcs cache should not add vcs mount point to container 2024-04-19 23:48:50 +03:00
allegroai
441e5a73b2 Fix conda env should not be cached if installing into base conda or conda existing env exists 2024-04-19 23:48:10 +03:00
allegroai
27ed6821c4 Add mirrorD config files to gitignore 2024-04-19 23:47:34 +03:00
allegroai
10c6629982 Support skipping re-enqueue on suspected preempted k8s pods 2024-04-19 23:46:57 +03:00
allegroai
6fb48a4c6e Revert version to v1.8.1 2024-04-19 23:44:31 +03:00
allegroai
105ade31f1 Version bump to v1.8.2 2024-04-14 18:18:10 +03:00
allegroai
502e266b6b Fix polling interval missing when not using daemon mode 2024-04-14 18:17:57 +03:00
allegroai
cd9a3b9f4e Version bump to v1.8.1 2024-04-12 20:30:11 +03:00
allegroai
4179ac5234 Fix git pulling on cached invalid git entry. On error, re-clone the entire repo again (disable using "agent.vcs_cache.clone_on_pull_fail: false") 2024-04-12 20:29:36 +03:00
Liron Ilouz
98cc0d86ba
Add option to set daemon polling interval (#197)
* add option to set worker polling interval

* polling interval minimum value

---------

Co-authored-by: Liron <liron@tapwithus.com>
2024-04-03 14:33:52 +03:00
allegroai
293cbc0ac6 Version bump to v1.8.0 2024-04-02 16:38:22 +03:00
allegroai
4387ed73b6 Fix None handling when no limits exist 2024-04-02 16:36:09 +03:00
allegroai
43443ccf08 Pass task_id when resolving k8s template 2024-04-01 11:37:01 +03:00
allegroai
3d43240c8f Improve conda package manager support
Add agent.package_manager.use_conda_base_env (CLEARML_USE_CONDA_BASE_ENV) allowing to use base conda environment (instead of installing a new one)
Fix conda support for python packages with markers and multiple specifications
Added "nvidia" conda channel and support for cuda-toolkit >= 12
2024-04-01 11:36:26 +03:00
allegroai
fc58ba947b Update requirements 2024-04-01 11:35:07 +03:00
allegroai
22672d2444 Improve GPU monitoring 2024-03-17 19:13:57 +02:00
allegroai
6a4fcda1bf Improve resource monitor 2024-03-17 19:06:57 +02:00
allegroai
a4ebf8293d Fix role support 2024-03-17 19:00:59 +02:00
allegroai
10fb157d58 Fix queue handling for backwards compatibility 2024-03-17 19:00:18 +02:00
allegroai
56058beec2 Update deprecated references 2024-03-17 18:59:48 +02:00
allegroai
9f207d5155 Fix dynamic GPU sometimes misses the initial print - if we found the closing print it should be good enough to signal everything is okay 2024-03-17 18:59:04 +02:00
allegroai
8a2bea3c14 Fix comment lines (#) are not ignored in docker startup bash script 2024-03-17 18:58:14 +02:00
allegroai
f1f9278928 Fix torch resolver settings applied to PytorchRequirement instance are not used 2024-03-17 18:56:47 +02:00
nfzd
2de1c926bf
Use correct Python version in Poetry init (#179)
* Use correct Python version in Poetry init

* Use interpreter override if configured

* Don't use agent.python_binary if it is empty

---------

Co-authored-by: Michael Mueller <michael.mueller@wsa.com>
2024-03-11 23:36:10 +02:00
allegroai
e1104e60bb Update README 2024-03-11 16:58:38 +02:00
ae-ae
8b2970350c
Fix FileNotFoundException crash in find_python_executable_for_version… (#192)
* Fix FileNotFoundException crash in find_python_executable_for_version (#164)

* Add a Windows check for error 9009 when searching for Python

---------

Co-authored-by: 12037964+ae-ae@users.noreply.github.com 12037964+ae-ae@users.noreply.github.com <ae-ae>
2024-03-06 09:17:31 +02:00
FeU-aKlos
a2758250b2
Fix queue handling in K8sIntegration and k8s_glue_example.py (#183)
* Fix queue handling in K8sIntegration and k8s_glue_example.py

* Update Dockerfile and k8s_glue_example.py

* Add executable permission to provider_entrypoint.sh

* ADJUST docker

* Update clearml-agent version

* ADDJUST stuff

* ADJUST queue string handling

* DELETE pip install from own repo
2024-02-29 14:20:54 +02:00
allegroai
01e8ffd854 Improve venv cache handling:
- Add FileLock readonly mode, default is write mode (i.e. exclusive lock, preserving behavior)
- Add venv cache now uses readonly lock when copying folders from venv cache into target folder. This enables multiple read, single write operation
- Do not lock the cache folder if we do not need to delete old entries
2024-02-29 14:19:24 +02:00
allegroai
74edf6aa36 Fix IOError on file lock when using shared folder 2024-02-29 14:16:25 +02:00
allegroai
09c5ef99af Fix Python 3.12 support by removing distutil imports 2024-02-29 14:12:21 +02:00
allegroai
17ae28a62f Add agent.venvs_cache.lock_timeout to control the venv cache folder lock timeout (in seconds, default 30) 2024-02-29 14:06:06 +02:00
allegroai
059a9385e9 Fix delete temp console pipe log files after Task execution is completed. This is important for long lasting services agents, avoiding collecting temp files on host machine 2024-02-29 14:03:30 +02:00
allegroai
9a321a410f Add CLEARML_AGENT_FORCE_TASK_INIT to allow runtime patching of script even if no repo is specified and the code is running a preinstalled docker 2024-02-29 14:02:27 +02:00
allegroai
919013d4fe Add CLEARML_AGENT_FORCE_POETRY to allow forcing poetry even when using pip requirements manager 2024-02-29 13:59:26 +02:00
allegroai
05530b712b Fix sanitization did not cover all keys 2024-02-29 13:56:14 +02:00
allegroai
8d15fd8798 Fix pippip is returned as a pip version if no value exists in agent.package_manager.pip_version 2024-02-29 13:55:41 +02:00
allegroai
b34329934b Add queue ID report before pulling task 2024-02-29 13:52:17 +02:00
allegroai
85049d8705 Move configuration sanitization settings to the default config file 2024-02-29 13:51:40 +02:00
allegroai
6fbd70786e Add protection for truncate() call 2024-02-29 13:51:09 +02:00
allegroai
05a65548da Fix agent.enable_git_ask_pass does not show in configuration dump 2024-02-29 13:50:52 +02:00