allegroai
441e5a73b2
Fix conda env should not be cached if installing into base conda or conda existing env exists
2024-04-19 23:48:10 +03:00
allegroai
27ed6821c4
Add mirrorD config files to gitignore
2024-04-19 23:47:34 +03:00
allegroai
10c6629982
Support skipping re-enqueue on suspected preempted k8s pods
2024-04-19 23:46:57 +03:00
allegroai
6fb48a4c6e
Revert version to v1.8.1
2024-04-19 23:44:31 +03:00
allegroai
105ade31f1
Version bump to v1.8.2
2024-04-14 18:18:10 +03:00
allegroai
502e266b6b
Fix polling interval missing when not using daemon mode
2024-04-14 18:17:57 +03:00
allegroai
cd9a3b9f4e
Version bump to v1.8.1
2024-04-12 20:30:11 +03:00
allegroai
4179ac5234
Fix git pulling on cached invalid git entry. On error, re-clone the entire repo again (disable using "agent.vcs_cache.clone_on_pull_fail: false")
2024-04-12 20:29:36 +03:00
Liron Ilouz
98cc0d86ba
Add option to set daemon polling interval ( #197 )
...
* add option to set worker polling interval
* polling interval minimum value
---------
Co-authored-by: Liron <liron@tapwithus.com>
2024-04-03 14:33:52 +03:00
allegroai
293cbc0ac6
Version bump to v1.8.0
2024-04-02 16:38:22 +03:00
allegroai
4387ed73b6
Fix None handling when no limits exist
2024-04-02 16:36:09 +03:00
allegroai
43443ccf08
Pass task_id when resolving k8s template
2024-04-01 11:37:01 +03:00
allegroai
3d43240c8f
Improve conda package manager support
...
Add agent.package_manager.use_conda_base_env (CLEARML_USE_CONDA_BASE_ENV) allowing to use base conda environment (instead of installing a new one)
Fix conda support for python packages with markers and multiple specifications
Added "nvidia" conda channel and support for cuda-toolkit >= 12
2024-04-01 11:36:26 +03:00
allegroai
fc58ba947b
Update requirements
2024-04-01 11:35:07 +03:00
allegroai
22672d2444
Improve GPU monitoring
2024-03-17 19:13:57 +02:00
allegroai
6a4fcda1bf
Improve resource monitor
2024-03-17 19:06:57 +02:00
allegroai
a4ebf8293d
Fix role support
2024-03-17 19:00:59 +02:00
allegroai
10fb157d58
Fix queue handling for backwards compatibility
2024-03-17 19:00:18 +02:00
allegroai
56058beec2
Update deprecated references
2024-03-17 18:59:48 +02:00
allegroai
9f207d5155
Fix dynamic GPU sometimes misses the initial print - if we found the closing print it should be good enough to signal everything is okay
2024-03-17 18:59:04 +02:00
allegroai
8a2bea3c14
Fix comment lines (#) are not ignored in docker startup bash script
2024-03-17 18:58:14 +02:00
allegroai
f1f9278928
Fix torch resolver settings applied to PytorchRequirement instance are not used
2024-03-17 18:56:47 +02:00
nfzd
2de1c926bf
Use correct Python version in Poetry init ( #179 )
...
* Use correct Python version in Poetry init
* Use interpreter override if configured
* Don't use agent.python_binary if it is empty
---------
Co-authored-by: Michael Mueller <michael.mueller@wsa.com>
2024-03-11 23:36:10 +02:00
allegroai
e1104e60bb
Update README
2024-03-11 16:58:38 +02:00
ae-ae
8b2970350c
Fix FileNotFoundException crash in find_python_executable_for_version… ( #192 )
...
* Fix FileNotFoundException crash in find_python_executable_for_version (#164 )
* Add a Windows check for error 9009 when searching for Python
---------
Co-authored-by: 12037964+ae-ae@users.noreply.github.com 12037964+ae-ae@users.noreply.github.com <ae-ae>
2024-03-06 09:17:31 +02:00
FeU-aKlos
a2758250b2
Fix queue handling in K8sIntegration and k8s_glue_example.py ( #183 )
...
* Fix queue handling in K8sIntegration and k8s_glue_example.py
* Update Dockerfile and k8s_glue_example.py
* Add executable permission to provider_entrypoint.sh
* ADJUST docker
* Update clearml-agent version
* ADDJUST stuff
* ADJUST queue string handling
* DELETE pip install from own repo
2024-02-29 14:20:54 +02:00
allegroai
01e8ffd854
Improve venv cache handling:
...
- Add FileLock readonly mode, default is write mode (i.e. exclusive lock, preserving behavior)
- Add venv cache now uses readonly lock when copying folders from venv cache into target folder. This enables multiple read, single write operation
- Do not lock the cache folder if we do not need to delete old entries
2024-02-29 14:19:24 +02:00
allegroai
74edf6aa36
Fix IOError on file lock when using shared folder
2024-02-29 14:16:25 +02:00
allegroai
09c5ef99af
Fix Python 3.12 support by removing distutil imports
2024-02-29 14:12:21 +02:00
allegroai
17ae28a62f
Add agent.venvs_cache.lock_timeout to control the venv cache folder lock timeout (in seconds, default 30)
2024-02-29 14:06:06 +02:00
allegroai
059a9385e9
Fix delete temp console pipe log files after Task execution is completed. This is important for long lasting services agents, avoiding collecting temp files on host machine
2024-02-29 14:03:30 +02:00
allegroai
9a321a410f
Add CLEARML_AGENT_FORCE_TASK_INIT to allow runtime patching of script even if no repo is specified and the code is running a preinstalled docker
2024-02-29 14:02:27 +02:00
allegroai
919013d4fe
Add CLEARML_AGENT_FORCE_POETRY to allow forcing poetry even when using pip requirements manager
2024-02-29 13:59:26 +02:00
allegroai
05530b712b
Fix sanitization did not cover all keys
2024-02-29 13:56:14 +02:00
allegroai
8d15fd8798
Fix pippip
is returned as a pip version if no value exists in agent.package_manager.pip_version
2024-02-29 13:55:41 +02:00
allegroai
b34329934b
Add queue ID report before pulling task
2024-02-29 13:52:17 +02:00
allegroai
85049d8705
Move configuration sanitization settings to the default config file
2024-02-29 13:51:40 +02:00
allegroai
6fbd70786e
Add protection for truncate() call
2024-02-29 13:51:09 +02:00
allegroai
05a65548da
Fix agent.enable_git_ask_pass does not show in configuration dump
2024-02-29 13:50:52 +02:00
allegroai
6657003d65
Fix using controller-uid will not always return required pods
2024-02-29 13:49:30 +02:00
allegroai
95dde6ca0c
Update README
2024-01-25 11:27:56 +02:00
allegroai
c9fc092f4e
Support force_system_packages argument in k8s glue class
2023-12-26 10:12:32 +02:00
allegroai
432ee395e1
Version bump to v1.7.0
2023-12-20 18:08:38 +02:00
allegroai
98fc4f0fb9
Add agent.resource_monitoring.disk_use_path
configuration option to allow monitoring a different volume than the one containing the home folder
2023-12-20 17:49:33 +02:00
allegroai
111e774c21
Add extra_index_url sanitization in configuration printout
2023-12-20 17:49:04 +02:00
allegroai
3dd8d783e1
Fix agent.git_host
setting will cause git@domain URLs to not be replaced by SSH URLs since furl cannot parse them to obtain host
2023-12-20 17:48:18 +02:00
allegroai
7c3e420df4
Add git clone verbosity using CLEARML_AGENT_GIT_CLONE_VERBOSE
env var
2023-12-20 17:47:52 +02:00
allegroai
55b065a114
Update GPU stats and pynvml support
2023-12-20 17:47:19 +02:00
allegroai
faa97b6cc2
Set worker ID in k8s glue mode
2023-12-20 17:45:34 +02:00
allegroai
f5861b1e4a
Change default agent.enable_git_ask_pass
to True
2023-12-20 17:44:41 +02:00