allegroai
8a2bea3c14
Fix comment lines (#) are not ignored in docker startup bash script
2024-03-17 18:58:14 +02:00
allegroai
f1f9278928
Fix torch resolver settings applied to PytorchRequirement instance are not used
2024-03-17 18:56:47 +02:00
nfzd
2de1c926bf
Use correct Python version in Poetry init ( #179 )
...
* Use correct Python version in Poetry init
* Use interpreter override if configured
* Don't use agent.python_binary if it is empty
---------
Co-authored-by: Michael Mueller <michael.mueller@wsa.com>
2024-03-11 23:36:10 +02:00
ae-ae
8b2970350c
Fix FileNotFoundException crash in find_python_executable_for_version… ( #192 )
...
* Fix FileNotFoundException crash in find_python_executable_for_version (#164 )
* Add a Windows check for error 9009 when searching for Python
---------
Co-authored-by: 12037964+ae-ae@users.noreply.github.com 12037964+ae-ae@users.noreply.github.com <ae-ae>
2024-03-06 09:17:31 +02:00
FeU-aKlos
a2758250b2
Fix queue handling in K8sIntegration and k8s_glue_example.py ( #183 )
...
* Fix queue handling in K8sIntegration and k8s_glue_example.py
* Update Dockerfile and k8s_glue_example.py
* Add executable permission to provider_entrypoint.sh
* ADJUST docker
* Update clearml-agent version
* ADDJUST stuff
* ADJUST queue string handling
* DELETE pip install from own repo
2024-02-29 14:20:54 +02:00
allegroai
01e8ffd854
Improve venv cache handling:
...
- Add FileLock readonly mode, default is write mode (i.e. exclusive lock, preserving behavior)
- Add venv cache now uses readonly lock when copying folders from venv cache into target folder. This enables multiple read, single write operation
- Do not lock the cache folder if we do not need to delete old entries
2024-02-29 14:19:24 +02:00
allegroai
74edf6aa36
Fix IOError on file lock when using shared folder
2024-02-29 14:16:25 +02:00
allegroai
09c5ef99af
Fix Python 3.12 support by removing distutil imports
2024-02-29 14:12:21 +02:00
allegroai
17ae28a62f
Add agent.venvs_cache.lock_timeout to control the venv cache folder lock timeout (in seconds, default 30)
2024-02-29 14:06:06 +02:00
allegroai
059a9385e9
Fix delete temp console pipe log files after Task execution is completed. This is important for long lasting services agents, avoiding collecting temp files on host machine
2024-02-29 14:03:30 +02:00
allegroai
9a321a410f
Add CLEARML_AGENT_FORCE_TASK_INIT to allow runtime patching of script even if no repo is specified and the code is running a preinstalled docker
2024-02-29 14:02:27 +02:00
allegroai
919013d4fe
Add CLEARML_AGENT_FORCE_POETRY to allow forcing poetry even when using pip requirements manager
2024-02-29 13:59:26 +02:00
allegroai
05530b712b
Fix sanitization did not cover all keys
2024-02-29 13:56:14 +02:00
allegroai
8d15fd8798
Fix pippip
is returned as a pip version if no value exists in agent.package_manager.pip_version
2024-02-29 13:55:41 +02:00
allegroai
b34329934b
Add queue ID report before pulling task
2024-02-29 13:52:17 +02:00
allegroai
85049d8705
Move configuration sanitization settings to the default config file
2024-02-29 13:51:40 +02:00
allegroai
6fbd70786e
Add protection for truncate() call
2024-02-29 13:51:09 +02:00
allegroai
05a65548da
Fix agent.enable_git_ask_pass does not show in configuration dump
2024-02-29 13:50:52 +02:00
allegroai
6657003d65
Fix using controller-uid will not always return required pods
2024-02-29 13:49:30 +02:00
allegroai
c9fc092f4e
Support force_system_packages argument in k8s glue class
2023-12-26 10:12:32 +02:00
allegroai
432ee395e1
Version bump to v1.7.0
2023-12-20 18:08:38 +02:00
allegroai
98fc4f0fb9
Add agent.resource_monitoring.disk_use_path
configuration option to allow monitoring a different volume than the one containing the home folder
2023-12-20 17:49:33 +02:00
allegroai
111e774c21
Add extra_index_url sanitization in configuration printout
2023-12-20 17:49:04 +02:00
allegroai
3dd8d783e1
Fix agent.git_host
setting will cause git@domain URLs to not be replaced by SSH URLs since furl cannot parse them to obtain host
2023-12-20 17:48:18 +02:00
allegroai
7c3e420df4
Add git clone verbosity using CLEARML_AGENT_GIT_CLONE_VERBOSE
env var
2023-12-20 17:47:52 +02:00
allegroai
55b065a114
Update GPU stats and pynvml support
2023-12-20 17:47:19 +02:00
allegroai
faa97b6cc2
Set worker ID in k8s glue mode
2023-12-20 17:45:34 +02:00
allegroai
f5861b1e4a
Change default agent.enable_git_ask_pass
to True
2023-12-20 17:44:41 +02:00
allegroai
030cbb69f1
Fix check if process return code is SIGKILL (-9 or 137) and abort callback was called, do not mark as failed but as aborted
2023-12-20 17:43:02 +02:00
allegroai
564f769ff7
Add agent.docker_args_extra_precedes_task
, agent.protected_docker_extra_args
...
to prevent the same switch to be used by both `extra_docker_args` and the a Task's docker args
2023-12-20 17:42:36 +02:00
allegroai
dd5d24b0ca
Add CLEARML_AGENT_TEMP_STDOUT_FILE_DIR to allow specifying temp dir used for storing agent log files and temporary log files (daemon and execution)
2023-11-14 11:45:13 +02:00
allegroai
996bb797c3
Add env var in case we're running a service task
2023-11-14 11:44:36 +02:00
allegroai
9ad49a0d21
Fix KeyError if container does not contain the arguments field
2023-11-01 15:11:07 +02:00
allegroai
ba4fee7b19
Fix agent.package_manager.poetry_install_extra_args are used in all Poetry commands and not just in install ( #173 )
2023-11-01 15:10:40 +02:00
allegroai
0131db8b7d
Add support for resource_applied() callback in k8s glue
...
Add support for sending log events with k8s-provided timestamps
Refactor env vars infrastructure
2023-11-01 15:10:08 +02:00
allegroai
d2384a9a95
Add example and support for prebuilt containers including services-mode support with overrides CLEARML_AGENT_FORCE_CODE_DIR CLEARML_AGENT_FORCE_EXEC_SCRIPT
2023-11-01 15:05:57 +02:00
allegroai
5b86c230c1
Fix an environment variable that should be set with a numerical value of 0 (i.e. end up as "0" or "0.0") is set to an empty string
2023-11-01 15:04:59 +02:00
allegroai
21e4be966f
Fix recursion issue when deep-copying a session
2023-11-01 15:04:24 +02:00
allegroai
9c6cb421b3
When cleaning up pending pods, verify task is still aborted and pod is still pending before deleting the pod
2023-11-01 15:04:01 +02:00
allegroai
52405c343d
Fix k8s glue configuration might be contaminated when changed during apply
2023-11-01 15:03:37 +02:00
allegroai
46f0c991c8
Add status reason when aborting before moving to k8s_scheduler queue
2023-11-01 15:02:24 +02:00
allegroai
0254279ed5
Version bump to v1.6.1
2023-09-06 15:41:29 +03:00
allegroai
58e0dc42ec
Version bump to v1.6.0
2023-09-05 15:05:11 +03:00
allegroai
d16825029d
Add new pytorch no resolver mode and CLEARML_AGENT_PACKAGE_PYTORCH_RESOLVE to change resolver on a Task basis, now supports "pip", "direct", "none"
2023-09-02 17:45:10 +03:00
allegroai
fb639afcb9
Fix PyTorch extra index pip resolver
2023-09-02 17:43:41 +03:00
Alex Burlacu
ed1356976b
Move extra configurations to Worker init to make sure all available configurations can be overridden
2023-08-24 19:00:36 +03:00
Alex Burlacu
2b815354e0
Improve file mode comment
2023-08-24 18:53:00 +03:00
Alex Burlacu
edae380a9e
Version bump
2023-08-24 18:51:47 +03:00
Alex Burlacu
946e9d9ce9
Fix invalid reference
2023-08-24 18:51:27 +03:00
allegroai
159a6e9a5a
Fix runtime property overriding existing properties
2023-07-20 10:41:15 +03:00