clearml
66494c598d
Fix on_abort bash callback, if main processes leave while on_abort callback is running, wait for the on_abort to complete
2025-02-24 13:46:55 +02:00
clearml
d30d4e7e61
Fix session should retry on Any error if send fails
2025-02-24 13:46:29 +02:00
clearml
0e2657421f
Added CLEARML_AGENT_ABORT_CALLBACK_CMD and CLEARML_AGENT_ABORT_CALLBACK_TIMEOUT
...
(default 180 sec) to define callback command to be called on abort status change
2025-02-24 13:46:00 +02:00
clearml
ee286e2fb7
Fix container default arguments should never be a list
2025-02-24 13:44:52 +02:00
clearml
d87521c36c
Add support for container rulebook overrides ('force_container_rules: true') and container rulebook task update ('update_back_task: true').
...
This addition allows users to override container arguments forcefully based on the tasks properties (repo, tags, project, user etc.), as well as offer additional defaults based on python required packages or python versions
2025-02-24 13:44:26 +02:00
clearml
8887453328
Cleanup error prints on bash startup script
2025-02-24 13:42:37 +02:00
clearml
8d3cb34390
Add default support for dns i.e. rocky/centos/fedora containers
2025-02-24 13:41:32 +02:00
clearml
528bf314ef
Update GpuFractionsHandler GPU name to mem size
2025-02-24 13:30:30 +02:00
clearml
4f91c45d38
Fix untitled file based on binary is now py/sh based on requested binary
2025-02-24 13:29:56 +02:00
clearml
0a13fd79fc
Make sure that if we fail to kill a child processes we continue to try the rest
2025-02-24 13:26:49 +02:00
clearml
c9f5b3d19a
Force the stop command to avoid a potential race
2025-02-24 13:26:26 +02:00
clearml
4219835aa1
Fix pip requirements print dump should be sorted
2025-02-24 13:25:23 +02:00
clearml
d32b82cb01
Integrate docker port mapping, to control non network=host port mapping, including port reassigning for multiple running agents on the same machine
2025-02-24 13:24:58 +02:00
clearml
97cb47d48e
Add docker port mapping parsing and reassigning feature support
...
Add initial component import from clearml-sdk for easier integration
2025-02-24 13:24:35 +02:00
clearml
8f28d2882a
Fix pip freeze dump to comply with yaml fancy print
2025-02-24 13:23:50 +02:00
clearml
546ffff95d
Fix cached venv tried to reinstall priority packages even through they are preinstalled
2025-02-24 13:23:00 +02:00
clearml
a6ae6b2095
Add initial support for --break-system-packages
version detection, but the reality is that we do not need it, because by the time we are running it is too late, so we do rm /usr/lib/python3.*/EXTERNALLY-MANAGED
2025-01-26 23:07:09 +02:00
clearml
369b440b96
Reduce required packages
2025-01-26 23:05:57 +02:00
clearml
28e9280a4f
Reduce required packages
2025-01-26 23:03:16 +02:00
clearml
7e9e3ad08b
Add printout when using custom configuration file
2025-01-26 22:51:09 +02:00
clearml
44709673f4
Add CLEARML_AGENT_CONFIG_VERBOSE for verbose configuration file loading
2025-01-26 22:50:49 +02:00
clearml
4158146420
Version bump to v1.9.3
2025-01-19 16:17:56 +02:00
clearml
91dfa09466
Fix Python 3.13 support
2025-01-05 12:14:24 +02:00
clearml
070919973b
Fix python 3.6 compatibility, no :=
operator
2025-01-05 12:13:21 +02:00
clearml
47d35ef48f
Fix managed python environment inside container (PEP 668) remove usr/lib/python3.*/EXTERNALLY-MANAGED
2024-12-26 18:59:42 +02:00
clearml
54ed234fca
Add agent.docker_args_filters to configuration docs
2024-12-26 18:58:58 +02:00
clearml
a26860e79f
Fix default value handling in merge_dicts()
2024-12-26 18:58:24 +02:00
clearml
fc1abbab0b
Refactor k8s glue
2024-12-26 18:58:00 +02:00
clearml
4fa61dde1f
Support ignoring kubectl errors
2024-12-12 23:41:31 +02:00
clearml
26d748a4d8
Support creating queue with tags
2024-12-12 23:40:57 +02:00
clearml
d8366dedc6
Fix UV priority
...
Fix UV cache is disabled, UV handles its own cache
Fix UV freeze
Fix make sure we do not use pip cache if poetry/uv is used (even if we reverted to pip we can't know if someone changed the repository and now in a new version, a lock file exists)
2024-12-12 23:38:42 +02:00
mads-oestergaard
cc656e2969
Add support for uv as package manager ( #218 )
...
* add uv as a package manager
* update configs
* update worker and defs
* update environ
* Update configs to highlight sync command
* rename to sync_extra_args and set UV_CACHE_DIR
2024-11-27 13:44:55 +02:00
clearml
b65e5fed94
Scan more Python 3 versions
2024-11-17 13:55:51 +02:00
clearml
3273f76b46
Version bump to v1.9.2
2024-10-28 18:33:04 +02:00
clearml
9af0f9fe41
Fix reload method is found in the config object
2024-10-28 18:12:22 +02:00
clearml
205cd47cb9
Fix use req_token_expiration_sec when creating a task session and not the default value
2024-10-28 18:11:42 +02:00
clearml
0ff428bb96
Fix report index not advancing in resource monitoring causes more than one GPU not to be reported
2024-10-28 18:11:00 +02:00
Matteo Destro
bf8d9c96e9
Handle OSError when checking for is_file ( #215 )
2024-10-13 10:08:03 +03:00
allegroai
a88487ff25
Add support for pip legacy resolver for versions specified in the agent.package_manager.pip_legacy_resolver
configuration option
...
Add skip existing packages
2024-09-22 22:36:06 +03:00
Jake Henning
785e22dc87
Version bump to v1.9.1
2024-09-02 01:04:49 +03:00
Jake Henning
6a2b778d53
Add default pip version support for Python 3.12
2024-09-02 01:03:52 +03:00
allegroai
b2c3702830
Version bump to v1.9.0
2024-08-28 23:18:26 +03:00
allegroai
6302d43990
Add support for skipping container apt installs using CLEARML_AGENT_SKIP_CONTAINER_APT env var in k8s
...
Add runtime callback support for setting runtime properties per task in k8s
Fix remove task from pending queue and set to failed when kubectl apply fails
2024-08-27 23:01:27 +03:00
allegroai
760bbca74e
Fix failed Task in services mode logged "User aborted" instead of failed, add Task reason string
2024-08-27 22:56:37 +03:00
allegroai
e63fd31420
Fix string format
2024-08-27 22:55:49 +03:00
allegroai
2ff9985db7
Add user ID to the vault loading print
2024-08-27 22:55:32 +03:00
allegroai
b8c762401b
Fix use same state transition if supported by the server (instead of stopping the task before re-enqueue)
2024-08-27 22:54:45 +03:00
allegroai
99e1e54f94
Add support for tasks containing only bash script or python module command
2024-08-27 22:53:14 +03:00
allegroai
a4d3b5bad6
Fix only set Task started status on node rank 0
2024-08-27 22:52:31 +03:00
allegroai
b21665ed6e
Fix do not cache venv cache if venv/python skip env var was set
2024-08-27 22:52:01 +03:00