allegroai
f5861b1e4a
Change default agent.enable_git_ask_pass
to True
2023-12-20 17:44:41 +02:00
allegroai
030cbb69f1
Fix check if process return code is SIGKILL (-9 or 137) and abort callback was called, do not mark as failed but as aborted
2023-12-20 17:43:02 +02:00
allegroai
564f769ff7
Add agent.docker_args_extra_precedes_task
, agent.protected_docker_extra_args
...
to prevent the same switch to be used by both `extra_docker_args` and the a Task's docker args
2023-12-20 17:42:36 +02:00
allegroai
dd5d24b0ca
Add CLEARML_AGENT_TEMP_STDOUT_FILE_DIR to allow specifying temp dir used for storing agent log files and temporary log files (daemon and execution)
2023-11-14 11:45:13 +02:00
allegroai
996bb797c3
Add env var in case we're running a service task
2023-11-14 11:44:36 +02:00
allegroai
9ad49a0d21
Fix KeyError if container does not contain the arguments field
2023-11-01 15:11:07 +02:00
allegroai
ba4fee7b19
Fix agent.package_manager.poetry_install_extra_args are used in all Poetry commands and not just in install ( #173 )
2023-11-01 15:10:40 +02:00
allegroai
0131db8b7d
Add support for resource_applied() callback in k8s glue
...
Add support for sending log events with k8s-provided timestamps
Refactor env vars infrastructure
2023-11-01 15:10:08 +02:00
allegroai
d2384a9a95
Add example and support for prebuilt containers including services-mode support with overrides CLEARML_AGENT_FORCE_CODE_DIR CLEARML_AGENT_FORCE_EXEC_SCRIPT
2023-11-01 15:05:57 +02:00
allegroai
5b86c230c1
Fix an environment variable that should be set with a numerical value of 0 (i.e. end up as "0" or "0.0") is set to an empty string
2023-11-01 15:04:59 +02:00
allegroai
21e4be966f
Fix recursion issue when deep-copying a session
2023-11-01 15:04:24 +02:00
allegroai
9c6cb421b3
When cleaning up pending pods, verify task is still aborted and pod is still pending before deleting the pod
2023-11-01 15:04:01 +02:00
allegroai
52405c343d
Fix k8s glue configuration might be contaminated when changed during apply
2023-11-01 15:03:37 +02:00
allegroai
46f0c991c8
Add status reason when aborting before moving to k8s_scheduler queue
2023-11-01 15:02:24 +02:00
allegroai
0254279ed5
Version bump to v1.6.1
2023-09-06 15:41:29 +03:00
allegroai
58e0dc42ec
Version bump to v1.6.0
2023-09-05 15:05:11 +03:00
allegroai
d16825029d
Add new pytorch no resolver mode and CLEARML_AGENT_PACKAGE_PYTORCH_RESOLVE to change resolver on a Task basis, now supports "pip", "direct", "none"
2023-09-02 17:45:10 +03:00
allegroai
fb639afcb9
Fix PyTorch extra index pip resolver
2023-09-02 17:43:41 +03:00
Alex Burlacu
ed1356976b
Move extra configurations to Worker init to make sure all available configurations can be overridden
2023-08-24 19:00:36 +03:00
Alex Burlacu
2b815354e0
Improve file mode comment
2023-08-24 18:53:00 +03:00
Alex Burlacu
edae380a9e
Version bump
2023-08-24 18:51:47 +03:00
Alex Burlacu
946e9d9ce9
Fix invalid reference
2023-08-24 18:51:27 +03:00
allegroai
159a6e9a5a
Fix runtime property overriding existing properties
2023-07-20 10:41:15 +03:00
pollfly
6e7d35a42a
Improve configuration files ( #160 )
2023-07-11 10:32:01 +03:00
allegroai
4c056a17b9
Add support for k8s jobs execution
...
Strip docker container obtained from task in k8s apply
2023-07-04 14:45:00 +03:00
allegroai
21d98afca5
Add support for extra docker arguments referencing machines environment variables using the agent.docker_allow_host_environ configuration option to allow users to also be able to use $ENV in the task's docker arguments
2023-07-04 14:42:28 +03:00
allegroai
6a1bf11549
Fix Task docker arguments passed twice
2023-07-04 14:41:07 +03:00
allegroai
7115a9b9a7
Add CLEARML_EXTRA_PIP_INSTALL_FLAGS / agent.package_manager.extra_pip_install_flags to control additional pip install flags
...
Fix pip version marking in "installed packages" is now preserved for and reinstalled
2023-07-04 14:39:40 +03:00
allegroai
450df2f8d3
Support skipping agent pip upgrade in container bash script using the CLEARML_AGENT_NO_UPDATE env var
2023-07-04 14:38:50 +03:00
allegroai
ccf752c4e4
Add support for setting mode on files applied by the agent
2023-07-04 14:37:58 +03:00
allegroai
3ed63e2154
Fix docker container backwards compatibility for API <2.13
...
Fix default docker match rules resolver (used incorrect field "container" instead of "image")
Remove "container" (image) match rule option from default docker image resolver
2023-07-04 14:37:18 +03:00
allegroai
a535f93cd6
Add support for CLEARML_AGENT_FORCE_MAX_API_VERSION for testing
2023-07-04 14:35:54 +03:00
allegroai
b380ec54c6
Improve config file comments
2023-07-04 14:34:43 +03:00
allegroai
a1274299ce
Add support for CLEARML_AGENT_EXTRA_DOCKER_LABELS env var
2023-07-03 11:08:59 +03:00
allegroai
c77224af68
Add support for task field injection into container docker name
2023-07-03 11:07:12 +03:00
allegroai
95dadca45c
Refactor k8s glue running/used pods getter
2023-05-21 22:56:12 +03:00
allegroai
685918fd9b
Version bump to v1.5.3rc3
2023-05-21 22:54:38 +03:00
allegroai
bc85ddf78d
Fix pytorch direct resolve replacing wheel link with directly installed version
2023-05-21 22:53:51 +03:00
allegroai
5b5fb0b8a6
Add agent.package_manager.pytorch_resolve
configuration setting with pip
or direct
values. pip
sets extra index based on cuda and lets pip resolve, direct
is the previous parsing algorithm that does the matching and downloading (default pip
)
2023-05-21 22:53:11 +03:00
allegroai
fec0ce1756
Better message for agent init when an existing clearml.conf is found
2023-05-21 22:51:11 +03:00
allegroai
1e09b88b7a
Add alias CLEARML_AGENT_DOCKER_AGENT_REPO
env var for the FORCE_CLEARML_AGENT_REPO
env var
2023-05-21 22:50:01 +03:00
allegroai
b6ca0fa6a5
Print error on resource monitor failure
2023-05-11 16:18:11 +03:00
allegroai
307ec9213e
Fix git+ssh:// links inside installed packages not being converted properly to HTTPS authenticated and vice versa
2023-05-11 16:16:51 +03:00
allegroai
a78a25d966
Support new Retry.DEFAULT_BACKOFF_MAX
in a backwards-compatible way
2023-05-11 16:16:18 +03:00
allegroai
ebb6231f5a
Add CLEARML_AGENT_STANDALONE_CONFIG_BC to support backwards compatibility in standalone mode
2023-05-11 16:15:06 +03:00
allegroai
3fe92a92ba
Version bump to v1.5.2
2023-03-29 12:49:33 +03:00
allegroai
154db59ce6
Add agent.package_manager.poetry_install_extra_args configuration option
2023-03-28 14:37:48 +03:00
allegroai
afffa83063
Fix git+ssh:// links inside installed packages not being converted properly to https authenticated links
2023-03-28 14:35:51 +03:00
allegroai
787c7d88bb
Fix additional poetry cwd support feature
2023-03-28 14:35:41 +03:00
allegroai
667c2ced3d
Fix very old pip version support (<20)
2023-03-28 14:34:19 +03:00
allegroai
7f5b3c8df4
Fix None config file in session causes k8s agent to raise exception
2023-03-28 14:33:55 +03:00
allegroai
46ded2864d
Fix restart feature should be tested against agent session
2023-03-28 14:33:33 +03:00
allegroai
40456be948
Black formatting
...
Refactor path support
2023-03-05 18:05:00 +02:00
allegroai
8d51aed679
Protect against cache folders without permission
2023-03-05 18:05:00 +02:00
allegroai
bfc4ba38cd
Fix torch inside nvidia containers to use preinstalled version (i.e. ==x.y.z.* matching)
2023-03-05 18:05:00 +02:00
Niels ten Boom
3cedc104df
Add poetry cwd support ( #142 )
...
Closes #138
2023-03-05 14:19:57 +02:00
allegroai
95e996bfda
Reintroduce CLEARML_AGENT_SERVICES_DOCKER_RESTART
accidentally reverted by a previous merge
2023-02-05 10:34:38 +02:00
allegroai
b6d132b226
Fix build fails when target is relative path
2023-02-05 10:33:32 +02:00
allegroai
4f17a2c17d
Fix K8s glue does not delete pending pods if the tasks they represent were aborted
2023-02-05 10:32:16 +02:00
allegroai
00e8e9eb5a
Do not allow request exceptions (only on the initial login call)
2023-02-05 10:30:45 +02:00
allegroai
af6a77918f
Fix _
is allowed in k8s label names
2023-02-05 10:29:48 +02:00
allegroai
855622fd30
Support custom service on Worker.get()
calls
2023-02-05 10:29:09 +02:00
allegroai
8cd12810f3
Fix login uses GET with payload which breaks when trying to connect a server running in GCP
2023-02-05 10:28:41 +02:00
pollfly
85e1fadf9b
Fix typos ( #131 )
2022-12-28 19:39:59 +02:00
allegroai
249b51a31b
Version bump
2022-12-13 15:29:10 +02:00
allegroai
da19ef26c4
Fix pinging running task (and change default to once a minute)
2022-12-13 15:26:26 +02:00
allegroai
f69e16ea9d
Fix clearml-agent build --docker
stuck on certain containers
2022-12-13 15:24:32 +02:00
allegroai
efa1f71dac
Version bump to v1.5.1
2022-12-10 22:18:21 +02:00
allegroai
ebdc215632
Remove "
from pip commands in venv
2022-12-10 20:58:30 +02:00
allegroai
b2da639582
Add CLEARML_AGENT_FORCE_SYSTEM_SITE_PACKAGES
env var (default true) to allow overriding default "system_site_packages: true" behavior when running tasks in containers (docker mode and k8s-glue)
2022-12-10 20:00:46 +02:00
allegroai
71fdb43f10
Version bump to v1.5.1rc0
2022-12-07 22:09:40 +02:00
allegroai
ca2791c65e
Fix pip support allowing multiple pip version constraints (by default, one for <PY3.10 and one for >=PY3.10)
2022-12-07 22:09:25 +02:00
allegroai
669fb1a6e5
Fix using deprecated types validator argument raises an error (deprecated even before jsonschema 3.0.0 and unsupported since 4.0.0)
2022-12-07 22:07:53 +02:00
allegroai
5d517c91b5
Add agent.disable_task_docker_override
configuration option to disable docker override specified in executing tasks
2022-12-07 22:07:11 +02:00
allegroai
6be75abc86
Add default output URI selection to "clearml-agent init"
2022-12-07 22:06:10 +02:00
allegroai
4c777fa2ee
Version bump to v1.5.0
2022-12-05 16:42:44 +02:00
allegroai
dc5e0033c8
Remove support for kubectl run
...
Allow customizing pod name prefix and limit pod label
Return deleted pods from cleanup
Some refactoring
2022-12-05 11:40:19 +02:00
allegroai
3dd5973734
Filter by phase when detecting hanging pods
...
More debug print-outs
Use task session when possible
Push task into k8s scheduler queue only if running from the same tenant
Make sure we pass git_user/pass to the task pod
Fix cleanup command not issued when no pods exist in a multi-queue setup
2022-12-05 11:29:59 +02:00
allegroai
53d379205f
Support raise_error
in get_bash_output()
2022-12-05 11:26:40 +02:00
allegroai
57cde21c48
Send task.ping
for executing tasks every 120 seconds (set using the agent.task_ping_interval_sec
configuration option)
2022-12-05 11:22:25 +02:00
allegroai
396abf13b6
Fix get_task_session()
may cause an old copy of the APIClient
to be used containing a reference to the previous session
2022-12-05 11:20:32 +02:00
allegroai
6e7fb5f331
Fix sending task logs fails when agent is not running in the same tenant
2022-12-05 11:19:14 +02:00
allegroai
1d5c118b70
Fix setting CLEARML_API_DEFAULT_REQ_METHOD
raises an error
2022-12-05 11:18:12 +02:00
allegroai
76c533a2e8
Fix access to config object
2022-11-11 13:34:17 +02:00
Niels ten Boom
9eee213683
Add option to crash agent on exception using agent.crash_on_exception
configuration setting ( #123 )
2022-11-06 17:15:39 +02:00
allegroai
e4861fc0fb
Add missing settings in clearml.conf
2022-11-06 12:36:01 +02:00
allegroai
26e62da1a8
version bump to 1.5.0rc0
2022-10-23 13:04:00 +03:00
allegroai
d2f3614ab0
Add support for CLEARML_AGENT_DOCKER_ARGS_HIDE_ENV environment variable (see agent.hide_docker_command_env_vars
config option)
2022-10-23 13:04:00 +03:00
allegroai
c6d767bd64
Make venv caching the default behavior
2022-10-23 13:04:00 +03:00
allegroai
efb06891a8
Add support for PyTorch new extra_index_url repo support. We will find the correct index url based on the cuda version, and let pip do the rest.
2022-10-23 13:04:00 +03:00
allegroai
70771b12a9
Remove unused code
2022-10-23 13:04:00 +03:00
allegroai
3f7a4840cc
Add support for operator != in package version (mostly for pytorch resolving)
2022-10-23 13:04:00 +03:00
allegroai
e28048dc25
Change default pip version used to "pip<21" for better Python 3.10 support
2022-10-23 13:04:00 +03:00
allegroai
2ef5d38b32
Remove future (Python 2 is not supported for clearml-agent)
2022-10-23 13:03:59 +03:00
allegroai
0de10345f7
Moved pyhocon to internal packages
2022-10-23 13:03:59 +03:00
allegroai
a243fa211f
Improve venv cache disabled message
2022-10-23 13:03:59 +03:00
allegroai
d794b047be
Fix system_site_packages is not turned on in k8s glue
2022-10-23 13:03:59 +03:00
allegroai
f0fd62a28f
Fix docker extra args showing up in configuration printout
2022-10-23 13:03:59 +03:00
allegroai
e8493d3807
Refactor override configuration to a method
2022-10-23 13:03:58 +03:00
allegroai
ef47225d41
Version bump to v1.4.1
2022-10-07 15:27:49 +03:00
allegroai
e61accefb9
PEP8 + refactor
2022-10-07 15:26:31 +03:00
allegroai
5c1543d112
Add agent.disable_ssh_mount
configuration option (same as CLEARML_AGENT_DISABLE_SSH_MOUNT
env var)
2022-10-07 15:24:39 +03:00
allegroai
7ff6aee20c
Add warning if venv cache is disabled
2022-10-07 15:23:10 +03:00
allegroai
37ea381d98
Add support for docker args filters
2022-10-07 15:22:42 +03:00
allegroai
67fc884895
Fix --gpus all
not reporting GPU stats on worker machine
2022-10-07 15:22:13 +03:00
allegroai
1e3646b57c
Fix docker command for monitoring child agents
2022-10-07 15:21:32 +03:00
allegroai
ba2db4e727
Version bump to v1.4.0
2022-09-29 18:21:04 +03:00
allegroai
077148be00
version bump
2022-09-16 17:29:42 +03:00
allegroai
594ee5842e
Allow to pverride pytorch lookup page: "agent.package_manager.torch_page / torch_nightly_page / torch_url_template_prefix"
2022-09-15 20:16:41 +03:00
allegroai
a69766bd8b
Add CLEARML_AGENT_CHILD_AGENTS_COUNT_CMD to allow overriding child agent count command in k8s
2022-09-15 20:16:01 +03:00
allegroai
857a750eb1
Fix GCP load balancer not fwd GET request body, allow to change default request Action to Put/Post/Get. see api.http.default_method or CLEARML_API_DEFAULT_REQ_METHOD
2022-09-15 20:15:42 +03:00
allegroai
26aa50f1b5
Fix k8s glue extra_bash_init_cmd location in initial bash script
2022-09-02 23:50:03 +03:00
allegroai
8b4f1eefc2
Add more debug printouts in k8s glue
2022-09-02 23:49:28 +03:00
allegroai
97c2e21dcc
Fix resolving k8s pending queue may cause a queue with a uuid name to be created
2022-09-02 23:49:28 +03:00
allegroai
918dd39b87
Add docker ssh_ro_folder (default: "/.ssh") changed docker ssh_folder (default: "~/.ssh")
2022-09-02 23:49:27 +03:00
allegroai
7776e906c4
Fix second .ssh temp mount fails if container changes the files inside
2022-09-02 23:49:27 +03:00
allegroai
1bf865ec08
Fix name not escaped as regex (all services "get_all" use regex for name)
2022-09-02 23:49:27 +03:00
allegroai
9006c2d28f
Add support for abort callback registration
2022-08-29 18:06:59 +03:00
allegroai
ec216198a0
Add agent.enable_git_ask_pass to improve passing user/pass to git commands
2022-08-29 18:06:26 +03:00
allegroai
fe6adbf110
Fix package @ file:// with quoted (url style) links should not be ignored
2022-08-29 18:06:09 +03:00
allegroai
2693c565ba
Fix docker mode use "~/.clearml/venvs-builds" as default for easier user-mode containers
2022-08-29 18:05:53 +03:00
allegroai
7292263f86
Add CLEARML_K8S_GLUE_START_AGENT_SCRIPT_PATH to allow customizing the agent startup script location for k8s glue agent
2022-08-23 23:16:36 +03:00
allegroai
f8a6cd697f
Add k8s agent debug env var
2022-08-23 23:15:53 +03:00
allegroai
ec9d027678
Add support for MIG devices, use 0:1 for GPU 0 slice 1 (or use 0.1)
2022-08-01 18:58:42 +03:00
allegroai
48a145a8bd
Fix messages
2022-08-01 18:57:36 +03:00
allegroai
12a8872b27
Fix Python 3.10+ support
2022-08-01 18:56:37 +03:00
allegroai
820ab4dc0c
Fix k8s glue debug mode, refactoring
2022-08-01 18:55:49 +03:00
allegroai
d96b8ff906
Fix template namespace should override default namespace
2022-07-22 22:44:32 +03:00
allegroai
e687418194
Refactor k8s glue template handling
2022-07-22 22:43:07 +03:00
allegroai
a5a797ec5e
Version bump to v1.3.0
2022-06-16 23:24:28 +03:00
allegroai
ff6cee4a44
Fix requirements --extra-index-url line with trailing comment
...
Fix --extra-index-url is added for different command line switches
2022-06-16 23:22:29 +03:00
allegroai
9acbad28f7
Fix repository URL contains credentials even when agent.force_git_ssh_protocolagent.force_git_ssh_protocol is true
2022-06-16 23:20:53 +03:00
allegroai
560e689ccd
Fix always make pygobject
an optional package (i.e. if installation fails continue the Task package environment setup)
2022-06-16 23:18:55 +03:00
allegroai
f66e42ddb1
Fix optional priority packaged always compare lower case package name
2022-06-16 23:18:31 +03:00
Niels ten Boom
24177cc5a9
Support private repos from requirements.txt file ( #107 )
...
* support private repos
* fix double indices
2022-06-15 10:26:24 +03:00
allegroai
51eb0a713c
Version bump
2022-05-12 23:31:54 +03:00
allegroai
249aa006cb
Make sure that if we have "setuptools" in the original required packages, we preserve the line in the pip freeze list
2022-05-12 23:31:32 +03:00
allegroai
c08e2ac0bb
Fix clearml.conf access in non-root containers
2022-05-05 12:23:11 +03:00
allegroai
335ef91d8e
Fix git unsafe directory issue (disable check on cached vcs folder)
2022-05-05 12:22:40 +03:00
allegroai
6c7a639673
Fix broken pytorch setuptools incompatibility (force setuptools < 59 if torch is below 1.11)
2022-05-05 12:22:13 +03:00
allegroai
5f77cad5ac
Fix error message
2022-04-27 15:36:39 +03:00
allegroai
0228ae0494
Set environment variables before expanding path
2022-04-27 15:14:16 +03:00
allegroai
165677e800
Version bump
2022-04-27 14:59:51 +03:00
allegroai
2e5298b737
Add support for use-owner-token in k8s glue
2022-04-27 14:59:27 +03:00
allegroai
c9ffb8a053
Version bump
2022-04-20 08:57:16 +03:00
allegroai
2466eed23f
Fix dynamic GPUs with "all" GPUs on he same worker
2022-04-20 08:56:22 +03:00
allegroai
6e31171d31
Version bump to v1.2.3
2022-04-14 22:39:38 +03:00
allegroai
e43f31eb80
Version bump
2022-04-13 10:02:25 +03:00
allegroai
f50ba005b5
Protect dynamic GPUs from failing to parse worker GPU index
2022-04-13 10:01:50 +03:00
allegroai
1011544533
Fix copy breaks agent and nulls the worker name
2022-04-13 10:01:12 +03:00