Version bump to v1.9.3

Fix dependency on windows
Update github repo link
2025-06-26 18:16:15 +00:00 · 2025-01-19 16:17:56 +02:00 · 2025-01-19 16:16:54 +02:00 · 2025-01-13 18:36:16 +02:00 · 2025-01-13 18:34:48 +02:00 · 2025-01-05 12:14:24 +02:00
25 changed files with 744 additions and 123 deletions
--- a/2
+++ b/2
@@ -186,7 +186,7 @@
      same "printed page" as the copyright notice for easier
      identification within third-party archives.

-   Copyright 2019 allegro.ai
+   Copyright 2025 ClearML Inc

   Licensed under the Apache License, Version 2.0 (the "License");
   you may not use this file except in compliance with the License.
--- a/README.md
+++ b/README.md
@@ -1,15 +1,15 @@
 <div align="center">

-<img src="https://github.com/allegroai/clearml-agent/blob/master/docs/clearml_agent_logo.png?raw=true" width="250px">
+<img src="https://github.com/clearml/clearml-agent/blob/master/docs/clearml_agent_logo.png?raw=true" width="250px">

 **ClearML Agent - MLOps/LLMOps made easy  
 MLOps/LLMOps scheduler & orchestration solution supporting Linux, macOS and Windows**

-[![GitHub license](https://img.shields.io/github/license/allegroai/clearml-agent.svg)](https://img.shields.io/github/license/allegroai/clearml-agent.svg)
+[![GitHub license](https://img.shields.io/github/license/clearml/clearml-agent.svg)](https://img.shields.io/github/license/clearml/clearml-agent.svg)
 [![PyPI pyversions](https://img.shields.io/pypi/pyversions/clearml-agent.svg)](https://img.shields.io/pypi/pyversions/clearml-agent.svg)
 [![PyPI version shields.io](https://img.shields.io/pypi/v/clearml-agent.svg)](https://img.shields.io/pypi/v/clearml-agent.svg)
 [![PyPI Downloads](https://pepy.tech/badge/clearml-agent/month)](https://pypi.org/project/clearml-agent/)
-[![Artifact Hub](https://img.shields.io/endpoint?url=https://artifacthub.io/badge/repository/allegroai)](https://artifacthub.io/packages/search?repo=allegroai)
+[![Artifact Hub](https://img.shields.io/endpoint?url=https://artifacthub.io/badge/repository/clearml)](https://artifacthub.io/packages/search?repo=clearml)

 `🌟 ClearML is open-source - Leave a star to support the project! 🌟`

@@ -33,21 +33,21 @@ It is a zero configuration fire-and-forget execution agent, providing a full ML/

 **Full Automation in 5 steps**

-1. ClearML Server [self-hosted](https://github.com/allegroai/clearml-server)
+1. ClearML Server [self-hosted](https://github.com/clearml/clearml-server)
   or [free tier hosting](https://app.clear.ml)
 2. `pip install clearml-agent` ([install](#installing-the-clearml-agent) the ClearML Agent on any GPU machine:
   on-premises / cloud / ...)
 3. Create a [job](https://clear.ml/docs/latest/docs/apps/clearml_task) or
-   add [ClearML](https://github.com/allegroai/clearml) to your code with just 2 lines of code
+   add [ClearML](https://github.com/clearml/clearml) to your code with just 2 lines of code
 4. Change the [parameters](#using-the-clearml-agent) in the UI & schedule for [execution](#using-the-clearml-agent) (or
   automate with an [AutoML pipeline](#automl-and-orchestration-pipelines-))
 5. :chart_with_downwards_trend: :chart_with_upwards_trend: :eyes:  :beer:

 "All the Deep/Machine-Learning DevOps your research needs, and then some... Because ain't nobody got time for that"

-**Try ClearML now** [Self Hosted](https://github.com/allegroai/clearml-server)
+**Try ClearML now** [Self Hosted](https://github.com/clearml/clearml-server)
 or [Free tier Hosting](https://app.clear.ml)
-<a href="https://app.clear.ml"><img src="https://github.com/allegroai/clearml-agent/blob/master/docs/screenshots.gif?raw=true" width="100%"></a>
+<a href="https://app.clear.ml"><img src="https://github.com/clearml/clearml-agent/blob/master/docs/screenshots.gif?raw=true" width="100%"></a>

 ### Simple, Flexible Experiment Orchestration

@@ -71,7 +71,7 @@ or [Free tier Hosting](https://app.clear.ml)
 We think Kubernetes is awesome, but it is not a must to get started with remote execution agents and cluster management.
 We designed `clearml-agent` so you can run both bare-metal and on top of Kubernetes, in any combination that fits your environment.

-You can find the Dockerfiles in the [docker folder](./docker) and the helm Chart in https://github.com/allegroai/clearml-helm-charts
+You can find the Dockerfiles in the [docker folder](./docker) and the helm Chart in https://github.com/clearml/clearml-helm-charts

 #### Benefits of integrating existing Kubernetes cluster with ClearML

@@ -86,8 +86,8 @@ You can find the Dockerfiles in the [docker folder](./docker) and the helm Chart
 - **Enterprise Features**: RBAC, vault, multi-tenancy, scheduler, quota management, fractional GPU support 

 **Run the agent in Kubernetes Glue mode an map ClearML jobs directly to K8s jobs:**
- Use the [ClearML Agent Helm Chart](https://github.com/allegroai/clearml-helm-charts/tree/main/charts/clearml-agent) to spin an agent pod acting as a controller
-  - Or run the [clearml-k8s glue](https://github.com/allegroai/clearml-agent/blob/master/examples/k8s_glue_example.py) on
+- Use the [ClearML Agent Helm Chart](https://github.com/clearml/clearml-helm-charts/tree/main/charts/clearml-agent) to spin an agent pod acting as a controller
+  - Or run the [clearml-k8s glue](https://github.com/clearml/clearml-agent/blob/master/examples/k8s_glue_example.py) on
    a Kubernetes cpu node
 - The clearml-k8s glue pulls jobs from the ClearML job execution queue and prepares a Kubernetes job (based on provided
  yaml template)
@@ -151,7 +151,7 @@ The ClearML Agent executes experiments using the following process:

 #### System Design & Flow

-<img src="https://github.com/allegroai/clearml-agent/blob/master/docs/clearml_architecture.png" width="100%" alt="clearml-architecture">
+<img src="https://github.com/clearml/clearml-agent/blob/master/docs/clearml_architecture.png" width="100%" alt="clearml-architecture">

 #### Installing the ClearML Agent

@@ -279,7 +279,7 @@ clearml-agent daemon --detached --gpus 0 --queue default --docker nvidia/cuda:11

 ### How do I create an experiment on the ClearML Server? <a name="from-scratch"></a>

-* Integrate [ClearML](https://github.com/allegroai/clearml) with your code
+* Integrate [ClearML](https://github.com/clearml/clearml) with your code
 * Execute the code on your machine (Manually / PyCharm / Jupyter Notebook)
 * As your code is running, **ClearML** creates an experiment logging all the necessary execution information:
    - Git repository link and commit ID (or an entire jupyter notebook)
@@ -326,21 +326,21 @@ The ClearML Agent can also be used to implement AutoML orchestration and Experim
 ClearML package.

 Sample AutoML & Orchestration examples can be found in the
-ClearML [example/automation](https://github.com/allegroai/clearml/tree/master/examples/automation) folder.
+ClearML [example/automation](https://github.com/clearml/clearml/tree/master/examples/automation) folder.

 AutoML examples:

- [Toy Keras training experiment](https://github.com/allegroai/clearml/blob/master/examples/optimization/hyper-parameter-optimization/base_template_keras_simple.py)
+- [Toy Keras training experiment](https://github.com/clearml/clearml/blob/master/examples/optimization/hyper-parameter-optimization/base_template_keras_simple.py)
    - In order to create an experiment-template in the system, this code must be executed once manually
- [Random Search over the above Keras experiment-template](https://github.com/allegroai/clearml/blob/master/examples/automation/manual_random_param_search_example.py)
+- [Random Search over the above Keras experiment-template](https://github.com/clearml/clearml/blob/master/examples/automation/manual_random_param_search_example.py)
    - This example will create multiple copies of the Keras experiment-template, with different hyperparameter
      combinations

 Experiment Pipeline examples:

- [First step experiment](https://github.com/allegroai/clearml/blob/master/examples/automation/task_piping_example.py)
+- [First step experiment](https://github.com/clearml/clearml/blob/master/examples/automation/task_piping_example.py)
    - This example will "process data", and once done, will launch a copy of the 'second step' experiment-template
- [Second step experiment](https://github.com/allegroai/clearml/blob/master/examples/automation/toy_base_task.py)
+- [Second step experiment](https://github.com/clearml/clearml/blob/master/examples/automation/toy_base_task.py)
    - In order to create an experiment-template in the system, this code must be executed once manually

 ### License
--- a/clearml_agent/backend_api/config/default/agent.conf
+++ b/clearml_agent/backend_api/config/default/agent.conf
@@ -54,22 +54,26 @@
    # docker_use_activated_venv: true

    # select python package manager:
-    # currently supported: pip, conda and poetry
+    # currently supported: pip, conda, uv and poetry
    # if "pip" or "conda" are used, the agent installs the required packages
    # based on the "installed packages" section of the Task. If the "installed packages" is empty,
    # it will revert to using `requirements.txt` from the repository's root directory.
    # If Poetry is selected and the root repository contains `poetry.lock` or `pyproject.toml`,
    # the "installed packages" section is ignored, and poetry is used.
    # If Poetry is selected and no lock file is found, it reverts to "pip" package manager behaviour.
+    # If uv is selected and the root repository contains `uv.lock` or `pyproject.toml`,
+    # the "installed packages" section is ignored, and uv is used.
    package_manager: {
-        # supported options: pip, conda, poetry
+        # supported options: pip, conda, poetry, uv
        type: pip,

        # specify pip version to use (examples "<20.2", "==19.3.1", "", empty string will install the latest version)
-        pip_version: ["<20.2 ; python_version < '3.10'", "<22.3 ; python_version >= '3.10'"],
+        pip_version: ["<20.2 ; python_version < '3.10'", "<22.3 ; python_version >= '3.10' and python_version <= '3.11'", ">=23,<24.3 ; python_version >= '3.12'"]
        # specify poetry version to use (examples "<2", "==1.1.1", "", empty string will install the latest version)
        # poetry_version: "<2",
        # poetry_install_extra_args: ["-v"]
+        # uv_version: ">0.4",
+        # uv_sync_extra_args: ["--all-extras"]

        # virtual environment inherits packages from system
        system_site_packages: false,
@@ -80,6 +84,14 @@
        # additional artifact repositories to use when installing python packages
        # extra_index_url: ["https://allegroai.jfrog.io/clearml/api/pypi/public/simple"]

+        # turn on the "--use-deprecated=legacy-resolver" flag for pip, to avoid package dependency version mismatch
+        # is any version restrictions are matched we add the "--use-deprecated=legacy-resolver" flag
+        # example: pip_legacy_resolver = [">=20.3,<24.3", ">99"]
+        # if pip==20.2 or pip==29.0 is installed we do nothing,
+        # if pip==21.1 or pip==101.1 is installed the flag is added
+        # disable the feature by passing an empty list
+        pip_legacy_resolver = [">=20.3,<24.3"]
+
        # control the pytorch wheel resolving algorithm, options are: "pip", "direct", "none"
        # Override with environment variable CLEARML_AGENT_PACKAGE_PYTORCH_RESOLVE
        # "pip" (default): would automatically detect the cuda version, and supply pip with the correct
@@ -125,6 +137,10 @@
        # if set to true, the agent will look for the "poetry.lock" file 
        # in the passed current working directory instead of the repository's root directory.
        poetry_files_from_repo_working_dir: false
+
+        # if set to true, the agent will look for the "uv.lock" file 
+        # in the passed current working directory instead of the repository's root directory.
+        uv_files_from_repo_working_dir: false
    },

    # target folder for virtual environments builds, created when executing experiment
@@ -184,6 +200,12 @@
    # allows the following task docker args to be overridden by the extra_docker_arguments
    # protected_docker_extra_args: ["privileged", "security-opt", "network", "ipc"]

+    # Enforce filter whitelist on docker arguments, allowing only those matching these filters to be used when running
+    # a task. These can also be provided using the CLEARML_AGENT_DOCKER_ARGS_FILTERS environment variable
+    # (using shlex.split whitespace-separated format).
+    # For example, allow only environment variables:
+    # docker_args_filters: ["^--env$", "^-e$"]
+
    # optional shell script to run in docker when started before the experiment is started
    # extra_docker_shell_script: ["apt-get install -y bindfs", ]

--- a/clearml_agent/backend_api/session/jsonmodels/fields.py
+++ b/clearml_agent/backend_api/session/jsonmodels/fields.py
@@ -134,7 +134,7 @@ class BaseField(object):
    def _validate_name(self):
        if self.name is None:
            return
-        if not re.match('^[A-Za-z_](([\w\-]*)?\w+)?$', self.name):
+        if not re.match(r'^[A-Za-z_](([\w\-]*)?\w+)?$', self.name):
            raise ValueError('Wrong name', self.name)

    def structue_name(self, default):
--- a/clearml_agent/backend_api/session/request.py
+++ b/clearml_agent/backend_api/session/request.py
@@ -19,8 +19,19 @@ class Request(ApiModel):
    _method = ENV_API_DEFAULT_REQ_METHOD.get(default="get")

    def __init__(self, **kwargs):
-        if kwargs:
+        allow_extra_fields = kwargs.pop("_allow_extra_fields_", False)
+        if not allow_extra_fields and kwargs:
            raise ValueError('Unsupported keyword arguments: %s' % ', '.join(kwargs.keys()))
+        elif allow_extra_fields and kwargs:
+            self._extra_fields = kwargs
+        else:
+            self._extra_fields = {}
+
+    def to_dict(self, *args, **kwargs):
+        res = super(Request, self).to_dict(*args, **kwargs)
+        if self._extra_fields:
+            res.update(self._extra_fields)
+        return res


@six.add_metaclass(abc.ABCMeta)
--- a/clearml_agent/commands/base.py
+++ b/clearml_agent/commands/base.py
@@ -327,7 +327,7 @@ class ServiceCommandSection(BaseCommandSection):
    def get_service(self, service_class):
        return service_class(config=self._session.config)

-    def _resolve_name(self, name, service=None):
+    def _resolve_name(self, name, service=None, search_hidden=False):
        """
        Resolve an object name to an object ID.
        Operation:
@@ -349,7 +349,11 @@ class ServiceCommandSection(BaseCommandSection):
        except AttributeError:
            raise NameResolutionError('Name resolution unavailable for {}'.format(service))

-        request = request_cls.from_dict(dict(name=re.escape(name), only_fields=['name', 'id']))
+        req_dict = {"name": re.escape(name), "only_fields": ['name', 'id']}
+        if search_hidden:
+            req_dict["_allow_extra_fields_"] = True
+            req_dict["search_hidden"] = True
+        request = request_cls.from_dict(req_dict)
        # from_dict will ignore unrecognised keyword arguments - not all GetAll's have only_fields
        response = getattr(self._session.send_api(request), service)
        matches = [db_object for db_object in response if name.lower() == db_object.name.lower()]
--- a/clearml_agent/commands/worker.py
+++ b/clearml_agent/commands/worker.py
@@ -122,8 +122,10 @@ from clearml_agent.helper.package.external_req import ExternalRequirements, Only
 from clearml_agent.helper.package.pip_api.system import SystemPip
 from clearml_agent.helper.package.pip_api.venv import VirtualenvPip
 from clearml_agent.helper.package.poetry_api import PoetryConfig, PoetryAPI
+from clearml_agent.helper.package.uv_api import UvConfig, UvAPI
 from clearml_agent.helper.package.post_req import PostRequirement
-from clearml_agent.helper.package.priority_req import PriorityPackageRequirement, PackageCollectorRequirement
+from clearml_agent.helper.package.priority_req import PriorityPackageRequirement, PackageCollectorRequirement, \
+    CachedPackageRequirement
 from clearml_agent.helper.package.pytorch import PytorchRequirement
 from clearml_agent.helper.package.requirements import (
    RequirementsManager, )
@@ -755,6 +757,7 @@ class Worker(ServiceCommandSection):

        self.is_venv_update = self._session.config.agent.venv_update.enabled
        self.poetry = PoetryConfig(self._session)
+        self.uv = UvConfig(self._session)
        self.docker_image_func = None
        self._patch_docker_cmd_func = None
        self._docker_image = None
@@ -1182,7 +1185,10 @@ class Worker(ServiceCommandSection):
        Requires that agent session credentials will allow impersonation as task user
        """
        def get_new_session(session, headers):
-            result = session.send(auth_api.LoginRequest(), headers=headers)
+            result = session.send(
+                auth_api.LoginRequest(expiration_sec=session.req_token_expiration_sec),
+                headers=headers
+            )
            if not (result.ok() and result.response):
                return
            new_session = copy(session)
@@ -1688,7 +1694,7 @@ class Worker(ServiceCommandSection):

    def reload_config(self):
        try:
-            reloaded = self._session.reload()
+            reloaded = self._session.config.reload()
        except Exception as ex:
            self.log("Failed reloading config file")
            self.log_traceback(ex)
@@ -2433,7 +2439,7 @@ class Worker(ServiceCommandSection):
            OnlyExternalRequirements.cwd = package_api.cwd = cwd
            package_api.requirements_manager = self._get_requirements_manager(
                base_interpreter=package_api.requirements_manager.get_interpreter(),
-                requirement_substitutions=[OnlyExternalRequirements],
+                requirement_substitutions=[CachedPackageRequirement, OnlyExternalRequirements],
            )
            # manually update the current state,
            # for the external git reference chance (in the replace callback)
@@ -2852,7 +2858,7 @@ class Worker(ServiceCommandSection):
                    OnlyExternalRequirements.cwd = package_api.cwd = cwd
                    package_api.requirements_manager = self._get_requirements_manager(
                        base_interpreter=package_api.requirements_manager.get_interpreter(),
-                        requirement_substitutions=[OnlyExternalRequirements]
+                        requirement_substitutions=[CachedPackageRequirement, OnlyExternalRequirements]
                    )
                    # manually update the current state,
                    # for the external git reference chance (in the replace callback)
@@ -2996,9 +3002,11 @@ class Worker(ServiceCommandSection):

        # Add the script CWD to the python path
        if repo_info and repo_info.root and self._session.config.get('agent.force_git_root_python_path', None):
-            python_path = get_python_path(repo_info.root, None, self.package_api, is_conda_env=self.is_conda)
+            python_path = get_python_path(repo_info.root, None, self.package_api,
+                                          is_conda_env=self.is_conda or self.uv.enabled)
        else:
-            python_path = get_python_path(script_dir, execution.entry_point, self.package_api, is_conda_env=self.is_conda)
+            python_path = get_python_path(script_dir, execution.entry_point, self.package_api,
+                                          is_conda_env=self.is_conda or self.uv.enabled)
        if ENV_TASK_EXTRA_PYTHON_PATH.get():
            python_path = add_python_path(python_path, ENV_TASK_EXTRA_PYTHON_PATH.get())
        if python_path:
@@ -3013,7 +3021,7 @@ class Worker(ServiceCommandSection):
                ENV_TASK_EXECUTE_AS_USER.get())
            use_execv = False
        else:
-            use_execv = is_linux_platform() and not isinstance(self.package_api, (PoetryAPI, CondaAPI))
+            use_execv = is_linux_platform() and not isinstance(self.package_api, (PoetryAPI, UvAPI ,CondaAPI))

        self._session.api_client.tasks.started(
            task=current_task.id,
@@ -3197,7 +3205,11 @@ class Worker(ServiceCommandSection):
            execution.working_dir = execution.working_dir or "."

        # fix our import patch (in case we have __future__)
-        if script_file and script_file.is_file():
+        try:
+            is_file = script_file and script_file.is_file()
+        except OSError:
+            is_file = False
+        if is_file:
            fix_package_import_diff_patch(script_file.as_posix())

        if is_literal_script and not has_repository:
@@ -3375,7 +3387,7 @@ class Worker(ServiceCommandSection):

        # disable caching with poetry because we cannot make it install into a specific folder
        # Todo: add support for poetry caching
-        if not self.poetry.enabled:
+        if not self.poetry.enabled and not self.uv.enabled:
            # disable caching if we skipped the venv creation or the entire python setup
            if add_venv_folder_cache and not self._standalone_mode and (
                not ENV_AGENT_SKIP_PIP_VENV_INSTALL.get() and
@@ -3426,6 +3438,31 @@ class Worker(ServiceCommandSection):
        except Exception as ex:
            self.log.error("failed installing poetry requirements: {}".format(ex))
        return None
+    
+    def _install_uv_requirements(self, repo_info, working_dir=None):
+        # type: (Optional[RepoInfo], Optional[str]) -> Optional[UvAPI]
+        if not repo_info:
+            return None
+
+        files_from_working_dir = self._session.config.get(
+            "agent.package_manager.uv_files_from_repo_working_dir", False)
+        lockfile_path = Path(repo_info.root) / ((working_dir or "") if files_from_working_dir else "")
+
+        try:
+            if not self.uv.enabled:
+                return None
+
+            self.uv.initialize(cwd=lockfile_path)
+            api = self.uv.get_api(lockfile_path)
+            if api.enabled:
+                print('UV Enabled: Ignoring requested python packages, using repository uv lock file!')
+                api.install()
+                return api
+            
+            print(f"Could not find pyproject.toml or uv.lock file in {lockfile_path} \n")
+        except Exception as ex:
+            self.log.error("failed installing uv requirements: {}".format(ex))
+        return None

    def install_requirements(
        self, execution, repo_info, requirements_manager, cached_requirements=None, cwd=None, package_api=None
@@ -3455,6 +3492,9 @@ class Worker(ServiceCommandSection):
            package_api.cwd = cwd

        api = self._install_poetry_requirements(repo_info, execution.working_dir)
+        if not api:
+            api = self._install_uv_requirements(repo_info, execution.working_dir)
+
        if api:
            # update back the package manager, this hack should be fixed
            if package_api == self.package_api:
@@ -3912,12 +3952,14 @@ class Worker(ServiceCommandSection):
                  'To accelerate spin-up time set `agent.venvs_cache.path=~/.clearml/venvs-cache` :::\n')

        # check if we have a cached folder
-        if cached_requirements and not skip_pip_venv_install and self.package_api.get_cached_venv(
-            requirements=cached_requirements,
-            docker_cmd=execution_info.docker_cmd if execution_info else None,
-            python_version=self.package_api.python,
-            cuda_version=self._session.config.get("agent.cuda_version"),
-            destination_folder=Path(venv_dir)
+        if (cached_requirements and not skip_pip_venv_install and
+            not self.poetry.enabled and not self.uv.enabled and
+            self.package_api.get_cached_venv(
+                requirements=cached_requirements,
+                docker_cmd=execution_info.docker_cmd if execution_info else None,
+                python_version=self.package_api.python,
+                cuda_version=self._session.config.get("agent.cuda_version"),
+                destination_folder=Path(venv_dir))
        ):
            print('::: Using Cached environment {} :::'.format(self.package_api.get_last_used_entry_cache()))
            return venv_dir, requirements_manager, True
@@ -4574,11 +4616,12 @@ class Worker(ServiceCommandSection):
                    "cp -Rf {mount_ssh_ro} -T {mount_ssh}" if host_ssh_cache else "",
                    "[ ! -z $(which git) ] || export CLEARML_APT_INSTALL=\"$CLEARML_APT_INSTALL git\"",
                    "declare LOCAL_PYTHON",
-                    "[ ! -z $LOCAL_PYTHON ] || for i in {{15..5}}; do which {python_single_digit}.$i && " +
+                    "[ ! -z $LOCAL_PYTHON ] || for i in {{20..5}}; do which {python_single_digit}.$i && " +
                    "{python_single_digit}.$i -m pip --version && " +
                    "export LOCAL_PYTHON=$(which {python_single_digit}.$i) && break ; done",
                    "[ ! -z $LOCAL_PYTHON ] || export CLEARML_APT_INSTALL=\"$CLEARML_APT_INSTALL {python_single_digit}-pip\"",  # noqa
                    "[ -z \"$CLEARML_APT_INSTALL\" ] || (apt-get update -y ; apt-get install -y $CLEARML_APT_INSTALL)",
+                    "rm /usr/lib/python3.*/EXTERNALLY-MANAGED",  # remove PEP 668
                ]

            if preprocess_bash_script:
@@ -4795,7 +4838,7 @@ class Worker(ServiceCommandSection):
                worker_name = '{}:cpu'.format(worker_name)
        return worker_id, worker_name

-    def _resolve_queue_names(self, queues, create_if_missing=False):
+    def _resolve_queue_names(self, queues, create_if_missing=False, create_system_tags=None):
        if not queues:
            # try to look for queues with "default" tag
            try:
@@ -4807,15 +4850,25 @@ class Worker(ServiceCommandSection):

        queues = return_list(queues)
        if not create_if_missing:
-            return [self._resolve_name(q if isinstance(q, str) else q.name, "queues") for q in queues]
+            return [
+                self._resolve_name(q if isinstance(q, str) else q.name, service="queues", search_hidden=True)
+                for q in queues
+            ]

        queue_ids = []
        for q in queues:
+            # noinspection PyBroadException
            try:
-                q_id = self._resolve_name(q if isinstance(q, str) else q.name, "queues")
+                q_id = self._resolve_name(
+                    q if isinstance(q, str) else q.name, service="queues", search_hidden=True
+                )
            except:
-                self._session.send_api(queues_api.CreateRequest(name=q if isinstance(q, str) else q.name))
-                q_id = self._resolve_name(q if isinstance(q, str) else q.name, "queues")
+                self._session.send_api(
+                    queues_api.CreateRequest(name=q if isinstance(q, str) else q.name, system_tags=create_system_tags)
+                )
+                q_id = self._resolve_name(
+                    q if isinstance(q, str) else q.name, service="queues", search_hidden=True
+                )
            queue_ids.append(q_id)
        return queue_ids

--- a/clearml_agent/definitions.py
+++ b/clearml_agent/definitions.py
@@ -161,6 +161,7 @@ ENV_AGENT_SKIP_PYTHON_ENV_INSTALL = EnvironmentConfig("CLEARML_AGENT_SKIP_PYTHON
 ENV_AGENT_FORCE_CODE_DIR = EnvironmentConfig("CLEARML_AGENT_FORCE_CODE_DIR")
 ENV_AGENT_FORCE_EXEC_SCRIPT = EnvironmentConfig("CLEARML_AGENT_FORCE_EXEC_SCRIPT")
 ENV_AGENT_FORCE_POETRY = EnvironmentConfig("CLEARML_AGENT_FORCE_POETRY", type=bool)
+ENV_AGENT_FORCE_UV = EnvironmentConfig("CLEARML_AGENT_FORCE_UV", type=bool)
 ENV_AGENT_FORCE_TASK_INIT = EnvironmentConfig("CLEARML_AGENT_FORCE_TASK_INIT", type=bool)
 ENV_DOCKER_SKIP_GPUS_FLAG = EnvironmentConfig("CLEARML_DOCKER_SKIP_GPUS_FLAG", "TRAINS_DOCKER_SKIP_GPUS_FLAG")
 ENV_AGENT_GIT_USER = EnvironmentConfig("CLEARML_AGENT_GIT_USER", "TRAINS_AGENT_GIT_USER")
--- a/clearml_agent/glue/definitions.py
+++ b/clearml_agent/glue/definitions.py
@@ -1,3 +1,5 @@
+import shlex
+
 from clearml_agent.helper.environment import EnvEntry

 ENV_START_AGENT_SCRIPT_PATH = EnvEntry("CLEARML_K8S_GLUE_START_AGENT_SCRIPT_PATH", default="~/__start_agent__.sh")
@@ -17,4 +19,14 @@ ENV_POD_USE_IMAGE_ENTRYPOINT = EnvEntry("K8S_GLUE_POD_USE_IMAGE_ENTRYPOINT", def
 """
 Do not inject a cmd and args to the container's image when building the k8s template (depend on the built-in image
 entrypoint)
-"""
+"""
+
+ENV_KUBECTL_IGNORE_ERROR = EnvEntry("K8S_GLUE_IGNORE_KUBECTL_ERROR", default=None)
+"""
+Ignore kubectl errors matching this string pattern (allows ignoring warnings sent on stderr while 
+kubectl actually works and starts the pod)
+"""
+
+ENV_DEFAULT_SCHEDULER_QUEUE_TAGS = EnvEntry(
+    "K8S_GLUE_DEFAULT_SCHEDULER_QUEUE_TAGS", default=["k8s-glue"], converter=shlex.split
+)
--- a/clearml_agent/glue/k8s.py
+++ b/clearml_agent/glue/k8s.py
@@ -42,6 +42,8 @@ from clearml_agent.glue.definitions import (
    ENV_DEFAULT_EXECUTION_AGENT_ARGS,
    ENV_POD_AGENT_INSTALL_ARGS,
    ENV_POD_USE_IMAGE_ENTRYPOINT,
+    ENV_KUBECTL_IGNORE_ERROR,
+    ENV_DEFAULT_SCHEDULER_QUEUE_TAGS,
 )


@@ -83,10 +85,11 @@ class K8sIntegration(Worker):
            for line in _CONTAINER_APT_SCRIPT_SECTION
        ),
        "declare LOCAL_PYTHON",
-        "[ ! -z $LOCAL_PYTHON ] || for i in {{15..5}}; do which python3.$i && python3.$i -m pip --version && "
+        "[ ! -z $LOCAL_PYTHON ] || for i in {{20..5}}; do which python3.$i && python3.$i -m pip --version && "
        "export LOCAL_PYTHON=$(which python3.$i) && break ; done",
        '[ ! -z "$CLEARML_AGENT_SKIP_CONTAINER_APT" ] || [ ! -z "$LOCAL_PYTHON" ] || apt-get install -y python3-pip',
        "[ ! -z $LOCAL_PYTHON ] || export LOCAL_PYTHON=python3",
+        "rm /usr/lib/python3.*/EXTERNALLY-MANAGED",  # remove PEP 668
        "{extra_bash_init_cmd}",
        "[ ! -z $CLEARML_AGENT_NO_UPDATE ] || $LOCAL_PYTHON -m pip install clearml-agent{agent_install_args}",
        "{extra_docker_bash_script}",
@@ -208,6 +211,10 @@ class K8sIntegration(Worker):
                self._session.feature_set != "basic" and self._session.check_min_server_version("3.22.3")
        )

+        self.ignore_kubectl_errors_re = (
+            re.compile(ENV_KUBECTL_IGNORE_ERROR.get()) if ENV_KUBECTL_IGNORE_ERROR.get() else None
+        )
+
    @property
    def agent_label(self):
        return self._get_agent_label()
@@ -466,13 +473,34 @@ class K8sIntegration(Worker):
                    queue=self.k8s_pending_queue_id,
                )

-                res = self._session.api_client.tasks.enqueue(
-                    task_id,
-                    queue=self.k8s_pending_queue_id,
-                    status_reason='k8s pending scheduler',
-                )
-                if res.meta.result_code != 200:
-                    raise Exception(res.meta.result_msg)
+                for attempt in range(2):
+                    res = self._session.send_request(
+                        "tasks",
+                        "enqueue",
+                        json={
+                            "task": task_id,
+                            "queue": self.k8s_pending_queue_id,
+                            "status_reason": "k8s pending scheduler",
+                            "update_execution_queue": False,
+                        }
+                    )
+                    if res.ok:
+                        break
+
+                    # noinspection PyBroadException
+                    try:
+                        result_subcode = res.json()["meta"]["result_subcode"]
+                        result_msg = res.json()["meta"]["result_msg"]
+                    except Exception:
+                        result_subcode = None
+                        result_msg = res.text
+
+                    if attempt == 0 and res.status_code == 400 and result_subcode == 701:
+                        # Invalid queue ID, only retry once
+                        self._ensure_pending_queue_exists()
+                        continue
+                    raise Exception(result_msg)
+
            except Exception as e:
                self.log.error("ERROR: Could not push back task [{}] to k8s pending queue {} [{}], error: {}".format(
                    task_id, self.k8s_pending_queue_name, self.k8s_pending_queue_id, e))
@@ -611,34 +639,76 @@ class K8sIntegration(Worker):
            print("ERROR: no template for task {}, skipping".format(task_id))
            return

+        pod_name = self.pod_name_prefix + str(task_id)
+
+        self.apply_template_and_handle_result(
+            pod_name=pod_name,
+            clearml_conf_create_script=clearml_conf_create_script,
+            labels=labels,
+            queue=queue,
+            task_id=task_id,
+            namespace=namespace,
+            template=template,
+            docker_image=container['image'],
+            docker_args=container.get('arguments'),
+            docker_bash=container.get('setup_shell_script'),
+            session=session,
+            task_session=task_session,
+            pod_number=pod_number,
+            queue_name=queue_name,
+            task_data=task_data,
+            ports_mode=ports_mode,
+            pod_count=pod_count,
+        )
+
+    def apply_template_and_handle_result(
+            self,
+            pod_name,
+            clearml_conf_create_script: List[str],
+            labels,
+            queue,
+            task_id,
+            namespace,
+            template,
+            docker_image,
+            docker_args,
+            docker_bash,
+            session,
+            task_session,
+            queue_name,
+            task_data,
+            ports_mode,
+            pod_count,
+            pod_number=None,
+            base_spec: dict = None,
+    ):
+        """Apply the provided template with all custom settings and handle bookkeeping for the reaults"""
+
        output, error, pod_name = self._kubectl_apply(
+            pod_name=pod_name,
            template=template,
            pod_number=pod_number,
            clearml_conf_create_script=clearml_conf_create_script,
            labels=labels,
-            docker_image=container['image'],
-            docker_args=container.get('arguments'),
-            docker_bash=container.get('setup_shell_script'),
+            docker_image=docker_image,
+            docker_args=docker_args,
+            docker_bash=docker_bash,
            task_id=task_id,
            queue=queue,
            namespace=namespace,
            task_token=task_session.token.encode("ascii") if task_session else None,
+            base_spec=base_spec,
        )

        print('kubectl output:\n{}\n{}'.format(error, output))
        if error:
-            send_log = "Running kubectl encountered an error: {}".format(error)
-            self.log.error(send_log)
-            self.send_logs(task_id, send_log.splitlines())
-
-            # Make sure to remove the task from our k8s pending queue
-            self._session.api_client.queues.remove_task(
-                task=task_id,
-                queue=self.k8s_pending_queue_id,
-            )
-            # Set task as failed
-            session.api_client.tasks.failed(task_id, force=True)
-            return
+            if self.ignore_kubectl_errors_re and self.ignore_kubectl_errors_re.match(error):
+                print(f"Ignoring error due to {ENV_KUBECTL_IGNORE_ERROR.key}")
+            else:
+                self._set_task_failed_while_applying(
+                    session, task_id, f"Running kubectl encountered an error: {error}"
+                )
+                return

        if pod_name:
            self.resource_applied(
@@ -650,6 +720,19 @@ class K8sIntegration(Worker):
            pod_number=pod_number, pod_count=pod_count, task_data=task_data
        )

+    def _set_task_failed_while_applying(self, session, task_id: str, error: str):
+        send_log = "Running kubectl encountered an error: {}".format(error)
+        self.log.error(send_log)
+        self.send_logs(task_id, send_log.splitlines())
+
+        # Make sure to remove the task from our k8s pending queue
+        self._session.api_client.queues.remove_task(
+            task=task_id,
+            queue=self.k8s_pending_queue_id,
+        )
+        # Set task as failed
+        session.api_client.tasks.failed(task_id, force=True)
+
    def set_task_info(
            self, task_id: str, task_session, task_data, queue_name: str, ports_mode: bool, pod_number, pod_count
    ):
@@ -830,6 +913,7 @@ class K8sIntegration(Worker):

    def _kubectl_apply(
        self,
+        pod_name,
        clearml_conf_create_script: List[str],
        docker_image,
        docker_args,
@@ -841,6 +925,7 @@ class K8sIntegration(Worker):
        template,
        pod_number=None,
        task_token=None,
+        base_spec: dict = None,  # base values for the spec (might be overridden)
    ):
        if "apiVersion" not in template:
            template["apiVersion"] = "batch/v1" if self.using_jobs else "v1"
@@ -855,8 +940,7 @@ class K8sIntegration(Worker):
            template["kind"] = self.kind.capitalize()

        metadata = template.setdefault('metadata', {})
-        name = self.pod_name_prefix + str(task_id)
-        metadata['name'] = name
+        metadata['name'] = pod_name

        def place_labels(metadata_dict):
            labels_dict = dict(pair.split('=', 1) for pair in labels)
@@ -876,13 +960,16 @@ class K8sIntegration(Worker):

            spec = spec_template.setdefault('spec', {})

+        if base_spec:
+            merge_dicts(spec, base_spec)
+
        containers = spec.setdefault('containers', [])
        spec.setdefault('restartPolicy', 'Never')

-        task_worker_id = self.get_task_worker_id(template, task_id, name, namespace, queue)
+        task_worker_id = self.get_task_worker_id(template, task_id, pod_name, namespace, queue)

        container = self._create_template_container(
-            pod_name=name,
+            pod_name=pod_name,
            task_id=task_id,
            docker_image=docker_image,
            docker_args=docker_args,
@@ -928,7 +1015,7 @@ class K8sIntegration(Worker):
        finally:
            safe_remove_file(yaml_file)

-        return stringify_bash_output(output), stringify_bash_output(error), name
+        return stringify_bash_output(output), stringify_bash_output(error), pod_name

    def _process_bash_lines_response(self, bash_cmd: str, raise_error=True):
        res = get_bash_output(bash_cmd, raise_error=raise_error)
@@ -1013,7 +1100,7 @@ class K8sIntegration(Worker):
        deleted_pods = defaultdict(list)
        for namespace in namespaces:
            if time() - self._last_pod_cleanup_per_ns[namespace] < self._min_cleanup_interval_per_ns_sec:
-                # Do not try to cleanup the same namespace too quickly
+                # Do not try to clean up the same namespace too quickly
                continue

            try:
@@ -1089,6 +1176,21 @@ class K8sIntegration(Worker):
    def check_if_suspended(self) -> bool:
        pass

+    def check_if_schedulable(self, queue: str) -> bool:
+        return True
+
+    def _ensure_pending_queue_exists(self):
+        resolved_ids = self._resolve_queue_names(
+            [self.k8s_pending_queue_name],
+            create_if_missing=True,
+            create_system_tags=ENV_DEFAULT_SCHEDULER_QUEUE_TAGS.get()
+        )
+        if not resolved_ids:
+            raise ValueError(
+                "Failed resolving or creating k8s pending queue {}".format(self.k8s_pending_queue_name)
+            )
+        self.k8s_pending_queue_id = resolved_ids[0]
+
    def run_tasks_loop(self, queues: List[Text], worker_params, **kwargs):
        """
        :summary: Pull and run tasks from queues.
@@ -1104,14 +1206,8 @@ class K8sIntegration(Worker):

        events_service = self.get_service(Events)

-        # make sure we have a k8s pending queue
        if not self.k8s_pending_queue_id:
-            resolved_ids = self._resolve_queue_names([self.k8s_pending_queue_name], create_if_missing=True)
-            if not resolved_ids:
-                raise ValueError(
-                    "Failed resolving or creating k8s pending queue {}".format(self.k8s_pending_queue_name)
-                )
-            self.k8s_pending_queue_id = resolved_ids[0]
+            self._ensure_pending_queue_exists()

        _last_machine_update_ts = 0
        while True:
@@ -1143,6 +1239,9 @@ class K8sIntegration(Worker):
                    sleep(self._polling_interval)
                    break

+                if not self.check_if_schedulable(queue):
+                    continue
+
                # get next task in queue
                try:
                    # print(f"debug> getting tasks for queue {queue}")
--- a/clearml_agent/glue/pending_pods_daemon.py
+++ b/clearml_agent/glue/pending_pods_daemon.py
@@ -1,11 +1,9 @@
 from time import sleep
-from typing import Dict, Tuple, Optional, List
+from typing import Dict, List

 from clearml_agent.backend_api.session import Request
 from clearml_agent.glue.utilities import get_bash_output

-from clearml_agent.helper.process import stringify_bash_output
-
 from .daemon import K8sDaemon
 from .utilities import get_path
 from .errors import GetPodsError
@@ -38,7 +36,11 @@ class PendingPodsDaemon(K8sDaemon):
        return get_path(pod, "metadata", "name")

    def _get_task_id(self, pod: dict):
-        return self._get_k8s_resource_name(pod).rpartition('-')[-1]
+        prefix, _, value = self._get_k8s_resource_name(pod).rpartition('-')
+        if len(value) > 4:
+            return value
+        # we assume this is a multi-node rank x (>0) pod
+        return prefix.rpartition('-')[-1] or value

    @staticmethod
    def _get_k8s_resource_namespace(pod: dict):
@@ -239,6 +241,11 @@ class PendingPodsDaemon(K8sDaemon):
                result_msg = get_path(result.json(), 'meta', 'result_msg')
                raise Exception(result_msg or result.text)

+            self._agent.send_logs(
+                task_id, ["Kubernetes Pod status: {}".format(msg)],
+                session=self._session
+            )
+
            # update last msg for this task
            self._last_tasks_msgs[task_id] = msg
        except Exception as ex:
--- a/clearml_agent/helper/dicts.py
+++ b/clearml_agent/helper/dicts.py
@@ -14,7 +14,7 @@ def merge_dicts(dict1, dict2, custom_merge_func=None):
        return dict2
    for k in dict2:
        if k in dict1:
-            res = None
+            res = _not_set
            if custom_merge_func:
                res = custom_merge_func(k, dict1[k], dict2[k], _not_set)
            dict1[k] = merge_dicts(dict1[k], dict2[k], custom_merge_func) if res is _not_set else res
--- a/clearml_agent/helper/package/base.py
+++ b/clearml_agent/helper/package/base.py
@@ -29,9 +29,12 @@ class PackageManager(object):
    _config_cache_max_entries = 'agent.venvs_cache.max_entries'
    _config_cache_free_space_threshold = 'agent.venvs_cache.free_space_threshold_gb'
    _config_cache_lock_timeout = 'agent.venvs_cache.lock_timeout'
+    _config_pip_legacy_resolver = 'agent.package_manager.pip_legacy_resolver'

    def __init__(self):
        self._cache_manager = None
+        self._existing_packages = []
+        self._base_install_flags = []

    @abc.abstractproperty
    def bin(self):
@@ -79,6 +82,23 @@ class PackageManager(object):
        # type: (Iterable[Text]) -> None
        pass

+    def add_extra_install_flags(self, extra_flags):  # type: (List[str]) -> None
+        if extra_flags:
+            extra_flags = [
+                e for e in extra_flags if e not in list(self._base_install_flags)
+            ]
+            self._base_install_flags = list(self._base_install_flags) + list(extra_flags)
+
+    def remove_extra_install_flags(self, extra_flags):  # type: (List[str]) -> bool
+        if extra_flags:
+            _base_install_flags = [
+                e for e in self._base_install_flags if e not in list(extra_flags)
+            ]
+            if self._base_install_flags != _base_install_flags:
+                self._base_install_flags = _base_install_flags
+                return True
+        return False
+
    def upgrade_pip(self):
        result = self._install(
            *select_for_platform(
@@ -87,19 +107,58 @@ class PackageManager(object):
            ),
            "--upgrade"
        )
-        packages = self.run_with_env(('list',), output=True).splitlines()
-        # p.split is ('pip', 'x.y.z')
-        pip = [p.split() for p in packages if len(p.split()) == 2 and p.split()[0] == 'pip']
-        if pip:
-            # noinspection PyBroadException
+
+        packages = (self.freeze(freeze_full_environment=True) or dict()).get("pip")
+        if packages:
+            from clearml_agent.helper.package.requirements import RequirementsManager
+            from .requirements import MarkerRequirement, SimpleVersion
+
+            # store existing packages so that we can check if we can skip preinstalled packages
+            # we will only check "@ file" "@ vcs" for exact match
+            self._existing_packages = RequirementsManager.parse_requirements_section_to_marker_requirements(
+                packages, skip_local_file_validation=True)
+
            try:
-                from .requirements import MarkerRequirement
-                pip = pip[0][1].split('.')
-                MarkerRequirement.pip_new_version = bool(int(pip[0]) >= 20)
-            except Exception:
-                pass
+                pip_pkg = next(p for p in self._existing_packages if p.name == "pip")
+            except StopIteration:
+                pip_pkg = None
+
+            # check if we need to list the pip version as well
+            if pip_pkg:
+                MarkerRequirement.pip_new_version = SimpleVersion.compare_versions(pip_pkg.version, ">=", "20")
+
+            # add --use-deprecated=legacy-resolver to pip install to avoid mismatched packages issues
+            self._add_legacy_resolver_flag(pip_pkg.version)
+
        return result

+    def _add_legacy_resolver_flag(self, pip_pkg_version):
+        if not self.session.config.get(self._config_pip_legacy_resolver, None):
+            return
+
+        from .requirements import SimpleVersion
+
+        match_versions = self.session.config.get(self._config_pip_legacy_resolver)
+        matched = False
+        for rule in match_versions:
+            matched = False
+            # make sure we match all the parts of the rule
+            for a_version in rule.split(","):
+                o, v = SimpleVersion.split_op_version(a_version.strip())
+                matched = SimpleVersion.compare_versions(pip_pkg_version, o, v)
+                if not matched:
+                    break
+            # if the rule is fully matched we have a match
+            if matched:
+                break
+
+        legacy_resolver_flags = ["--use-deprecated=legacy-resolver"]
+        if matched:
+            print("INFO: Using legacy resolver for PIP to avoid inconsistency with package versions!")
+            self.add_extra_install_flags(legacy_resolver_flags)
+        elif self.remove_extra_install_flags(legacy_resolver_flags):
+            print("INFO: removing pip legacy resolver!")
+
    def get_python_command(self, extra=()):
        # type: (...) -> Executable
        return Argv(self.bin, *extra)
@@ -149,6 +208,18 @@ class PackageManager(object):
                    return False
            except Exception:
                return False
+
+            try:
+                from .requirements import Requirement, MarkerRequirement
+                req = MarkerRequirement(Requirement.parse(package_name))
+
+                # if pip was part of the requirements, make sure we update the flags
+                # add --use-deprecated=legacy-resolver to pip install to avoid mismatched packages issues
+                if req.name == "pip" and req.version:
+                    PackageManager._selected_manager._add_legacy_resolver_flag(req.version)
+            except Exception as e:
+                print("WARNING: Error while parsing pip version legacy [{}]".format(e))
+
        return True

    @classmethod
--- a/clearml_agent/helper/package/pip_api/system.py
+++ b/clearml_agent/helper/package/pip_api/system.py
@@ -97,7 +97,7 @@ class SystemPip(PackageManager):
        return Argv(self.bin, '-m', 'pip', '--disable-pip-version-check', *command)

    def install_flags(self):
-        indices_args = tuple(
+        base_args = tuple(self._base_install_flags or []) + tuple(
            chain.from_iterable(('--extra-index-url', x) for x in PIP_EXTRA_INDICES)
        )

@@ -105,7 +105,7 @@ class SystemPip(PackageManager):
            ENV_PIP_EXTRA_INSTALL_FLAGS.get() or \
            self.session.config.get("agent.package_manager.extra_pip_install_flags", None)

-        return (indices_args + tuple(extra_pip_flags)) if extra_pip_flags else indices_args
+        return (base_args + tuple(extra_pip_flags)) if extra_pip_flags else base_args

    def download_flags(self):
        indices_args = tuple(
--- a/clearml_agent/helper/package/pip_api/venv.py
+++ b/clearml_agent/helper/package/pip_api/venv.py
@@ -37,7 +37,9 @@ class VirtualenvPip(SystemPip, PackageManager):

    def load_requirements(self, requirements):
        if isinstance(requirements, dict) and requirements.get("pip"):
-            requirements["pip"] = self.requirements_manager.replace(requirements["pip"])
+            requirements["pip"] = self.requirements_manager.replace(
+                requirements["pip"], existing_packages=self._existing_packages
+            )
        super(VirtualenvPip, self).load_requirements(requirements)
        self.requirements_manager.post_install(self.session, package_manager=self)

--- a/clearml_agent/helper/package/priority_req.py
+++ b/clearml_agent/helper/package/priority_req.py
@@ -7,7 +7,7 @@ from .requirements import SimpleSubstitution

 class PriorityPackageRequirement(SimpleSubstitution):

-    name = ("cython", "numpy", "setuptools", "pip", )
+    name = ("cython", "numpy", "setuptools", "pip", "uv", )
    optional_package_names = tuple()

    def __init__(self, *args, **kwargs):
@@ -53,12 +53,18 @@ class PriorityPackageRequirement(SimpleSubstitution):
        if not self._replaced_packages:
            return list_of_requirements

+        # we assume that both pip & setup tools are not in list_of_requirements, and we need to add them
+
        if "pip" in self._replaced_packages:
            full_freeze = PackageManager.out_of_scope_freeze(freeze_full_environment=True)
-            # now let's look for pip
-            pips = [line for line in full_freeze.get("pip", []) if line.split("==")[0] == "pip"]
-            if pips and "pip" in list_of_requirements:
-                list_of_requirements["pip"] = [pips[0]] + list_of_requirements["pip"]
+            if not full_freeze:
+                if "pip" in list_of_requirements:
+                    list_of_requirements["pip"] = [self._replaced_packages["pip"]] + list_of_requirements["pip"]
+            else:
+                # now let's look for pip
+                pips = [line for line in full_freeze.get("pip", []) if str(line.split("==")[0]).strip() == "pip"]
+                if pips and "pip" in list_of_requirements:
+                    list_of_requirements["pip"] = [pips[0]] + list_of_requirements["pip"]

        if "setuptools" in self._replaced_packages:
            try:
@@ -87,6 +93,20 @@ class PriorityPackageRequirement(SimpleSubstitution):
        return list_of_requirements


+class CachedPackageRequirement(PriorityPackageRequirement):
+
+    name = ("setuptools", "pip", )
+    optional_package_names = tuple()
+
+    def replace(self, req):
+        """
+        Put the requirement in the list for later conversion
+        :raises: ValueError if version is pre-release
+        """
+        self._replaced_packages[req.name] = req.line
+        return Text(req)
+
+
 class PackageCollectorRequirement(SimpleSubstitution):
    """
    This RequirementSubstitution class will allow you to have multiple instances of the same
--- a/clearml_agent/helper/package/requirements.py
+++ b/clearml_agent/helper/package/requirements.py
@@ -19,7 +19,7 @@ import logging
 from clearml_agent.definitions import PIP_EXTRA_INDICES
 from clearml_agent.helper.base import (
    warning, is_conda, which, join_lines, is_windows_platform,
-    convert_cuda_version_to_int_10_base_str, )
+    convert_cuda_version_to_int_10_base_str, dump_yaml, )
 from clearml_agent.helper.process import Argv, PathLike
 from clearml_agent.helper.gpu.gpustat import get_driver_cuda_version
 from clearml_agent.session import Session, normalize_cuda_version
@@ -94,6 +94,12 @@ class MarkerRequirement(object):
    def __repr__(self):
        return '{self.__class__.__name__}[{self}]'.format(self=self)

+    def __eq__(self, other):
+        return isinstance(other, MarkerRequirement) and str(self) == str(other)
+
+    def __hash__(self):
+        return str(self).__hash__()
+
    def format_specs(self, num_parts=None, max_num_parts=None):
        max_num_parts = max_num_parts or num_parts
        if max_num_parts is None or not self.specs:
@@ -116,6 +122,10 @@ class MarkerRequirement(object):
    def specs(self):  # type: () -> List[Tuple[Text, Text]]
        return self.req.specs

+    @property
+    def version(self):  # type: () -> Text
+        return self.specs[0][1] if self.specs else ""
+
    @specs.setter
    def specs(self, value):  # type: (List[Tuple[Text, Text]]) -> None
        self.req.specs = value
@@ -143,6 +153,8 @@ class MarkerRequirement(object):
        If the requested version is 1.2 the self.spec should be 1.2*
        etc.

+        usage: it returns the value of the following comparison: requested_version "op" self.version
+
        :param str requested_version:
        :param str op: '==', '>', '>=', '<=', '<', '~='
        :param int num_parts: number of parts to compare
@@ -152,7 +164,7 @@ class MarkerRequirement(object):
        if not self.specs:
            return True

-        version = self.specs[0][1]
+        version = self.version
        op = (op or self.specs[0][0]).strip()

        return SimpleVersion.compare_versions(
@@ -170,11 +182,21 @@ class MarkerRequirement(object):
        self.req.local_file = False
        return True

-    def validate_local_file_ref(self):
+    def is_local_package_ref(self):
        # if local file does not exist, remove the reference to it
        if self.vcs or self.editable or self.path or not self.local_file or not self.name or \
                not self.uri or not self.uri.startswith("file://"):
+            return False
+        return True
+
+    def is_vcs_ref(self):
+        return bool(self.vcs)
+
+    def validate_local_file_ref(self):
+        # if local file does not exist, remove the reference to it
+        if not self.is_local_package_ref():
            return
+
        local_path = Path(self.uri[len("file://"):])
        if not local_path.exists():
            local_path = Path(unquote(self.uri)[len("file://"):])
@@ -221,6 +243,19 @@ class SimpleVersion:
    _local_version_separators = re.compile(r"[\._-]")
    _regex = re.compile(r"^\s*" + VERSION_PATTERN + r"\s*$", re.VERBOSE | re.IGNORECASE)

+    @classmethod
+    def split_op_version(cls, line):
+        """
+        Split a string in the form of ">=1.2.3" into a (op, version), i.e. (">=", "1.2.3")
+        Notice is calling with only a version string (e.g. "1.2.3") default operator is "=="
+        which means you get ("==", "1.2.3")
+        :param line: string examples: "<=0.1.2"
+        :return: tuple of (op, version) example ("<=", "0.1.2")
+        """
+        match = r"\s*([>=<~!]*)\s*(\S*)\s*"
+        groups = re.match(match, line).groups()
+        return groups[0] or "==", groups[1]
+
    @classmethod
    def compare_versions(cls, version_a, op, version_b, ignore_sub_versions=True, num_parts=3):
        """
@@ -624,14 +659,54 @@ class RequirementsManager(object):
                return handler.replace(req)
        return None

-    def replace(self, requirements):  # type: (Text) -> Text
+    def replace(
+            self,
+            requirements,  # type: Text
+            existing_packages=None,  # type: List[MarkerRequirement]
+            pkg_skip_existing_local=True,  # type: bool
+            pkg_skip_existing_vcs=True,  # type: bool
+            pkg_skip_existing=True,  # type: bool
+    ):  # type: (...) -> Text
        parsed_requirements = self.parse_requirements_section_to_marker_requirements(
-            requirements=requirements, cwd=self._cwd)
+            requirements=requirements, cwd=self._cwd, skip_local_file_validation=True)

+        if parsed_requirements and existing_packages:
+            skipped_packages = None
+            if pkg_skip_existing:
+                skipped_packages = set(parsed_requirements) & set(existing_packages)
+            elif pkg_skip_existing_local or pkg_skip_existing_vcs:
+                existing_packages = [
+                    p for p in existing_packages if (
+                        (pkg_skip_existing_local and p.is_local_package_ref()) or
+                        (pkg_skip_existing_vcs and p.is_vcs_ref())
+                    )
+                ]
+                skipped_packages = set(parsed_requirements) & set(existing_packages)
+
+            if skipped_packages:
+                # maintain order
+                num_skipped_packages = len(parsed_requirements)
+                parsed_requirements = [p for p in parsed_requirements if p not in skipped_packages]
+                num_skipped_packages -= len(parsed_requirements)
+                print("Skipping {} pre-installed packages:\n{}Remaining {} additional packages to install".format(
+                    num_skipped_packages,
+                    dump_yaml(sorted([str(p) for p in skipped_packages])),
+                    len(parsed_requirements)
+                ))
+
+                # nothing to install!
+                if not parsed_requirements:
+                    return ""
+
+        # sanity check
        if not parsed_requirements:
            # return the original requirements just in case
            return requirements

+        # remove local file reference that do not exist
+        for p in parsed_requirements:
+            p.validate_local_file_ref()
+
        def replace_one(i, req):
            # type: (int, MarkerRequirement) -> Optional[Text]
            try:
@@ -805,7 +880,7 @@ class RequirementsManager(object):
                normalize_cuda_version(cudnn_version or 0))

    @staticmethod
-    def parse_requirements_section_to_marker_requirements(requirements, cwd=None):
+    def parse_requirements_section_to_marker_requirements(requirements, cwd=None, skip_local_file_validation=False):
        def safe_parse(req_str):
            # noinspection PyBroadException
            try:
@@ -815,7 +890,8 @@ class RequirementsManager(object):

        def create_req(x):
            r = MarkerRequirement(x)
-            r.validate_local_file_ref()
+            if not skip_local_file_validation:
+                r.validate_local_file_ref()
            return r

        if not requirements:
--- a/clearml_agent/helper/package/uv_api.py
+++ b/clearml_agent/helper/package/uv_api.py
@@ -0,0 +1,234 @@
+from copy import deepcopy
+from functools import wraps
+
+import attr
+import sys
+import os
+from pathlib2 import Path
+
+from clearml_agent.definitions import ENV_AGENT_FORCE_UV
+from clearml_agent.helper.base import select_for_platform
+from clearml_agent.helper.process import Argv, DEVNULL, check_if_command_exists
+from clearml_agent.session import Session, UV
+
+
+def prop_guard(prop, log_prop=None):
+    assert isinstance(prop, property)
+    assert not log_prop or isinstance(log_prop, property)
+
+    def decorator(func):
+        message = "%s:%s calling {}, {} = %s".format(func.__name__, prop.fget.__name__)
+
+        @wraps(func)
+        def new_func(self, *args, **kwargs):
+            prop_value = prop.fget(self)
+            if log_prop:
+                log_prop.fget(self).debug(
+                    message,
+                    type(self).__name__,
+                    "" if prop_value else " not",
+                    prop_value,
+                )
+            if prop_value:
+                return func(self, *args, **kwargs)
+
+        return new_func
+
+    return decorator
+
+
+class UvConfig:
+    def __init__(self, session):
+        # type: (Session, str) -> None
+        self.session = session
+        self._log = session.get_logger(__name__)
+        self._python = (
+            sys.executable
+        )  # default, overwritten from session config in initialize()
+        self._initialized = False
+
+    @property
+    def log(self):
+        return self._log
+
+    @property
+    def enabled(self):
+        return (
+            ENV_AGENT_FORCE_UV.get()
+            or self.session.config["agent.package_manager.type"] == UV
+        )
+
+    _guard_enabled = prop_guard(enabled, log)
+
+    def run(self, *args, **kwargs):
+        func = kwargs.pop("func", Argv.get_output)
+        kwargs.setdefault("stdin", DEVNULL)
+        kwargs["env"] = deepcopy(os.environ)
+        if "VIRTUAL_ENV" in kwargs["env"] or "CONDA_PREFIX" in kwargs["env"]:
+            kwargs["env"].pop("VIRTUAL_ENV", None)
+            kwargs["env"].pop("CONDA_PREFIX", None)
+            kwargs["env"].pop("PYTHONPATH", None)
+            if hasattr(sys, "real_prefix") and hasattr(sys, "base_prefix"):
+                path = ":" + kwargs["env"]["PATH"]
+                path = path.replace(":" + sys.base_prefix, ":" + sys.real_prefix, 1)
+                kwargs["env"]["PATH"] = path
+
+        if self.session and self.session.config and args and args[0] == "sync":
+            # Set the cache dir to venvs dir
+            cache_dir = self.session.config.get("agent.venvs_dir", None)
+            if cache_dir is not None:
+                os.environ["UV_CACHE_DIR"] = cache_dir
+            
+            extra_args = self.session.config.get(
+                "agent.package_manager.uv_sync_extra_args", None
+            )
+            if extra_args:
+                args = args + tuple(extra_args)
+
+        if check_if_command_exists("uv"):
+            argv = Argv("uv", *args)
+        else:
+            argv = Argv(self._python, "-m", "uv", *args)
+        self.log.debug("running: %s", argv)
+        return func(argv, **kwargs)
+
+    @_guard_enabled
+    def initialize(self, cwd=None):
+        if not self._initialized:
+            # use correct python version -- detected in Worker.install_virtualenv() and written to
+            # session
+            if self.session.config.get("agent.python_binary", None):
+                self._python = self.session.config.get("agent.python_binary")
+
+            if (
+                self.session.config.get("agent.package_manager.uv_version", None)
+                is not None
+            ):
+                version = str(
+                    self.session.config.get("agent.package_manager.uv_version")
+                )
+
+                # get uv version
+                version = version.replace(" ", "")
+                if (
+                    ("=" in version)
+                    or ("~" in version)
+                    or ("<" in version)
+                    or (">" in version)
+                ):
+                    version = version
+                elif version:
+                    version = "==" + version
+                # (we are not running it yet)
+                argv = Argv(
+                    self._python,
+                    "-m",
+                    "pip",
+                    "install",
+                    "uv{}".format(version),
+                    "--upgrade",
+                    "--disable-pip-version-check",
+                )
+                # this is just for beauty and checks, we already set the verion in the Argv
+                if not version:
+                    version = "latest"
+            else:
+                # mark to install uv if not already installed (we are not running it yet)
+                argv = Argv(
+                    self._python,
+                    "-m",
+                    "pip",
+                    "install",
+                    "uv",
+                    "--disable-pip-version-check",
+                )
+                version = ""
+
+            # first upgrade pip if we need to
+            try:
+                from clearml_agent.helper.package.pip_api.venv import VirtualenvPip
+
+                pip = VirtualenvPip(
+                    session=self.session,
+                    python=self._python,
+                    requirements_manager=None,
+                    path=None,
+                    interpreter=self._python,
+                )
+                pip.upgrade_pip()
+            except Exception as ex:
+                self.log.warning("failed upgrading pip: {}".format(ex))
+
+            # check if we do not have a specific version and uv is found skip installation
+            if not version and check_if_command_exists("uv"):
+                print(
+                    "Notice: uv was found, no specific version required, skipping uv installation"
+                )
+            else:
+                print("Installing / Upgrading uv package to {}".format(version))
+                # now install uv
+                try:
+                    print(argv.get_output())
+                except Exception as ex:
+                    self.log.warning("failed installing uv: {}".format(ex))
+
+            # all done.
+            self._initialized = True
+
+    def get_api(self, path):
+        # type: (Path) -> UvAPI
+        return UvAPI(self, path)
+
+
+@attr.s
+class UvAPI(object):
+    config = attr.ib(type=UvConfig)
+    path = attr.ib(type=Path, converter=Path)
+
+    INDICATOR_FILES = "pyproject.toml", "uv.lock"
+
+    def install(self):
+        # type: () -> bool
+        if self.enabled:
+            self.config.run("sync", "--locked", cwd=str(self.path), func=Argv.check_call)
+            return True
+        return False
+
+    @property
+    def enabled(self):
+        return self.config.enabled and (
+            any((self.path / indicator).exists() for indicator in self.INDICATOR_FILES)
+        )
+
+    def freeze(self, freeze_full_environment=False):
+        python = Path(self.path) / ".venv" / select_for_platform(linux="bin/python", windows="scripts/python.exe")
+        lines = self.config.run("pip", "freeze", "--python", str(python), cwd=str(self.path)).splitlines()
+        # fix local filesystem reference in freeze
+        from clearml_agent.external.requirements_parser.requirement import Requirement
+        packages = [Requirement.parse(p) for p in lines]
+        for p in packages:
+            if p.local_file and p.editable:
+                p.path = str(Path(p.path).relative_to(self.path))
+                p.line = "-e {}".format(p.path)
+
+        return {
+            "pip": [p.line for p in packages]
+        }
+
+    def get_python_command(self, extra):
+        if check_if_command_exists("uv"):
+            return Argv("uv", "run", "python", *extra)
+        else:
+            return Argv(self.config._python, "-m", "uv", "run", "python", *extra)
+
+    def upgrade_pip(self, *args, **kwargs):
+        pass
+
+    def set_selected_package_manager(self, *args, **kwargs):
+        pass
+
+    def out_of_scope_install_package(self, *args, **kwargs):
+        pass
+
+    def install_from_file(self, *args, **kwargs):
+        pass
--- a/clearml_agent/helper/resource_monitor.py
+++ b/clearml_agent/helper/resource_monitor.py
@@ -401,6 +401,7 @@ class ResourceMonitor(object):
                        fractions = self._fractions_handler.fractions
                        stats["gpu_fraction_{}".format(report_index)] = \
                            (fractions[i] if i < len(fractions) else fractions[-1]) if fractions else 1.0
+                    report_index += 1

            except Exception as ex:
                # something happened and we can't use gpu stats,
--- a/clearml_agent/session.py
+++ b/clearml_agent/session.py
@@ -24,6 +24,7 @@ from clearml_agent.helper.docker_args import DockerArgsSanitizer, sanitize_urls
 from .version import __version__

 POETRY = "poetry"
+UV = "uv"


@attr.s
--- a/clearml_agent/version.py
+++ b/clearml_agent/version.py
@@ -1 +1 @@
-__version__ = '1.9.0'
+__version__ = '1.9.3'
--- a/docker/k8s-glue/build-resources/clearml.conf
+++ b/docker/k8s-glue/build-resources/clearml.conf
@@ -53,8 +53,9 @@ agent {
    # select python package manager:
    # currently supported pip and conda
    # poetry is used if pip selected and repository contains poetry.lock file
+    # uv is used if pip selected and repository contains uv.lock file
    package_manager: {
-        # supported options: pip, conda, poetry
+        # supported options: pip, conda, poetry, uv
        type: pip,

        # specify pip version to use (examples "<20.2", "==19.3.1", "", empty string will install the latest version)
--- a/docs/clearml.conf
+++ b/docs/clearml.conf
@@ -74,8 +74,10 @@ agent {
    # If Poetry is selected and the root repository contains `poetry.lock` or `pyproject.toml`,
    # the "installed packages" section is ignored, and poetry is used.
    # If Poetry is selected and no lock file is found, it reverts to "pip" package manager behaviour.
+    # If uv is selected and the root repository contains `uv.lock` or `pyproject.toml`,
+    # the "installed packages" section is ignored, and uv is used.
    package_manager: {
-        # supported options: pip, conda, poetry
+        # supported options: pip, conda, poetry, uv
        type: pip,

        # specify pip version to use (examples "<20.2", "==19.3.1", "", empty string will install the latest version)
@@ -83,6 +85,8 @@ agent {
        # specify poetry version to use (examples "<2", "==1.1.1", "", empty string will install the latest version)
        # poetry_version: "<2",
        # poetry_install_extra_args: ["-v"]
+        # uv_version: ">0.4",
+        # uv_sync_extra_args: ["--all-extras"]

        # virtual environment inheres packages from system
        system_site_packages: false,
--- a/requirements.txt
+++ b/requirements.txt
@@ -9,7 +9,9 @@ python-dateutil>=2.4.2,<2.9.0
 pyjwt>=2.4.0,<2.9.0
 PyYAML>=3.12,<6.1
 requests>=2.20.0,<=2.31.0
+setuptools ; python_version > '3.11'
 six>=1.13.0,<1.17.0
 typing>=3.6.4,<3.8.0 ; python_version < '3.5'
 urllib3>=1.21.1,<2
 virtualenv>=16,<21
+pywin32 ; sys_platform == 'win32'
--- a/setup.py
+++ b/setup.py
@@ -1,7 +1,7 @@
 """
-ClearML - Artificial Intelligence Version Control
+ClearML Inc.
 CLEARML-AGENT DevOps for machine/deep learning
-https://github.com/allegroai/clearml-agent
+https://github.com/clearml/clearml-agent
 """

 import os.path
@@ -39,9 +39,9 @@ setup(
    long_description=long_description,
    long_description_content_type='text/markdown',
    # The project's main homepage.
-    url='https://github.com/allegroai/clearml-agent',
-    author='Allegroai',
-    author_email='clearml@allegro.ai',
+    url='https://github.com/clearml/clearml-agent',
+    author='clearml',
+    author_email='clearml@clearml.ai',
    license='Apache License 2.0',
    classifiers=[
        'Development Status :: 5 - Production/Stable',
@@ -56,7 +56,6 @@ setup(
        'Topic :: Scientific/Engineering :: Image Recognition',
        'Topic :: System :: Logging',
        'Topic :: System :: Monitoring',
-        'Programming Language :: Python :: 3.5',
        'Programming Language :: Python :: 3.6',
        'Programming Language :: Python :: 3.7',
        'Programming Language :: Python :: 3.8',
@@ -64,6 +63,7 @@ setup(
        'Programming Language :: Python :: 3.10',
        'Programming Language :: Python :: 3.11',
        'Programming Language :: Python :: 3.12',
+        'Programming Language :: Python :: 3.13',
        'License :: OSI Approved :: Apache Software License',
    ],
Author	SHA1	Message	Date
clearml	4158146420	Version bump to v1.9.3	2025-01-19 16:17:56 +02:00
clearml	b9ef1a55cd	Fix dependency on windows	2025-01-19 16:16:54 +02:00
clearml	9fa8d72640	Update github repo link	2025-01-13 18:36:16 +02:00
clearml	e535390815	Add win32file dependency on windows	2025-01-13 18:34:48 +02:00
clearml	91dfa09466	Fix Python 3.13 support	2025-01-05 12:14:24 +02:00
clearml	f110bbf5b4	Remove Python 3.5 support	2025-01-05 12:13:57 +02:00
clearml	070919973b	Fix python 3.6 compatibility, no `:=` operator	2025-01-05 12:13:21 +02:00
clearml	47d35ef48f	Fix managed python environment inside container (PEP 668) remove usr/lib/python3.*/EXTERNALLY-MANAGED	2024-12-26 18:59:42 +02:00
clearml	54ed234fca	Add agent.docker_args_filters to configuration docs	2024-12-26 18:58:58 +02:00
clearml	a26860e79f	Fix default value handling in merge_dicts()	2024-12-26 18:58:24 +02:00
clearml	fc1abbab0b	Refactor k8s glue	2024-12-26 18:58:00 +02:00
clearml	4fa61dde1f	Support ignoring kubectl errors	2024-12-12 23:41:31 +02:00
clearml	26d748a4d8	Support creating queue with tags	2024-12-12 23:40:57 +02:00
clearml	5419fd84ae	Add support for Python 3.13	2024-12-12 23:39:11 +02:00
clearml	d8366dedc6	Fix UV priority Fix UV cache is disabled, UV handles its own cache Fix UV freeze Fix make sure we do not use pip cache if poetry/uv is used (even if we reverted to pip we can't know if someone changed the repository and now in a new version, a lock file exists)	2024-12-12 23:38:42 +02:00
mads-oestergaard	cc656e2969	Add support for uv as package manager (#218 ) * add uv as a package manager * update configs * update worker and defs * update environ * Update configs to highlight sync command * rename to sync_extra_args and set UV_CACHE_DIR	2024-11-27 13:44:55 +02:00
clearml	b65e5fed94	Scan more Python 3 versions	2024-11-17 13:55:51 +02:00
clearml	3273f76b46	Version bump to v1.9.2	2024-10-28 18:33:04 +02:00
clearml	9af0f9fe41	Fix reload method is found in the config object	2024-10-28 18:12:22 +02:00
clearml	205cd47cb9	Fix use req_token_expiration_sec when creating a task session and not the default value	2024-10-28 18:11:42 +02:00
clearml	0ff428bb96	Fix report index not advancing in resource monitoring causes more than one GPU not to be reported	2024-10-28 18:11:00 +02:00
Matteo Destro	bf8d9c96e9	Handle OSError when checking for is_file (#215 )	2024-10-13 10:08:03 +03:00
allegroai	a88487ff25	Add support for pip legacy resolver for versions specified in the `agent.package_manager.pip_legacy_resolver` configuration option Add skip existing packages	2024-09-22 22:36:06 +03:00
Jake Henning	785e22dc87	Version bump to v1.9.1	2024-09-02 01:04:49 +03:00
Jake Henning	6a2b778d53	Add default pip version support for Python 3.12	2024-09-02 01:03:52 +03:00