clearml-docs/ver_0_16.md at 9fdc89358271a267be6a78550131b49f2fe511c3

mirror of https://github.com/clearml/clearml-docs synced 2025-06-26 18:17:44 +00:00

pollfly e1b179ed54

CI / build (push) Has been cancelled

Details

2025-02-13 13:21:35 +02:00

title
Version 0.16

:::important Trains is now ClearML. :::

Features

conda:
- Add agent.package_manager.conda_env_as_base_docker allowing "docker_cmd" to contain link to a full pre-packaged conda environment (tar.gz created by conda-pack). Use TRAINS_CONDA_ENV_PACKAGE environment variable to specify conda tar.gz file.
- Add conda support for read-only pre-built environment (pass conda folder as docker_cmd on Task).
- Improve trying to find conda executable.
k8s glue:
- Add support for limited number of services exposing ports.
- Add support for k8s pod custom user properties.
- Allow selecting external trains.conf file for the pod itself.
- Allow providing pod template, extra bash init script, alternate SSH server port, gateway address (k8s ingress / ELB).
Allow specifying cudatoolkit version in the "installed packages" section when using conda as package manager (GitHub trains Issue 229).
Add agent.package_manager.force_repo_requirements_txt. If True, "Installed Packages" on Task are ignored, and only repository requirements.txt is used.
Pass TRAINS_DOCKER_IMAGE into docker for interactive sessions.
Add torchcsprng and torchtext to PyTorch resolving.

Bug Fixes

When logging suppress "\r" when reading a current chunk of a file / stream. Add agent.suppress_carriage_return (default True) to support previous behavior.
Make sure TRAINS_AGENT_K8S_HOST_MOUNT is used only once per mount.
Fix k8s glue script to trains-agent default docker script.
Fix apply git diff from submodule only.
conda:
- Fix conda pip freeze to be consistent with trains 0.16.3.
- Fix conda environment support for trains 0.16.3 full env. Add agent.package_manager.conda_full_env_update to allow conda to update back the requirements (default False, to preserve previous behavior).
- Fix running from conda environment - conda.sh not found in first conda PATH match.
Fix docker mode ubuntu / debian support by making sure not to ask for input (fix tzdata install).
Fix repository detection - ignore environment SSH_AUTH_SOCK, only check if git user/pass are configured.
git diff:
- Fix support for non-ascii diff.
- Fix diff with empty line at the end will cause corrupt diff apply message.
- Allow zero context diffs (useful when blind patching repository).
Fix daemon --stop when agent UID cannot be located.
Fix nvidia docker support on some linux distros (SUSE).
Fix nvidia pytorch dockers support.
Fix torch CUDA 11.1 support.
Fix requirements dict with null entry in pip should be considered None install from repository's requirements.txt.

Features

Add sdk.metrics.plot_max_num_digits configuration option to reduce plot storage size.
Add agent.package_manager.post_packages and agent.package_manager.post_optional_packages configuration options to control packages install order (e.g. horovod).
Add agent.git_host configuration option for limiting git credential usage for a specific host (overridable using TRAINS_AGENT_GIT_HOST environment variable).
Add agent.force_git_ssh_port configuration option to control HTTPS to SSH link conversion for non-standard SSH ports.
Add requirements detection features. Improve support for detecting new pip version (20+) supporting package @ scheme://link.

Bug Fixes

Fix pre-installed packages are ignored when installing a git package wheel. Reinstalling a git+http link is enough to make sure all requirements are met / installed (GitHub Issue #196).
Fix incorrect check for spaces in current execution folder.
Fix requirements detection:
- Update torch version after using downloaded / system pre-installed version.
- Do not install git packages twice when a new pip version is used (pip freeze will detect the correct git link version).

Features

Add agent.docker_init_bash_script configuration section to allow finer control over Docker startup script.
Changed default Docker image from nvidia/cuda to nvidia/cuda:10.1-runtime-ubuntu18.04 to support cudnn frameworks (e.g. TF).
Improve support for Dockers with preinstalled conda environment.
Improve trains-agent-docker spinning.
Add daemon --order-fairness for round-robin queue pulling.
Add daemon --stop to terminate a running agent (assuming other arguments are the same). If no additional arguments, Agents are terminated in lexicographical order.
Support cleanup of all log files on termination unless executed with --debug.
Add error message when Trains API Server is not accessible on startup.

Bug Fixes

Fix GPU Windows monitoring support (GitHub Issue #177).
Fix .git-credentials and .gitconfig mapping into docker.
Fix non-root docker image usage.
Fix docker to use UTF-8 encoding, so prints won't break it.
Fix --debug to set all loggers to DEBUG.
Fix task status change to queued should never happen during Task runtime.
Fix requirement_parser to support package @ git+http lines.
Fix GIT user/password in requirements and support for -e git+http lines.
Fix configuration wizard to generate trains.conf matching latest Trains definitions.