5.4 KiB
title |
---|
Version 0.16 |
:::important Trains is now ClearML. :::
Trains-Agent 0.16.2
Features
-
conda:
- Add
agent.package_manager.conda_env_as_base_docker
allowing "docker_cmd" to contain link to a full pre-packaged conda environment (tar.gz
created byconda-pack
). UseTRAINS_CONDA_ENV_PACKAGE
environment variable to specifyconda tar.gz
file. - Add conda support for read-only pre-built environment (pass conda folder as
docker_cmd
on Task). - Improve trying to find conda executable.
- Add
-
k8s glue:
- Add support for limited number of services exposing ports.
- Add support for k8s pod custom user properties.
- Allow selecting external
trains.conf
file for the pod itself. - Allow providing pod template, extra bash init script, alternate SSH server port, gateway address (k8s ingress / ELB).
-
Allow specifying
cudatoolkit
version in the "installed packages" section when using conda as package manager (GitHub trains Issue 229). -
Add
agent.package_manager.force_repo_requirements_txt
. If True, "Installed Packages" on Task are ignored, and only repositoryrequirements.txt
is used. -
Pass
TRAINS_DOCKER_IMAGE
into docker for interactive sessions. -
Add
torchcsprng
andtorchtext
to PyTorch resolving.
Bug Fixes
-
When logging suppress "\r" when reading a current chunk of a file / stream. Add
agent.suppress_carriage_return
(default True) to support previous behavior. -
Make sure
TRAINS_AGENT_K8S_HOST_MOUNT
is used only once per mount. -
Fix k8s glue script to trains-agent default docker script.
-
Fix apply git diff from submodule only.
-
conda:
- Fix conda pip freeze to be consistent with trains 0.16.3.
- Fix conda environment support for trains 0.16.3 full env. Add
agent.package_manager.conda_full_env_update
to allow conda to update back the requirements (default False, to preserve previous behavior). - Fix running from conda environment -
conda.sh
not found in first conda PATH match.
-
Fix docker mode ubuntu / debian support by making sure not to ask for input (fix
tzdata
install). -
Fix repository detection - ignore environment
SSH_AUTH_SOCK
, only check if git user/pass are configured. -
git diff:
- Fix support for non-ascii diff.
- Fix diff with empty line at the end will cause corrupt diff apply message.
- Allow zero context diffs (useful when blind patching repository).
-
Fix
daemon --stop
when agent UID cannot be located. -
Fix nvidia docker support on some linux distros (SUSE).
-
Fix nvidia pytorch dockers support.
-
Fix torch CUDA 11.1 support.
-
Fix requirements dict with null entry in
pip
should be considered None install from repository'srequirements.txt
.
Trains Agent 0.16.1
Features
- Add
sdk.metrics.plot_max_num_digits
configuration option to reduce plot storage size. - Add
agent.package_manager.post_packages
andagent.package_manager.post_optional_packages
configuration options to control packages install order (e.g. horovod). - Add
agent.git_host
configuration option for limiting git credential usage for a specific host (overridable usingTRAINS_AGENT_GIT_HOST
environment variable). - Add
agent.force_git_ssh_port
configuration option to control HTTPS to SSH link conversion for non-standard SSH ports. - Add requirements detection features. Improve support for detecting new pip version (20+) supporting
package @ scheme://link
.
Bug Fixes
- Fix pre-installed packages are ignored when installing a git package wheel. Reinstalling a
git+http
link is enough to make sure all requirements are met / installed (GitHub Issue #196). - Fix incorrect check for spaces in current execution folder.
- Fix requirements detection:
- Update torch version after using downloaded / system pre-installed version.
- Do not install git packages twice when a new pip version is used (pip freeze will detect the correct git link version).
Trains Agent 0.16.0
Features
- Add
agent.docker_init_bash_script
configuration section to allow finer control over Docker startup script. - Changed default Docker image from
nvidia/cuda
tonvidia/cuda:10.1-runtime-ubuntu18.04
to supportcudnn
frameworks (e.g. TF). - Improve support for Dockers with preinstalled
conda
environment. - Improve trains-agent-docker spinning.
- Add
daemon --order-fairness
for round-robin queue pulling. - Add
daemon --stop
to terminate a running agent (assuming other arguments are the same). If no additional arguments, Agents are terminated in lexicographical order. - Support cleanup of all log files on termination unless executed with
--debug
. - Add error message when Trains API Server is not accessible on startup.
Bug Fixes
- Fix GPU Windows monitoring support (GitHub Issue #177).
- Fix
.git-credentials
and.gitconfig
mapping into docker. - Fix non-root docker image usage.
- Fix docker to use
UTF-8
encoding, so prints won't break it. - Fix
--debug
to set all loggers toDEBUG
. - Fix task status change to
queued
should never happen during Task runtime. - Fix
requirement_parser
to supportpackage @ git+http
lines. - Fix GIT user/password in requirements and support for
-e git+http
lines. - Fix configuration wizard to generate
trains.conf
matching latest Trains definitions.