Commit Graph

220 Commits

Author SHA1 Message Date
allegroai
8245293f7f Fix request endpoint constant version numbers 2020-05-31 13:52:53 +03:00
allegroai
6563ce70c8 Update README 2020-05-09 20:12:53 +03:00
allegroai
829b1d8f15 Use deep copy to clone configuration, always write configuration before launching a docker 2020-05-09 20:12:29 +03:00
allegroai
f6be64a4b5 Print conda install output if running in debug mode, turn on debugging if --debug flag is used 2020-05-09 20:11:01 +03:00
allegroai
21f6a73f66 Include CUDA version in the pytorch package fail error 2020-05-09 20:09:18 +03:00
allegroai
77c4c79a2f Support pip 20.1 local/http package reference in pip freeze 2020-05-09 20:08:17 +03:00
allegroai
2ad929fa00 Add torch_nightly flag support (if torch wheel is not found on stable try the nightly builds), improve support for torch in freeze (add actually used HTTP link as comment to the original package) 2020-05-09 20:08:05 +03:00
allegroai
53f511f536 Improve docker host-mount support, use TRAINS_AGENT_DOCKER_HOST_MOUNT env var 2020-05-09 20:02:46 +03:00
allegroai
7c87797a40 Pass git credentials to dockerized task execution 2020-05-09 19:59:58 +03:00
allegroai
272fa07c29 Fix and enhance "build --docker"
- Fix standalone docker execution
- Add --install-globally option to install required packages in the docker's system python
- Add --entry-point option to allow automatic task cloning when running the docker
2020-05-09 19:57:25 +03:00
allegroai
6ce9cf7c2a Fix version control links in requirements when using conda 2020-05-09 19:52:51 +03:00
allegroai
abb30ac2b8 Move --gpus and --cpu-only to worker args (used by daemon, execute and build) 2020-05-09 19:51:45 +03:00
allegroai
5bb257c46c Add daemon --create-queue to automatically create a queue and use it if queue name doesn't exist in server 2020-05-09 19:50:53 +03:00
allegroai
c65b28ed92 Update venv_update URL 2020-05-09 19:47:00 +03:00
allegroai
fce8eb6782 Add OS environment configuration for git user/pass using TRAINS_AGENT_GIT_USER/TRAINS_AGENT_GIT_PASS 2020-05-09 19:46:46 +03:00
allegroai
9cb71b9526 Add daemon service mode to allow multiple tasks to be launched simultaneously on the same machine (--service-mode) 2020-05-09 19:45:14 +03:00
allegroai
38e02ca5cd Add worker command state enforcement conforming and verification callback 2020-05-09 19:42:51 +03:00
allegroai
06bfea80bc Fix read file scope 2020-04-09 11:27:04 +03:00
allegroai
e660c7f2be Fix comments in config files 2020-04-09 11:23:45 +03:00
allegroai
fc28467080 Improve error message when failing to locate a task 2020-04-09 11:23:13 +03:00
allegroai
8d47905982 Show host information when failing to obtain a task 2020-04-01 19:12:45 +03:00
allegroai
a6a0b01f71 Remove deprecated OS environment variables 2020-04-01 19:11:37 +03:00
allegroai
2b561f6066 Version bump to v0.14.1 2020-03-24 20:37:18 +02:00
allegroai
61232d05dd Fix run as user support in Windows and add fall-back for created user folders 2020-03-22 19:16:11 +02:00
allegroai
b3418e4496 Add daemon detached mode (--detached, -d) that runs agent in the background and returns immediately 2020-03-22 19:00:29 +02:00
allegroai
5ef627165c Fix PyTorch support to ignore minor versions when looking for package to install or to download 2020-03-20 10:48:48 +02:00
allegroai
98a983d9a2 Add TRAINS_AGENT_EXTRA_PYTHON_PATH to allow adding additional python path for task execution (helpful when using extra untracked modules) 2020-03-20 10:46:56 +02:00
allegroai
482007c4ce Fix run as user feature (TRAINS_AGENT_EXEC_USER) 2020-03-20 10:42:32 +02:00
allegroai
98198b8006 Auto mount ~/.git-credentials into docker container if file exists 2020-03-20 10:39:59 +02:00
allegroai
94bb11a81a Change message when using local torch 2020-03-20 10:37:42 +02:00
allegroai
4158d08f6f Fix test 2020-03-20 10:36:20 +02:00
allegroai
58ab67ea31 Fix execution output handling 2020-03-20 10:35:25 +02:00
allegroai
ea0ed4807e Version bump to v0.14.0 2020-03-12 19:42:32 +02:00
allegroai
389600b91e Fix git checkout with submodules 2020-03-12 18:39:47 +02:00
allegroai
5fb2550212 Update to backend API v2.5 2020-03-12 18:39:10 +02:00
allegroai
15e9e6b778 Fix "execute --clone" support 2020-03-12 18:38:35 +02:00
allegroai
aa75b92e46 Prefer docker image from command line over the one in the experiment 2020-03-12 18:35:49 +02:00
allegroai
757210d5b3 Add support for "execute --docker" and for cloning an experiment before execution 2020-03-12 18:33:07 +02:00
allegroai
00eb2f10ec Version bump to v0.13.3 2020-03-09 16:07:50 +02:00
allegroai
3393372b9c Do not share apt cache among agents on the same machine 2020-03-09 12:38:51 +02:00
allegroai
f2d2d702de Fix k8s support to allow a specific network for the docker (do not use the parent daemon network definition) 2020-03-09 12:38:32 +02:00
allegroai
e3d0680d39 Improve Unicode/UTF stdout handling 2020-03-09 12:34:48 +02:00
allegroai
618c2ac5c4 Add default storage environment vars to generated agent configuration 2020-03-09 12:33:03 +02:00
allegroai
0272c4c79c Add "--force-current-version" daemon command-line flag 2020-03-09 12:31:43 +02:00
allegroai
ff8cf63abf Add "--force-current-version" daemon command-line flag 2020-03-09 12:27:39 +02:00
allegroai
2c7c7f5b44 Add K8s/trains glue service example 2020-03-05 14:10:08 +02:00
allegroai
01f57c1e44 Create missing queues when starting the AWS dynamic cluster management service 2020-03-05 14:08:32 +02:00
allegroai
47bcd3839a Pass correct GPU limit when skipping gpus flag in docker mode 2020-03-05 14:07:44 +02:00
allegroai
0a3a8a1c52 Add support for mounting dockerized experiment folders to host when running on K8s in daemon mode 2020-03-05 13:13:03 +02:00
allegroai
231a907cff Add support for running daemon inside a K8s pod in daemon mode 2020-03-05 13:03:36 +02:00