Commit Graph

475 Commits

Author SHA1 Message Date
allegroai
7f4b100042 Fix text encoding utf-8 and pr_curve broken in Tensorboard support 2020-04-16 16:40:14 +03:00
allegroai
4bb17ca420 Fix renaming/deleting model file right after saving will break async upload (trains/issues#123) 2020-04-13 19:03:15 +03:00
allegroai
12659307a8 Fix update_weights() to use model upload target file when passed 2020-04-13 19:00:35 +03:00
allegroai
4b9c5c235c Update docstrings 2020-04-13 18:58:39 +03:00
allegroai
648779380c Add media (audio) support for both Logger and Tensorboard bind 2020-04-09 13:14:14 +03:00
allegroai
7ac7e088a1 Add trace feature 2020-04-09 13:12:50 +03:00
allegroai
0df3d38862 Fix self references in configuration when environment variables exist 2020-04-09 13:11:21 +03:00
allegroai
3ac7dbdb49 Refactor shutdown sequence 2020-04-09 13:10:29 +03:00
allegroai
7dae058359 Optimize locking for TaskHandler, avoid lock when shutting down 2020-04-09 13:08:46 +03:00
allegroai
d9aa83380f Stop resource monitoring before signaling task stop 2020-04-09 13:07:26 +03:00
allegroai
ab263bb59f Raise ValueError if Task.get_logger() is called after task was closed 2020-04-09 13:06:06 +03:00
allegroai
3c4925d605 Fix resource monitor and check if task is valid 2020-04-09 13:00:24 +03:00
allegroai
7f00e45d6c Do not recreate logger after Task was closed/exited 2020-04-09 12:59:00 +03:00
allegroai
3f6fb5379a Revert fork patching as signal is not enough and is not called from forked processes 2020-04-09 12:57:50 +03:00
allegroai
5eb4ae6600 Use a daemon thread for the log 2020-04-09 12:56:55 +03:00
allegroai
1b901b7d13 Fix logger in case a packet was dropped before it was overwritten 2020-04-09 12:56:02 +03:00
allegroai
aa737e6b5b Make sure task is marked as started in remote execution (just in case) 2020-04-09 12:53:43 +03:00
allegroai
f61cbdeb39 Check if join was successful when waiting for std flush pool 2020-04-09 12:51:34 +03:00
allegroai
2f395cc76b Use sub-process fork signal hooks instead of os._exit patch 2020-04-09 12:50:04 +03:00
allegroai
004f925454 ThreadPool should be terminated, not closed, otherwise it might hang 2020-04-09 12:47:38 +03:00
allegroai
9916c93ce0 Add 10sec timeout for stdout/stderr flush at end of process 2020-04-09 12:46:30 +03:00
allegroai
1718aa20d4 Add thread_waited_join waited join for Thread/Process Pools 2020-04-09 12:45:06 +03:00
allegroai
23bd6097a8 Add nicer stdout log flush 2020-04-09 12:42:45 +03:00
allegroai
9a0a84a83e Do not wait for logs if we are aborting the task manually (i.e. ctrl-C) 2020-04-09 12:41:10 +03:00
allegroai
98ce0bbe43 Change TaskHandler.close() wait default to False as it should not wait for logs to flush 2020-04-09 12:39:09 +03:00
allegroai
b3c9872a3f Intercept SystemExit and do nothing so we could kill the thread 2020-04-09 12:33:16 +03:00
allegroai
5ec4d80493 Disconnect stdout/stderr logger on exit 2020-04-09 12:31:43 +03:00
allegroai
de9c88bc2d Do not try to wait for Lock 2020-04-09 12:30:42 +03:00
allegroai
337e60a376 Kill repo/package detection thread on exit 2020-04-09 12:28:57 +03:00
allegroai
b2c2002c40 Create dev task manually when constructing the Task 2020-04-09 12:27:13 +03:00
allegroai
11420adce7 Log reports at the end of the task 2020-04-09 12:24:37 +03:00
allegroai
ffedb219d5 Local modules (except trains) imported from a folder inside the git project should not be logged as "local packages", they should be ignored 2020-04-09 12:21:37 +03:00
allegroai
07daf8f5e6 Fix logger sometimes getting stuck at end of experiment 2020-04-09 12:05:56 +03:00
allegroai
e6f29428eb Add StorageManager 2020-04-09 12:03:41 +03:00
allegroai
e1fc9b3dc8 ThreadPool should be terminated, not closed, otherwise it might hang 2020-04-09 11:39:03 +03:00
allegroai
070fd8149a Store the version that matching the Session API so we do not reload every time 2020-04-09 11:35:51 +03:00
allegroai
a425a70fc6 Add api.ssl_error_count_verbosity and make sure SSL retries are taken care by the session 2020-04-09 11:33:55 +03:00
allegroai
101e5393d1 Fix TRAINS_VCS_ROOT path conversion 2020-04-01 19:06:30 +03:00
allegroai
41ca1a2e49 Fix requirements detection to make sure trains is detected even if we execute without actually being installed 2020-04-01 19:04:57 +03:00
allegroai
01772430d6 Ignore virtual-environment folder that might be inside the project's directory 2020-04-01 19:02:54 +03:00
allegroai
6de3d4b6fd Ignore local modules imported from a folder inside the git project 2020-04-01 19:01:21 +03:00
allegroai
172ed62d41 Add Task.get_tasks() filtering support 2020-04-01 18:54:16 +03:00
allegroai
581edf1098 Version bump to v0.14.1 2020-03-24 20:36:57 +02:00
allegroai
c4719f2e2f Add type annotations and fix docstrings 2020-03-23 23:26:46 +02:00
allegroai
766c8ab24f Add Task.models property 2020-03-23 23:25:55 +02:00
allegroai
0211d233d4 Deprecate Task.set_model_config(), Task.get_model_config_text() and Task.get_model_config_dict() 2020-03-23 23:25:16 +02:00
allegroai
023f1721c1 Add Task.get_models() retrieving stored models on previously executed tasks 2020-03-22 18:19:07 +02:00
allegroai
332e9e2f63 Fix Tensorflow direct V2.1 multiple FileWriters 2020-03-22 18:17:16 +02:00
allegroai
493cce443a Reuse Model objects if we are storing local files (reduce clutter) 2020-03-22 18:15:32 +02:00
allegroai
4e2564cd3a Support reusing Models. Use trains.Model as general purpose registered Model. 2020-03-22 18:13:56 +02:00
allegroai
63507c82f7 Fix Model.download_model_weights() to reuse previously downloaded file 2020-03-22 18:11:30 +02:00
allegroai
477665ee33 Fix storage_uri handling in Model.update() 2020-03-22 18:05:05 +02:00
allegroai
abc9b512f7 Fix logging typos 2020-03-22 18:03:25 +02:00
allegroai
7817ef5cda Fix joblib binding 2020-03-20 10:30:13 +02:00
allegroai
5db53ba643 Support multiple EventWriter in TensorFlow eager mode (TF 2.0+) 2020-03-20 10:29:18 +02:00
allegroai
b4050ecf25 Fix TensorFlow NaN/Inf values support 2020-03-20 10:27:52 +02:00
allegroai
babaf9f1ce Add OpenMPI/Slurm support 2020-03-20 10:23:00 +02:00
allegroai
0adbd79975 Fix StorageHelper upload on shutdown 2020-03-20 10:20:44 +02:00
allegroai
dc915d0241 Fix support for Task init/close multiple times 2020-03-20 10:20:06 +02:00
allegroai
667ddcab88 Fix import for services that do not exist in old versions 2020-03-20 10:16:48 +02:00
allegroai
3b1d2d3258 Version bump to v0.14.0 2020-03-12 19:42:48 +02:00
allegroai
afad6a42ea Add initial slurm support (multiple nodes sharing the same task id) 2020-03-12 18:12:16 +02:00
allegroai
5b29aa194c Make sure artifact temporary files names are valid file names 2020-03-12 18:10:03 +02:00
allegroai
84a34428b6 Add trains-init support for config file env override (as well as argument) 2020-03-12 18:09:03 +02:00
allegroai
b3dff9a4eb Support setting task initial iteration for continuing previous runs 2020-03-12 17:40:29 +02:00
allegroai
f3531c1af2 Allow Task.set_credentials() to override configuration file in dev mode 2020-03-12 17:22:09 +02:00
allegroai
5bc39271e3 Fix store uncommitted code configuration option 2020-03-12 17:17:39 +02:00
allegroai
461fbd9df0 Better warning messages for storage errors 2020-03-12 17:13:36 +02:00
allegroai
30cf6b4834 Fix HTTP link quoting in stored links 2020-03-12 17:04:31 +02:00
allegroai
98c9a95338 Add support for reporting tables 2020-03-10 13:30:42 +02:00
allegroai
9e0ea880ce Add missing import 2020-03-08 18:56:28 +02:00
allegroai
b7358d7fef Add portalocker for inter-process lock 2020-03-05 12:31:22 +02:00
allegroai
2e3820603a Allow argparser override values with command line even in remote execution (essential for sub-process support) 2020-03-05 12:28:36 +02:00
allegroai
1d9e70bd8b Fix signal hooking registeration (cont.) 2020-03-05 12:26:56 +02:00
allegroai
181a0be0af Remove temporary file lock at the end of the execution or in Task.close() 2020-03-05 12:25:17 +02:00
allegroai
b0c602c832 Fix signal hooking registeration 2020-03-05 12:24:14 +02:00
allegroai
bcf97afeb9 Forking processes should not pass along the original File based Lock 2020-03-05 12:22:14 +02:00
allegroai
888c53f67d Allow disabling repository detection when calling Task.init() 2020-03-05 12:19:40 +02:00
allegroai
4bca5ccf27 Always reload task section before editing parts of it 2020-03-05 12:11:55 +02:00
allegroai
a2ecb2c75d Only use file based locks for main task. Secondary tasks use traditional multiprocessing lock 2020-03-05 12:10:23 +02:00
allegroai
da804ca75f Add support for Popen subprocesses with task edit protection from multiple processes 2020-03-05 12:05:12 +02:00
allegroai
e3ae4f4e26 Optimize task refresh while pulling task status in local worker and last iteration for Resource Monitoring 2020-03-05 11:40:27 +02:00
Karthikeyan Singaravelan
a97850e5b6 Import ABC from collections.abc instead of collections for Python 3.9 compatibility. 2020-03-03 21:38:03 +05:30
allegroai
146da439e7 Integrate pigar into Trains 2020-03-01 17:12:28 +02:00
allegroai
8ee2bd1844 Retry sending console logs if session.send() fails (applicable only in local mode where we use the logging handler) 2020-02-26 17:07:07 +02:00
allegroai
cf850020fb Don't print empty line at end of process if there's no artifacts summary 2020-02-26 17:06:17 +02:00
allegroai
baf5fc9e54 version bump to v0.13.3 2020-02-23 11:20:21 +02:00
allegroai
8972c1f005 Add Task.[get/set]_parameters_as_dict() to allow interaction with non-main task parameters (no need to connect()) 2020-02-20 18:32:12 +02:00
allegroai
98e6c2004c Use standard os environment variables to obtain default credentials for AWS, Google and Azure 2020-02-20 18:29:53 +02:00
allegroai
b5168010e9 Make sure Task.connect() returns the same value it is passed 2020-02-18 11:26:52 +02:00
allegroai
14588e6dec Refactor utility function 2020-02-18 11:25:29 +02:00
allegroai
3ea570cadf Store python binary along with major.minor version in task script section 2020-02-18 11:24:04 +02:00
allegroai
9fd3b98b24 Fix session error to print the instance host and not the class host 2020-02-18 11:23:06 +02:00
allegroai
edc237dad4 Improve support for tensorboard.summarywriter.addscalars binding 2020-02-18 11:21:47 +02:00
allegroai
f5f13658c3 Add binding for tensorboard.summarywriter.addscalars as well as scalars grouping configuration option 2020-02-12 14:04:53 +02:00
allegroai
63ffc09ae0 Fix incorrect upgrade message 2020-02-12 14:03:24 +02:00
allegroai
0bc71fbcf4 Remove title/series naming restrictions (allow '$' and '.') 2020-02-10 10:30:57 +02:00
allegroai
3ee70beea2 Fix URL for uploaded files with '%' in their name to allow proper unquote during HTTP serving 2020-02-10 10:30:57 +02:00
allegroai
c6849985ea Add Task.set_base_docer() and Task.get_base_docker() 2020-02-10 10:30:57 +02:00
allegroai
8c2b36968b Prefer tensorflow_gpu over tensorflow when inspecting installed packages 2020-02-04 18:00:39 +02:00
allegroai
0c71889ca5 Fix printout during init 2020-02-04 17:59:50 +02:00
allegroai
34d9402abb version bump to v0.13.2 2020-01-27 19:45:56 +02:00
allegroai
7b9e7406ad Fix mutually_exclusive() use of at_least_one() 2020-01-27 15:41:19 +02:00
allegroai
9f8e814ca6 Support git repositories without ".git" suffix 2020-01-27 15:41:19 +02:00
allegroai
923e45bb17 Allow reporting a pre-uploaded image url in Logger.report_image using the url parameter 2020-01-26 15:29:35 +02:00
allegroai
8772bc2755 Version bump 2020-01-22 11:08:41 +02:00
allegroai
d03311764e Fix None type as default value in dictionary 2020-01-22 11:08:06 +02:00
allegroai
b50bfd5b63 Fix default argparser value handling when value is None 2020-01-22 11:06:52 +02:00
allegroai
af0b8f4c70 Fix type check in hyper-parameters argparser integration 2020-01-22 11:03:56 +02:00
allegroai
1e011e10a2 Version bump 2020-01-21 16:41:14 +02:00
allegroai
1cc0ea6cf3 Fix logs, events and jupyter flushing on exit 2020-01-21 16:41:01 +02:00
allegroai
f0a27127bf Fix matplotlib savefig patching 2020-01-21 16:37:26 +02:00
allegroai
c5dd762d9b Improve conda support 2020-01-21 16:32:57 +02:00
allegroai
9a3e130700 version bump to v0.13.1 2020-01-13 17:29:17 +02:00
allegroai
fcaff82980 Add support for pylab.savefig in matplotlib binding 2020-01-13 17:16:56 +02:00
allegroai
0ecd734fd1 Support multi-line paste of credentials in configuration wizard 2020-01-13 17:16:25 +02:00
allegroai
1c6be01e38 Add support for savefig in matplotlib binding 2020-01-13 12:06:47 +02:00
allegroai
66b251a62b Try to make sure tensorboard is available when using torch 2020-01-13 11:55:55 +02:00
Allegro AI
affd6050f6
Merge pull request #79 from danmalowany-allegro/master
Fix type hint in Logger.report_scatter3d()
2020-01-12 12:50:10 +02:00
allegroai
5fdb2398df Version bump 2020-01-10 13:42:24 +02:00
allegroai
8da8053726 Do not store keras model network design if it cannot be serialized 2020-01-10 13:41:06 +02:00
allegroai
073f4c308d Convert ndarray to histogram for axis to get rid of warning in tensorflow binding 2020-01-10 13:39:31 +02:00
allegroai
163ace8856 Display matplotlib low version warning only once 2020-01-10 13:36:27 +02:00
allegroai
70624f469b Fix matplotlib binding support 2020-01-10 13:35:07 +02:00
allegroai
f65ef3e757 Support broken Jupyter version on some conda installations (SageMaker) 2020-01-10 13:33:19 +02:00
danmalowany-allegro
b9ee824877
Changed List to list 2020-01-06 18:27:56 +02:00
danmalowany-allegro
457f0b71c8
Fixed scatter type in report_scatter3d to Union 2020-01-06 18:12:00 +02:00
allegroai
80f3dc6790 version bump to v0.13.0 2020-01-06 17:44:28 +02:00
Allegro AI
be8d100e33
Merge pull request #78 from szymonmaszke/patch-1
Add .pt file extension as PyTorch
2020-01-06 17:30:31 +02:00
allegroai
30eaed79ea Add warning when automatic argument parser binding cannot be turned off 2020-01-06 17:20:15 +02:00
allegroai
bc33ad0da3 Calculate data-audit artifact uniqueness by user-criteria 2020-01-06 17:19:44 +02:00
allegroai
a169b43885 Add Task.upload_artifact support for external URLs 2020-01-06 17:16:51 +02:00
allegroai
7820e0d14a Use an environment variable for setting a default docker image 2020-01-06 17:09:45 +02:00
allegroai
7b7b6e487e Fix argparser/subparser support and support unsynced connected hyper parameters in remote execution 2020-01-06 17:08:03 +02:00
allegroai
8585d7e134 Avoid retries when verifying invalid credentials 2020-01-06 17:05:37 +02:00
Szymon Maszke
6815dd5410
Add .pt file extension as PyTorch
Usually `.pt` is used as pytorch's extension, see [this StackOverflow question](https://stackoverflow.com/questions/59095824/what-is-difference-between-pt-pth-and-pwf-extentions-in-pytorch).

Furthermore `.pth` is used by Python to list additional package search paths (see [this PyTorch issue](https://github.com/pytorch/pytorch/issues/14864)) so IMO it might be worth reconsidering existence of it extension. AFAIK `.pt` is advised and used throughout most projects.
2020-01-06 13:08:54 +01:00
allegroai
54ae340ccb Use source task id to determine cloned task parent 2020-01-02 12:01:03 +02:00
allegroai
62d5535351 Fix requests issue in python 2.7 that can cause a deadlock when importing netrc 2020-01-02 11:58:02 +02:00
allegroai
f4be527a21 Fix typo 2020-01-02 11:55:59 +02:00
allegroai
7110b938ae Fix matplotlib import binding when imported before trains 2019-12-30 18:37:17 +02:00
allegroai
ddeece1f57 Add support for API v2.5 2019-12-30 18:34:28 +02:00
allegroai
ba79471848 Update documentation 2019-12-24 18:20:16 +02:00
allegroai
e4024e01d5 Make sure ProxyDictPreWrite and ProxyDictPostWrite are pickled correctly 2019-12-21 18:33:15 +02:00
allegroai
4e0f711e39 Keep only the input artifacts when cloning a task 2019-12-21 18:30:24 +02:00
allegroai
0be981fbc1 Fix check_min_api_version to use a default session if none was created 2019-12-21 18:28:10 +02:00
allegroai
085eebc6b9 version bump 2019-12-15 15:13:44 +02:00
allegroai
14ce1e925e version bump 2019-12-15 00:11:26 +02:00
allegroai
3bd997c4dc Improve trains-init configuration wizard 2019-12-15 00:11:01 +02:00
allegroai
c1cc80ba1b Optimize artifacts threading 2019-12-15 00:10:34 +02:00
allegroai
a992591f3c Fix artifacts update in auxiliary task 2019-12-15 00:10:12 +02:00