2019-06-14 15:42:09 +00:00
# TRAINS FAQ
2019-06-10 17:00:28 +00:00
2019-06-27 20:22:50 +00:00
* [How to change the location of TRAINS configuration file ](#change-config-path )
* [How to override TRAINS credentials from OS environment ](#credentials-os-env )
* [How to sort models by a certain metric? ](#custom-columns )
2019-06-14 15:42:09 +00:00
* [Can I store more information on the models? ](#store-more-model-info )
* [Can I store the model configuration file as well? ](#store-model-configuration )
* [I want to add more graphs, not just with Tensorboard. Is this supported? ](#more-graph-types )
2019-06-27 20:22:50 +00:00
* [Is there a way to create a graph comparing hyper-parameters vs model accuracy? ](#compare-graph-parameters )
2019-06-14 15:42:09 +00:00
* [I noticed that all of my experiments appear as `Training`. Are there other options? ](#other-experiment-types )
* [I noticed I keep getting the message `warning: uncommitted code`. What does it mean? ](#uncommitted-code-warning )
* [Is there something TRAINS can do about uncommitted code running? ](#help-uncommitted-code )
* [I read there is a feature for centralized model storage. How do I use it? ](#centralized-model-storage )
* [I am training multiple models at the same time, but I only see one of them. What happened? ](#only-last-model-appears )
* [Can I log input and output models manually? ](#manually-log-models )
* [I am using Jupyter Notebook. Is this supported? ](#jupyter-notebook )
* [I do not use Argarser for hyper-parameters. Do you have a solution? ](#dont-want-argparser )
* [Git is not well supported in Jupyter, so we just gave up on committing our code. Do you have a solution? ](#commit-git-in-jupyter )
* [Can I use TRAINS with scikit-learn? ](#use-scikit-learn )
* [When using PyCharm to remotely debug a machine, the git repo is not detected. Do you have a solution? ](#pycharm-remote-debug-detect-git )
* [How do I know a new version came out? ](#new-version-auto-update )
* [Sometimes I see experiments as running when in fact they are not. What's going on? ](#experiment-running-but-stopped )
* [The first log lines are missing from the experiment log tab. Where did they go? ](#first-log-lines-missing )
2019-06-10 17:51:38 +00:00
2019-06-10 17:00:28 +00:00
2019-06-27 20:22:50 +00:00
## How to change the location of TRAINS configuration file? <a name="change-config-path"></a>
Set "TRAINS_CONFIG_FILE" OS environment variable to override the default configuration file location.
```bash
export TRAINS_CONFIG_FILE="/home/user/mytrains.conf"
```
## How to override TRAINS credentials from OS environment? <a name="credentials-os-env"></a>
Set the OS environment variables below, in order to override the configuration file / defaults.
```bash
export TRAINS_API_ACCESS_KEY="key_here"
export TRAINS_API_SECRET_KEY="secret_here"
export TRAINS_API_HOST="http://localhost:8080"
```
## How to sort models by a certain metric? <a name="custom-columns"></a>
Models are associated with the experiments that created them.
In order to sort experiments by a specific metric, add a custom column in the experiments table,
2019-06-27 20:35:33 +00:00
< img src = "https://github.com/allegroai/trains/blob/master/docs/screenshots/set_custom_column.png?raw=true" width = 25% >
< img src = "https://github.com/allegroai/trains/blob/master/docs/screenshots/custom_column.png?raw=true" width = 25% >
2019-06-27 20:22:50 +00:00
2019-06-14 15:42:09 +00:00
## Can I store more information on the models? <a name="store-more-model-info"></a>
2019-06-17 22:01:27 +00:00
#### For example, can I store enumeration of classes?
2019-06-14 15:42:09 +00:00
Yes! Use the `Task.set_model_label_enumeration()` method:
2019-06-10 17:51:38 +00:00
2019-06-10 17:00:28 +00:00
```python
2019-06-14 15:42:09 +00:00
Task.current_task().set_model_label_enumeration( {"label": int(0), } )
2019-06-10 17:00:28 +00:00
```
2019-06-14 15:42:09 +00:00
## Can I store the model configuration file as well? <a name="store-model-configuration"></a>
2019-06-10 17:51:38 +00:00
2019-06-14 15:42:09 +00:00
Yes! Use the `Task.set_model_design()` method:
2019-06-10 17:00:28 +00:00
```python
2019-06-14 15:42:09 +00:00
Task.current_task().set_model_design("a very long text with the configuration file's content")
2019-06-10 17:00:28 +00:00
```
2019-06-14 15:42:09 +00:00
## I want to add more graphs, not just with Tensorboard. Is this supported? <a name="more-graph-types"></a>
2019-06-10 17:00:28 +00:00
2019-06-14 15:42:09 +00:00
Yes! Use a [Logger ](https://github.com/allegroai/trains/blob/master/trains/logger.py ) object. An instance can be always be retrieved using the `Task.current_task().get_logger()` method:
2019-06-10 17:00:28 +00:00
```python
2019-06-14 15:42:09 +00:00
# Get a logger object
2019-06-10 17:00:28 +00:00
logger = Task.current_task().get_logger()
2019-06-14 15:42:09 +00:00
# Report some scalar
2019-06-10 17:00:28 +00:00
logger.report_scalar("loss", "classification", iteration=42, value=1.337)
```
2019-06-17 22:01:27 +00:00
#### **TRAINS supports:**
2019-06-14 15:42:09 +00:00
* Scalars
* Plots
* 2D/3D Scatter Diagrams
* Histograms
* Surface Diagrams
* Confusion Matrices
* Images
* Text logs
2019-06-10 17:00:28 +00:00
2019-06-14 15:42:09 +00:00
For a more detailed example, see [here ](https://github.com/allegroai/trains/blob/master/examples/manual_reporting.py ).
2019-06-10 17:00:28 +00:00
2019-06-27 20:22:50 +00:00
## Is there a way to create a graph comparing hyper-parameters vs model accuracy? <a name="compare-graph-parameters"></a>
Yes, You can manually create a plot with a single point X-axis for the hyper-parameter value,
and Y-Axis for the accuracy. For example:
```python
number_layers = 10
accuracy = 0.95
Task.current_task().get_logger().report_scatter2d(
"performance", "accuracy", iteration=0,
mode='markers', scatter=[(number_layers, accuracy)])
```
Assuming the hyper-parameter is "number_layers" with current value 10, and the accuracy for the trained model is 0.95.
Then, the experiment comparison graph shows:
< img src = "https://github.com/allegroai/trains/blob/master/docs/screenshots/compare_plots.png?raw=true" width = "50%" >
Another option is a histogram chart:
```python
number_layers = 10
accuracy = 0.95
Task.current_task().get_logger().report_vector(
"performance", "accuracy", iteration=0, labels=['accuracy'],
values=[accuracy], xlabels=['number_layers %d' % number_layers])
```
< img src = "https://github.com/allegroai/trains/blob/master/docs/screenshots/compare_plots_hist.png?raw=true" width = "50%" >
2019-06-14 15:42:09 +00:00
## I noticed that all of my experiments appear as `Training`. Are there other options? <a name="other-experiment-types"></a>
2019-06-10 17:00:28 +00:00
2019-06-14 15:42:09 +00:00
Yes! When creating experiments and calling `Task.init` , you can provide an experiment type.
The currently supported types are `Task.TaskTypes.training` and `Task.TaskTypes.testing` . For example:
2019-06-10 17:00:28 +00:00
```python
task = Task.init(project_name, task_name, Task.TaskTypes.testing)
```
2019-06-10 17:51:38 +00:00
If you feel we should add a few more, let us know in the [issues ](https://github.com/allegroai/trains/issues ) section.
2019-06-10 17:00:28 +00:00
2019-06-14 15:42:09 +00:00
## I noticed I keep getting the message `warning: uncommitted code`. What does it mean? <a name="uncommitted-code-warning"></a>
2019-06-10 17:00:28 +00:00
2019-06-10 17:51:38 +00:00
TRAINS not only detects your current repository and git commit,
2019-06-14 15:42:09 +00:00
but also warns you if you are using uncommitted code. TRAINS does this
because uncommitted code means this experiment will be difficult to reproduce.
If you still don't care, just ignore this message - it is merely a warning.
2019-06-10 17:00:28 +00:00
2019-06-14 15:42:09 +00:00
## Is there something TRAINS can do about uncommitted code running? <a name="help-uncommitted-code"></a>
2019-06-10 17:00:28 +00:00
2019-06-14 15:42:09 +00:00
Yes! TRAINS currently stores the git diff as part of the experiment's information.
2019-06-10 17:51:38 +00:00
The Web-App will soon present the git diff as well. This is coming very soon!
2019-06-10 17:00:28 +00:00
2019-06-14 15:42:09 +00:00
## I read there is a feature for centralized model storage. How do I use it? <a name="centralized-model-storage"></a>
When calling `Task.init()` , providing the `output_uri` parameter allows you to specify the location in which model snapshots will be stored.
For example, calling:
2019-06-10 17:00:28 +00:00
```python
2019-06-14 15:42:09 +00:00
task = Task.init(project_name, task_name, output_uri="/mnt/shared/folder")
2019-06-10 17:00:28 +00:00
```
2019-06-14 15:42:09 +00:00
Will tell TRAINS to copy all stored snapshots into a sub-folder under `/mnt/shared/folder` .
The sub-folder's name will contain the experiment's ID.
Assuming the experiment's ID in this example is `6ea4f0b56d994320a713aeaf13a86d9d` , the following folder will be used:
2019-06-10 17:51:38 +00:00
2019-06-10 17:00:28 +00:00
`/mnt/shared/folder/task_6ea4f0b56d994320a713aeaf13a86d9d/models/`
2019-06-14 15:42:09 +00:00
TRAINS supports more storage types for `output_uri` :
2019-06-10 17:00:28 +00:00
```python
2019-06-14 15:42:09 +00:00
# AWS S3 bucket
task = Task.init(project_name, task_name, output_uri="s3://bucket-name/folder")
2019-06-10 17:00:28 +00:00
```
```python
2019-06-14 15:42:09 +00:00
# Google Cloud Storage bucket
taks = Task.init(project_name, task_name, output_uri="gs://bucket-name/folder")
2019-06-10 17:00:28 +00:00
```
2019-06-14 15:42:09 +00:00
**NOTE:** These require configuring the storage credentials in `~/trains.conf` .
2019-06-19 12:14:28 +00:00
For a more detailed example, see [here ](https://github.com/allegroai/trains/blob/master/docs/trains.conf#L55 ).
2019-06-10 17:00:28 +00:00
2019-06-14 15:42:09 +00:00
## I am training multiple models at the same time, but I only see one of them. What happened? <a name="only-last-model-appears"></a>
2019-06-10 17:00:28 +00:00
2019-06-14 15:42:09 +00:00
Although all models can be found under the project's **Models** tab, TRAINS currently shows only the last model associated with an experiment in the experiment's information panel.
2019-06-10 17:00:28 +00:00
2019-06-14 15:42:09 +00:00
This will be fixed in a future version.
2019-06-10 17:00:28 +00:00
2019-06-14 15:42:09 +00:00
## Can I log input and output models manually? <a name="manually-log-models"></a>
2019-06-10 17:00:28 +00:00
2019-06-14 15:42:09 +00:00
Yes! For example:
2019-06-10 17:00:28 +00:00
```python
input_model = InputModel.import_model(link_to_initial_model_file)
Task.current_task().connect(input_model)
2019-06-14 15:42:09 +00:00
2019-06-10 17:51:38 +00:00
OutputModel(Task.current_task()).update_weights(link_to_new_model_file_here)
2019-06-10 17:00:28 +00:00
```
2019-06-14 15:42:09 +00:00
See [InputModel ](https://github.com/allegroai/trains/blob/master/trains/model.py#L319 ) and [OutputModel ](https://github.com/allegroai/trains/blob/master/trains/model.py#L539 ) for more information.
2019-06-10 17:00:28 +00:00
2019-06-14 15:42:09 +00:00
## I am using Jupyter Notebook. Is this supported? <a name="jupyter-notebook"></a>
2019-06-10 17:00:28 +00:00
2019-06-14 15:42:09 +00:00
Yes! Jupyter Notebook is supported. See [TRAINS Jupyter Plugin ](https://github.com/allegroai/trains-jupyter-plugin ).
2019-06-10 17:00:28 +00:00
2019-06-14 15:42:09 +00:00
## I do not use Argarser for hyper-parameters. Do you have a solution? <a name="dont-want-argparser"></a>
2019-06-10 17:00:28 +00:00
2019-06-14 15:42:09 +00:00
Yes! TRAINS supports using a Python dictionary for hyper-parameter logging. Just call:
2019-06-10 17:00:28 +00:00
```python
parameters_dict = Task.current_task().connect(parameters_dict)
```
2019-06-14 15:42:09 +00:00
From this point onward, not only are the dictionary key/value pairs stored as part of the experiment, but any changes to the dictionary will be automatically updated in the task's information.
## Git is not well supported in Jupyter, so we just gave up on committing our code. Do you have a solution? <a name="commit-git-in-jupyter"></a>
Yes! Check our [TRAINS Jupyter Plugin ](https://github.com/allegroai/trains-jupyter-plugin ). This plugin allows you to commit your notebook directly from Jupyter. It also saves the Python version of your code and creates an updated `requirements.txt` so you know which packages you were using.
2019-06-10 17:00:28 +00:00
2019-06-14 15:42:09 +00:00
## Can I use TRAINS with scikit-learn? <a name="use-scikit-learn"></a>
2019-06-10 17:00:28 +00:00
2019-06-14 15:42:09 +00:00
Yes! `scikit-learn` is supported. Everything you do is logged.
2019-06-10 17:00:28 +00:00
2019-06-14 15:42:09 +00:00
**NOTE**: Models are not automatically logged because in most cases, scikit-learn will simply pickle the object to files so there is no underlying frame we can connect to.
2019-06-10 17:00:28 +00:00
2019-06-14 15:42:09 +00:00
## When using PyCharm to remotely debug a machine, the git repo is not detected. Do you have a solution? <a name="pycharm-remote-debug-detect-git"></a>
2019-06-10 17:00:28 +00:00
2019-06-14 15:42:09 +00:00
Yes! Since this is such a common occurrence, we created a PyCharm plugin that allows a remote debugger to grab your local repository / commit ID. See our [TRAINS PyCharm Plugin ](https://github.com/allegroai/trains-pycharm-plugin ) repository for instructions and [latest release ](https://github.com/allegroai/trains-pycharm-plugin/releases ).
2019-06-10 17:00:28 +00:00
2019-06-14 15:42:09 +00:00
## How do I know a new version came out? <a name="new-version-auto-update"></a>
2019-06-10 17:00:28 +00:00
2019-06-14 15:42:09 +00:00
TRAINS does not yet support auto-update checks. We hope to add this feature soon.
2019-06-10 17:00:28 +00:00
2019-06-14 15:42:09 +00:00
## Sometimes I see experiments as running when in fact they are not. What's going on? <a name="experiment-running-but-stopped"></a>
2019-06-10 17:00:28 +00:00
2019-06-14 15:42:09 +00:00
TRAINS monitors your Python process. When the process exits in an orderly fashion, TRAINS closes the experiment.
2019-06-10 17:00:28 +00:00
2019-06-14 15:42:09 +00:00
When the process crashes and terminates abnormally, the stop signal is sometimes missed. In such a case, you can safely right click the experiment in the Web-App and stop it.
2019-06-10 17:51:38 +00:00
2019-06-10 17:00:28 +00:00
2019-06-14 15:42:09 +00:00
## The first log lines are missing from the experiment log tab. Where did they go? <a name="first-log-lines-missing"></a>
2019-06-10 17:00:28 +00:00
2019-06-14 15:42:09 +00:00
Due to speed/optimization issues, we opted to display only the last several hundred log lines.
2019-06-10 17:00:28 +00:00
2019-06-14 15:42:09 +00:00
You can always downloaded the full log as a file using the Web-App.
2019-06-10 17:00:28 +00:00