2019-12-24 15:58:02 +00:00
# Trains Server
2019-06-11 15:55:04 +00:00
2019-06-13 23:17:46 +00:00
## Auto-Magical Experiment Manager & Version Control for AI
2019-06-10 21:24:35 +00:00
2019-06-11 17:09:23 +00:00
[![GitHub license ](https://img.shields.io/badge/license-SSPL-green.svg )](https://img.shields.io/badge/license-SSPL-green.svg)
2019-06-18 13:32:19 +00:00
[![Python versions ](https://img.shields.io/badge/python-3.6%20%7C%203.7-blue.svg )](https://img.shields.io/badge/python-3.6%20%7C%203.7-blue.svg)
2019-06-11 17:09:23 +00:00
[![GitHub version ](https://img.shields.io/github/release-pre/allegroai/trains-server.svg )](https://img.shields.io/github/release-pre/allegroai/trains-server.svg)
[![PyPI status ](https://img.shields.io/badge/status-beta-yellow.svg )](https://img.shields.io/badge/status-beta-yellow.svg)
2019-06-10 21:24:35 +00:00
## Introduction
2019-12-24 15:58:02 +00:00
The **trains-server** is the backend service infrastructure for [Trains ](https://github.com/allegroai/trains ).
2019-06-12 22:27:36 +00:00
It allows multiple users to collaborate and manage their experiments.
2019-12-24 15:58:02 +00:00
By default, **Trains** is set up to work with the **Trains** demo server, which is open to anyone and resets periodically.
In order to host your own server, you will need to launch **trains-server** and point **Trains** to it.
2019-06-12 22:27:36 +00:00
2019-06-16 21:55:05 +00:00
**trains-server** contains the following components:
2019-06-11 15:55:04 +00:00
2019-12-24 15:58:02 +00:00
* The **Trains** Web-App, a single-page UI for experiment management and browsing
2019-06-16 21:55:05 +00:00
* RESTful API for:
* Documenting and logging experiment information, statistics and results
* Querying experiments history, logs and results
* Locally-hosted file server for storing images and models making them easily accessible using the Web-App
2019-06-10 21:24:35 +00:00
2019-12-19 16:27:16 +00:00
You can quickly [deploy ](#launching-trains-server ) your **trains-server** using Docker, AWS EC2 AMI, or Kubernetes.
2019-06-10 21:24:35 +00:00
2019-08-09 00:24:47 +00:00
## System design
2019-06-10 21:24:35 +00:00
2019-06-13 23:14:14 +00:00
![Alt Text ](https://github.com/allegroai/trains/blob/master/docs/system_diagram.png?raw=true )
2019-09-01 22:00:45 +00:00
**trains-server** has two supported configurations:
- Single IP (domain) with the following open ports
2019-10-29 18:43:46 +00:00
- Web application on port 8080
2019-09-01 22:00:45 +00:00
- API service on port 8008
- File storage service on port 8081
2019-10-29 18:43:46 +00:00
2019-09-01 22:00:45 +00:00
- Sub-Domain configuration with default http/s ports (80 or 443)
- Web application on sub-domain: app.\*.\*
- API service on sub-domain: api.\*.\*
- File storage service on sub-domain: files.\*.\*
2019-12-19 16:27:16 +00:00
## Launching trains-server
2019-10-29 18:43:46 +00:00
2019-12-19 16:27:16 +00:00
### Prerequisites
2019-06-18 13:32:19 +00:00
2019-12-19 16:27:16 +00:00
The ports 8080/8081/8008 must be available for the **trains-server** services.
For example, to see if port `8080` is in use:
2019-06-18 13:32:19 +00:00
2019-12-19 16:27:16 +00:00
* Linux or Mac OS X:
sudo lsof -Pn -i4 | grep :8080 | grep LISTEN
2019-06-10 21:24:35 +00:00
2019-12-19 16:27:16 +00:00
* Windows:
2019-11-09 21:07:43 +00:00
2019-12-19 16:27:16 +00:00
netstat -an |find /i "8080"
2019-11-09 21:07:43 +00:00
2019-12-19 16:27:16 +00:00
### Launching
2019-11-09 21:07:43 +00:00
2019-12-19 16:27:16 +00:00
Launch **trains-server** in any of the following formats:
2019-07-17 15:46:12 +00:00
2019-12-19 16:27:16 +00:00
- Pre-built [AWS EC2 AMI ](https://github.com/allegroai/trains-server/blob/master/docs/install_aws.md )
- Pre-built Docker Image
- [Linux ](https://github.com/allegroai/trains-server/blob/master/docs/install_linux_mac.md )
- [Mac OS X ](https://github.com/allegroai/trains-server/blob/master/docs/install_linux_mac.md )
- [Windows 10 ](https://github.com/allegroai/trains-server/blob/master/docs/install_win.md )
- Kubernetes
- [Kubernetes Helm ](https://github.com/allegroai/trains-server-helm#prerequisites )
- Manual [Kubernetes installation ](https://github.com/allegroai/trains-server-k8s#prerequisites )
2019-10-29 18:43:46 +00:00
2019-12-24 15:58:02 +00:00
## Connecting Trains to your trains-server
2019-10-29 18:43:46 +00:00
2019-12-24 15:58:02 +00:00
By default, the **Trains** client is set up to work with the [**Trains** demo server ](https://demoapp.trains.allegro.ai/ ).
To have the **Trains** client use your **trains-server** instead:
2019-12-19 16:27:16 +00:00
- Run the `trains-init` command for an interactive setup.
- Or manually edit `~/trains.conf` file, making sure the server settings (`api_server`, `web_server` , `file_server` ) are configured correctly, for example:
2019-06-12 19:53:50 +00:00
2019-08-07 23:22:36 +00:00
api {
# API server on port 8008
api_server: "http://localhost:8008"
2019-10-29 18:43:46 +00:00
2019-08-07 23:22:36 +00:00
# web_server on port 8080
web_server: "http://localhost:8080"
2019-10-29 18:43:46 +00:00
2019-08-07 23:22:36 +00:00
# file server on port 8081
files_server: "http://localhost:8081"
}
2019-06-12 19:53:50 +00:00
2019-12-19 16:27:16 +00:00
**Note**: If you have set up **trains-server** in a sub-domain configuration, then there is no need to specify a port number,
2019-09-01 22:00:45 +00:00
it will be inferred from the http/s scheme.
2019-12-24 15:58:02 +00:00
After launching the **trains-server** and configuring the **Trains** client to use the **trains-server** ,
you can [use ](https://github.com/allegroai/trains#using-trains ) **Trains** in your experiments and view them in your **trains-server** web server,
2019-12-19 16:27:16 +00:00
for example http://localhost:8080.
2019-12-24 15:58:02 +00:00
For more information about the Trains client, see [**Trains** ](https://github.com/allegroai/trains ).
2019-06-12 19:53:50 +00:00
2019-12-19 16:27:16 +00:00
## Advanced Functionality
2019-06-16 21:55:05 +00:00
2019-12-19 16:27:16 +00:00
**trains-server** provides a few additional useful features, which can be manually enabled:
* [Web login authentication ](https://github.com/allegroai/trains-server/blob/master/docs/faq.md#web-auth )
* [Non-responsive experiments watchdog ](https://github.com/allegroai/trains-server/blob/master/docs/faq.md#watchdog-the-non-responsive-task-watchdog-settings )
## Restarting trains-server
To restart the **trains-server** , you must first stop the containers, and then restart them.
```bash
docker-compose down
docker-compose -f docker-compose-unified.yml up
```
2019-06-16 21:55:05 +00:00
2019-08-07 22:51:40 +00:00
## Upgrading <a name="upgrade"></a>
2019-06-10 21:24:35 +00:00
2019-12-19 16:27:16 +00:00
**trains-server** releases are also reflected in the [docker compose configuration file ](https://github.com/allegroai/trains-server/blob/master/docker-compose-unified.yml ).
We strongly encourage you to keep your **trains-server** up to date, by keeping up with the current release.
**Note**: The following upgrade instructions use the Linux OS as an example.
To upgrade your existing **trains-server** deployment:
2019-06-11 15:55:04 +00:00
2019-11-09 22:18:16 +00:00
1. Shut down the docker containers
```bash
2019-12-19 16:27:16 +00:00
docker-compose down
2019-11-09 22:18:16 +00:00
```
2019-06-11 15:55:04 +00:00
2019-11-09 22:18:16 +00:00
1. We highly recommend backing up your data directory before upgrading.
2019-11-09 21:54:59 +00:00
2019-11-09 22:18:16 +00:00
Assuming your data directory is `/opt/trains` , to archive all data into `~/trains_backup.tgz` execute:
2019-11-09 21:54:59 +00:00
2019-11-09 22:18:16 +00:00
```bash
2019-12-19 16:27:16 +00:00
sudo tar czvf ~/trains_backup.tgz /opt/trains/data
2019-11-09 22:18:16 +00:00
```
2019-10-29 18:43:46 +00:00
2019-11-09 22:18:16 +00:00
< details >
< summary > Restore instructions:< / summary >
To restore this example backup, execute:
```bash
2019-12-19 16:27:16 +00:00
sudo rm -R /opt/trains/data
sudo tar -xzf ~/trains_backup.tgz -C /opt/trains/data
2019-11-09 22:18:16 +00:00
```
< / details >
2019-12-19 16:27:16 +00:00
1. Download the latest `docker-compose-unified.yml` file.
2019-07-08 20:58:09 +00:00
2019-11-09 22:18:16 +00:00
```bash
2019-12-19 16:27:16 +00:00
curl https://raw.githubusercontent.com/allegroai/trains-server/master/docker-compose-unified.yml -o docker-compose-unified.yml
2019-11-09 22:18:16 +00:00
```
2019-12-19 16:27:16 +00:00
1. Spin up the docker containers, it will automatically pull the latest **trains-server** build
2019-11-09 22:18:16 +00:00
```bash
2019-12-19 16:27:16 +00:00
docker-compose -f docker-compose-unified.yml pull
docker-compose -f docker-compose-unified.yml up
2019-11-09 22:18:16 +00:00
```
2019-10-29 18:43:46 +00:00
2019-12-19 16:27:16 +00:00
**\* If something went wrong along the way, check our FAQ: [Common Docker Upgrade Errors ](https://github.com/allegroai/trains-server/blob/master/docs/faq.md#common-docker-upgrade-errors ).**
2019-10-29 18:43:46 +00:00
2019-06-10 21:24:35 +00:00
2019-08-01 16:36:58 +00:00
## Community & Support
2019-12-24 15:58:02 +00:00
If you have any questions, look to the Trains server [FAQ ](https://github.com/allegroai/trains-server/blob/master/docs/faq.md ), or
2019-08-01 16:36:58 +00:00
tag your questions on [stackoverflow ](https://stackoverflow.com/questions/tagged/trains ) with '**trains**' tag.
For feature requests or bug reports, please use [GitHub issues ](https://github.com/allegroai/trains-server/issues ).
Additionally, you can always find us at *trains@allegro.ai*
2019-06-10 21:24:35 +00:00
## License
[Server Side Public License v1.0 ](https://github.com/mongodb/mongo/blob/master/LICENSE-Community.txt )
2019-06-16 21:55:05 +00:00
**trains-server** relies on both [MongoDB ](https://github.com/mongodb/mongo ) and [ElasticSearch ](https://github.com/elastic/elasticsearch ).
2019-10-29 18:43:46 +00:00
With the recent changes in both MongoDB's and ElasticSearch's OSS license, we feel it is our responsibility as a
2019-06-16 21:55:05 +00:00
member of the community to support the projects we love and cherish.
2019-10-29 18:43:46 +00:00
We believe the cause for the license change in both cases is more than just,
2019-06-16 21:55:05 +00:00
and chose [SSPL ](https://www.mongodb.com/licensing/server-side-public-license ) because it is the more general and flexible of the two licenses.
2019-06-10 21:24:35 +00:00
This is our way to say - we support you guys!