Compare commits

13 Commits

Author SHA1 Message Date
allegroai
d9bdebefc7 Update AWS AMIs 2020-05-14 17:54:30 +03:00
allegroai
f29884f05a Version bump to v0.14.2 2020-05-14 17:53:56 +03:00
allegroai
0f72d662f8 Update GCP documentation 2020-05-04 17:31:11 +03:00
allegroai
6202219034 Update README 2020-05-03 11:08:21 +03:00
allegroai
bb3218f65d Update GCP installation instructions 2020-04-06 12:59:29 +03:00
allegroai
cbcaa7c789 Add MongoDB performance optimization 2020-04-01 19:20:53 +03:00
allegroai
427322a424 Update schema 2020-04-01 19:16:34 +03:00
allegroai
0e7d7d36a9 Update docs for GCP Custom Images 2020-03-30 15:51:58 +03:00
allegroai
06032a6d66 Update documentation 2020-03-20 10:51:43 +02:00
allegroai
b48f4eb2eb Make sure time intervals are calculated in ms 2020-03-20 10:50:56 +02:00
Allegro AI
383b2666c4 Update AWS AMIs 2020-03-16 21:57:07 +02:00
allegroai
50c373cf0d Version bump to v0.14.1 2020-03-16 18:47:35 +02:00
allegroai
394a9de5fa Update docs with AMI IDs for v0.14.1 2020-03-16 18:47:20 +02:00
12 changed files with 226 additions and 63 deletions

View File

@@ -7,6 +7,8 @@
[![GitHub version](https://img.shields.io/github/release-pre/allegroai/trains-server.svg)](https://img.shields.io/github/release-pre/allegroai/trains-server.svg)
[![PyPI status](https://img.shields.io/badge/status-beta-yellow.svg)](https://img.shields.io/badge/status-beta-yellow.svg)
### Help improve Trains by filling our 2-min [user survey](https://allegro.ai/lp/trains-user-survey/)
## Introduction
The **trains-server** is the backend service infrastructure for [Trains](https://github.com/allegroai/trains).
@@ -61,6 +63,7 @@ For example, to see if port `8080` is in use:
Launch **trains-server** in any of the following formats:
- Pre-built [AWS EC2 AMI](https://github.com/allegroai/trains-server/blob/master/docs/install_aws.md)
- Pre-built [GCP Custom Image](https://github.com/allegroai/trains-server/blob/master/docs/install_gcp.md)
- Pre-built Docker Image
- [Linux](https://github.com/allegroai/trains-server/blob/master/docs/install_linux_mac.md)
- [macOS](https://github.com/allegroai/trains-server/blob/master/docs/install_linux_mac.md)

View File

@@ -26,7 +26,7 @@ The minimum recommended amount of RAM is 8GB. For example, **t3.large** or **t3a
To upgrade **trains-server** on an existing EC2 instance based on one of these AMIs, SSH into the instance and follow the [upgrade instructions](../README.md#upgrade) for **trains-server**.
### Upgrading AMIs to v0.12
### Note on upgrading AMIs to v0.12
This upgrade includes the automatically updated AMI in Version 0.12. It also includes an additional REDIS docker to the **trains-server** setup.
@@ -50,26 +50,64 @@ To upgrade the AMI:
The following sections contain lists of AMI Image IDs, per region, for each released **trains-server** version.
### Latest version AMI - v0.14.0 (auto update)<a name="autoupdate"></a>
### Latest version AMI - v0.14.2 (auto update)<a name="autoupdate"></a>
For easier upgrades, the following AMIs automatically update to the latest release every reboot:
* **eu-north-1** : ami-050c24cc0099e9512
* **ap-south-1** : ami-07bb33de49e319d73
* **eu-west-3** : ami-00ecdf092af972d24
* **eu-west-2** : ami-09ace28116ad33dd9
* **eu-west-1** : ami-01d85e00c7741d69b
* **ap-northeast-2** : ami-0ccc3d85996362545
* **ap-northeast-1** : ami-06abda05aa2407b1a
* **sa-east-1** : ami-0ce3597b116cfdd79
* **ca-central-1** : ami-0cb2d22a74007fa14
* **ap-southeast-1** : ami-06a9784d792a7c30f
* **ap-southeast-2** : ami-012ab6092f28f62b6
* **eu-central-1** : ami-04443efac619cac6d
* **us-east-2** : ami-05391549da2d5e38c
* **us-west-1** : ami-0444959077f5f7310
* **us-west-2** : ami-029b979c20d7f16f3
* **us-east-1** : ami-024ab496fe05a4b4d
* **eu-north-1** : ami-095cc888970c06e09
* **ap-south-1** : ami-07019e7b3febea37e
* **eu-west-3** : ami-0433d76badf430c16
* **eu-west-2** : ami-05794c2b23ff79990
* **eu-west-1** : ami-03e3bcabd1863d666
* **ap-northeast-2** : ami-00f14188b66a5803e
* **ap-northeast-1** : ami-005c93e30c99dab0c
* **sa-east-1** : ami-0d819231779e7d264
* **ca-central-1** : ami-0eff2fd400939d960
* **ap-southeast-1** : ami-049b21bfa0d35c21c
* **ap-southeast-2** : ami-0318b96a72d5da068
* **eu-central-1** : ami-0cdb9d794340b9704
* **us-east-2** : ami-0d846a080fc5a9345
* **us-west-1** : ami-0ef970342625159bf
* **us-west-2** : ami-04f3d13b75c642506
* **us-east-1** : ami-01bef4da91280a322
### v0.14.2 (static update)
* **eu-north-1** : ami-006d491e9e8869248
* **ap-south-1** : ami-0e55ec221687f98e7
* **eu-west-3** : ami-06ad9cf3c05c83e91
* **eu-west-2** : ami-0d05839268e748cff
* **eu-west-1** : ami-0d14c297789ce0d7a
* **ap-northeast-2** : ami-0d7fd775f0e76cc6f
* **ap-northeast-1** : ami-0c0a6e1daeb3f7a9c
* **sa-east-1** : ami-01e0c5e30e94ec887
* **ca-central-1** : ami-07a31896832734897
* **ap-southeast-1** : ami-0886d5b2d4b7fccd5
* **ap-southeast-2** : ami-0397d5a2db3c356fe
* **eu-central-1** : ami-0629f26eea22f5c17
* **us-east-2** : ami-0499c3d7bb45a1a6e
* **us-west-1** : ami-02fa8a961a4daf9f0
* **us-west-2** : ami-05c711cfab4342468
* **us-east-1** : ami-0b97d99a08012c726
### v0.14.1 (static update)
* **eu-north-1** : ami-036defe1885dced2e
* **ap-south-1** : ami-0b403aa1da6a5dc17
* **eu-west-3** : ami-0d30c2d330d1255c4
* **eu-west-2** : ami-06f0e8d075e50a029
* **eu-west-1** : ami-0da721d874f282b6d
* **ap-northeast-2** : ami-03bffe94675dd5f8c
* **ap-northeast-1** : ami-0f96520d646423673
* **sa-east-1** : ami-0c2f706a3b7d97282
* **ca-central-1** : ami-0da74525dcfd74e32
* **ap-southeast-1** : ami-066368a21cf6d232b
* **ap-southeast-2** : ami-0bfd09170067f7318
* **eu-central-1** : ami-06aa99b1c41492986
* **us-east-2** : ami-065c1880f59d03272
* **us-west-1** : ami-0b7f6b896f5058eba
* **us-west-2** : ami-0041e10ca68eef29a
* **us-east-1** : ami-0b7125e4305bbd7eb
### v0.14.0 (static update)
* **eu-north-1** : ami-02de71586ec496e38

58
docs/install_gcp.md Normal file
View File

@@ -0,0 +1,58 @@
# Deploying Trains Server on Google Cloud Platform
To easily deploy Trains Server on GCP, use one of our pre-built GCP Custom Images.
We provide Custom Images for each released version of Trains Server, see [Released versions](#released-versions) below.
Once your GCP instance is up and running using our Custom Image, [configure the Trains client](https://github.com/allegroai/trains/blob/master/README.md#configuration) to use your **trains-server**.
The service port numbers on our Trains Server GCP Custom Image are:
- Web application: `8080`
- API Server: `8008`
- File Server: `8081`
The persistent storage configuration:
- MongoDB: `/opt/trains/data/mongo/`
- ElasticSearch: `/opt/trains/data/elastic/`
- File Server: `/mnt/fileserver/`
For examples and use cases, check the [Trains usage examples](https://github.com/allegroai/trains/blob/master/docs/trains_examples.md).
## Importing the Custom Image to your GCP account
In order to launch an instance using the Trains Server GCP Custom Image, you'll need to import the image to your custom images list.
**Note:** there's **no need** to upload the image file to Google Cloud Storage - we already provide links to image files stored in Google Storage
To import the image to your custom images list:
1. In the Cloud Console, go to the [Images](https://console.cloud.google.com/compute/images) page.
1. At the top of the page, click **Create image**.
1. In the **Name** field, specify a unique name for the image.
1. Optionally, specify an image family for your new image, or configure specific encryption settings for the image.
1. Click the **Source** menu and select **Cloud Storage file**.
1. Enter the Trains Server image bucket path (see [Trains Server GCP Custom Image](#released-versions)), for example:
`allegro-files/trains-server/trains-server.tar.gz`
1. Click the **Create** button to import the image. The process can take several minutes depending on the size of the boot disk image.
For more information see [Import the image to your custom images list](https://cloud.google.com/compute/docs/import/import-existing-image#import_image) in the [Compute Engine Documentation](https://cloud.google.com/compute/docs).
## Launching an instance with a Custom Image
For instructions on launching an instance using a GCP Custom Image, see the [Manually importing virtual disks](https://cloud.google.com/compute/docs/import/import-existing-image#overview) in the [Compute Engine Documentation](https://cloud.google.com/compute/docs).
For more information on Custom Images, see [Custom Images](https://cloud.google.com/compute/docs/images#custom_images) in the Compute Engine Documentation.
The minimum recommended requirements for Trains Server are:
- 2 vCPUs
- 7.5GB RAM
## Upgrading
To upgrade **trains-server** on an existing GCP instance based on one of these Custom Images, SSH into the instance and follow the [upgrade instructions](../README.md#upgrade) for **trains-server**.
## Released versions
The following sections contain lists of Custom Image URLs (exported in different formats) for each released **trains-server** version.
### Latest version image (v0.14.1)
- https://storage.googleapis.com/allegro-files/trains-server/trains-server.tar.gz

View File

@@ -111,7 +111,7 @@ class TimestampKey(ScalarKey):
self.name: {
"date_histogram": {
"field": "timestamp",
"interval": interval,
"interval": f"{interval}ms",
"min_doc_count": 1,
}
}
@@ -150,7 +150,7 @@ class ISOTimeKey(ScalarKey):
self.name: {
"date_histogram": {
"field": "timestamp",
"interval": interval,
"interval": f"{interval}ms",
"min_doc_count": 1,
"format": "strict_date_time",
}

View File

@@ -12,35 +12,32 @@ from database.model.user import User
class Model(DbModelMixin, Document):
meta = {
'db_alias': Database.backend,
'strict': strict,
'indexes': [
"db_alias": Database.backend,
"strict": strict,
"indexes": [
"parent",
"project",
"task",
("company", "name"),
{
'name': '%s.model.main_text_index' % Database.backend,
'fields': [
'$name',
'$id',
'$comment',
'$parent',
'$task',
'$project',
],
'default_language': 'english',
'weights': {
'name': 10,
'id': 10,
'comment': 10,
'parent': 5,
'task': 3,
'project': 3,
}
}
"name": "%s.model.main_text_index" % Database.backend,
"fields": ["$name", "$id", "$comment", "$parent", "$task", "$project"],
"default_language": "english",
"weights": {
"name": 10,
"id": 10,
"comment": 10,
"parent": 5,
"task": 3,
"project": 3,
},
},
],
}
id = StringField(primary_key=True)
name = StrippedStringField(user_set_allowed=True, min_length=3)
parent = StringField(reference_field='Model', required=False)
parent = StringField(reference_field="Model", required=False)
user = StringField(required=True, reference_field=User)
company = StringField(required=True, reference_field=Company)
project = StringField(reference_field=Project, user_set_allowed=True)
@@ -49,9 +46,11 @@ class Model(DbModelMixin, Document):
comment = StringField(user_set_allowed=True)
tags = ListField(StringField(required=True), user_set_allowed=True)
system_tags = ListField(StringField(required=True), user_set_allowed=True)
uri = StrippedStringField(default='', user_set_allowed=True)
uri = StrippedStringField(default="", user_set_allowed=True)
framework = StringField()
design = SafeDictField()
labels = ModelLabels()
ready = BooleanField(required=True)
ui_cache = SafeDictField(default=dict, user_set_allowed=True, exclude_by_default=True)
ui_cache = SafeDictField(
default=dict, user_set_allowed=True, exclude_by_default=True
)

View File

@@ -17,12 +17,13 @@ class Project(AttributedDocument):
"db_alias": Database.backend,
"strict": strict,
"indexes": [
("company", "name"),
{
"name": "%s.project.main_text_index" % Database.backend,
"fields": ["$name", "$id", "$description"],
"default_language": "english",
"weights": {"name": 10, "id": 10, "description": 10},
}
},
],
}

View File

@@ -110,6 +110,12 @@ class Task(AttributedDocument):
"created",
"started",
"completed",
"parent",
"project",
("company", "name"),
("company", "type", "system_tags", "status"),
("company", "project", "type", "system_tags", "status"),
("status", "last_update"), # for maintenance tasks
{
"name": "%s.task.main_text_index" % Database.backend,
"fields": [

View File

@@ -258,6 +258,7 @@
properties {
added { type: integer }
errors { type: integer }
errors_info { type: object }
}
}
}
@@ -362,7 +363,7 @@
}
navigate_earlier {
type: boolean
description: "If set then events are retreived from later iterations to earlier ones. Otherwise from earlier iterations to the later. The default is True"
description: "If set then events are retreived from latest iterations to earliest ones. Otherwise from earliest iterations to the latest. The default is True"
}
refresh {
type: boolean
@@ -529,6 +530,59 @@
}
}
}
"2.7" {
description: "Get 'log' events for this task"
request {
type: object
required: [
task
]
properties {
task {
type: string
description: "Task ID"
}
batch_size {
type: integer
description: "The amount of log events to return"
}
navigate_earlier {
type: boolean
description: "If set then log events are retreived from the latest to the earliest ones (in timestamp descending order). Otherwise from the earliest to the latest ones (in timestamp ascending order). The default is True"
}
refresh {
type: boolean
description: "If set then scroll will be moved to the latest logs (if 'navigate_earlier' is set to True) or to the earliest (otherwise)"
}
scroll_id {
type: string
description: "Scroll ID of previous call (used for getting more results)"
}
}
}
response {
type: object
properties {
events {
type: array
items { type: object }
description: "Log items list"
}
returned {
type: integer
description: "Number of log events returned"
}
total {
type: number
description: "Total number of log events available for this query"
}
scroll_id {
type: string
description: "Scroll ID for getting more results"
}
}
}
}
}
get_task_events {
"2.1" {

View File

@@ -261,7 +261,7 @@
type: string
}
uri {
description: "URI for the model"
description: "URI for the model. Exactly one of uri or override_model_id is a required."
type: string
}
name {
@@ -283,7 +283,7 @@
items {type: string}
}
override_model_id {
description: "Override model ID. If provided, this model is updated in the task."
description: "Override model ID. If provided, this model is updated in the task. Exactly one of override_model_id or uri is required."
type: string
}
iteration {

View File

@@ -33,8 +33,7 @@ create_fields = {
}
get_all_query_options = Project.QueryParameterOptions(
pattern_fields=("name", "description"),
list_fields=("tags", "system_tags", "id"),
pattern_fields=("name", "description"), list_fields=("tags", "system_tags", "id"),
)
@@ -58,7 +57,7 @@ def get_by_id(call):
call.result.data = {"project": project_dict}
def make_projects_get_all_pipelines(project_ids, specific_state=None):
def make_projects_get_all_pipelines(company_id, project_ids, specific_state=None):
archived = EntityVisibility.archived.value
def ensure_valid_fields():
@@ -74,15 +73,18 @@ def make_projects_get_all_pipelines(project_ids, specific_state=None):
"else": "$system_tags",
}
},
"status": {
"$ifNull": ["$status", "unknown"]
}
"status": {"$ifNull": ["$status", "unknown"]},
}
}
status_count_pipeline = [
# count tasks per project per status
{"$match": {"project": {"$in": project_ids}}},
{
"$match": {
"company": {"$in": [None, "", company_id]},
"project": {"$in": project_ids},
}
},
ensure_valid_fields(),
{
"$group": {
@@ -153,7 +155,10 @@ def make_projects_get_all_pipelines(project_ids, specific_state=None):
{
"$match": {
"type": {"$in": ["training", "testing", "annotation"]},
"project": {"$in": project_ids},
"project": {
"company": {"$in": [None, "", company_id]},
"$in": project_ids,
},
}
},
ensure_valid_fields(),
@@ -195,7 +200,7 @@ def get_all_ex(call: APICall):
ids = [project["id"] for project in projects]
status_count_pipeline, runtime_pipeline = make_projects_get_all_pipelines(
ids, specific_state=specific_state
call.identity.company, ids, specific_state=specific_state
)
default_counts = dict.fromkeys(get_options(TaskStatus), 0)
@@ -205,7 +210,7 @@ def get_all_ex(call: APICall):
status_count = defaultdict(lambda: {})
key = itemgetter(EntityVisibility.archived.value)
for result in Task.aggregate(*status_count_pipeline):
for result in Task.aggregate(status_count_pipeline):
for k, group in groupby(sorted(result["counts"], key=key), key):
section = (
EntityVisibility.archived if k else EntityVisibility.active
@@ -219,7 +224,7 @@ def get_all_ex(call: APICall):
runtime = {
result["_id"]: {k: v for k, v in result.items() if k != "_id"}
for result in Task.aggregate(*runtime_pipeline)
for result in Task.aggregate(runtime_pipeline)
}
def safe_get(obj, path, default=None):

View File

@@ -750,8 +750,7 @@ class CleanupResult(object):
deleted_models = attr.ib(type=int)
def cleanup_task(task, force=False):
# type: (Task, bool) -> CleanupResult
def cleanup_task(task: Task, force: bool = False):
"""
Validate task deletion and delete/modify all its output.
:param task: task object

View File

@@ -1 +1 @@
__version__ = "0.14.0"
__version__ = "0.14.2"