Commit Graph

1962 Commits

Author SHA1 Message Date
Evan Lezar
a02f7f8f6f Merge branch 'CNT-1554/docker-swarm' into 'master'
Fix bug where docker swarm device selection is overriden by NVIDIA_VISIBLE_DEVICES

See merge request nvidia/container-toolkit/container-toolkit!31
2021-06-08 05:31:05 +00:00
Evan Lezar
2a92d6acb7 Fix bug where docker swarm device selection is overriden by NVIDIA_VISIBLE_DEVICES
This change fixes a bug where the value of NVIDIA_VISIBLE_DEVICES would be used to
select devices even if the `swarm-resource` config option is specified.

Note that this does not change the value of NVIDIA_VISIBLE_DEVICES in the container.

Signed-off-by: Evan Lezar <elezar@nvidia.com>
2021-06-07 14:10:08 +02:00
Evan Lezar
602eaf0e60 Use require package for tests
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2021-06-07 13:31:41 +02:00
Evan Lezar
b930487dc5 Add coverage to go tests
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2021-06-07 13:21:28 +02:00
Evan Lezar
9aac07fe64 Update vendoring
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2021-06-07 13:20:34 +02:00
Evan Lezar
825990ba41 Merge branch 'CNT-1334-publish-tags-to-artifactory' into 'master'
Add artifactory publish step

See merge request nvidia/container-toolkit/container-toolkit!30
2021-05-18 17:25:07 +00:00
Evan Lezar
03d9c1d698 Update to Golang 1.16.3
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2021-05-18 11:25:52 +02:00
Evan Lezar
de172674b1 Add artifactory publish step
This change simplifies the build process by only targetting ubuntu20.04-amd64
and adds logic to push tagged builds to artifactory.

Signed-off-by: Evan Lezar <elezar@nvidia.com>
2021-05-18 11:25:48 +02:00
Kevin Klues
b71a9ed153 Merge branch 'upstream-bump-v1.5.0' into 'master'
Bump version to 1.5.0

See merge request nvidia/container-toolkit/container-toolkit!29
2021-04-29 14:08:23 +00:00
Kevin Klues
dde7159e11 Bump version to 1.5.0
Signed-off-by: Kevin Klues <kklues@nvidia.com>
2021-04-29 10:16:44 +00:00
Evan Lezar
46de426cc4 Merge branch 'CNT-1330/jenkins-ci' into 'master'
Add Jenkins file for CI build steps

See merge request nvidia/container-toolkit/container-toolkit!28
2021-03-18 10:06:44 +00:00
Evan Lezar
1c7d6a233a Add golang check targets
This change adds check targets for Golang to the make file. These are also
added as stages to the to the Jenkinsfile definition and the GitLab CI
is modified to use them too.

Signed-off-by: Evan Lezar <elezar@nvidia.com>
2021-03-17 16:58:39 +01:00
Evan Lezar
635aeb8343 Add Jenkinsfile definition for build targets
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2021-03-17 13:52:19 +01:00
Evan Lezar
ec9d296afe Move docker.mk to docker folder
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2021-03-17 13:52:14 +01:00
Evan Lezar
ff44395b31 Merge branch 'upstream-bump-v1.4.2' into 'master'
Bump version to 1.4.2

See merge request nvidia/container-toolkit/container-toolkit!27
2021-02-05 12:47:01 +00:00
Kevin Klues
8571e5ac5d Bump version to 1.4.2
Signed-off-by: Kevin Klues <kklues@nvidia.com>
2021-02-05 10:26:10 +00:00
Kevin Klues
108c99bb9b Merge branch 'upstream-bump-v1.4.1' into 'master'
Bump version to 1.4.1

See merge request nvidia/container-toolkit/container-toolkit!26
2021-01-25 13:35:42 +00:00
Kevin Klues
dfb5daf200 Bump version to 1.4.1
Signed-off-by: Kevin Klues <kklues@nvidia.com>
2021-01-25 10:42:32 +00:00
Kevin Klues
e8aa3cc8c3 Merge branch 'ignore-nvidia-visible-devices' into 'master'
Ignore NVIDIA_VISIBLE_DEVICES for containers with insufficent privileges

See merge request nvidia/container-toolkit/container-toolkit!25
2021-01-25 10:25:00 +00:00
Evan Lezar
fc408a32c7 Add utility function to get config name from struct
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2021-01-22 16:08:45 +01:00
Evan Lezar
f6b1b1afad Ignore NVIDIA_VISIBLE_DEVICES for containers with insufficent privileges
This change ignores the value of NVIDIA_VISIBLE_DEVICES instead of
raising an error when launching a container with insufficient permissions.

This changes the behaviour under the following conditions:

NVIDIA_VISIBLE_DEVICES is set
and

accept-nvidia-visible-devices-envvar-when-unprivileged = false (default: true)

or

privileged = false (default: false)

This means that a user need not explicitly clear the NVIDIA_VISIBLE_DEVICES
environment variable if no GPUs are to be used in unprivileged containers.
Note that this envvar is set to 'all' by default in many CUDA images that
are used as base images.

Signed-off-by: Evan Lezar <elezar@nvidia.com>
2021-01-22 15:34:52 +01:00
Kevin Klues
97516467c0 Merge branch 'upstream-bump-v1.4.0' into 'master'
Bump version to 1.4.0

See merge request nvidia/container-toolkit/container-toolkit!24
2020-12-14 14:41:02 +00:00
Kevin Klues
01063c0433 Bump version to 1.4.0
Signed-off-by: Kevin Klues <kklues@nvidia.com>
2020-12-11 18:05:49 +00:00
Kevin Klues
119f75dcf8 Merge branch 'upstream-add-compute-to-default-capabilities' into 'master'
Add 'compute' capability to list of defaults.

See merge request nvidia/container-toolkit/container-toolkit!23
2020-12-08 11:31:27 +00:00
Kevin Klues
20604621e4 Add 'compute' capability to list of defaults.
For most practical purposes, it should be fine to set
NVIDIA_DRIVER_CAPABILITIES=all nowadays.

Historically, these different capabilities exist because they were added
incrementally, with varying degrees of stability. It's fairly common to
run with GPUs in containers today, but a few years ago the driver didn't
support them very well, and it was important to make sure the libraries
being injected into the container actually worked in a containerized
environment. When they didn't, it was common to get information leaks,
crashes, or even silent failures.

In the past, whenever a new set of libraries was being vetted for
injected, a new capability was added to make sure that users had control
to explicitly include only those libraries they were comfortable having
injected into their containers.

The idea being that whoever puts together a container image for use with
GPUs should have the knowledge of what capabilities the software in that
container image requires, and can set the NVIDIA_DRIVER_CAPABILITIES
envvar in that image appropriately.

After some back and forth, we've decided it doesn't quite make sense to
set it to "all" just yet, but we should set it to "utility, compute"
instead of just "utility", so that at least the core CUDA libraries work
by default (once installed in the container).

Signed-off-by: Kevin Klues <kklues@nvidia.com>
2020-12-07 12:10:23 +00:00
Kevin Klues
8cfb3c29f6 Merge branch 'upstream-bump-v1.3.0' into 'master'
Bump to version 1.3.0

See merge request nvidia/container-toolkit/container-toolkit!22
2020-09-16 13:34:37 +00:00
Kevin Klues
98e202d0d8 Bump to version 1.3.0
Signed-off-by: Kevin Klues <kklues@nvidia.com>
2020-09-16 11:45:31 +00:00
Kevin Klues
26668097c4 Merge branch 'upstream-bump-1.3.0-rc.2' into 'master'
Bump to version 1.3.0 rc.2

See merge request nvidia/container-toolkit/container-toolkit!21
2020-08-10 15:33:25 +00:00
Kevin Klues
caf2792463 Update changelogs for 1.3.0-rc.2
Signed-off-by: Kevin Klues <kklues@nvidia.com>
2020-08-10 13:08:17 +00:00
Kevin Klues
b2be0b08ac Bump version to 1.3.0-rc.2
Signed-off-by: Kevin Klues <kklues@nvidia.com>
2020-08-10 13:03:00 +00:00
Kevin Klues
edc5041636 Merge branch 'upstream-update-devices-from-volume-mounts-semantics' into 'master'
Refactor accepting device lists from volume mounts as a boolean

See merge request nvidia/container-toolkit/container-toolkit!20
2020-08-07 18:40:56 +00:00
Kevin Klues
2c1809475c Add more tests for new semantics with device list from volume mounts
Signed-off-by: Kevin Klues <kklues@nvidia.com>
2020-08-07 16:30:31 +00:00
Kevin Klues
7c00385797 Refactor accepting device lists from volume mounts as a boolean
Also hard code the "root" path where these volume mounts will be looked
for rather than making it configurable.

Signed-off-by: Kevin Klues <kklues@nvidia.com>
2020-08-07 16:30:19 +00:00
Kevin Klues
322006c361 Merge branch 'upstream-bump-1.3.0-rc.1' into 'master'
Bump version to 1.3.0-rc.1

See merge request nvidia/container-toolkit/container-toolkit!19
2020-07-24 20:36:38 +00:00
Kevin Klues
a25017fb8a Merge branch 'upstream-build-prerelease' into 'master'
Update build system to accept a TAG variable for things like rc.x

See merge request nvidia/container-toolkit/container-toolkit!18
2020-07-24 20:22:00 +00:00
Kevin Klues
928905ce94 Update changelogs for 1.3.0-rc.1
Signed-off-by: Kevin Klues <kklues@nvidia.com>
2020-07-24 20:10:42 +00:00
Kevin Klues
7ed17bb9ca Bump version to 1.3.0-rc.1
Signed-off-by: Kevin Klues <kklues@nvidia.com>
2020-07-24 20:03:48 +00:00
Kevin Klues
b50d86c174 Update build system to accept a TAG variable for things like rc.x
Signed-off-by: Kevin Klues <kklues@nvidia.com>
2020-07-24 19:54:29 +00:00
Kevin Klues
bf342fb4c9 Merge branch 'upstream-fix-ci' into 'master'
Generalize CI variables

See merge request nvidia/container-toolkit/container-toolkit!17
2020-07-24 14:28:49 +00:00
Kevin Klues
1791372f22 Generalize CI variables
Signed-off-by: Kevin Klues <kklues@nvidia.com>
2020-07-24 14:01:39 +00:00
Kevin Klues
4448319605 Merge branch 'upstream-add-alternate-device-list' into 'master'
Add the ability to pull the device list from mounted files instead of just Envvars

See merge request nvidia/container-toolkit/container-toolkit!15
2020-07-24 13:18:53 +00:00
Kevin Klues
2ea3150b60 Merge branch 'upstream-simplify-nvidia-config-generation' into 'master'
Simplify logic for `nvidiaConfig` generation

See merge request nvidia/container-toolkit/container-toolkit!14
2020-07-24 13:18:35 +00:00
Kevin Klues
32b4b09bc9 Add tests to verify priority of device list from mounts vs. envvar
Signed-off-by: Kevin Klues <kklues@nvidia.com>
2020-07-24 12:50:05 +00:00
Kevin Klues
cc0a22a6d9 Consolidate logic for building nvidiaConfig into a single function
Signed-off-by: Kevin Klues <kklues@nvidia.com>
2020-07-24 12:50:05 +00:00
Kevin Klues
e48d23d107 Add test for getDevicesFromMounts()
Signed-off-by: Kevin Klues <kklues@nvidia.com>
2020-07-24 12:50:05 +00:00
Kevin Klues
430dda41e9 Remove getNvidiaConfigLegacy() function
A subsequent commit will add equivalent functionality back in

Signed-off-by: Kevin Klues <kklues@nvidia.com>
2020-07-24 12:50:05 +00:00
Kevin Klues
8bcd02ee5d Add logic implementing getDevicesFromMounts()
Signed-off-by: Kevin Klues <kklues@nvidia.com>
2020-07-24 12:50:05 +00:00
Kevin Klues
4791fab747 Simplify getMigConfigDevices() and getMigMonitorDevices()
Signed-off-by: Kevin Klues <kklues@nvidia.com>
2020-07-24 12:50:05 +00:00
Kevin Klues
7313069d4c Update getDevices() to account for getting the devices list from mounts
Signed-off-by: Kevin Klues <kklues@nvidia.com>
2020-07-24 12:50:05 +00:00
Kevin Klues
a24b0c8b4e Split isLegacyCUDAImage() into its own helper function
Signed-off-by: Kevin Klues <kklues@nvidia.com>
2020-07-24 12:50:05 +00:00