Compare commits

..

98 Commits

Author SHA1 Message Date
Evan Lezar
1d0fd7475c Merge branch 'packaging-fix' into 'master'
Update nvidia-docker submodule

See merge request nvidia/container-toolkit/container-toolkit!71
2021-11-05 14:09:47 +00:00
Evan Lezar
40032edc3b Update submodules for packaging
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2021-11-05 15:03:46 +01:00
Evan Lezar
f2d2991651 Update nvidia-docker submodule
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2021-11-05 14:04:28 +01:00
Evan Lezar
3d5be45349 Merge branch 'prep-for-release' into 'master'
Specify toolkit version when building runtime and docker packages

See merge request nvidia/container-toolkit/container-toolkit!70
2021-11-05 12:27:54 +00:00
Evan Lezar
4d945e96f3 Auto update debian changelog and release date
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2021-11-05 12:06:29 +01:00
Evan Lezar
14c641377f Update nvidia-docker submodule
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2021-11-04 17:14:36 +01:00
Evan Lezar
988e067091 Forward nvidia-container-toolkit versions to dependants
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2021-11-04 16:47:37 +01:00
Evan Lezar
98168ea16c Update libnvidia-container submodule
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2021-11-04 16:40:09 +01:00
Evan Lezar
d6a2733557 Update nvidia-container-runtime submodule
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2021-11-04 16:39:40 +01:00
Evan Lezar
ee6545fbab Merge branch 'update-oci-spec' into 'master'
Add basic test for preservation of OCI spec under modification

See merge request nvidia/container-toolkit/container-toolkit!69
2021-11-04 13:29:03 +00:00
Evan Lezar
e8cc95c53b Update imported OCI runtime spec
This change updates the imported OCI runtime spec to a3c33d663ebc which includes
the ability to override the return code for syscalls. This is used by docker for
the clone3 syscall, for example.

Signed-off-by: Evan Lezar <elezar@nvidia.com>
2021-11-04 13:42:46 +01:00
Evan Lezar
8afd89676f Add basic test for preservation of OCI spec under modification
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2021-11-04 13:42:46 +01:00
Evan Lezar
dd5c0a94ad Merge branch 'pulse-ci' into 'master'
[ci] use pulse instead of contamer for scans

See merge request nvidia/container-toolkit/container-toolkit!65
2021-11-02 18:50:42 +00:00
Christopher Desiniotis
93ecf3aeaf [ci] use pulse instead of contamer for scans
Signed-off-by: Christopher Desiniotis <cdesiniotis@nvidia.com>
2021-11-01 11:09:10 -07:00
Evan Lezar
55328126c6 Merge branch 'fix-devel-release-version' into 'master'
Rename RELEASE_DEVEL_TAG for consistency

See merge request nvidia/container-toolkit/container-toolkit!64
2021-10-27 11:52:57 +00:00
Evan Lezar
c2b35da111 Rename RELEASE_DEVEL_TAG for consistency
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2021-10-27 13:25:12 +02:00
Evan Lezar
2c210ebe21 Merge branch 'CNT-1874/add-internal-ci' into 'master'
Add release CI for toolkit-container

See merge request nvidia/container-toolkit/container-toolkit!58
2021-10-27 08:23:05 +00:00
Evan Lezar
1f0064525c Merge branch 'empty-bundle' into 'master'
Bump version to 1.6.0-rc.2

See merge request nvidia/container-toolkit/container-toolkit!63
2021-10-26 12:57:43 +00:00
Evan Lezar
c301bde4f4 Remove rule for merge requests
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2021-10-26 14:29:28 +02:00
Evan Lezar
5996379fcc Add changelog entry for config.json path changes
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2021-10-26 14:29:28 +02:00
Evan Lezar
23bdcbc818 Bump version to 1.6.0-rc.2
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2021-10-26 14:29:26 +02:00
Evan Lezar
ee7206ef29 Merge branch 'master' into 'master'
skip error when bundleDir not exist

See merge request nvidia/container-toolkit/container-toolkit!62
2021-10-26 11:10:46 +00:00
wenjun gao
350c8893fb skip error when bundleDir not exist 2021-10-26 17:51:12 +08:00
Evan Lezar
5b1a6765c6 Remove rule for merge requests
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2021-10-25 12:13:06 +02:00
Evan Lezar
cd1540300e Add internal CI definition for release
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2021-10-25 12:13:06 +02:00
Evan Lezar
52f52d5376 Merge branch 'CNT-1874/build-toolkit-container' into 'master'
Build container-toolkit images from nvidia-container-toolkit repository

See merge request nvidia/container-toolkit/container-toolkit!52
2021-10-22 11:09:02 +00:00
Evan Lezar
c35444c76c Add CI to build toolkit-container image
This change adds CI definitions for building the toolkit-container
images. This modifies the existing CI and replaces the build-one
stage with multiple stages that do the following:
* peform the standard golang checks
* build the packages required by the images
* build the images for supported platforms
* releases the images (currently to the CI staging registry)

The build-all stage is included as a final step in the CI. This is
run after the release stage as the target platforms are not requried
from an imaging perspective. The build-all stage is only run on
MRs or tagged builds.

Signed-off-by: Evan Lezar <elezar@nvidia.com>
2021-10-22 11:57:55 +02:00
Evan Lezar
0b3bc13b32 Add dockerfile and makefile to build toolkit-container
This change adds platform-specific Dockerfiles and a Makefile
to build the toolkit-container images.

This image builds the container-config commands from the tools
directory and installs the components of the NVIDIA Container Toolkit
directly from the nvidia-container-toolkit and libnvidia-container*
packages in the dist directory.

This includes make targets for the centos7, centos8, ubuntu18.04,
and ubi8 container-toolkit images as well as the container tests
make targets implemented in the contianer-config repository.

Files adapted from:
383587f766

Signed-off-by: Evan Lezar <elezar@nvidia.com>
2021-10-22 11:57:55 +02:00
Evan Lezar
f2c93363ab Copy container test scripts from container-config
Files copied from:

383587f766

Signed-off-by: Evan Lezar <elezar@nvidia.com>
2021-10-22 11:57:55 +02:00
Evan Lezar
7d76243783 Copy cmd from container-config
This change copies the code from container-config/cmd to
tools/container. This allows the code to be built and
added to the container image without additional refactoring.

As the configuration utilities are incorporated into the cmds
of the nvidia-container-toolkit, the code will be moved from tools.

Files copied from:

383587f766

Signed-off-by: Evan Lezar <elezar@nvidia.com>
2021-10-22 11:57:51 +02:00
Evan Lezar
7bf5c25831 Update go vendoring
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2021-10-21 12:07:42 +02:00
Evan Lezar
266b752b02 Merge branch 'al2-aarch64' into 'master'
Add aarch64 build for Amazon Linux 2

See merge request nvidia/container-toolkit/container-toolkit!55
2021-10-19 14:31:51 +00:00
Evan Lezar
7fc33d02b4 Update submodules
* libnvidia-container: @1fa138a694b3667fd89ac89dfdb26fcd06ab0bb9
* nvidia-container-runtime: @cd6aef41126b5409c2329b66803b278a697aaaf3
* nvidia-docker: @4613cdae34c3e106ef124c9b86e4cf998569bbd6

Signed-off-by: Evan Lezar <elezar@nvidia.com>
2021-10-19 15:30:29 +02:00
Evan Lezar
9be9b89f9f Add aarch64 build for Amazon Linux 2
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2021-10-18 12:03:22 +02:00
Evan Lezar
a036a83afa Merge branch 'improve-release-tests' into 'master'
Extend release testing toolking to allow for upgrade testing

See merge request nvidia/container-toolkit/container-toolkit!54
2021-10-14 17:23:49 +00:00
Evan Lezar
ee0b908613 Extend release testing toolking to allow for upgrade testing
This change allows for upgrade workflows to be tested in the
release test containers. To achieve this a script is added
to configure the test repositories leaving the defaults installed
initially.

Signed-off-by: Evan Lezar <elezar@nvidia.com>
2021-10-14 17:10:43 +02:00
Evan Lezar
28f6b7c02c Merge branch 'update-ci' into 'master'
Update CI to use build image directly

See merge request nvidia/container-toolkit/container-toolkit!53
2021-10-12 12:55:48 +00:00
Evan Lezar
f7e9d1ca45 Use build image directly in CI
This change uses the build image directly in CI instead of
using dind and invoking the docker-* make targets.

Signed-off-by: Evan Lezar <elezar@nvidia.com>
2021-10-12 13:53:11 +02:00
Evan Lezar
229f9c3730 Update DEVELOPMENT.md
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2021-10-12 13:53:06 +02:00
Evan Lezar
845701447c Merge branch 'build-from-repo' into 'master'
Add logic to build all package from nvidia-container-toolkit repo

See merge request nvidia/container-toolkit/container-toolkit!49
2021-10-07 14:30:13 +00:00
Evan Lezar
1ad98df39f Update submodules for packaging fixes
This change updates the git submodules for nvidia-docker and
nvidia-container-runtime to contain the package fixes and
code cleanup.

Signed-off-by: Evan Lezar <elezar@nvidia.com>
2021-10-07 15:28:52 +02:00
Evan Lezar
22a958fae7 Add docker-based tests for package installation workflows
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2021-09-24 14:58:17 +02:00
Evan Lezar
f10fa7b292 FIXUP: Update development
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2021-09-23 21:38:03 +02:00
Evan Lezar
fa7dc8cb31 Require at least a matching libnvidia-container-tools version
This change ensures that at least the same libnvidia-container-tools
version is required when installing nvidia-container-toolkit.

Signed-off-by: Evan Lezar <elezar@nvidia.com>
2021-09-23 21:34:09 +02:00
Evan Lezar
3fef6bb5ab Merge branch 'update-readme' into 'master'
Add README to nvidia-container-toolkit repository

See merge request nvidia/container-toolkit/container-toolkit!50
2021-09-23 19:07:22 +00:00
Evan Lezar
2dc85de5d4 Use consistent package revisions for all rpm-based packages
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2021-09-23 15:36:27 +02:00
Evan Lezar
bb6f4745e9 Fix nvidia-container-runtime breaks / replaces dependency
The relationship between packages also considers the package revision
when determining validity. This means that 3.5.0-1 is considered
greater than 3.5.0. This changed adds the package revision to the
nvidia-container-runtime breaks / replaces relationship.

Signed-off-by: Evan Lezar <elezar@nvidia.com>
2021-09-23 14:56:55 +02:00
Evan Lezar
77740c2a80 Add release script for specific targets
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2021-09-23 14:38:37 +02:00
Evan Lezar
f0fb4739ff Remove docker-all target from Makefile
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2021-09-23 14:38:37 +02:00
Evan Lezar
5ee2150eaa Add basic version checks
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2021-09-23 14:38:37 +02:00
Evan Lezar
34e023361b Add nvidia-docker as a git submodule
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2021-09-23 14:38:37 +02:00
Evan Lezar
2ed7d86709 Add nvidia-container-runtime as a git submodule
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2021-09-23 14:38:37 +02:00
Evan Lezar
e729e74fe5 Add libnvidia-container as a git submodule
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2021-09-23 14:38:37 +02:00
Evan Lezar
b551d0f4f4 Apply edits for the NVIDIA container toolkit
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2021-09-22 16:49:58 +02:00
Evan Lezar
1d674783b0 Copy README.md from nvidia-docker
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2021-09-22 15:22:53 +02:00
Evan Lezar
cc9c3c0d28 Copy scripts from nvidia-container-toolkit-release
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2021-09-22 13:42:32 +02:00
Evan Lezar
78f137a5ef Bump version for 1.6.0 development
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2021-09-22 13:42:32 +02:00
Evan Lezar
00258f14fb Merge branch 'include-nvidia-container-runtime' into 'master'
Include nvidia-container-runtime executable in nvidia-container-toolkit

See merge request nvidia/container-toolkit/container-toolkit!45
2021-09-20 14:54:06 +00:00
Evan Lezar
e828697f90 Update debian and rpm package definitions
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2021-09-20 14:54:54 +02:00
Evan Lezar
923344d376 Add PREFIX make variable to control command output
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2021-09-13 17:15:34 +02:00
Evan Lezar
35c6559013 Make all commands and copy executables
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2021-09-13 17:15:19 +02:00
Evan Lezar
eb67968911 Add cmds target to makefile to build all go commands
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2021-09-13 17:08:31 +02:00
Evan Lezar
6e1436cefb Update go vendoring
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2021-09-07 13:13:03 +02:00
Evan Lezar
10cd42273e Update package references
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2021-09-07 13:13:03 +02:00
Evan Lezar
b6a585c77d Copy code from nvidia-container-runtime
This change copies the cmd/nvidia-container-runtime, internal, and test
folders from github.com/NVIDIA/nvidia-container-runtime@8a63b4b34f3ce3b4167f0516aa3f7207ca280dfb

Signed-off-by: Evan Lezar <elezar@nvidia.com>
2021-09-07 13:13:03 +02:00
Evan Lezar
58e707fed6 Bump version for post 1.5.1 development
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2021-09-07 13:12:25 +02:00
Evan Lezar
28ee3d5fd5 Merge branch 'revert-nvlink' into 'master'
Revert support for NVIDIA_FABRIC_DEVICES

See merge request nvidia/container-toolkit/container-toolkit!41
2021-08-20 08:39:49 +00:00
Evan Lezar
c2ac6db43b Revert "Add support for NVIDIA_FABRIC_DEVICES"
This reverts commit f828efcf64.
2021-08-18 15:17:59 +02:00
Evan Lezar
620bd806e8 Revert "Bump version to 1.6.0~rc.1"
This reverts commit 2001d66f9b.
2021-08-18 15:17:37 +02:00
Evan Lezar
afe0f8b61f Merge branch 'release-1.6.0-rc.1' into 'master'
Bump version to 1.6.0~rc.1

See merge request nvidia/container-toolkit/container-toolkit!40
2021-08-16 10:27:11 +00:00
Evan Lezar
2001d66f9b Bump version to 1.6.0~rc.1
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2021-08-13 14:55:28 +02:00
Evan Lezar
7626578b8e Merge branch 'nvlink' into 'master'
Add support for NVIDIA_FABRIC_DEVICES

See merge request nvidia/container-toolkit/container-toolkit!39
2021-08-12 14:04:10 +00:00
Evan Lezar
f828efcf64 Add support for NVIDIA_FABRIC_DEVICES
This change adds support for the NVIDIA_FABRIC_DEVICES envvar. The (non-empty)
value of this envvar is passed to the NVIDIA Container CLI using the --fabric-device
command line flag and allows for nvswitch and nvlink devices to be mounted
into the container.

Signed-off-by: Evan Lezar <elezar@nvidia.com>
2021-08-11 10:33:55 +02:00
Evan Lezar
faf0df66c7 Merge branch 'improve-ci' into 'master'
Improve CI for container toolkit

See merge request nvidia/container-toolkit/container-toolkit!38
2021-07-15 15:40:56 +00:00
Evan Lezar
1ef4b1a14a Use extends keyword for build-one and build-all
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2021-07-15 16:27:17 +02:00
Evan Lezar
3df0969349 Improve CI for container toolkit
This change improves the CI for the container toolkit. The go targets are
executed in a docker container which allows for reproducible behaviour on
local systems as well as CI. The Makefile is updated to facilitate this.

Signed-off-by: Evan Lezar <elezar@nvidia.com>
2021-07-15 16:27:15 +02:00
Evan Lezar
22fcd022f3 Merge branch 'upstream-bump-v1.5.1' into 'master'
Bump to version 1.5.1

See merge request nvidia/container-toolkit/container-toolkit!35
2021-06-14 18:35:45 +00:00
Evan Lezar
492905de38 Bump to version 1.5.1
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2021-06-14 17:16:44 +02:00
Evan Lezar
17e76cad4d Merge branch 'make-binary-goos-explicit' into 'master'
Explicitly set GOOS when building binary

See merge request nvidia/container-toolkit/container-toolkit!33
2021-06-14 10:12:24 +00:00
Evan Lezar
c728bf4b1e Explicitly set GOOS when building binary
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2021-06-10 10:40:06 +02:00
Evan Lezar
f05e4e81c5 Merge branch 'fix-go-install' into 'master'
Move pkg to cmd/nvidia-container-toolkit

See merge request nvidia/container-toolkit/container-toolkit!32
2021-06-08 15:21:06 +00:00
Evan Lezar
14cd7c1833 Add coverage step to CI
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2021-06-08 15:23:19 +02:00
Evan Lezar
f72b79cc2a Move pkg to cmd/nvidia-container-toolkit
This change moves the pkg folder to `cmd/nvidia-container-toolkit` to
better match go best practices. This allows, for example, for the
`cmd/nvidia-container-toolkit` to be go installed.

The only package included in `pkg` was `main`.

Signed-off-by: Evan Lezar <elezar@nvidia.com>
2021-06-08 15:20:59 +02:00
Evan Lezar
f25698e96e Run go mod vendor and go mod tidy
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2021-06-08 10:51:33 +02:00
Evan Lezar
a02f7f8f6f Merge branch 'CNT-1554/docker-swarm' into 'master'
Fix bug where docker swarm device selection is overriden by NVIDIA_VISIBLE_DEVICES

See merge request nvidia/container-toolkit/container-toolkit!31
2021-06-08 05:31:05 +00:00
Evan Lezar
2a92d6acb7 Fix bug where docker swarm device selection is overriden by NVIDIA_VISIBLE_DEVICES
This change fixes a bug where the value of NVIDIA_VISIBLE_DEVICES would be used to
select devices even if the `swarm-resource` config option is specified.

Note that this does not change the value of NVIDIA_VISIBLE_DEVICES in the container.

Signed-off-by: Evan Lezar <elezar@nvidia.com>
2021-06-07 14:10:08 +02:00
Evan Lezar
602eaf0e60 Use require package for tests
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2021-06-07 13:31:41 +02:00
Evan Lezar
b930487dc5 Add coverage to go tests
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2021-06-07 13:21:28 +02:00
Evan Lezar
9aac07fe64 Update vendoring
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2021-06-07 13:20:34 +02:00
Evan Lezar
825990ba41 Merge branch 'CNT-1334-publish-tags-to-artifactory' into 'master'
Add artifactory publish step

See merge request nvidia/container-toolkit/container-toolkit!30
2021-05-18 17:25:07 +00:00
Evan Lezar
03d9c1d698 Update to Golang 1.16.3
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2021-05-18 11:25:52 +02:00
Evan Lezar
de172674b1 Add artifactory publish step
This change simplifies the build process by only targetting ubuntu20.04-amd64
and adds logic to push tagged builds to artifactory.

Signed-off-by: Evan Lezar <elezar@nvidia.com>
2021-05-18 11:25:48 +02:00
Kevin Klues
b71a9ed153 Merge branch 'upstream-bump-v1.5.0' into 'master'
Bump version to 1.5.0

See merge request nvidia/container-toolkit/container-toolkit!29
2021-04-29 14:08:23 +00:00
Kevin Klues
dde7159e11 Bump version to 1.5.0
Signed-off-by: Kevin Klues <kklues@nvidia.com>
2021-04-29 10:16:44 +00:00
Evan Lezar
46de426cc4 Merge branch 'CNT-1330/jenkins-ci' into 'master'
Add Jenkins file for CI build steps

See merge request nvidia/container-toolkit/container-toolkit!28
2021-03-18 10:06:44 +00:00
Evan Lezar
1c7d6a233a Add golang check targets
This change adds check targets for Golang to the make file. These are also
added as stages to the to the Jenkinsfile definition and the GitLab CI
is modified to use them too.

Signed-off-by: Evan Lezar <elezar@nvidia.com>
2021-03-17 16:58:39 +01:00
Evan Lezar
635aeb8343 Add Jenkinsfile definition for build targets
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2021-03-17 13:52:19 +01:00
Evan Lezar
ec9d296afe Move docker.mk to docker folder
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2021-03-17 13:52:14 +01:00
1126 changed files with 450242 additions and 253 deletions

409
.common-ci.yml Normal file
View File

@@ -0,0 +1,409 @@
# Copyright (c) 2021, NVIDIA CORPORATION. All rights reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
default:
image: docker:stable
services:
- name: docker:stable-dind
command: ["--experimental"]
variables:
GIT_SUBMODULE_STRATEGY: recursive
BUILDIMAGE: "${CI_REGISTRY_IMAGE}/build:${CI_COMMIT_SHORT_SHA}"
stages:
- image
- lint
- go-checks
- go-build
- unit-tests
- package-build
- image-build
- test
- scan
- release
- build-all
build-dev-image:
stage: image
script:
- apk --no-cache add make bash
- make .build-image
- docker login -u "${CI_REGISTRY_USER}" -p "${CI_REGISTRY_PASSWORD}" "${CI_REGISTRY}"
- make .push-build-image
.requires-build-image:
image: "${BUILDIMAGE}"
.go-check:
extends:
- .requires-build-image
stage: go-checks
fmt:
extends:
- .go-check
script:
- make assert-fmt
vet:
extends:
- .go-check
script:
- make vet
lint:
extends:
- .go-check
script:
- make lint
allow_failure: true
ineffassign:
extends:
- .go-check
script:
- make ineffassign
allow_failure: true
misspell:
extends:
- .go-check
script:
- make misspell
go-build:
extends:
- .requires-build-image
stage: go-build
script:
- make build
unit-tests:
extends:
- .requires-build-image
stage: unit-tests
script:
- make coverage
# Define the distribution targets
.dist-centos7:
variables:
DIST: centos7
.dist-centos8:
variables:
DIST: centos8
.dist-ubi8:
variables:
DIST: ubi8
.dist-ubuntu18.04:
variables:
DIST: ubuntu18.04
.arch-aarch64:
variables:
ARCH: aarch64
.arch-amd64:
variables:
ARCH: amd64
.arch-arm64:
variables:
ARCH: arm64
.arch-ppc64le:
variables:
ARCH: ppc64le
.arch-x86_64:
variables:
ARCH: x86_64
# Define the package build helpers
.multi-arch-build:
before_script:
- apk add --no-cache coreutils build-base sed git bash make
- '[[ -n "${SKIP_QEMU_SETUP}" ]] || docker run --rm --privileged multiarch/qemu-user-static --reset -p yes -c yes'
.package-artifacts:
variables:
ARTIFACTS_NAME: "toolkit-container-${CI_PIPELINE_ID}"
ARTIFACTS_ROOT: "toolkit-container-${CI_PIPELINE_ID}"
DIST_DIR: ${CI_PROJECT_DIR}/${ARTIFACTS_ROOT}
.package-build:
extends:
- .multi-arch-build
- .package-artifacts
stage: package-build
script:
- ./scripts/release.sh ${DIST}-${ARCH}
artifacts:
name: ${ARTIFACTS_NAME}
paths:
- ${ARTIFACTS_ROOT}
# Define the package build targets
package-ubuntu18.04-amd64:
extends:
- .package-build
- .dist-ubuntu18.04
- .arch-amd64
package-ubuntu18.04-arm64:
extends:
- .package-build
- .dist-ubuntu18.04
- .arch-arm64
package-ubuntu18.04-ppc64le:
extends:
- .package-build
- .dist-ubuntu18.04
- .arch-ppc64le
package-centos7-x86_64:
extends:
- .package-build
- .dist-centos7
- .arch-x86_64
package-centos8-x86_64:
extends:
- .package-build
- .dist-centos8
- .arch-x86_64
# Define the image build targets
.image-build:
stage: image-build
variables:
IMAGE_NAME: "${CI_REGISTRY_IMAGE}/container-toolkit"
VERSION: "${CI_COMMIT_SHORT_SHA}"
before_script:
- apk add --no-cache bash make
- 'echo "Logging in to CI registry ${CI_REGISTRY}"'
- docker login -u "${CI_REGISTRY_USER}" -p "${CI_REGISTRY_PASSWORD}" "${CI_REGISTRY}"
image-centos7:
extends:
- .image-build
- .package-artifacts
- .dist-centos7
needs:
- package-centos7-x86_64
script:
- make -f build/container/Makefile build-${DIST}
- make -f build/container/Makefile push-${DIST}
image-centos8:
extends:
- .image-build
- .package-artifacts
- .dist-centos8
needs:
- package-centos8-x86_64
script:
- make -f build/container/Makefile build-${DIST}
- make -f build/container/Makefile push-${DIST}
image-ubi8:
extends:
- .image-build
- .package-artifacts
- .dist-ubi8
needs:
# Note: The ubi8 image currently uses the centos7 packages
- package-centos7-x86_64
script:
- make -f build/container/Makefile build-${DIST}
- make -f build/container/Makefile push-${DIST}
image-ubuntu18.04:
extends:
- .image-build
- .package-artifacts
- .dist-ubuntu18.04
needs:
- package-ubuntu18.04-amd64
# TODO: These will be required once we generate multi-arch images
# - package-ubuntu18.04-arm64
# - package-ubuntu18.04-ppc64le
script:
- make -f build/container/Makefile build-${DIST}
- make -f build/container/Makefile push-${DIST}
# Define test helpers
.integration:
stage: test
variables:
IMAGE_NAME: "${CI_REGISTRY_IMAGE}/container-toolkit"
VERSION: "${CI_COMMIT_SHORT_SHA}"
before_script:
- apk add --no-cache make bash jq
- docker login -u "${CI_REGISTRY_USER}" -p "${CI_REGISTRY_PASSWORD}" "${CI_REGISTRY}"
- docker pull "${IMAGE_NAME}:${VERSION}-${DIST}"
script:
- make -f build/container/Makefile test-${DIST}
.test:toolkit:
extends:
- .integration
variables:
TEST_CASES: "toolkit"
.test:docker:
extends:
- .integration
variables:
TEST_CASES: "docker"
.test:containerd:
# TODO: The containerd tests fail due to issues with SIGHUP.
# Until this is resolved with retry up to twice and allow failure here.
retry: 2
allow_failure: true
extends:
- .integration
variables:
TEST_CASES: "containerd"
.test:crio:
extends:
- .integration
variables:
TEST_CASES: "crio"
# Define the test targets
test-toolkit-ubuntu18.04:
extends:
- .test:toolkit
- .dist-ubuntu18.04
needs:
- image-ubuntu18.04
test-containerd-ubuntu18.04:
extends:
- .test:containerd
- .dist-ubuntu18.04
needs:
- image-ubuntu18.04
test-crio-ubuntu18.04:
extends:
- .test:crio
- .dist-ubuntu18.04
needs:
- image-ubuntu18.04
test-docker-ubuntu18.04:
extends:
- .test:docker
- .dist-ubuntu18.04
needs:
- image-ubuntu18.04
# .release forms the base of the deployment jobs which push images to the CI registry.
# This is extended with the version to be deployed (e.g. the SHA or TAG) and the
# target os.
.release:
stage:
release
variables:
# Define the source image for the release
IMAGE_NAME: "${CI_REGISTRY_IMAGE}/container-toolkit"
VERSION: "${CI_COMMIT_SHORT_SHA}"
# OUT_IMAGE_VERSION is overridden for external releases
OUT_IMAGE_VERSION: "${CI_COMMIT_SHORT_SHA}"
stage: release
before_script:
# We ensure that the OUT_IMAGE_VERSION is set
- 'echo Version: ${OUT_IMAGE_VERSION} ; [[ -n "${OUT_IMAGE_VERSION}" ]] || exit 1'
# In the case where we are deploying a different version to the CI_COMMIT_SHA, we
# need to tag the image.
# Note: a leading 'v' is stripped from the version if present
- apk add --no-cache make bash
- 'echo "Logging in to CI registry ${CI_REGISTRY}"'
- docker login -u "${CI_REGISTRY_USER}" -p "${CI_REGISTRY_PASSWORD}" "${CI_REGISTRY}"
- docker pull "${IMAGE_NAME}:${VERSION}-${DIST}"
script:
- docker tag "${IMAGE_NAME}:${VERSION}-${DIST}" "${OUT_IMAGE_NAME}:${OUT_IMAGE_VERSION}-${DIST}"
# Log in to the "output" registry, tag the image and push the image
- 'echo "Logging in to output registry ${OUT_REGISTRY}"'
- docker logout
- docker login -u "${OUT_REGISTRY_USER}" -p "${OUT_REGISTRY_TOKEN}" "${OUT_REGISTRY}"
- make IMAGE_NAME=${OUT_IMAGE_NAME} VERSION=${OUT_IMAGE_VERSION} -f build/container/Makefile push-${DIST}
# Define a staging release step that pushes an image to an internal "staging" repository
# This is triggered for all pipelines (i.e. not only tags) to test the pipeline steps
# outside of the release process.
.release:staging:
extends:
- .release
variables:
OUT_REGISTRY_USER: "${CI_REGISTRY_USER}"
OUT_REGISTRY_TOKEN: "${CI_REGISTRY_PASSWORD}"
OUT_REGISTRY: "${CI_REGISTRY}"
OUT_IMAGE_NAME: "${CI_REGISTRY_IMAGE}/staging/container-toolkit"
# Define an external release step that pushes an image to an external repository.
# This includes a devlopment image off master.
.release:external:
extends:
- .release
rules:
- if: $CI_COMMIT_TAG
variables:
OUT_IMAGE_VERSION: "${CI_COMMIT_TAG}"
- if: $CI_COMMIT_BRANCH == $RELEASE_DEVEL_BRANCH
variables:
OUT_IMAGE_VERSION: "${DEVEL_RELEASE_IMAGE_VERSION}"
# Define the release jobs
release:staging-centos7:
extends:
- .release:staging
- .dist-centos7
needs:
- image-centos7
release:staging-centos8:
extends:
- .release:staging
- .dist-centos8
needs:
- image-centos8
release:staging-ubi8:
extends:
- .release:staging
- .dist-ubi8
needs:
- image-ubi8
release:staging-ubuntu18.04:
extends:
- .release:staging
- .dist-ubuntu18.04
needs:
- test-toolkit-ubuntu18.04
- test-containerd-ubuntu18.04
- test-crio-ubuntu18.04
- test-docker-ubuntu18.04

View File

@@ -1,2 +1,2 @@
.git
dist
/shared-*

5
.gitignore vendored
View File

@@ -1,3 +1,8 @@
dist
*.swp
*.swo
/coverage.out
/test/output/
/nvidia-container-runtime
/nvidia-container-toolkit
/shared-*

View File

@@ -1,161 +1,60 @@
# Build packages for all supported OS / ARCH combinations
stages:
- tests
- build-one
- build-all
.tests-setup: &tests-setup
image: golang:1.14.4
rules:
- when: always
variables:
GITHUB_ROOT: "github.com/NVIDIA"
PROJECT_GOPATH: "${GITHUB_ROOT}/nvidia-container-toolkit"
before_script:
- mkdir -p ${GOPATH}/src/${GITHUB_ROOT}
- ln -s ${CI_PROJECT_DIR} ${GOPATH}/src/${PROJECT_GOPATH}
.build-setup: &build-setup
image: docker:19.03.8
services:
- name: docker:19.03.8-dind
command: ["--experimental"]
before_script:
- apk update
- apk upgrade
- apk add coreutils build-base sed git bash make
- docker run --rm --privileged multiarch/qemu-user-static --reset -p yes -c yes
# Run a series of sanity-check tests over the code
lint:
<<: *tests-setup
stage: tests
script:
- go get -u golang.org/x/lint/golint
- golint -set_exit_status ${PROJECT_GOPATH}/pkg
vet:
<<: *tests-setup
stage: tests
script:
- go vet ${PROJECT_GOPATH}/pkg
unit_test:
<<: *tests-setup
stage: tests
script:
- go test ${PROJECT_GOPATH}/pkg
fmt:
<<: *tests-setup
stage: tests
script:
- res=$(gofmt -l pkg/*.go)
- echo "$res"
- test -z "$res"
ineffassign:
<<: *tests-setup
stage: tests
script:
- go get -u github.com/gordonklaus/ineffassign
- ineffassign pkg/*.go
misspell:
<<: *tests-setup
stage: tests
script:
- go get -u github.com/client9/misspell/cmd/misspell
- misspell pkg/*.go
# build-one jobs build packages for a single OS / ARCH combination.
# Copyright (c) 2019-2021, NVIDIA CORPORATION. All rights reserved.
#
# They are run during the first stage of the pipeline as a smoke test to ensure
# that we can successfully build packages on all of our architectures for a
# single OS. They are triggered on any change to an MR. No artifacts are
# produced as part of build-one jobs.
.build-one-setup: &build-one-setup
<<: *build-setup
stage: build-one
only:
- merge_requests
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
include:
- .common-ci.yml
# build-all jobs build packages for every OS / ARCH combination we support.
#
# They are run under two conditions:
# 1) Automatically whenever a new tag is pushed to the repo (e.g. v1.1.0)
# 2) Manually by a reviewer just before merging a MR.
#
# Unlike build-one jobs, it takes a long time to build the full suite
# OS / ARCH combinations, so this is optimized to only run once per MR
# (assuming it all passes). A full set of artifacts including the packages
# built for each OS / ARCH are produced as a result of these jobs.
.build-all-setup: &build-all-setup
<<: *build-setup
.build-all-for-arch:
variables:
# Setting DIST=docker invokes the docker- release targets
DIST: docker
extends:
- .package-build
stage: build-all
timeout: 2h 30m
rules:
- if: $CI_COMMIT_TAG
when: always
- if: $CI_MERGE_REQUEST_ID
when: always
variables:
ARTIFACTS_NAME: "${CI_PROJECT_NAME}-${CI_COMMIT_REF_SLUG}-${CI_JOB_NAME}-artifacts-${CI_PIPELINE_ID}"
ARTIFACTS_DIR: "${CI_PROJECT_NAME}-${CI_COMMIT_REF_SLUG}-artifacts-${CI_PIPELINE_ID}"
DIST_DIR: "${CI_PROJECT_DIR}/${ARTIFACTS_DIR}"
artifacts:
name: ${ARTIFACTS_NAME}
paths:
- ${ARTIFACTS_DIR}
# The full set of build-one jobs organizes to build
# ubuntu18.04 in parallel on each of our supported ARCHs.
build-one-amd64:
<<: *build-one-setup
script:
- make ubuntu18.04-amd64
build-one-ppc64le:
<<: *build-one-setup
script:
- make ubuntu18.04-ppc64le
build-one-arm64:
<<: *build-one-setup
script:
- make ubuntu18.04-arm64
# The full set of build-all jobs organized to
# have builds for each ARCH run in parallel.
build-all-amd64:
<<: *build-all-setup
script:
- make docker-amd64
extends:
- .build-all-for-arch
- .arch-amd64
build-all-x86_64:
<<: *build-all-setup
script:
- make docker-x86_64
extends:
- .build-all-for-arch
- .arch-x86_64
build-all-ppc64le:
<<: *build-all-setup
script:
- make docker-ppc64le
extends:
- .build-all-for-arch
- .arch-ppc64le
build-all-arm64:
<<: *build-all-setup
script:
- make docker-arm64
extends:
- .build-all-for-arch
- .arch-arm64
build-all-aarch64:
<<: *build-all-setup
script:
- make docker-aarch64
extends:
- .build-all-for-arch
- .arch-aarch64

9
.gitmodules vendored Normal file
View File

@@ -0,0 +1,9 @@
[submodule "third_party/libnvidia-container"]
path = third_party/libnvidia-container
url = https://gitlab.com/nvidia/container-toolkit/libnvidia-container.git
[submodule "third_party/nvidia-container-runtime"]
path = third_party/nvidia-container-runtime
url = https://gitlab.com/nvidia/container-toolkit/container-runtime.git
[submodule "third_party/nvidia-docker"]
path = third_party/nvidia-docker
url = https://gitlab.com/nvidia/container-toolkit/nvidia-docker.git

174
.nvidia-ci.yml Normal file
View File

@@ -0,0 +1,174 @@
# Copyright (c) 2021, NVIDIA CORPORATION. All rights reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
include:
- local: '.common-ci.yml'
default:
tags:
- cnt
- container-dev
- docker/multi-arch
- docker/privileged
- os/linux
- type/docker
variables:
DOCKER_DRIVER: overlay2
DOCKER_TLS_CERTDIR: "/certs"
# Release "devel"-tagged images off the master branch
RELEASE_DEVEL_BRANCH: "master"
DEVEL_RELEASE_IMAGE_VERSION: "devel"
# On the multi-arch builder we don't need the qemu setup.
SKIP_QEMU_SETUP: "1"
# We skip the integration tests for the internal CI:
.integration:
stage: test
before_script:
- echo "Skipped in internal CI"
script:
- echo "Skipped in internal CI"
# The .scan step forms the base of the image scan operation performed before releasing
# images.
.scan:
stage: scan
image: "${PULSE_IMAGE}"
variables:
IMAGE: "${CI_REGISTRY_IMAGE}/container-toolkit:${CI_COMMIT_SHORT_SHA}-${DIST}"
IMAGE_ARCHIVE: "container-toolkit.tar"
rules:
- if: $CI_COMMIT_MESSAGE =~ /\[skip[ _-]scans?\]/i
when: never
- if: $SKIP_SCANS
when: never
- if: $CI_COMMIT_TAG == null && $CI_COMMIT_BRANCH != $RELEASE_DEVEL_BRANCH
allow_failure: true
before_script:
- docker login -u "${CI_REGISTRY_USER}" -p "${CI_REGISTRY_PASSWORD}" "${CI_REGISTRY}"
# TODO: We should specify the architecture here and scan all architectures
- docker pull "${IMAGE}"
- docker save "${IMAGE}" -o "${IMAGE_ARCHIVE}"
- AuthHeader=$(echo -n $SSA_CLIENT_ID:$SSA_CLIENT_SECRET | base64 -w0)
- >
export SSA_TOKEN=$(curl --request POST --header "Authorization: Basic $AuthHeader" --header "Content-Type: application/x-www-form-urlencoded" ${SSA_ISSUER_URL} | jq ".access_token" | tr -d '"')
- if [ -z "$SSA_TOKEN" ]; then exit 1; else echo "SSA_TOKEN set!"; fi
script:
- pulse-cli -n $NSPECT_ID --pss $PSS_URL --ssa $SSA_TOKEN scan -i $IMAGE_ARCHIVE -p $CONTAINER_POLICY -o
artifacts:
when: always
expire_in: 1 week
paths:
- pulse-cli.log
- licenses.json
- sbom.json
- vulns.json
- policy_evaluation.json
# Define the scan targets
scan-centos7:
extends:
- .scan
- .dist-centos7
needs:
- image-centos7
scan-centos8:
extends:
- .scan
- .dist-centos8
needs:
- image-centos8
scan-ubuntu18.04:
extends:
- .scan
- .dist-ubuntu18.04
needs:
- image-ubuntu18.04
scan-ubi8:
extends:
- .scan
- .dist-ubi8
needs:
- image-ubi8
# Define external release helpers
.release:ngc:
extends:
- .release:external
variables:
OUT_REGISTRY_USER: "${NGC_REGISTRY_USER}"
OUT_REGISTRY_TOKEN: "${NGC_REGISTRY_TOKEN}"
OUT_REGISTRY: "${NGC_REGISTRY}"
OUT_IMAGE_NAME: "${NGC_REGISTRY_IMAGE}"
# TODO: For now we disable external releases
DOCKER: echo
.release:dockerhub:
extends:
- .release:external
variables:
OUT_REGISTRY_USER: "${REGISTRY_USER}"
OUT_REGISTRY_TOKEN: "${REGISTRY_TOKEN}"
OUT_REGISTRY: "${DOCKERHUB_REGISTRY}"
OUT_IMAGE_NAME: "${REGISTRY_IMAGE}"
# TODO: For now we disable external releases
DOCKER: echo
# Define the external release targets
# Release to NGC
release:ngc-centos7:
extends:
- .release:ngc
- .dist-centos7
release:ngc-centos8:
extends:
- .release:ngc
- .dist-centos8
release:ngc-ubuntu18:
extends:
- .release:ngc
- .dist-ubuntu18.04
release:ngc-ubi8:
extends:
- .release:ngc
- .dist-ubi8
# Release to Dockerhub
release:dockerhub-centos7:
extends:
- .release:dockerhub
- .dist-centos7
release:dockerhub-centos8:
extends:
- .release:dockerhub
- .dist-centos8
release:dockerhub-ubuntu18:
extends:
- .release:dockerhub
- .dist-ubuntu18.04
release:dockerhub-ubi8:
extends:
- .release:dockerhub
- .dist-ubi8

45
DEVELOPMENT.md Normal file
View File

@@ -0,0 +1,45 @@
# NVIDIA Container Toolkit Release Tooling
This repository allows for the components of the NVIDIA container stack to be
built and released as the NVIDIA Container Toolkit from a single repository. The components:
* `libnvidia-container`
* `nvidia-container-runtime`
* `nvidia-docker`
are included as submodules in the `third_party` folder.
The `nvidia-container-toolkit` resides in this repo directly.
## Building
In oder to build the packages, the following command is executed
```sh
./scripts/build-all-components.sh TARGET
```
where `TARGET` is a make target that is valid for each of the sub-components.
These include:
* `ubuntu18.04-amd64`
* `centos8-x86_64`
The packages are generated in the `dist` folder.
## Testing local changes
In oder to use the same build logic to be used to generate packages with local changes,
the location of the individual components can be overridded using the: `LIBNVIDIA_CONTAINER_ROOT`,
`NVIDIA_CONTAINER_TOOLKIT_ROOT`, `NVIDIA_CONTAINER_RUNTIME_ROOT`, and `NVIDIA_DOCKER_ROOT`
environment variables.
## Testing packages locally
The [test/release](./test/release/) folder contains documentation on how the installation of local or staged packages can be tested.
## Releasing
A utility script [`scripts/release.sh`](./scripts/release.sh) is provided to build
packages required for release. If run without arguments, all supported distribution-architecture combinations are built. A specific distribution-architecture pair can also be provided
```sh
./scripts/release.sh ubuntu18.04-amd64
```
where the `amd64` builds for `ubuntu18.04` are provided as an example.

142
Jenkinsfile vendored Normal file
View File

@@ -0,0 +1,142 @@
/*
# Copyright (c) 2021, NVIDIA CORPORATION. All rights reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
*/
podTemplate (cloud:'sw-gpu-cloudnative',
containers: [
containerTemplate(name: 'docker', image: 'docker:dind', ttyEnabled: true, privileged: true),
containerTemplate(name: 'golang', image: 'golang:1.16.3', ttyEnabled: true)
]) {
node(POD_LABEL) {
def scmInfo
stage('checkout') {
scmInfo = checkout(scm)
}
stage('dependencies') {
container('golang') {
sh 'GO111MODULE=off go get -u github.com/client9/misspell/cmd/misspell'
sh 'GO111MODULE=off go get -u github.com/gordonklaus/ineffassign'
sh 'GO111MODULE=off go get -u golang.org/x/lint/golint'
}
container('docker') {
sh 'apk add --no-cache make bash git'
}
}
stage('check') {
parallel (
getGolangStages(["assert-fmt", "lint", "vet", "ineffassign", "misspell"])
)
}
stage('test') {
parallel (
getGolangStages(["test"])
)
}
def versionInfo
stage('version') {
container('docker') {
versionInfo = getVersionInfo(scmInfo)
println "versionInfo=${versionInfo}"
}
}
def dist = 'ubuntu20.04'
def arch = 'amd64'
def stageLabel = "${dist}-${arch}"
stage('build-one') {
container('docker') {
stage (stageLabel) {
sh "make ${dist}-${arch}"
}
}
}
stage('release') {
container('docker') {
stage (stageLabel) {
def component = 'main'
def repository = 'sw-gpu-cloudnative-debian-local/pool/main/'
def uploadSpec = """{
"files":
[ {
"pattern": "./dist/${dist}/${arch}/*.deb",
"target": "${repository}",
"props": "deb.distribution=${dist};deb.component=${component};deb.architecture=${arch}"
}
]
}"""
sh "echo starting release with versionInfo=${versionInfo}"
if (versionInfo.isTag) {
// upload to artifactory repository
def server = Artifactory.server 'sw-gpu-artifactory'
server.upload spec: uploadSpec
} else {
sh "echo skipping release for non-tagged build"
}
}
}
}
}
}
def getGolangStages(def targets) {
stages = [:]
for (t in targets) {
stages[t] = getLintClosure(t)
}
return stages
}
def getLintClosure(def target) {
return {
container('golang') {
stage(target) {
sh "make ${target}"
}
}
}
}
// getVersionInfo returns a hash of version info
def getVersionInfo(def scmInfo) {
def versionInfo = [
isTag: isTag(scmInfo.GIT_BRANCH)
]
scmInfo.each { k, v -> versionInfo[k] = v }
return versionInfo
}
def isTag(def branch) {
if (!branch.startsWith('v')) {
return false
}
def version = shOutput('git describe --all --exact-match --always')
return version == "tags/${branch}"
}
def shOuptut(def script) {
return sh(script: script, returnStdout: true).trim()
}

144
Makefile
View File

@@ -1,19 +1,147 @@
# Copyright (c) 2017-2020, NVIDIA CORPORATION. All rights reserved.
# Copyright (c) 2017-2021, NVIDIA CORPORATION. All rights reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
DOCKER ?= docker
MKDIR ?= mkdir
DIST_DIR ?= $(CURDIR)/dist
LIB_NAME := nvidia-container-toolkit
LIB_VERSION := 1.4.2
LIB_TAG ?=
LIB_VERSION := 1.6.0
LIB_TAG ?= rc.2
GOLANG_VERSION := 1.14.2
GOLANG_PKG_PATH := github.com/NVIDIA/nvidia-container-toolkit/pkg
GOLANG_VERSION := 1.16.3
MODULE := github.com/NVIDIA/nvidia-container-toolkit
# By default run all native docker-based targets
docker-native:
include $(CURDIR)/docker.mk
include $(CURDIR)/docker/docker.mk
binary:
go build -ldflags "-s -w" -o "$(LIB_NAME)" $(GOLANG_PKG_PATH)
ifeq ($(IMAGE_NAME),)
REGISTRY ?= nvidia
IMAGE_NAME = $(REGISTRY)/container-toolkit
endif
BUILDIMAGE_TAG ?= golang$(GOLANG_VERSION)
BUILDIMAGE ?= $(IMAGE_NAME)-build:$(BUILDIMAGE_TAG)
EXAMPLES := $(patsubst ./examples/%/,%,$(sort $(dir $(wildcard ./examples/*/))))
EXAMPLE_TARGETS := $(patsubst %,example-%, $(EXAMPLES))
CMDS := $(patsubst ./cmd/%/,%,$(sort $(dir $(wildcard ./cmd/*/))))
CMD_TARGETS := $(patsubst %,cmd-%, $(CMDS))
$(info CMD_TARGETS=$(CMD_TARGETS))
CHECK_TARGETS := assert-fmt vet lint ineffassign misspell
MAKE_TARGETS := binaries build check fmt lint-internal test examples cmds coverage generate $(CHECK_TARGETS)
TARGETS := $(MAKE_TARGETS) $(EXAMPLE_TARGETS) $(CMD_TARGETS)
DOCKER_TARGETS := $(patsubst %,docker-%, $(TARGETS))
.PHONY: $(TARGETS) $(DOCKER_TARGETS)
GOOS ?= linux
binaries: cmds
ifneq ($(PREFIX),)
cmd-%: COMMAND_BUILD_OPTIONS = -o $(PREFIX)/$(*)
endif
cmds: $(CMD_TARGETS)
$(CMD_TARGETS): cmd-%:
GOOS=$(GOOS) go build -ldflags "-s -w" $(COMMAND_BUILD_OPTIONS) $(MODULE)/cmd/$(*)
build:
GOOS=$(GOOS) go build ./...
examples: $(EXAMPLE_TARGETS)
$(EXAMPLE_TARGETS): example-%:
GOOS=$(GOOS) go build ./examples/$(*)
all: check test build binary
check: $(CHECK_TARGETS)
# Apply go fmt to the codebase
fmt:
go list -f '{{.Dir}}' $(MODULE)/... \
| xargs gofmt -s -l -w
assert-fmt:
go list -f '{{.Dir}}' $(MODULE)/... \
| xargs gofmt -s -l > fmt.out
@if [ -s fmt.out ]; then \
echo "\nERROR: The following files are not formatted:\n"; \
cat fmt.out; \
rm fmt.out; \
exit 1; \
else \
rm fmt.out; \
fi
ineffassign:
ineffassign $(MODULE)/...
lint:
# We use `go list -f '{{.Dir}}' $(MODULE)/...` to skip the `vendor` folder.
go list -f '{{.Dir}}' $(MODULE)/... | grep -v /internal/ | xargs golint -set_exit_status
lint-internal:
# We use `go list -f '{{.Dir}}' $(MODULE)/...` to skip the `vendor` folder.
go list -f '{{.Dir}}' $(MODULE)/internal/... | xargs golint -set_exit_status
misspell:
misspell $(MODULE)/...
vet:
go vet $(MODULE)/...
COVERAGE_FILE := coverage.out
test: build cmds
go test -v -coverprofile=$(COVERAGE_FILE) $(MODULE)/...
coverage: test
cat $(COVERAGE_FILE) | grep -v "_mock.go" > $(COVERAGE_FILE).no-mocks
go tool cover -func=$(COVERAGE_FILE).no-mocks
generate:
go generate $(MODULE)/...
# Generate an image for containerized builds
# Note: This image is local only
.PHONY: .build-image .pull-build-image .push-build-image
.build-image: docker/Dockerfile.devel
if [ x"$(SKIP_IMAGE_BUILD)" = x"" ]; then \
$(DOCKER) build \
--progress=plain \
--build-arg GOLANG_VERSION="$(GOLANG_VERSION)" \
--tag $(BUILDIMAGE) \
-f $(^) \
docker; \
fi
.pull-build-image:
$(DOCKER) pull $(BUILDIMAGE)
.push-build-image:
$(DOCKER) push $(BUILDIMAGE)
$(DOCKER_TARGETS): docker-%: .build-image
@echo "Running 'make $(*)' in docker container $(BUILDIMAGE)"
$(DOCKER) run \
--rm \
-e GOCACHE=/tmp/.cache \
-v $(PWD):$(PWD) \
-w $(PWD) \
--user $$(id -u):$$(id -g) \
$(BUILDIMAGE) \
make $(*)

31
README.md Normal file
View File

@@ -0,0 +1,31 @@
# NVIDIA Container Toolkit
[![GitHub license](https://img.shields.io/github/license/NVIDIA/nvidia-container-toolkit?style=flat-square)](https://raw.githubusercontent.com/NVIDIA/nvidia-container-toolkit/master/LICENSE)
[![Documentation](https://img.shields.io/badge/documentation-wiki-blue.svg?style=flat-square)](https://docs.nvidia.com/datacenter/cloud-native/container-toolkit/overview.html)
[![Package repository](https://img.shields.io/badge/packages-repository-b956e8.svg?style=flat-square)](https://nvidia.github.io/libnvidia-container)
![nvidia-container-stack](https://cloud.githubusercontent.com/assets/3028125/12213714/5b208976-b632-11e5-8406-38d379ec46aa.png)
## Introduction
The NVIDIA Container Toolkit allows users to build and run GPU accelerated containers. The toolkit includes a container runtime [library](https://github.com/NVIDIA/libnvidia-container) and utilities to automatically configure containers to leverage NVIDIA GPUs.
Product documentation including an architecture overview, platform support, and installation and usage guides can be found in the [documentation repository](https://docs.nvidia.com/datacenter/cloud-native/container-toolkit/overview.html).
## Getting Started
**Make sure you have installed the [NVIDIA driver](https://docs.nvidia.com/datacenter/cloud-native/container-toolkit/install-guide.html#nvidia-drivers) for your Linux Distribution**
**Note that you do not need to install the CUDA Toolkit on the host system, but the NVIDIA driver needs to be installed**
For instructions on getting started with the NVIDIA Container Toolkit, refer to the [installation guide](https://docs.nvidia.com/datacenter/cloud-native/container-toolkit/install-guide.html#installation-guide).
## Usage
The [user guide](https://docs.nvidia.com/datacenter/cloud-native/container-toolkit/user-guide.html) provides information on the configuration and command line options available when running GPU containers with Docker.
## Issues and Contributing
[Checkout the Contributing document!](CONTRIBUTING.md)
* Please let us know by [filing a new issue](https://github.com/NVIDIA/nvidia-container-toolkit/issues/new)
* You can contribute by creating a [merge request](https://gitlab.com/nvidia/container-toolkit/container-toolkit/-/merge_requests/new) to our public GitLab repository

View File

@@ -0,0 +1,76 @@
# Copyright (c) 2019-2021, NVIDIA CORPORATION. All rights reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
ARG BASE_DIST
ARG CUDA_VERSION
ARG GOLANG_VERSION=x.x.x
ARG VERSION="N/A"
# NOTE: In cases where the libc version is a concern, we would have to use an
# image based on the target OS to build the golang executables here -- especially
# if cgo code is included.
FROM golang:${GOLANG_VERSION} as build
# We override the GOPATH to ensure that the binaries are installed to
# /artifacts/bin
ARG GOPATH=/artifacts
# Install the experiemental nvidia-container-runtime
# NOTE: This will be integrated into the nvidia-container-toolkit package / repo
ARG NVIDIA_CONTAINER_RUNTIME_EXPERIMENTAL_VERSION=experimental
RUN GOPATH=/artifacts go install github.com/NVIDIA/nvidia-container-toolkit/cmd/nvidia-container-runtime.experimental@${NVIDIA_CONTAINER_RUNTIME_EXPERIMENTAL_VERSION}
WORKDIR /build
COPY . .
# NOTE: Until the config utilities are properly integrated into the
# nvidia-container-toolkit repository, these are built from the `tools` folder
# and not `cmd`.
RUN GOPATH=/artifacts go install -ldflags="-s -w -X 'main.Version=${VERSION}'" ./tools/...
FROM nvidia/cuda:${CUDA_VERSION}-base-${BASE_DIST}
ENV NVIDIA_DISABLE_REQUIRE="true"
ENV NVIDIA_VISIBLE_DEVICES=all
ENV NVIDIA_DRIVER_CAPABILITIES=utility
WORKDIR /artifacts/packages
ARG ARTIFACTS_DIR
COPY ${ARTIFACTS_DIR}/* /artifacts/packages/
ARG PACKAGE_VERSION
RUN yum localinstall -y \
libnvidia-container1-${PACKAGE_VERSION}*.rpm \
libnvidia-container-tools-${PACKAGE_VERSION}*.rpm \
nvidia-container-toolkit-${PACKAGE_VERSION}*.rpm
WORKDIR /work
COPY --from=build /artifacts/bin /work
ENV PATH=/work:$PATH
LABEL io.k8s.display-name="NVIDIA Container Runtime Config"
LABEL name="NVIDIA Container Runtime Config"
LABEL vendor="NVIDIA"
LABEL version="${VERSION}"
LABEL release="N/A"
LABEL summary="Automatically Configure your Container Runtime for GPU support."
LABEL description="See summary"
COPY ./LICENSE /licenses/LICENSE
ENTRYPOINT ["/work/nvidia-toolkit"]

View File

@@ -0,0 +1,82 @@
# Copyright (c) 2019-2021, NVIDIA CORPORATION. All rights reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
ARG BASE_DIST
ARG CUDA_VERSION
ARG GOLANG_VERSION=x.x.x
ARG VERSION="N/A"
# NOTE: In cases where the libc version is a concern, we would have to use an
# image based on the target OS to build the golang executables here -- especially
# if cgo code is included.
FROM golang:${GOLANG_VERSION} as build
# We override the GOPATH to ensure that the binaries are installed to
# /artifacts/bin
ARG GOPATH=/artifacts
# Install the experiemental nvidia-container-runtime
# NOTE: This will be integrated into the nvidia-container-toolkit package / repo
ARG NVIDIA_CONTAINER_RUNTIME_EXPERIMENTAL_VERSION=experimental
RUN GOPATH=/artifacts go install github.com/NVIDIA/nvidia-container-toolkit/cmd/nvidia-container-runtime.experimental@${NVIDIA_CONTAINER_RUNTIME_EXPERIMENTAL_VERSION}
WORKDIR /build
COPY . .
# NOTE: Until the config utilities are properly integrated into the
# nvidia-container-toolkit repository, these are built from the `tools` folder
# and not `cmd`.
RUN GOPATH=/artifacts go install -ldflags="-s -w -X 'main.Version=${VERSION}'" ./tools/...
FROM nvidia/cuda:${CUDA_VERSION}-base-${BASE_DIST}
ARG DEBIAN_FRONTEND=noninteractive
RUN apt-get update && apt-get install -y --no-install-recommends \
libcap2 \
&& \
rm -rf /var/lib/apt/lists/*
ENV NVIDIA_DISABLE_REQUIRE="true"
ENV NVIDIA_VISIBLE_DEVICES=all
ENV NVIDIA_DRIVER_CAPABILITIES=utility
WORKDIR /artifacts/packages
ARG ARTIFACTS_DIR
COPY ${ARTIFACTS_DIR}/* /artifacts/packages/
ARG PACKAGE_VERSION
RUN dpkg -i \
libnvidia-container1_${PACKAGE_VERSION}*.deb \
libnvidia-container-tools_${PACKAGE_VERSION}*.deb \
nvidia-container-toolkit_${PACKAGE_VERSION}*.deb
WORKDIR /work
COPY --from=build /artifacts/bin /work/
ENV PATH=/work:$PATH
LABEL io.k8s.display-name="NVIDIA Container Runtime Config"
LABEL name="NVIDIA Container Runtime Config"
LABEL vendor="NVIDIA"
LABEL version="${VERSION}"
LABEL release="N/A"
LABEL summary="Automatically Configure your Container Runtime for GPU support."
LABEL description="See summary"
COPY ./LICENSE /licenses/LICENSE
ENTRYPOINT ["/work/nvidia-toolkit"]

110
build/container/Makefile Normal file
View File

@@ -0,0 +1,110 @@
# Copyright (c) 2021, NVIDIA CORPORATION. All rights reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
DOCKER ?= docker
MKDIR ?= mkdir
DIST_DIR ?= $(CURDIR)/dist
##### Global variables #####
# TODO: These should be defined ONCE and currently duplicate the version in the
# toolkit makefile.
LIB_VERSION := 1.6.0
LIB_TAG := rc.2
VERSION ?= $(LIB_VERSION)$(if $(LIB_TAG),-$(LIB_TAG))
CUDA_VERSION ?= 11.4.2
GOLANG_VERSION ?= 1.16.4
ifeq ($(IMAGE_NAME),)
REGISTRY ?= nvidia
IMAGE_NAME := $(REGISTRY)/container-toolkit
endif
IMAGE_TAG ?= $(VERSION)-$(DIST)
IMAGE = $(IMAGE_NAME):$(IMAGE_TAG)
##### Public rules #####
DEFAULT_PUSH_TARGET := ubuntu18.04
TARGETS := ubuntu20.04 ubuntu18.04 ubi8 centos7 centos8
BUILD_TARGETS := $(patsubst %, build-%, $(TARGETS))
PUSH_TARGETS := $(patsubst %, push-%, $(TARGETS))
TEST_TARGETS := $(patsubst %, test-%, $(TARGETS))
.PHONY: $(TARGETS) $(PUSH_TARGETS) $(BUILD_TARGETS) $(TEST_TARGETS)
$(PUSH_TARGETS): push-%:
$(DOCKER) push "$(IMAGE_NAME):$(IMAGE_TAG)"
# For the default push target we also push a short tag equal to the version.
# We skip this for the development release
DEVEL_RELEASE_IMAGE_VERSION ?= devel
ifneq ($(strip $(VERSION)),$(DEVEL_RELEASE_IMAGE_VERSION))
push-$(DEFAULT_PUSH_TARGET): push-short
endif
push-short:
$(DOCKER) tag "$(IMAGE_NAME):$(VERSION)-$(DEFAULT_PUSH_TARGET)" "$(IMAGE_NAME):$(VERSION)"
$(DOCKER) push "$(IMAGE_NAME):$(VERSION)"
build-%: DIST = $(*)
build-%: DOCKERFILE = $(CURDIR)/build/container/Dockerfile.$(DOCKERFILE_SUFFIX)
# Use a generic build target to build the relevant images
$(BUILD_TARGETS): build-%: $(ARTIFACTS_DIR)
$(DOCKER) build --pull \
--tag $(IMAGE) \
--build-arg ARTIFACTS_DIR="$(ARTIFACTS_DIR)" \
--build-arg BASE_DIST="$(BASE_DIST)" \
--build-arg CUDA_VERSION="$(CUDA_VERSION)" \
--build-arg GOLANG_VERSION="$(GOLANG_VERSION)" \
--build-arg PACKAGE_VERSION="$(PACKAGE_VERSION)" \
--build-arg VERSION="$(VERSION)" \
-f $(DOCKERFILE) \
$(CURDIR)
ARTIFACTS_ROOT ?= $(shell realpath --relative-to=$(CURDIR) $(DIST_DIR))
build-ubuntu%: DOCKERFILE_SUFFIX := ubuntu
build-ubuntu%: ARTIFACTS_DIR = $(ARTIFACTS_ROOT)/$(*)/amd64
build-ubuntu%: PACKAGE_VERSION := $(LIB_VERSION)$(if $(LIB_TAG),~$(LIB_TAG))
build-ubuntu18.04: BASE_DIST := ubuntu18.04
build-ubuntu20.04: BASE_DIST := ubuntu20.04
build-ubi8: DOCKERFILE_SUFFIX := centos
# TODO: Update this to use the centos8 packages
build-ubi8: ARTIFACTS_DIR = $(ARTIFACTS_ROOT)/centos7/x86_64
build-ubi8: PACKAGE_VERSION := $(LIB_VERSION)-$(if $(LIB_TAG),0.1.$(LIB_TAG),1)
build-ubi8: BASE_DIST := ubi8
build-centos%: DOCKERFILE_SUFFIX := centos
build-centos%: ARTIFACTS_DIR = $(ARTIFACTS_ROOT)/$(*)/x86_64
build-centos%: PACKAGE_VERSION := $(LIB_VERSION)-$(if $(LIB_TAG),0.1.$(LIB_TAG),1)
build-centos7: BASE_DIST := centos7
build-centos8: BASE_DIST := centos8
# Test targets
test-%: DIST = $(*)
TEST_CASES ?= toolkit docker crio containerd
$(TEST_TARGETS): test-%:
TEST_CASES="$(TEST_CASES)" bash -x $(CURDIR)/test/container/main.sh run \
$(CURDIR)/shared-$(*) \
$(IMAGE) \
--no-cleanup-on-error

View File

@@ -0,0 +1,4 @@
# NVIDIA Container Toolkit Container
This folder contains make and docker files for building the NVIDIA Container Toolkit Container.

View File

@@ -0,0 +1,79 @@
/*
# Copyright (c) 2021, NVIDIA CORPORATION. All rights reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
*/
package main
import (
"fmt"
"io"
"os"
"github.com/sirupsen/logrus"
"github.com/tsaikd/KDGoLib/logrusutil"
)
// Logger adds a way to manage output to a log file to a logrus.Logger
type Logger struct {
*logrus.Logger
previousOutput io.Writer
logFile *os.File
}
// NewLogger constructs a Logger with a preddefined formatter
func NewLogger() *Logger {
logrusLogger := logrus.New()
formatter := &logrusutil.ConsoleLogFormatter{
TimestampFormat: "2006/01/02 15:04:07",
Flag: logrusutil.Ltime,
}
logger := &Logger{
Logger: logrusLogger,
}
logger.SetFormatter(formatter)
return logger
}
// LogToFile opens the specified file for appending and sets the logger to
// output to the opened file. A reference to the file pointer is stored to
// allow this to be closed.
func (l *Logger) LogToFile(filename string) error {
logFile, err := os.OpenFile(filename, os.O_APPEND|os.O_CREATE|os.O_WRONLY, 0644)
if err != nil {
return fmt.Errorf("error opening debug log file: %v", err)
}
l.logFile = logFile
l.previousOutput = l.Out
l.SetOutput(logFile)
return nil
}
// CloseFile closes the log file (if any) and resets the logger output to what it
// was before LogToFile was called.
func (l *Logger) CloseFile() error {
if l.logFile == nil {
return nil
}
logFile := l.logFile
l.SetOutput(l.previousOutput)
l.logFile = nil
return logFile.Close()
}

View File

@@ -0,0 +1,89 @@
package main
import (
"fmt"
"os"
"path"
"github.com/pelletier/go-toml"
)
const (
configOverride = "XDG_CONFIG_HOME"
configFilePath = "nvidia-container-runtime/config.toml"
hookDefaultFilePath = "/usr/bin/nvidia-container-runtime-hook"
)
var (
configDir = "/etc/"
)
var logger = NewLogger()
func main() {
err := run(os.Args)
if err != nil {
logger.Errorf("Error running %v: %v", os.Args, err)
os.Exit(1)
}
}
// run is an entry point that allows for idiomatic handling of errors
// when calling from the main function.
func run(argv []string) (err error) {
cfg, err := getConfig()
if err != nil {
return fmt.Errorf("error loading config: %v", err)
}
err = logger.LogToFile(cfg.debugFilePath)
if err != nil {
return fmt.Errorf("error opening debug log file: %v", err)
}
defer func() {
// We capture and log a returning error before closing the log file.
if err != nil {
logger.Errorf("Error running %v: %v", argv, err)
}
logger.CloseFile()
}()
r, err := newRuntime(argv)
if err != nil {
return fmt.Errorf("error creating runtime: %v", err)
}
logger.Printf("Running %s\n", argv[0])
return r.Exec(argv)
}
type config struct {
debugFilePath string
}
// getConfig sets up the config struct. Values are read from a toml file
// or set via the environment.
func getConfig() (*config, error) {
cfg := &config{}
if XDGConfigDir := os.Getenv(configOverride); len(XDGConfigDir) != 0 {
configDir = XDGConfigDir
}
configFilePath := path.Join(configDir, configFilePath)
tomlContent, err := os.ReadFile(configFilePath)
if err != nil {
return nil, err
}
toml, err := toml.Load(string(tomlContent))
if err != nil {
return nil, err
}
cfg.debugFilePath = toml.GetDefault("nvidia-container-runtime.debug", "/dev/null").(string)
return cfg, nil
}

View File

@@ -0,0 +1,293 @@
package main
import (
"bytes"
"encoding/json"
"fmt"
"io/ioutil"
"os"
"os/exec"
"path/filepath"
"runtime"
"strings"
"testing"
"github.com/opencontainers/runtime-spec/specs-go"
"github.com/stretchr/testify/require"
)
const (
nvidiaRuntime = "nvidia-container-runtime"
nvidiaHook = "nvidia-container-runtime-hook"
bundlePathSuffix = "test/output/bundle/"
specFile = "config.json"
unmodifiedSpecFileSuffix = "test/input/test_spec.json"
)
type testConfig struct {
root string
binPath string
}
var cfg *testConfig
func TestMain(m *testing.M) {
// TEST SETUP
// Determine the module root and the test binary path
var err error
moduleRoot, err := getModuleRoot()
if err != nil {
logger.Fatalf("error in test setup: could not get module root: %v", err)
}
testBinPath := filepath.Join(moduleRoot, "test", "bin")
testInputPath := filepath.Join(moduleRoot, "test", "input")
// Set the environment variables for the test
os.Setenv("PATH", prependToPath(testBinPath, moduleRoot))
os.Setenv("XDG_CONFIG_HOME", testInputPath)
// Confirm that the environment is configured correctly
runcPath, err := exec.LookPath(runcExecutableName)
if err != nil || filepath.Join(testBinPath, runcExecutableName) != runcPath {
logger.Fatalf("error in test setup: mock runc path set incorrectly in TestMain(): %v", err)
}
hookPath, err := exec.LookPath(nvidiaHook)
if err != nil || filepath.Join(testBinPath, nvidiaHook) != hookPath {
logger.Fatalf("error in test setup: mock hook path set incorrectly in TestMain(): %v", err)
}
// Store the root and binary paths in the test Config
cfg = &testConfig{
root: moduleRoot,
binPath: testBinPath,
}
// RUN TESTS
exitCode := m.Run()
// TEST CLEANUP
os.Remove(specFile)
os.Exit(exitCode)
}
func getModuleRoot() (string, error) {
_, filename, _, _ := runtime.Caller(0)
return hasGoMod(filename)
}
func hasGoMod(dir string) (string, error) {
if dir == "" || dir == "/" {
return "", fmt.Errorf("module root not found")
}
_, err := os.Stat(filepath.Join(dir, "go.mod"))
if err != nil {
return hasGoMod(filepath.Dir(dir))
}
return dir, nil
}
func prependToPath(additionalPaths ...string) string {
paths := strings.Split(os.Getenv("PATH"), ":")
paths = append(additionalPaths, paths...)
return strings.Join(paths, ":")
}
// case 1) nvidia-container-runtime run --bundle
// case 2) nvidia-container-runtime create --bundle
// - Confirm the runtime handles bad input correctly
func TestBadInput(t *testing.T) {
err := cfg.generateNewRuntimeSpec()
if err != nil {
t.Fatal(err)
}
cmdRun := exec.Command(nvidiaRuntime, "run", "--bundle")
t.Logf("executing: %s\n", strings.Join(cmdRun.Args, " "))
output, err := cmdRun.CombinedOutput()
require.Errorf(t, err, "runtime should return an error", "output=%v", string(output))
cmdCreate := exec.Command(nvidiaRuntime, "create", "--bundle")
t.Logf("executing: %s\n", strings.Join(cmdCreate.Args, " "))
err = cmdCreate.Run()
require.Error(t, err, "runtime should return an error")
}
// case 1) nvidia-container-runtime run --bundle <bundle-name> <ctr-name>
// - Confirm the runtime runs with no errors
// case 2) nvidia-container-runtime create --bundle <bundle-name> <ctr-name>
// - Confirm the runtime inserts the NVIDIA prestart hook correctly
func TestGoodInput(t *testing.T) {
err := cfg.generateNewRuntimeSpec()
if err != nil {
t.Fatalf("error generating runtime spec: %v", err)
}
cmdRun := exec.Command(nvidiaRuntime, "run", "--bundle", cfg.bundlePath(), "testcontainer")
t.Logf("executing: %s\n", strings.Join(cmdRun.Args, " "))
output, err := cmdRun.CombinedOutput()
require.NoErrorf(t, err, "runtime should not return an error", "output=%v", string(output))
// Check config.json and confirm there are no hooks
spec, err := cfg.getRuntimeSpec()
require.NoError(t, err, "should be no errors when reading and parsing spec from config.json")
require.Empty(t, spec.Hooks, "there should be no hooks in config.json")
cmdCreate := exec.Command(nvidiaRuntime, "create", "--bundle", cfg.bundlePath(), "testcontainer")
t.Logf("executing: %s\n", strings.Join(cmdCreate.Args, " "))
err = cmdCreate.Run()
require.NoError(t, err, "runtime should not return an error")
// Check config.json for NVIDIA prestart hook
spec, err = cfg.getRuntimeSpec()
require.NoError(t, err, "should be no errors when reading and parsing spec from config.json")
require.NotEmpty(t, spec.Hooks, "there should be hooks in config.json")
require.Equal(t, 1, nvidiaHookCount(spec.Hooks), "exactly one nvidia prestart hook should be inserted correctly into config.json")
}
// NVIDIA prestart hook already present in config file
func TestDuplicateHook(t *testing.T) {
err := cfg.generateNewRuntimeSpec()
if err != nil {
t.Fatal(err)
}
var spec specs.Spec
spec, err = cfg.getRuntimeSpec()
if err != nil {
t.Fatal(err)
}
t.Logf("inserting nvidia prestart hook to config.json")
if err = addNVIDIAHook(&spec); err != nil {
t.Fatal(err)
}
jsonOutput, err := json.MarshalIndent(spec, "", "\t")
if err != nil {
t.Fatal(err)
}
jsonFile, err := os.OpenFile(cfg.specFilePath(), os.O_RDWR, 0644)
if err != nil {
t.Fatal(err)
}
_, err = jsonFile.WriteAt(jsonOutput, 0)
if err != nil {
t.Fatal(err)
}
// Test how runtime handles already existing prestart hook in config.json
cmdCreate := exec.Command(nvidiaRuntime, "create", "--bundle", cfg.bundlePath(), "testcontainer")
t.Logf("executing: %s\n", strings.Join(cmdCreate.Args, " "))
output, err := cmdCreate.CombinedOutput()
require.NoErrorf(t, err, "runtime should not return an error", "output=%v", string(output))
// Check config.json for NVIDIA prestart hook
spec, err = cfg.getRuntimeSpec()
require.NoError(t, err, "should be no errors when reading and parsing spec from config.json")
require.NotEmpty(t, spec.Hooks, "there should be hooks in config.json")
require.Equal(t, 1, nvidiaHookCount(spec.Hooks), "exactly one nvidia prestart hook should be inserted correctly into config.json")
}
// addNVIDIAHook is a basic wrapper for nvidiaContainerRunime.addNVIDIAHook that is used for
// testing.
func addNVIDIAHook(spec *specs.Spec) error {
r := nvidiaContainerRuntime{logger: logger.Logger}
return r.addNVIDIAHook(spec)
}
func (c testConfig) getRuntimeSpec() (specs.Spec, error) {
filePath := c.specFilePath()
var spec specs.Spec
jsonFile, err := os.OpenFile(filePath, os.O_RDWR, 0644)
if err != nil {
return spec, err
}
defer jsonFile.Close()
jsonContent, err := ioutil.ReadAll(jsonFile)
if err != nil {
return spec, err
} else if json.Valid(jsonContent) {
err = json.Unmarshal(jsonContent, &spec)
if err != nil {
return spec, err
}
} else {
err = json.NewDecoder(bytes.NewReader(jsonContent)).Decode(&spec)
if err != nil {
return spec, err
}
}
return spec, err
}
func (c testConfig) bundlePath() string {
return filepath.Join(c.root, bundlePathSuffix)
}
func (c testConfig) specFilePath() string {
return filepath.Join(c.bundlePath(), specFile)
}
func (c testConfig) unmodifiedSpecFile() string {
return filepath.Join(c.root, unmodifiedSpecFileSuffix)
}
func (c testConfig) generateNewRuntimeSpec() error {
var err error
err = os.MkdirAll(c.bundlePath(), 0755)
if err != nil {
return err
}
cmd := exec.Command("cp", c.unmodifiedSpecFile(), c.specFilePath())
err = cmd.Run()
if err != nil {
return err
}
return nil
}
// Return number of valid NVIDIA prestart hooks in runtime spec
func nvidiaHookCount(hooks *specs.Hooks) int {
if hooks == nil {
return 0
}
count := 0
for _, hook := range hooks.Prestart {
if strings.Contains(hook.Path, nvidiaHook) {
count++
}
}
return count
}
func TestGetConfigWithCustomConfig(t *testing.T) {
wd, err := os.Getwd()
require.NoError(t, err)
// By default debug is disabled
contents := []byte("[nvidia-container-runtime]\ndebug = \"/nvidia-container-toolkit.log\"")
testDir := filepath.Join(wd, "test")
filename := filepath.Join(testDir, configFilePath)
os.Setenv(configOverride, testDir)
require.NoError(t, os.MkdirAll(filepath.Dir(filename), 0766))
require.NoError(t, ioutil.WriteFile(filename, contents, 0766))
defer func() { require.NoError(t, os.RemoveAll(testDir)) }()
cfg, err := getConfig()
require.NoError(t, err)
require.Equal(t, cfg.debugFilePath, "/nvidia-container-toolkit.log")
}

View File

@@ -0,0 +1,145 @@
/*
# Copyright (c) 2021, NVIDIA CORPORATION. All rights reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
*/
package main
import (
"fmt"
"os"
"os/exec"
"strings"
"github.com/NVIDIA/nvidia-container-toolkit/internal/oci"
"github.com/opencontainers/runtime-spec/specs-go"
log "github.com/sirupsen/logrus"
)
// nvidiaContainerRuntime encapsulates the NVIDIA Container Runtime. It wraps the specified runtime, conditionally
// modifying the specified OCI specification before invoking the runtime.
type nvidiaContainerRuntime struct {
logger *log.Logger
runtime oci.Runtime
ociSpec oci.Spec
}
var _ oci.Runtime = (*nvidiaContainerRuntime)(nil)
// newNvidiaContainerRuntime is a constructor for a standard runtime shim.
func newNvidiaContainerRuntimeWithLogger(logger *log.Logger, runtime oci.Runtime, ociSpec oci.Spec) (oci.Runtime, error) {
r := nvidiaContainerRuntime{
logger: logger,
runtime: runtime,
ociSpec: ociSpec,
}
return &r, nil
}
// Exec defines the entrypoint for the NVIDIA Container Runtime. A check is performed to see whether modifications
// to the OCI spec are required -- and applicable modifcations applied. The supplied arguments are then
// forwarded to the underlying runtime's Exec method.
func (r nvidiaContainerRuntime) Exec(args []string) error {
if r.modificationRequired(args) {
err := r.modifyOCISpec()
if err != nil {
return fmt.Errorf("error modifying OCI spec: %v", err)
}
}
r.logger.Println("Forwarding command to runtime")
return r.runtime.Exec(args)
}
// modificationRequired checks the intput arguments to determine whether a modification
// to the OCI spec is required.
func (r nvidiaContainerRuntime) modificationRequired(args []string) bool {
var previousWasBundle bool
for _, a := range args {
// We check for '--bundle create' explicitly to ensure that we
// don't inadvertently trigger a modification if the bundle directory
// is specified as `create`
if !previousWasBundle && isBundleFlag(a) {
previousWasBundle = true
continue
}
if !previousWasBundle && a == "create" {
r.logger.Infof("'create' command detected; modification required")
return true
}
previousWasBundle = false
}
r.logger.Infof("No modification required")
return false
}
// modifyOCISpec loads and modifies the OCI spec specified in the nvidiaContainerRuntime
// struct. The spec is modified in-place and written to the same file as the input after
// modifcationas are applied.
func (r nvidiaContainerRuntime) modifyOCISpec() error {
err := r.ociSpec.Load()
if err != nil {
return fmt.Errorf("error loading OCI specification for modification: %v", err)
}
err = r.ociSpec.Modify(r.addNVIDIAHook)
if err != nil {
return fmt.Errorf("error injecting NVIDIA Container Runtime hook: %v", err)
}
err = r.ociSpec.Flush()
if err != nil {
return fmt.Errorf("error writing modified OCI specification: %v", err)
}
return nil
}
// addNVIDIAHook modifies the specified OCI specification in-place, inserting a
// prestart hook.
func (r nvidiaContainerRuntime) addNVIDIAHook(spec *specs.Spec) error {
path, err := exec.LookPath("nvidia-container-runtime-hook")
if err != nil {
path = hookDefaultFilePath
_, err = os.Stat(path)
if err != nil {
return err
}
}
r.logger.Printf("prestart hook path: %s\n", path)
args := []string{path}
if spec.Hooks == nil {
spec.Hooks = &specs.Hooks{}
} else if len(spec.Hooks.Prestart) != 0 {
for _, hook := range spec.Hooks.Prestart {
if !strings.Contains(hook.Path, "nvidia-container-runtime-hook") {
continue
}
r.logger.Println("existing nvidia prestart hook in OCI spec file")
return nil
}
}
spec.Hooks.Prestart = append(spec.Hooks.Prestart, specs.Hook{
Path: path,
Args: append(args, "prestart"),
})
return nil
}

View File

@@ -0,0 +1,226 @@
/*
# Copyright (c) 2021, NVIDIA CORPORATION. All rights reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
*/
package main
import (
"fmt"
"strings"
"testing"
"github.com/NVIDIA/nvidia-container-toolkit/internal/oci"
"github.com/opencontainers/runtime-spec/specs-go"
testlog "github.com/sirupsen/logrus/hooks/test"
"github.com/stretchr/testify/require"
)
func TestArgsGetConfigFilePath(t *testing.T) {
testCases := []struct {
bundleDir string
ociSpecPath string
}{
{
ociSpecPath: "config.json",
},
{
bundleDir: "/foo/bar",
ociSpecPath: "/foo/bar/config.json",
},
{
bundleDir: "/foo/bar/",
ociSpecPath: "/foo/bar/config.json",
},
}
for i, tc := range testCases {
cp, err := getOCISpecFilePath(tc.bundleDir)
require.NoErrorf(t, err, "%d: %v", i, tc)
require.Equalf(t, tc.ociSpecPath, cp, "%d: %v", i, tc)
}
}
func TestAddNvidiaHook(t *testing.T) {
logger, logHook := testlog.NewNullLogger()
shim := nvidiaContainerRuntime{
logger: logger,
}
testCases := []struct {
spec *specs.Spec
errorPrefix string
shouldNotAdd bool
}{
{
spec: &specs.Spec{},
},
{
spec: &specs.Spec{
Hooks: &specs.Hooks{},
},
},
{
spec: &specs.Spec{
Hooks: &specs.Hooks{
Prestart: []specs.Hook{{
Path: "some-hook",
}},
},
},
},
{
spec: &specs.Spec{
Hooks: &specs.Hooks{
Prestart: []specs.Hook{{
Path: "nvidia-container-runtime-hook",
}},
},
},
shouldNotAdd: true,
},
}
for i, tc := range testCases {
logHook.Reset()
var numPrestartHooks int
if tc.spec.Hooks != nil {
numPrestartHooks = len(tc.spec.Hooks.Prestart)
}
err := shim.addNVIDIAHook(tc.spec)
if tc.errorPrefix == "" {
require.NoErrorf(t, err, "%d: %v", i, tc)
} else {
require.Truef(t, strings.HasPrefix(err.Error(), tc.errorPrefix), "%d: %v", i, tc)
require.NotNilf(t, tc.spec.Hooks, "%d: %v", i, tc)
require.Equalf(t, 1, nvidiaHookCount(tc.spec.Hooks), "%d: %v", i, tc)
if tc.shouldNotAdd {
require.Equal(t, numPrestartHooks+1, len(tc.spec.Hooks.Poststart), "%d: %v", i, tc)
} else {
require.Equal(t, numPrestartHooks+1, len(tc.spec.Hooks.Poststart), "%d: %v", i, tc)
nvidiaHook := tc.spec.Hooks.Poststart[len(tc.spec.Hooks.Poststart)-1]
// TODO: This assumes that the hook has been set up in the makefile
expectedPath := "/usr/bin/nvidia-container-runtime-hook"
require.Equalf(t, expectedPath, nvidiaHook.Path, "%d: %v", i, tc)
require.Equalf(t, []string{expectedPath, "prestart"}, nvidiaHook.Args, "%d: %v", i, tc)
require.Emptyf(t, nvidiaHook.Env, "%d: %v", i, tc)
require.Nilf(t, nvidiaHook.Timeout, "%d: %v", i, tc)
}
}
}
}
func TestNvidiaContainerRuntime(t *testing.T) {
logger, hook := testlog.NewNullLogger()
testCases := []struct {
shim nvidiaContainerRuntime
shouldModify bool
args []string
modifyError error
writeError error
}{
{
shim: nvidiaContainerRuntime{},
shouldModify: false,
},
{
shim: nvidiaContainerRuntime{},
args: []string{"create"},
shouldModify: true,
},
{
shim: nvidiaContainerRuntime{},
args: []string{"--bundle=create"},
shouldModify: false,
},
{
shim: nvidiaContainerRuntime{},
args: []string{"--bundle", "create"},
shouldModify: false,
},
{
shim: nvidiaContainerRuntime{},
args: []string{"create"},
shouldModify: true,
},
{
shim: nvidiaContainerRuntime{},
args: []string{"create"},
modifyError: fmt.Errorf("error modifying"),
shouldModify: true,
},
{
shim: nvidiaContainerRuntime{},
args: []string{"create"},
writeError: fmt.Errorf("error writing"),
shouldModify: true,
},
}
for i, tc := range testCases {
tc.shim.logger = logger
hook.Reset()
spec := &specs.Spec{}
ociMock := oci.NewMockSpec(spec, tc.writeError, tc.modifyError)
require.Equal(t, tc.shouldModify, tc.shim.modificationRequired(tc.args), "%d: %v", i, tc)
tc.shim.ociSpec = ociMock
tc.shim.runtime = &MockShim{}
err := tc.shim.Exec(tc.args)
if tc.modifyError != nil || tc.writeError != nil {
require.Error(t, err, "%d: %v", i, tc)
} else {
require.NoError(t, err, "%d: %v", i, tc)
}
if tc.shouldModify {
require.Equal(t, 1, ociMock.MockModify.Callcount, "%d: %v", i, tc)
require.Equal(t, 1, nvidiaHookCount(spec.Hooks), "%d: %v", i, tc)
} else {
require.Equal(t, 0, ociMock.MockModify.Callcount, "%d: %v", i, tc)
require.Nil(t, spec.Hooks, "%d: %v", i, tc)
}
writeExpected := tc.shouldModify && tc.modifyError == nil
if writeExpected {
require.Equal(t, 1, ociMock.MockFlush.Callcount, "%d: %v", i, tc)
} else {
require.Equal(t, 0, ociMock.MockFlush.Callcount, "%d: %v", i, tc)
}
}
}
type MockShim struct {
called bool
args []string
returnError error
}
func (m *MockShim) Exec(args []string) error {
m.called = true
m.args = args
return m.returnError
}

View File

@@ -0,0 +1,166 @@
/*
# Copyright (c) 2021, NVIDIA CORPORATION. All rights reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
*/
package main
import (
"fmt"
"os/exec"
"path/filepath"
"strings"
"github.com/NVIDIA/nvidia-container-toolkit/internal/oci"
)
const (
ociSpecFileName = "config.json"
dockerRuncExecutableName = "docker-runc"
runcExecutableName = "runc"
)
// newRuntime is a factory method that constructs a runtime based on the selected configuration.
func newRuntime(argv []string) (oci.Runtime, error) {
ociSpec, err := newOCISpec(argv)
if err != nil {
return nil, fmt.Errorf("error constructing OCI specification: %v", err)
}
runc, err := newRuncRuntime()
if err != nil {
return nil, fmt.Errorf("error constructing runc runtime: %v", err)
}
r, err := newNvidiaContainerRuntimeWithLogger(logger.Logger, runc, ociSpec)
if err != nil {
return nil, fmt.Errorf("error constructing NVIDIA Container Runtime: %v", err)
}
return r, nil
}
// newOCISpec constructs an OCI spec for the provided arguments
func newOCISpec(argv []string) (oci.Spec, error) {
bundlePath, err := getBundlePath(argv)
if err != nil {
return nil, fmt.Errorf("error parsing command line arguments: %v", err)
}
ociSpecPath, err := getOCISpecFilePath(bundlePath)
if err != nil {
return nil, fmt.Errorf("error getting OCI specification file path: %v", err)
}
ociSpec := oci.NewSpecFromFile(ociSpecPath)
return ociSpec, nil
}
// newRuncRuntime locates the runc binary and wraps it in a SyscallExecRuntime
func newRuncRuntime() (oci.Runtime, error) {
runtimePath, err := findRunc()
if err != nil {
return nil, fmt.Errorf("error locating runtime: %v", err)
}
runc, err := oci.NewSyscallExecRuntimeWithLogger(logger.Logger, runtimePath)
if err != nil {
return nil, fmt.Errorf("error constructing runtime: %v", err)
}
return runc, nil
}
// getBundlePath checks the specified slice of strings (argv) for a 'bundle' flag as allowed by runc.
// The following are supported:
// --bundle{{SEP}}BUNDLE_PATH
// -bundle{{SEP}}BUNDLE_PATH
// -b{{SEP}}BUNDLE_PATH
// where {{SEP}} is either ' ' or '='
func getBundlePath(argv []string) (string, error) {
var bundlePath string
for i := 0; i < len(argv); i++ {
param := argv[i]
parts := strings.SplitN(param, "=", 2)
if !isBundleFlag(parts[0]) {
continue
}
// The flag has the format --bundle=/path
if len(parts) == 2 {
bundlePath = parts[1]
continue
}
// The flag has the format --bundle /path
if i+1 < len(argv) {
bundlePath = argv[i+1]
i++
continue
}
// --bundle / -b was the last element of argv
return "", fmt.Errorf("bundle option requires an argument")
}
return bundlePath, nil
}
// findRunc locates runc in the path, returning the full path to the
// binary or an error.
func findRunc() (string, error) {
runtimeCandidates := []string{
dockerRuncExecutableName,
runcExecutableName,
}
return findRuntime(runtimeCandidates)
}
func findRuntime(runtimeCandidates []string) (string, error) {
for _, candidate := range runtimeCandidates {
logger.Infof("Looking for runtime binary '%v'", candidate)
runcPath, err := exec.LookPath(candidate)
if err == nil {
logger.Infof("Found runtime binary '%v'", runcPath)
return runcPath, nil
}
logger.Warnf("Runtime binary '%v' not found: %v", candidate, err)
}
return "", fmt.Errorf("no runtime binary found from candidate list: %v", runtimeCandidates)
}
func isBundleFlag(arg string) bool {
if !strings.HasPrefix(arg, "-") {
return false
}
trimmed := strings.TrimLeft(arg, "-")
return trimmed == "b" || trimmed == "bundle"
}
// getOCISpecFilePath returns the expected path to the OCI specification file for the given
// bundle directory. If the bundle directory is empty, only `config.json` is returned.
func getOCISpecFilePath(bundleDir string) (string, error) {
logger.Infof("Using bundle directory: %v", bundleDir)
OCISpecFilePath := filepath.Join(bundleDir, ociSpecFileName)
logger.Infof("Using OCI specification file path: %v", OCISpecFilePath)
return OCISpecFilePath, nil
}

View File

@@ -0,0 +1,192 @@
/*
# Copyright (c) 2021, NVIDIA CORPORATION. All rights reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
*/
package main
import (
"path/filepath"
"testing"
testlog "github.com/sirupsen/logrus/hooks/test"
"github.com/stretchr/testify/require"
)
func TestConstructor(t *testing.T) {
shim, err := newRuntime([]string{})
require.NoError(t, err)
require.NotNil(t, shim)
}
func TestGetBundlePath(t *testing.T) {
type expected struct {
bundle string
isError bool
}
testCases := []struct {
argv []string
expected expected
}{
{
argv: []string{},
},
{
argv: []string{"create"},
},
{
argv: []string{"--bundle"},
expected: expected{
isError: true,
},
},
{
argv: []string{"-b"},
expected: expected{
isError: true,
},
},
{
argv: []string{"--bundle", "/foo/bar"},
expected: expected{
bundle: "/foo/bar",
},
},
{
argv: []string{"--not-bundle", "/foo/bar"},
},
{
argv: []string{"--"},
},
{
argv: []string{"-bundle", "/foo/bar"},
expected: expected{
bundle: "/foo/bar",
},
},
{
argv: []string{"--bundle=/foo/bar"},
expected: expected{
bundle: "/foo/bar",
},
},
{
argv: []string{"-b=/foo/bar"},
expected: expected{
bundle: "/foo/bar",
},
},
{
argv: []string{"-b=/foo/=bar"},
expected: expected{
bundle: "/foo/=bar",
},
},
{
argv: []string{"-b", "/foo/bar"},
expected: expected{
bundle: "/foo/bar",
},
},
{
argv: []string{"create", "-b", "/foo/bar"},
expected: expected{
bundle: "/foo/bar",
},
},
{
argv: []string{"-b", "create", "create"},
expected: expected{
bundle: "create",
},
},
{
argv: []string{"-b=create", "create"},
expected: expected{
bundle: "create",
},
},
{
argv: []string{"-b", "create"},
expected: expected{
bundle: "create",
},
},
}
for i, tc := range testCases {
bundle, err := getBundlePath(tc.argv)
if tc.expected.isError {
require.Errorf(t, err, "%d: %v", i, tc)
} else {
require.NoErrorf(t, err, "%d: %v", i, tc)
}
require.Equalf(t, tc.expected.bundle, bundle, "%d: %v", i, tc)
}
}
func TestFindRunc(t *testing.T) {
testLogger, _ := testlog.NewNullLogger()
logger.Logger = testLogger
runcPath, err := findRunc()
require.NoError(t, err)
require.Equal(t, filepath.Join(cfg.binPath, runcExecutableName), runcPath)
}
func TestFindRuntime(t *testing.T) {
testLogger, _ := testlog.NewNullLogger()
logger.Logger = testLogger
testCases := []struct {
candidates []string
expectedPath string
}{
{
candidates: []string{},
},
{
candidates: []string{"not-runc"},
},
{
candidates: []string{"not-runc", "also-not-runc"},
},
{
candidates: []string{runcExecutableName},
expectedPath: filepath.Join(cfg.binPath, runcExecutableName),
},
{
candidates: []string{runcExecutableName, "not-runc"},
expectedPath: filepath.Join(cfg.binPath, runcExecutableName),
},
{
candidates: []string{"not-runc", runcExecutableName},
expectedPath: filepath.Join(cfg.binPath, runcExecutableName),
},
}
for i, tc := range testCases {
runcPath, err := findRuntime(tc.candidates)
if tc.expectedPath == "" {
require.Error(t, err, "%d: %v", i, tc)
} else {
require.NoError(t, err, "%d: %v", i, tc)
}
require.Equal(t, tc.expectedPath, runcPath, "%d: %v", i, tc)
}
}

View File

@@ -216,6 +216,7 @@ func getDevicesFromEnvvar(env map[string]string, legacyImage bool) *string {
for _, envVar := range envVars {
if devs, ok := env[envVar]; ok {
devices = &devs
break
}
}

View File

@@ -2,8 +2,9 @@ package main
import (
"path/filepath"
"reflect"
"testing"
"github.com/stretchr/testify/require"
)
func TestGetNvidiaConfig(t *testing.T) {
@@ -414,7 +415,7 @@ func TestGetNvidiaConfig(t *testing.T) {
// For any tests that are expected to panic, make sure they do.
if tc.expectedPanic {
mustPanic(t, getConfig)
require.Panics(t, getConfig)
return
}
@@ -422,31 +423,20 @@ func TestGetNvidiaConfig(t *testing.T) {
getConfig()
// And start comparing the test results to the expected results.
if config == nil && tc.expectedConfig == nil {
if tc.expectedConfig == nil {
require.Nil(t, config, tc.description)
return
}
if config != nil && tc.expectedConfig != nil {
if !reflect.DeepEqual(config.Devices, tc.expectedConfig.Devices) {
t.Errorf("Unexpected nvidiaConfig (got: %v, wanted: %v)", config, tc.expectedConfig)
}
if !reflect.DeepEqual(config.MigConfigDevices, tc.expectedConfig.MigConfigDevices) {
t.Errorf("Unexpected nvidiaConfig (got: %v, wanted: %v)", config, tc.expectedConfig)
}
if !reflect.DeepEqual(config.MigMonitorDevices, tc.expectedConfig.MigMonitorDevices) {
t.Errorf("Unexpected nvidiaConfig (got: %v, wanted: %v)", config, tc.expectedConfig)
}
if !reflect.DeepEqual(config.DriverCapabilities, tc.expectedConfig.DriverCapabilities) {
t.Errorf("Unexpected nvidiaConfig (got: %v, wanted: %v)", config, tc.expectedConfig)
}
if !elementsMatch(config.Requirements, tc.expectedConfig.Requirements) {
t.Errorf("Unexpected nvidiaConfig (got: %v, wanted: %v)", config, tc.expectedConfig)
}
if !reflect.DeepEqual(config.DisableRequire, tc.expectedConfig.DisableRequire) {
t.Errorf("Unexpected nvidiaConfig (got: %v, wanted: %v)", config, tc.expectedConfig)
}
return
}
t.Errorf("Unexpected nvidiaConfig (got: %v, wanted: %v)", config, tc.expectedConfig)
require.NotNil(t, config, tc.description)
require.Equal(t, tc.expectedConfig.Devices, config.Devices)
require.Equal(t, tc.expectedConfig.MigConfigDevices, config.MigConfigDevices)
require.Equal(t, tc.expectedConfig.MigMonitorDevices, config.MigMonitorDevices)
require.Equal(t, tc.expectedConfig.DriverCapabilities, config.DriverCapabilities)
require.ElementsMatch(t, tc.expectedConfig.Requirements, config.Requirements)
require.Equal(t, tc.expectedConfig.DisableRequire, config.DisableRequire)
})
}
}
@@ -524,9 +514,7 @@ func TestGetDevicesFromMounts(t *testing.T) {
for _, tc := range tests {
t.Run(tc.description, func(t *testing.T) {
devices := getDevicesFromMounts(tc.mounts)
if !reflect.DeepEqual(devices, tc.expectedDevices) {
t.Errorf("Unexpected devices (got: %v, wanted: %v)", *devices, *tc.expectedDevices)
}
require.Equal(t, tc.expectedDevices, devices)
})
}
}
@@ -639,36 +627,198 @@ func TestDeviceListSourcePriority(t *testing.T) {
// For all other tests, just grab the devices and check the results
getDevices()
if !reflect.DeepEqual(devices, tc.expectedDevices) {
t.Errorf("Unexpected devices (got: %v, wanted: %v)", *devices, *tc.expectedDevices)
}
require.Equal(t, tc.expectedDevices, devices)
})
}
}
func elementsMatch(slice0, slice1 []string) bool {
map0 := make(map[string]int)
map1 := make(map[string]int)
func TestGetDevicesFromEnvvar(t *testing.T) {
all := "all"
empty := ""
envDockerResourceGPUs := "DOCKER_RESOURCE_GPUS"
gpuID := "GPU-12345"
anotherGPUID := "GPU-67890"
for _, e := range slice0 {
map0[e]++
var tests = []struct {
description string
envSwarmGPU *string
env map[string]string
legacyImage bool
expectedDevices *string
}{
{
description: "empty env returns nil for non-legacy image",
},
{
description: "blank NVIDIA_VISIBLE_DEVICES returns nil for non-legacy image",
env: map[string]string{
envNVVisibleDevices: "",
},
},
{
description: "'void' NVIDIA_VISIBLE_DEVICES returns nil for non-legacy image",
env: map[string]string{
envNVVisibleDevices: "void",
},
},
{
description: "'none' NVIDIA_VISIBLE_DEVICES returns empty for non-legacy image",
env: map[string]string{
envNVVisibleDevices: "none",
},
expectedDevices: &empty,
},
{
description: "NVIDIA_VISIBLE_DEVICES set returns value for non-legacy image",
env: map[string]string{
envNVVisibleDevices: gpuID,
},
expectedDevices: &gpuID,
},
{
description: "NVIDIA_VISIBLE_DEVICES set returns value for legacy image",
env: map[string]string{
envNVVisibleDevices: gpuID,
},
legacyImage: true,
expectedDevices: &gpuID,
},
{
description: "empty env returns all for legacy image",
legacyImage: true,
expectedDevices: &all,
},
// Add the `DOCKER_RESOURCE_GPUS` envvar and ensure that this is ignored when
// not enabled
{
description: "missing NVIDIA_VISIBLE_DEVICES returns nil for non-legacy image",
env: map[string]string{
envDockerResourceGPUs: anotherGPUID,
},
},
{
description: "blank NVIDIA_VISIBLE_DEVICES returns nil for non-legacy image",
env: map[string]string{
envNVVisibleDevices: "",
envDockerResourceGPUs: anotherGPUID,
},
},
{
description: "'void' NVIDIA_VISIBLE_DEVICES returns nil for non-legacy image",
env: map[string]string{
envNVVisibleDevices: "void",
envDockerResourceGPUs: anotherGPUID,
},
},
{
description: "'none' NVIDIA_VISIBLE_DEVICES returns empty for non-legacy image",
env: map[string]string{
envNVVisibleDevices: "none",
envDockerResourceGPUs: anotherGPUID,
},
expectedDevices: &empty,
},
{
description: "NVIDIA_VISIBLE_DEVICES set returns value for non-legacy image",
env: map[string]string{
envNVVisibleDevices: gpuID,
envDockerResourceGPUs: anotherGPUID,
},
expectedDevices: &gpuID,
},
{
description: "NVIDIA_VISIBLE_DEVICES set returns value for legacy image",
env: map[string]string{
envNVVisibleDevices: gpuID,
envDockerResourceGPUs: anotherGPUID,
},
legacyImage: true,
expectedDevices: &gpuID,
},
{
description: "empty env returns all for legacy image",
env: map[string]string{
envDockerResourceGPUs: anotherGPUID,
},
legacyImage: true,
expectedDevices: &all,
},
// Add the `DOCKER_RESOURCE_GPUS` envvar and ensure that this is selected when
// enabled
{
description: "empty env returns nil for non-legacy image",
envSwarmGPU: &envDockerResourceGPUs,
},
{
description: "blank DOCKER_RESOURCE_GPUS returns nil for non-legacy image",
envSwarmGPU: &envDockerResourceGPUs,
env: map[string]string{
envDockerResourceGPUs: "",
},
},
{
description: "'void' DOCKER_RESOURCE_GPUS returns nil for non-legacy image",
envSwarmGPU: &envDockerResourceGPUs,
env: map[string]string{
envDockerResourceGPUs: "void",
},
},
{
description: "'none' DOCKER_RESOURCE_GPUS returns empty for non-legacy image",
envSwarmGPU: &envDockerResourceGPUs,
env: map[string]string{
envDockerResourceGPUs: "none",
},
expectedDevices: &empty,
},
{
description: "DOCKER_RESOURCE_GPUS set returns value for non-legacy image",
envSwarmGPU: &envDockerResourceGPUs,
env: map[string]string{
envDockerResourceGPUs: gpuID,
},
expectedDevices: &gpuID,
},
{
description: "DOCKER_RESOURCE_GPUS set returns value for legacy image",
envSwarmGPU: &envDockerResourceGPUs,
env: map[string]string{
envDockerResourceGPUs: gpuID,
},
legacyImage: true,
expectedDevices: &gpuID,
},
{
description: "DOCKER_RESOURCE_GPUS is selected if present",
envSwarmGPU: &envDockerResourceGPUs,
env: map[string]string{
envDockerResourceGPUs: anotherGPUID,
},
expectedDevices: &anotherGPUID,
},
{
description: "DOCKER_RESOURCE_GPUS overrides NVIDIA_VISIBLE_DEVICES if present",
envSwarmGPU: &envDockerResourceGPUs,
env: map[string]string{
envNVVisibleDevices: gpuID,
envDockerResourceGPUs: anotherGPUID,
},
expectedDevices: &anotherGPUID,
},
}
for _, e := range slice1 {
map1[e]++
}
for i, tc := range tests {
t.Run(tc.description, func(t *testing.T) {
envSwarmGPU = tc.envSwarmGPU
devices := getDevicesFromEnvvar(tc.env, tc.legacyImage)
if tc.expectedDevices == nil {
require.Nil(t, devices, "%d: %v", i, tc)
return
}
for k0, v0 := range map0 {
if map1[k0] != v0 {
return false
}
require.NotNil(t, devices, "%d: %v", i, tc)
require.Equal(t, *tc.expectedDevices, *devices, "%d: %v", i, tc)
})
}
for k1, v1 := range map1 {
if map0[k1] != v1 {
return false
}
}
return true
}

View File

@@ -3,6 +3,8 @@ package main
import (
"encoding/json"
"testing"
"github.com/stretchr/testify/require"
)
func TestParseCudaVersionValid(t *testing.T) {
@@ -16,24 +18,15 @@ func TestParseCudaVersionValid(t *testing.T) {
{"9.0.116", [3]uint32{9, 0, 116}},
{"4294967295.4294967295.4294967295", [3]uint32{4294967295, 4294967295, 4294967295}},
}
for _, c := range tests {
for i, c := range tests {
vmaj, vmin, vpatch := parseCudaVersion(c.version)
if vmaj != c.expected[0] || vmin != c.expected[1] || vpatch != c.expected[2] {
t.Errorf("parseCudaVersion(%s): %d.%d.%d (expected: %v)", c.version, vmaj, vmin, vpatch, c.expected)
}
version := [3]uint32{vmaj, vmin, vpatch}
require.Equal(t, c.expected, version, "%d: %v", i, c)
}
}
func mustPanic(t *testing.T, f func()) {
defer func() {
if err := recover(); err == nil {
t.Error("Test didn't panic!")
}
}()
f()
}
func TestParseCudaVersionInvalid(t *testing.T) {
var tests = []string{
"foo",
@@ -53,10 +46,9 @@ func TestParseCudaVersionInvalid(t *testing.T) {
"-9.-1.-116",
}
for _, c := range tests {
mustPanic(t, func() {
t.Logf("parseCudaVersion(%s)", c)
require.Panics(t, func() {
parseCudaVersion(c)
})
}, "parseCudaVersion(%v)", c)
}
}
@@ -132,12 +124,11 @@ func TestIsPrivileged(t *testing.T) {
false,
},
}
for _, tc := range tests {
for i, tc := range tests {
var spec Spec
_ = json.Unmarshal([]byte(tc.spec), &spec)
privileged := isPrivileged(&spec)
if privileged != tc.expected {
t.Errorf("isPrivileged() returned unexpectred value (privileged: %v, tc.expected: %v)", privileged, tc.expected)
}
require.Equal(t, tc.expected, privileged, "%d: %v", i, tc)
}
}

View File

@@ -40,8 +40,7 @@ RUN mkdir -p $DIST_DIR /dist
WORKDIR $GOPATH/src/nvidia-container-toolkit
COPY . .
RUN make binary && \
mv ./nvidia-container-toolkit $DIST_DIR/nvidia-container-toolkit
RUN make PREFIX=${DIST_DIR} cmds
COPY config/config.toml.amzn $DIST_DIR/config.toml
@@ -60,6 +59,7 @@ CMD arch=$(uname -m) && \
rpmbuild --clean --target=$arch -bb \
-D "_topdir $PWD" \
-D "version $VERSION" \
-D "libnvidia_container_version ${VERSION}-${RELEASE}" \
-D "release $RELEASE" \
SPECS/nvidia-container-toolkit.spec && \
mv RPMS/$arch/*.rpm /dist

View File

@@ -40,8 +40,7 @@ RUN mkdir -p $DIST_DIR /dist
WORKDIR $GOPATH/src/nvidia-container-toolkit
COPY . .
RUN make binary && \
mv ./nvidia-container-toolkit $DIST_DIR/nvidia-container-toolkit
RUN make PREFIX=${DIST_DIR} cmds
COPY config/config.toml.centos $DIST_DIR/config.toml
@@ -58,6 +57,7 @@ CMD arch=$(uname -m) && \
rpmbuild --clean --target=$arch -bb \
-D "_topdir $PWD" \
-D "version $VERSION" \
-D "libnvidia_container_version ${VERSION}-${RELEASE}" \
-D "release $RELEASE" \
SPECS/nvidia-container-toolkit.spec && \
mv RPMS/$arch/*.rpm /dist

View File

@@ -48,8 +48,7 @@ RUN mkdir -p $DIST_DIR /dist
WORKDIR $GOPATH/src/nvidia-container-toolkit
COPY . .
RUN make binary && \
mv ./nvidia-container-toolkit $DIST_DIR/nvidia-container-toolkit
RUN make PREFIX=${DIST_DIR} cmds
COPY config/config.toml.debian $DIST_DIR/config.toml
@@ -62,8 +61,11 @@ WORKDIR $DIST_DIR
COPY packaging/debian ./debian
RUN sed -i "s;@VERSION@;${REVISION};" debian/changelog && \
dch --changelog debian/changelog --append "Bump libnvidia-container dependency to ${REVISION}}" && \
dch --changelog debian/changelog -r "" && \
if [ "$REVISION" != "$(dpkg-parsechangelog --show-field=Version)" ]; then exit 1; fi
CMD export DISTRIB="$(lsb_release -cs)" && \
debuild -eDISTRIB -eSECTION --dpkg-buildpackage-hook='sh debian/prepare' -i -us -uc -b && \
debuild -eDISTRIB -eSECTION -eLIBNVIDIA_CONTAINER_VERSION="${REVISION}" \
--dpkg-buildpackage-hook='sh debian/prepare' -i -us -uc -b && \
mv /tmp/nvidia-container-toolkit_*.deb /dist

20
docker/Dockerfile.devel Normal file
View File

@@ -0,0 +1,20 @@
# Copyright (c) 2021, NVIDIA CORPORATION. All rights reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
ARG GOLANG_VERSION=x.x.x
FROM golang:${GOLANG_VERSION}
RUN go get -u golang.org/x/lint/golint
RUN go get -u github.com/matryer/moq
RUN go get -u github.com/gordonklaus/ineffassign
RUN go get -u github.com/client9/misspell/cmd/misspell

View File

@@ -39,8 +39,7 @@ RUN mkdir -p $DIST_DIR /dist
WORKDIR $GOPATH/src/nvidia-container-toolkit
COPY . .
RUN make binary && \
mv ./nvidia-container-toolkit $DIST_DIR/nvidia-container-toolkit
RUN make PREFIX=${DIST_DIR} cmds
# Hook for Project Atomic's fork of Docker: https://github.com/projectatomic/docker/tree/docker-1.13.1-rhel#add-dockerhooks-exec-custom-hooks-for-prestartpoststop-containerspatch
COPY oci-nvidia-hook $DIST_DIR/oci-nvidia-hook
@@ -57,6 +56,7 @@ CMD arch=$(uname -m) && \
rpmbuild --clean --target=$arch -bb \
-D "_topdir $PWD" \
-D "version $VERSION" \
-D "libnvidia_container_version ${VERSION}-${RELEASE}" \
-D "release $RELEASE" \
SPECS/nvidia-container-toolkit.spec && \
mv RPMS/$arch/*.rpm /dist

View File

@@ -46,8 +46,7 @@ RUN mkdir -p $DIST_DIR /dist
WORKDIR $GOPATH/src/nvidia-container-toolkit
COPY . .
RUN make binary && \
mv ./nvidia-container-toolkit $DIST_DIR/nvidia-container-toolkit
RUN make PREFIX=${DIST_DIR} cmds
COPY config/config.toml.ubuntu $DIST_DIR/config.toml
@@ -55,8 +54,11 @@ WORKDIR $DIST_DIR
COPY packaging/debian ./debian
RUN sed -i "s;@VERSION@;${REVISION};" debian/changelog && \
dch --changelog debian/changelog --append "Bump libnvidia-container dependency to ${REVISION}}" && \
dch --changelog debian/changelog -r "" && \
if [ "$REVISION" != "$(dpkg-parsechangelog --show-field=Version)" ]; then exit 1; fi
CMD export DISTRIB="$(lsb_release -cs)" && \
debuild -eDISTRIB -eSECTION --dpkg-buildpackage-hook='sh debian/prepare' -i -us -uc -b && \
debuild -eDISTRIB -eSECTION -eLIBNVIDIA_CONTAINER_VERSION="${REVISION}" \
--dpkg-buildpackage-hook='sh debian/prepare' -i -us -uc -b && \
mv /tmp/*.deb /dist

View File

@@ -1,11 +1,23 @@
# Copyright (c) 2017-2020, NVIDIA CORPORATION. All rights reserved.
# Copyright (c) 2017-2021, NVIDIA CORPORATION. All rights reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
# Supported OSs by architecture
AMD64_TARGETS := ubuntu20.04 ubuntu18.04 ubuntu16.04 debian10 debian9
X86_64_TARGETS := centos7 centos8 rhel7 rhel8 amazonlinux1 amazonlinux2 opensuse-leap15.1
PPC64LE_TARGETS := ubuntu18.04 ubuntu16.04 centos7 centos8 rhel7 rhel8
ARM64_TARGETS := ubuntu20.04 ubuntu18.04
AARCH64_TARGETS := centos8 rhel8
AARCH64_TARGETS := centos8 rhel8 amazonlinux2
# Define top-level build targets
docker%: SHELL:=/bin/bash
@@ -85,11 +97,11 @@ docker-all: $(AMD64_TARGETS) $(X86_64_TARGETS) \
# private centos target
--centos%: OS := centos
--centos%: PKG_REV := $(if $(LIB_TAG),0.1.$(LIB_TAG),2)
--centos%: PKG_REV := $(if $(LIB_TAG),0.1.$(LIB_TAG),1)
# private amazonlinux target
--amazonlinux%: OS := amazonlinux
--amazonlinux%: PKG_REV = $(if $(LIB_TAG),0.1.$(LIB_TAG).amzn$(VERSION),2.amzn$(VERSION))
--amazonlinux%: PKG_REV := $(if $(LIB_TAG),0.1.$(LIB_TAG),1)
# private opensuse-leap target
--opensuse-leap%: OS = opensuse-leap
@@ -98,7 +110,7 @@ docker-all: $(AMD64_TARGETS) $(X86_64_TARGETS) \
# private rhel target (actually built on centos)
--rhel%: OS := centos
--rhel%: PKG_REV := $(if $(LIB_TAG),0.1.$(LIB_TAG),2)
--rhel%: PKG_REV := $(if $(LIB_TAG),0.1.$(LIB_TAG),1)
--rhel%: VERSION = $(patsubst rhel%-$(ARCH),%,$(TARGET_PLATFORM))
--rhel%: ARTIFACTS_DIR = $(DIST_DIR)/rhel$(VERSION)/$(ARCH)
@@ -108,7 +120,7 @@ docker-build-%:
DOCKER_BUILDKIT=1 \
$(DOCKER) build \
--progress=plain \
--build-arg BASEIMAGE=$(BASEIMAGE) \
--build-arg BASEIMAGE="$(BASEIMAGE)" \
--build-arg GOLANG_VERSION="$(GOLANG_VERSION)" \
--build-arg PKG_VERS="$(LIB_VERSION)" \
--build-arg PKG_REV="$(PKG_REV)" \

10
go.mod
View File

@@ -4,6 +4,14 @@ go 1.14
require (
github.com/BurntSushi/toml v0.3.1
github.com/stretchr/testify v1.6.0
github.com/containerd/containerd v1.5.7
github.com/containers/podman/v2 v2.2.1
github.com/opencontainers/runtime-spec v1.0.3-0.20211101234015-a3c33d663ebc
github.com/pelletier/go-toml v1.9.3
github.com/sirupsen/logrus v1.8.1
github.com/stretchr/testify v1.7.0
github.com/tsaikd/KDGoLib v0.0.0-20191001134900-7f3cf518e07d
github.com/urfave/cli/v2 v2.3.0
golang.org/x/mod v0.3.0
golang.org/x/sys v0.0.0-20210426230700-d19ff857e887
)

1169
go.sum

File diff suppressed because it is too large Load Diff

23
internal/oci/runtime.go Normal file
View File

@@ -0,0 +1,23 @@
/*
# Copyright (c) 2021, NVIDIA CORPORATION. All rights reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
*/
package oci
// Runtime is an interface for a runtime shim. The Exec method accepts a list
// of command line arguments, and returns an error / nil.
type Runtime interface {
Exec([]string) error
}

View File

@@ -0,0 +1,79 @@
/*
# Copyright (c) 2021, NVIDIA CORPORATION. All rights reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
*/
package oci
import (
"fmt"
"os"
"syscall"
log "github.com/sirupsen/logrus"
)
// SyscallExecRuntime wraps the path that a binary and defines the semanitcs for how to exec into it.
// This can be used to wrap an OCI-compliant low-level runtime binary, allowing it to be used through the
// Runtime internface.
type SyscallExecRuntime struct {
logger *log.Logger
path string
// exec is used for testing. This defaults to syscall.Exec
exec func(argv0 string, argv []string, envv []string) error
}
var _ Runtime = (*SyscallExecRuntime)(nil)
// NewSyscallExecRuntime creates a SyscallExecRuntime for the specified path with the standard logger
func NewSyscallExecRuntime(path string) (Runtime, error) {
return NewSyscallExecRuntimeWithLogger(log.StandardLogger(), path)
}
// NewSyscallExecRuntimeWithLogger creates a SyscallExecRuntime for the specified logger and path
func NewSyscallExecRuntimeWithLogger(logger *log.Logger, path string) (Runtime, error) {
info, err := os.Stat(path)
if err != nil {
return nil, fmt.Errorf("invalid path '%v': %v", path, err)
}
if info.IsDir() || info.Mode()&0111 == 0 {
return nil, fmt.Errorf("specified path '%v' is not an executable file", path)
}
shim := SyscallExecRuntime{
logger: logger,
path: path,
exec: syscall.Exec,
}
return &shim, nil
}
// Exec exces into the binary at the path from the SyscallExecRuntime struct, passing it the supplied arguments
// after ensuring that the first argument is the path of the target binary.
func (s SyscallExecRuntime) Exec(args []string) error {
runtimeArgs := []string{s.path}
if len(args) > 1 {
runtimeArgs = append(runtimeArgs, args[1:]...)
}
err := s.exec(s.path, runtimeArgs, os.Environ())
if err != nil {
return fmt.Errorf("could not exec '%v': %v", s.path, err)
}
// syscall.Exec is not expected to return. This is an error state regardless of whether
// err is nil or not.
return fmt.Errorf("unexpected return from exec '%v'", s.path)
}

View File

@@ -0,0 +1,100 @@
/*
# Copyright (c) 2021, NVIDIA CORPORATION. All rights reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
*/
package oci
import (
"fmt"
"strings"
"testing"
testlog "github.com/sirupsen/logrus/hooks/test"
"github.com/stretchr/testify/require"
)
func TestSyscallExecConstructor(t *testing.T) {
r, err := NewSyscallExecRuntime("////an/invalid/path")
require.Error(t, err)
require.Nil(t, r)
r, err = NewSyscallExecRuntime("/tmp")
require.Error(t, err)
require.Nil(t, r)
r, err = NewSyscallExecRuntime("/dev/null")
require.Error(t, err)
require.Nil(t, r)
r, err = NewSyscallExecRuntime("/bin/sh")
require.NoError(t, err)
f, ok := r.(*SyscallExecRuntime)
require.True(t, ok)
require.Equal(t, "/bin/sh", f.path)
}
func TestSyscallExecForwardsArgs(t *testing.T) {
logger, _ := testlog.NewNullLogger()
f := SyscallExecRuntime{
logger: logger,
path: "runtime",
}
testCases := []struct {
returnError error
args []string
errorPrefix string
}{
{
returnError: nil,
errorPrefix: "unexpected return from exec",
},
{
returnError: fmt.Errorf("error from exec"),
errorPrefix: "could not exec",
},
{
returnError: nil,
args: []string{"otherargv0"},
errorPrefix: "unexpected return from exec",
},
{
returnError: nil,
args: []string{"otherargv0", "arg1", "arg2", "arg3"},
errorPrefix: "unexpected return from exec",
},
}
for i, tc := range testCases {
execMock := WithMockExec(f, tc.returnError)
err := execMock.Exec(tc.args)
require.Errorf(t, err, "%d: %v", i, tc)
require.Truef(t, strings.HasPrefix(err.Error(), tc.errorPrefix), "%d: %v", i, tc)
if tc.returnError != nil {
require.Truef(t, strings.HasSuffix(err.Error(), tc.returnError.Error()), "%d: %v", i, tc)
}
require.Equalf(t, f.path, execMock.argv0, "%d: %v", i, tc)
require.Equalf(t, f.path, execMock.argv[0], "%d: %v", i, tc)
require.LessOrEqualf(t, len(tc.args), len(execMock.argv), "%d: %v", i, tc)
if len(tc.args) > 1 {
require.Equalf(t, tc.args[1:], execMock.argv[1:], "%d: %v", i, tc)
}
}
}

View File

@@ -0,0 +1,49 @@
/*
# Copyright (c) 2021, NVIDIA CORPORATION. All rights reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
*/
package oci
// MockExecRuntime wraps a SyscallExecRuntime, intercepting the exec call for testing
type MockExecRuntime struct {
SyscallExecRuntime
execMock
}
// WithMockExec wraps a specified SyscallExecRuntime with a mocked exec function for testing
func WithMockExec(e SyscallExecRuntime, execResult error) *MockExecRuntime {
m := MockExecRuntime{
SyscallExecRuntime: e,
execMock: execMock{result: execResult},
}
// overrdie the exec function to the mocked exec function.
m.SyscallExecRuntime.exec = m.execMock.exec
return &m
}
type execMock struct {
argv0 string
argv []string
envv []string
result error
}
func (m *execMock) exec(argv0 string, argv []string, envv []string) error {
m.argv0 = argv0
m.argv = argv
m.envv = envv
return m.result
}

102
internal/oci/spec.go Normal file
View File

@@ -0,0 +1,102 @@
/*
# Copyright (c) 2021, NVIDIA CORPORATION. All rights reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
*/
package oci
import (
"encoding/json"
"fmt"
"os"
oci "github.com/opencontainers/runtime-spec/specs-go"
)
// SpecModifier is a function that accepts a pointer to an OCI Srec and returns an
// error. The intention is that the function would modify the spec in-place.
type SpecModifier func(*oci.Spec) error
// Spec defines the operations to be performed on an OCI specification
type Spec interface {
Load() error
Flush() error
Modify(SpecModifier) error
}
type fileSpec struct {
*oci.Spec
path string
}
var _ Spec = (*fileSpec)(nil)
// NewSpecFromFile creates an object that encapsulates a file-backed OCI spec.
// This can be used to read from the file, modify the spec, and write to the
// same file.
func NewSpecFromFile(filepath string) Spec {
oci := fileSpec{
path: filepath,
}
return &oci
}
// Load reads the contents of an OCI spec from file to be referenced internally.
// The file is opened "read-only"
func (s *fileSpec) Load() error {
specFile, err := os.Open(s.path)
if err != nil {
return fmt.Errorf("error opening OCI specification file: %v", err)
}
defer specFile.Close()
decoder := json.NewDecoder(specFile)
var spec oci.Spec
err = decoder.Decode(&spec)
if err != nil {
return fmt.Errorf("error reading OCI specification from file: %v", err)
}
s.Spec = &spec
return nil
}
// Modify applies the specified SpecModifier to the stored OCI specification.
func (s *fileSpec) Modify(f SpecModifier) error {
if s.Spec == nil {
return fmt.Errorf("no spec loaded for modification")
}
return f(s.Spec)
}
// Flush writes the stored OCI specification to the filepath specifed by the path member.
// The file is truncated upon opening, overwriting any existing contents.
func (s fileSpec) Flush() error {
specFile, err := os.Create(s.path)
if err != nil {
return fmt.Errorf("error opening OCI specification file: %v", err)
}
defer specFile.Close()
encoder := json.NewEncoder(specFile)
err = encoder.Encode(s.Spec)
if err != nil {
return fmt.Errorf("error writing OCI specification to file: %v", err)
}
return nil
}

70
internal/oci/spec_mock.go Normal file
View File

@@ -0,0 +1,70 @@
/*
# Copyright (c) 2021, NVIDIA CORPORATION. All rights reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
*/
package oci
import (
oci "github.com/opencontainers/runtime-spec/specs-go"
)
// MockSpec provides a simple mock for an OCI spec to be used in testing.
// It also implements the SpecModifier interface.
type MockSpec struct {
*oci.Spec
MockLoad mockFunc
MockFlush mockFunc
MockModify mockFunc
}
var _ Spec = (*MockSpec)(nil)
// NewMockSpec constructs a MockSpec to be used in testing as a Spec
func NewMockSpec(spec *oci.Spec, flushResult error, modifyResult error) *MockSpec {
s := MockSpec{
Spec: spec,
MockFlush: mockFunc{result: flushResult},
MockModify: mockFunc{result: modifyResult},
}
return &s
}
// Load invokes the mocked Load function to return the predefined error / result
func (s *MockSpec) Load() error {
return s.MockLoad.call()
}
// Flush invokes the mocked Load function to return the predefined error / result
func (s *MockSpec) Flush() error {
return s.MockFlush.call()
}
// Modify applies the specified SpecModifier to the spec and invokes the
// mocked modify function to return the predefined error / result.
func (s *MockSpec) Modify(f SpecModifier) error {
f(s.Spec)
return s.MockModify.call()
}
type mockFunc struct {
Callcount int
result error
}
func (m *mockFunc) call() error {
m.Callcount++
return m.result
}

58
internal/oci/spec_test.go Normal file
View File

@@ -0,0 +1,58 @@
package oci
import (
"fmt"
"os"
"path/filepath"
"runtime"
"testing"
"github.com/stretchr/testify/require"
)
func TestMaintainSpec(t *testing.T) {
moduleRoot, err := getModuleRoot()
require.NoError(t, err)
files := []string{
"config.clone3.json",
}
for _, f := range files {
inputSpecPath := filepath.Join(moduleRoot, "test/input", f)
spec := NewSpecFromFile(inputSpecPath).(*fileSpec)
spec.Load()
outputSpecPath := filepath.Join(moduleRoot, "test/output", f)
spec.path = outputSpecPath
spec.Flush()
inputContents, err := os.ReadFile(inputSpecPath)
require.NoError(t, err)
outputContents, err := os.ReadFile(outputSpecPath)
require.NoError(t, err)
require.JSONEq(t, string(inputContents), string(outputContents))
}
}
func getModuleRoot() (string, error) {
_, filename, _, _ := runtime.Caller(0)
return hasGoMod(filename)
}
func hasGoMod(dir string) (string, error) {
if dir == "" || dir == "/" {
return "", fmt.Errorf("module root not found")
}
_, err := os.Stat(filepath.Join(dir, "go.mod"))
if err != nil {
return hasGoMod(filepath.Dir(dir))
}
return dir, nil
}

View File

@@ -1,3 +1,36 @@
nvidia-container-toolkit (1.6.0~rc.2-1) experimental; urgency=medium
* Use relative path to OCI specification file (config.json) if bundle path is not specified as an argument to the nvidia-container-runtime
-- NVIDIA CORPORATION <cudatools@nvidia.com> Tue, 26 Oct 2021 12:24:05 +0200
nvidia-container-toolkit (1.6.0~rc.1-1) experimental; urgency=medium
* Add AARCH64 package for Amazon Linux 2
* Include nvidia-container-runtime into nvidia-container-toolkit package
-- NVIDIA CORPORATION <cudatools@nvidia.com> Mon, 06 Sep 2021 12:24:05 +0200
nvidia-container-toolkit (1.5.1-1) UNRELEASED; urgency=medium
* Fix bug where Docker Swarm device selection is ignored if
NVIDIA_VISIBLE_DEVICES is also set
* Improve unit testing by using require package and adding coverage reports
* Remove unneeded go dependencies by running go mod tidy
* Move contents of pkg directory to cmd for CLI tools
* Ensure make binary target explicitly sets GOOS
-- NVIDIA CORPORATION <cudatools@nvidia.com> Mon, 14 Jun 2021 09:00:00 -0700
nvidia-container-toolkit (1.5.0-1) UNRELEASED; urgency=medium
* Add dependence on libnvidia-container-tools >= 1.4.0
* Add golang check targets to Makefile
* Add Jenkinsfile definition for build targets
* Move docker.mk to docker folder
-- NVIDIA CORPORATION <cudatools@nvidia.com> Thu, 29 Apr 2021 03:12:43 -0700
nvidia-container-toolkit (1.4.2-1) UNRELEASED; urgency=medium
* Add dependence on libnvidia-container-tools >= 1.3.3
@@ -76,7 +109,7 @@ nvidia-container-toolkit (1.1.0-1) UNRELEASED; urgency=medium
* 4e4de762 Update build system to support multi-arch builds
* fcc1d116 Add support for MIG (Multi-Instance GPUs)
* d4ff0416 Add ability to merge envars of the form NVIDIA_VISIBLE_DEVICES_*
* d4ff0416 Add ability to merge envars of the form NVIDIA_VISIBLE_DEVICES_*
* 60f165ad Add no-pivot option to toolkit
-- NVIDIA CORPORATION <cudatools@nvidia.com> Fri, 15 May 2020 12:05:32 -0700

View File

@@ -10,8 +10,8 @@ Build-Depends: debhelper (>= 9)
Package: nvidia-container-toolkit
Architecture: any
Depends: ${misc:Depends}, libnvidia-container-tools (>= 1.3.3), libnvidia-container-tools (<< 2.0.0)
Breaks: nvidia-container-runtime (<< 2.0.0), nvidia-container-runtime-hook
Replaces: nvidia-container-runtime (<< 2.0.0), nvidia-container-runtime-hook
Depends: ${misc:Depends}, libnvidia-container-tools (>= @LIBNVIDIA_CONTAINER_VERSION@), libnvidia-container-tools (<< 2.0.0), libseccomp2
Breaks: nvidia-container-runtime (<= 3.5.0-1), nvidia-container-runtime-hook
Replaces: nvidia-container-runtime (<= 3.5.0-1), nvidia-container-runtime-hook
Description: NVIDIA container runtime hook
Provides a OCI hook to enable GPU support in containers.

View File

@@ -1,2 +1,3 @@
config.toml /etc/nvidia-container-runtime
nvidia-container-toolkit /usr/bin
nvidia-container-runtime /usr/bin

View File

@@ -3,6 +3,7 @@
set -e
sed -i "s;@SECTION@;${SECTION:+$SECTION/};g" debian/control
sed -i "s;@LIBNVIDIA_CONTAINER_VERSION@;${LIBNVIDIA_CONTAINER_VERSION:+$LIBNVIDIA_CONTAINER_VERSION};g" debian/control
if [ -n "$DISTRIB" ]; then
sed -i "s;UNRELEASED;$DISTRIB;" debian/changelog

View File

@@ -11,24 +11,34 @@ URL: https://github.com/NVIDIA/nvidia-container-runtime
License: Apache-2.0
Source0: nvidia-container-toolkit
Source1: config.toml
Source2: oci-nvidia-hook
Source3: oci-nvidia-hook.json
Source4: LICENSE
Source1: nvidia-container-runtime
Source2: config.toml
Source3: oci-nvidia-hook
Source4: oci-nvidia-hook.json
Source5: LICENSE
Obsoletes: nvidia-container-runtime < 2.0.0, nvidia-container-runtime-hook
Obsoletes: nvidia-container-runtime <= 3.5.0-1, nvidia-container-runtime-hook
Provides: nvidia-container-runtime
Provides: nvidia-container-runtime-hook
Requires: libnvidia-container-tools >= 1.3.2, libnvidia-container-tools < 2.0.0
Requires: libnvidia-container-tools >= %{libnvidia_container_version}, libnvidia-container-tools < 2.0.0
%if 0%{?suse_version}
Requires: libseccomp2
Requires: libapparmor1
%else
Requires: libseccomp
%endif
%description
Provides a OCI hook to enable GPU support in containers.
%prep
cp %{SOURCE0} %{SOURCE1} %{SOURCE2} %{SOURCE3} %{SOURCE4} .
cp %{SOURCE0} %{SOURCE1} %{SOURCE2} %{SOURCE3} %{SOURCE4} %{SOURCE5} .
%install
mkdir -p %{buildroot}%{_bindir}
install -m 755 -t %{buildroot}%{_bindir} nvidia-container-toolkit
install -m 755 -t %{buildroot}%{_bindir} nvidia-container-runtime
mkdir -p %{buildroot}/etc/nvidia-container-runtime
install -m 644 -t %{buildroot}/etc/nvidia-container-runtime config.toml
@@ -48,11 +58,36 @@ rm -f %{_bindir}/nvidia-container-runtime-hook
%files
%license LICENSE
%{_bindir}/nvidia-container-toolkit
%{_bindir}/nvidia-container-runtime
%config /etc/nvidia-container-runtime/config.toml
/usr/libexec/oci/hooks.d/oci-nvidia-hook
/usr/share/containers/oci/hooks.d/oci-nvidia-hook.json
%changelog
* Tue Oct 26 2021 NVIDIA CORPORATION <cudatools@nvidia.com> 1.6.0-0.1.rc.2
- Use relative path to OCI specification file (config.json) if bundle path is not specified as an argument to the nvidia-container-runtime
* Mon Sep 06 2021 NVIDIA CORPORATION <cudatools@nvidia.com> 1.6.0-0.1.rc.1
- Add AARCH64 package for Amazon Linux 2
- Include nvidia-container-runtime into nvidia-container-toolkit package
* Mon Jun 14 2021 NVIDIA CORPORATION <cudatools@nvidia.com> 1.5.1-1
- Fix bug where Docker Swarm device selection is ignored if NVIDIA_VISIBLE_DEVICES is also set
- Improve unit testing by using require package and adding coverage reports
- Remove unneeded go dependencies by running go mod tidy
- Move contents of pkg directory to cmd for CLI tools
- Ensure make binary target explicitly sets GOOS
* Thu Apr 29 2021 NVIDIA CORPORATION <cudatools@nvidia.com> 1.5.0-1
- Add dependence on libnvidia-container-tools >= 1.4.0
- Add golang check targets to Makefile
- Add Jenkinsfile definition for build targets
- Move docker.mk to docker folder
* Fri Feb 05 2021 NVIDIA CORPORATION <cudatools@nvidia.com> 1.4.2-1
- Add dependence on libnvidia-container-tools >= 1.3.3
@@ -100,5 +135,5 @@ rm -f %{_bindir}/nvidia-container-runtime-hook
* Fri May 15 2020 NVIDIA CORPORATION <cudatools@nvidia.com> 1.1.0-1
- 4e4de762 Update build system to support multi-arch builds
- fcc1d116 Add support for MIG (Multi-Instance GPUs)
- d4ff0416 Add ability to merge envars of the form NVIDIA_VISIBLE_DEVICES_*
- d4ff0416 Add ability to merge envars of the form NVIDIA_VISIBLE_DEVICES_*
- 60f165ad Add no-pivot option to toolkit

73
scripts/build-all-components.sh Executable file
View File

@@ -0,0 +1,73 @@
#!/usr/bin/env bash
# Copyright (c) 2021, NVIDIA CORPORATION. All rights reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
# This script is used to build the packages for the components of the NVIDIA
# Container Stack. These include the nvidia-container-toolkit in this repository
# as well as the components included in the third_party folder.
# All required packages are generated in the specified dist folder.
function assert_usage() {
echo "Missing argument $1"
echo "$(basename ${BASH_SOURCE[0]}) TARGET"
exit 1
}
set -e -x
SCRIPTS_DIR="$( cd "$( dirname "${BASH_SOURCE[0]}" )"/../scripts && pwd )"
PROJECT_ROOT="$( cd ${SCRIPTS_DIR}/.. && pwd )"
if [[ $# -ne 1 ]]; then
assert_usage "TARGET"
fi
TARGET=$1
: ${DIST_DIR:=${PROJECT_ROOT}/dist}
export DIST_DIR
echo "Building ${TARGET} for all packages to ${DIST_DIR}"
: ${LIBNVIDIA_CONTAINER_ROOT:=${PROJECT_ROOT}/third_party/libnvidia-container}
: ${NVIDIA_CONTAINER_TOOLKIT_ROOT:=${PROJECT_ROOT}}
: ${NVIDIA_CONTAINER_RUNTIME_ROOT:=${PROJECT_ROOT}/third_party/nvidia-container-runtime}
: ${NVIDIA_DOCKER_ROOT:=${PROJECT_ROOT}/third_party/nvidia-docker}
${SCRIPTS_DIR}/get-component-versions.sh
# Build libnvidia-container
make -C ${LIBNVIDIA_CONTAINER_ROOT} -f mk/docker.mk ${TARGET}
# Build nvidia-container-toolkit
make -C ${NVIDIA_CONTAINER_TOOLKIT_ROOT} ${TARGET}
if [[ -z ${NVIDIA_CONTAINER_TOOLKIT_VERSION} ]]; then
eval $(${SCRIPTS_DIR}/get-component-versions.sh)
fi
# We set the TOOLKIT_VERSION for the nvidia-container-runtime and nvidia-docker targets
# Build nvidia-container-runtime
make -C ${NVIDIA_CONTAINER_RUNTIME_ROOT} \
TOOLKIT_VERSION="${NVIDIA_CONTAINER_TOOLKIT_VERSION}" \
TOOLKIT_TAG="${NVIDIA_CONTAINER_TOOLKIT_TAG}" \
${TARGET}
# Build nvidia-docker2
make -C ${NVIDIA_DOCKER_ROOT} \
TOOLKIT_VERSION="${NVIDIA_CONTAINER_TOOLKIT_VERSION}" \
TOOLKIT_TAG="${NVIDIA_CONTAINER_TOOLKIT_TAG}" \
${TARGET}

View File

@@ -0,0 +1,63 @@
#!/usr/bin/env bash
# Copyright (c) 2021, NVIDIA CORPORATION. All rights reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
# This script is used to build the packages for the components of the NVIDIA
# Container Stack. These include the nvidia-container-toolkit in this repository
# as well as the components included in the third_party folder.
# All required packages are generated in the specified dist folder.
function assert_usage() {
exit 1
}
set -e
SCRIPTS_DIR="$( cd "$( dirname "${BASH_SOURCE[0]}" )"/../scripts && pwd )"
PROJECT_ROOT="$( cd ${SCRIPTS_DIR}/.. && pwd )"
: ${LIBNVIDIA_CONTAINER_ROOT:=${PROJECT_ROOT}/third_party/libnvidia-container}
: ${NVIDIA_CONTAINER_TOOLKIT_ROOT:=${PROJECT_ROOT}}
: ${NVIDIA_CONTAINER_RUNTIME_ROOT:=${PROJECT_ROOT}/third_party/nvidia-container-runtime}
: ${NVIDIA_DOCKER_ROOT:=${PROJECT_ROOT}/third_party/nvidia-docker}
# Get version for libnvidia-container
libnvidia_container_version_tag=$(grep "#define NVC_VERSION" ${LIBNVIDIA_CONTAINER_ROOT}/src/nvc.h \
| sed -e 's/#define NVC_VERSION[[:space:]]"\(.*\)"/\1/')
# Get version for nvidia-container-toolit
nvidia_container_toolkit_version=$(grep -m 1 "^LIB_VERSION := " ${NVIDIA_CONTAINER_TOOLKIT_ROOT}/Makefile | sed -e 's/LIB_VERSION :=[[:space:]]\(.*\)[[:space:]]*/\1/')
nvidia_container_toolkit_tag=$(grep -m 1 "^LIB_TAG .= " ${NVIDIA_CONTAINER_TOOLKIT_ROOT}/Makefile | sed -e 's/LIB_TAG .=[[:space:]]\(.*\)[[:space:]]*/\1/')
nvidia_container_toolkit_version_tag="${nvidia_container_toolkit_version}${nvidia_container_toolkit_tag:+~${nvidia_container_toolkit_tag}}"
# Get version for nvidia-container-runtime
nvidia_container_runtime_version=$(grep -m 1 "^LIB_VERSION := " ${NVIDIA_CONTAINER_RUNTIME_ROOT}/Makefile | sed -e 's/LIB_VERSION :=[[:space:]]\(.*\)[[:space:]]*/\1/')
nvidia_container_runtime_tag=$(grep -m 1 "^LIB_TAG .= " ${NVIDIA_CONTAINER_RUNTIME_ROOT}/Makefile | sed -e 's/LIB_TAG .=[[:space:]]\(.*\)[[:space:]]*/\1/')
nvidia_container_runtime_version_tag="${nvidia_container_runtime_version}${nvidia_container_runtime_tag:+~${nvidia_container_runtime_tag}}"
# Get version for nvidia-docker
nvidia_docker_version=$(grep -m 1 "^LIB_VERSION := " ${NVIDIA_DOCKER_ROOT}/Makefile | sed -e 's/LIB_VERSION :=[[:space:]]\(.*\)[[:space:]]*/\1/')
nvidia_docker_tag=$(grep -m 1 "^LIB_TAG .= " ${NVIDIA_DOCKER_ROOT}/Makefile | sed -e 's/LIB_TAG .=[[:space:]]\(.*\)[[:space:]]*/\1/')
nvidia_docker_version_tag="${nvidia_docker_version}${nvidia_docker_tag:+~${nvidia_docker_tag}}"
echo "LIBNVIDIA_CONTAINER_VERSION=${libnvidia_container_version_tag}"
echo "NVIDIA_CONTAINER_TOOLKIT_VERSION=${nvidia_container_toolkit_version}"
echo "NVIDIA_CONTAINER_TOOLKIT_TAG=${nvidia_container_toolkit_tag}"
if [[ "${libnvidia_container_version_tag}" != "${nvidia_container_toolkit_version_tag}" ]]; then
>&2 echo "WARNING: The libnvidia-container and nvidia-container-toolkit versions do not match"
fi
echo "NVIDIA_CONTAINER_RUNTIME_VERSION=${nvidia_container_runtime_version}"
echo "NVIDIA_DOCKER_VERSION=${nvidia_docker_version}"

61
scripts/release.sh Executable file
View File

@@ -0,0 +1,61 @@
#!/usr/bin/env bash
# Copyright (c) 2021, NVIDIA CORPORATION. All rights reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
# This script is used to build the packages for the components of the NVIDIA
# Container Stack. These include the nvidia-container-toolkit in this repository
# as well as the components included in the third_party folder.
# All required packages are generated in the specified dist folder.
set -e -x
SCRIPTS_DIR="$( cd "$( dirname "${BASH_SOURCE[0]}" )"/../scripts && pwd )"
PROJECT_ROOT="$( cd ${SCRIPTS_DIR}/.. && pwd )"
# This list represents the distribution-architecture pairs that are actually published
# to the relevant repositories. This targets forwarded to the build-all-components script
# can be overridden by specifying command line arguments.
all=(
amazonlinux1-x86_64
amazonlinux2-aarch64
amazonlinux2-x86_64
centos7-ppc64le
centos7-x86_64
centos8-aarch64
centos8-ppc64le
centos8-x86_64
debian10-amd64
debian9-amd64
opensuse-leap15.1-x86_64
ubuntu16.04-amd64
ubuntu16.04-ppc64le
ubuntu18.04-amd64
ubuntu18.04-arm64
ubuntu18.04-ppc64le
)
if [[ $# -gt 0 ]]; then
targets=($*)
else
targets=${all[@]}
fi
eval $(${SCRIPTS_DIR}/get-component-versions.sh)
export NVIDIA_CONTAINER_TOOLKIT_VERSION
export NVIDIA_CONTAINER_TOOLKIT_TAG
for target in ${targets[@]}; do
${SCRIPTS_DIR}/build-all-components.sh ${target}
done

View File

@@ -0,0 +1,2 @@
#!/bin/bash
echo mock hook

2
test/bin/runc Executable file
View File

@@ -0,0 +1,2 @@
#!/bin/bash
echo mock runc

117
test/container/common.sh Normal file
View File

@@ -0,0 +1,117 @@
#! /bin/bash
# Copyright (c) 2019-2021, NVIDIA CORPORATION. All rights reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
readonly CRIO_HOOKS_DIR="/usr/share/containers/oci/hooks.d"
readonly CRIO_HOOK_FILENAME="oci-nvidia-hook.json"
# shellcheck disable=SC2015
[ -t 2 ] && readonly LOG_TTY=1 || readonly LOG_NO_TTY=1
if [ "${LOG_TTY-0}" -eq 1 ] && [ "$(tput colors)" -ge 15 ]; then
readonly FMT_BOLD=$(tput bold)
readonly FMT_RED=$(tput setaf 1)
readonly FMT_YELLOW=$(tput setaf 3)
readonly FMT_BLUE=$(tput setaf 12)
readonly FMT_CLEAR=$(tput sgr0)
fi
log() {
local -r level="$1"; shift
local -r message="$*"
local fmt_on="${FMT_CLEAR-}"
local -r fmt_off="${FMT_CLEAR-}"
case "${level}" in
INFO) fmt_on="${FMT_BLUE-}" ;;
WARN) fmt_on="${FMT_YELLOW-}" ;;
ERROR) fmt_on="${FMT_RED-}" ;;
esac
printf "%s[%s]%s %b\n" "${fmt_on}" "${level}" "${fmt_off}" "${message}" >&2
}
with_retry() {
local max_attempts="$1"
local delay="$2"
local count=0
local rc
shift 2
while true; do
set +e
"$@"; rc="$?"
set -e
count="$((count+1))"
if [[ "${rc}" -eq 0 ]]; then
return 0
fi
if [[ "${max_attempts}" -le 0 ]] || [[ "${count}" -lt "${max_attempts}" ]]; then
sleep "${delay}"
else
break
fi
done
return 1
}
testing::setup() {
cp -Rp ${basedir}/shared ${shared_dir}
mkdir -p "${shared_dir}/etc/containerd"
mkdir -p "${shared_dir}/etc/docker"
mkdir -p "${shared_dir}/run/docker/containerd"
mkdir -p "${shared_dir}/run/nvidia"
mkdir -p "${shared_dir}/usr/local/nvidia"
mkdir -p "${shared_dir}${CRIO_HOOKS_DIR}"
}
testing::cleanup() {
if [[ "${CLEANUP}" == "false" ]]; then
echo "Skipping cleanup: CLEANUP=${CLEANUP}"
return 0
fi
if [[ -e "${shared_dir}" ]]; then
docker run --rm \
-v "${shared_dir}:/work" \
alpine sh -c 'rm -rf /work/*'
rmdir "${shared_dir}"
fi
if [[ "${test_cases:-""}" == "" ]]; then
echo "No test cases defined. Skipping test case cleanup"
return 0
fi
for tc in ${test_cases}; do
testing::${tc}::cleanup
done
}
testing::docker_run::toolkit::shell() {
docker run --rm --privileged \
--entrypoint sh \
-v "${shared_dir}/etc/containerd:/etc/containerd" \
-v "${shared_dir}/etc/docker:/etc/docker" \
-v "${shared_dir}/run/docker/containerd:/run/docker/containerd" \
-v "${shared_dir}/run/nvidia:/run/nvidia" \
-v "${shared_dir}/usr/local/nvidia:/usr/local/nvidia" \
-v "${shared_dir}${CRIO_HOOKS_DIR}:${CRIO_HOOKS_DIR}" \
"${toolkit_container_image}" "-c" "$*"
}

147
test/container/containerd_test.sh Executable file
View File

@@ -0,0 +1,147 @@
#! /bin/bash
# Copyright (c) 2019, NVIDIA CORPORATION. All rights reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
readonly containerd_dind_ctr="container-config-containerd-dind-ctr-name"
readonly containerd_test_ctr="container-config-containerd-test-ctr-name"
readonly containerd_dind_socket="/run/nvidia/docker.sock"
readonly containerd_dind_containerd_dir="/run/docker/containerd"
testing::containerd::dind::setup() {
# Docker creates /etc/docker when starting
# by default there isn't any config in this directory (even after the daemon starts)
docker run -d --rm --privileged \
-v "${shared_dir}/etc/docker:/etc/docker" \
-v "${shared_dir}/run/nvidia:/run/nvidia" \
-v "${shared_dir}/usr/local/nvidia:/usr/local/nvidia" \
-v "${shared_dir}/run/docker/containerd:/run/docker/containerd" \
--name "${containerd_dind_ctr}" \
docker:stable-dind -H unix://${containerd_dind_socket}
}
testing::containerd::dind::exec() {
docker exec "${containerd_dind_ctr}" sh -c "$*"
}
testing::containerd::toolkit::run() {
local version=${1}
# We run ctr image list to ensure that containerd has successfully started in the docker-in-docker container
with_retry 5 5s testing::containerd::dind::exec " \
ctr --address=${containerd_dind_containerd_dir}/containerd.sock image list -q"
# Ensure that we can run some non GPU containers from within dind
with_retry 3 5s testing::containerd::dind::exec " \
ctr --address=${containerd_dind_containerd_dir}/containerd.sock image pull nvcr.io/nvidia/cuda:11.1-base; \
ctr --address=${containerd_dind_containerd_dir}/containerd.sock run --rm --runtime=io.containerd.runtime.v1.linux nvcr.io/nvidia/cuda:11.1-base cuda echo foo"
# Share the volumes so that we can edit the config file and point to the new runtime
# Share the pid so that we can ask docker to reload its config
docker run --rm --privileged \
--volumes-from "${containerd_dind_ctr}" \
-v "${shared_dir}/etc/containerd/config_${version}.toml:${containerd_dind_containerd_dir}/containerd.toml" \
--pid "container:${containerd_dind_ctr}" \
-e "RUNTIME=containerd" \
-e "RUNTIME_ARGS=--config=${containerd_dind_containerd_dir}/containerd.toml --socket=${containerd_dind_containerd_dir}/containerd.sock" \
--name "${containerd_test_ctr}" \
"${toolkit_container_image}" "/usr/local/nvidia" "--no-daemon"
# We run ctr image list to ensure that containerd has successfully started in the docker-in-docker container
with_retry 5 5s testing::containerd::dind::exec " \
ctr --address=${containerd_dind_containerd_dir}/containerd.sock image list -q"
# Ensure that we haven't broken non GPU containers
with_retry 3 5s testing::containerd::dind::exec " \
ctr --address=${containerd_dind_containerd_dir}/containerd.sock image pull nvcr.io/nvidia/cuda:11.1-base; \
ctr --address=${containerd_dind_containerd_dir}/containerd.sock run --rm --runtime=io.containerd.runtime.v1.linux nvcr.io/nvidia/cuda:11.1-base cuda echo foo"
}
# This test runs containerd setup and containerd cleanup in succession to ensure that the
# config is restored correctly.
testing::containerd::toolkit::test_config() {
local version=${1}
# We run ctr image list to ensure that containerd has successfully started in the docker-in-docker container
with_retry 5 5s testing::containerd::dind::exec " \
ctr --address=${containerd_dind_containerd_dir}/containerd.sock image list -q"
local input_config="${shared_dir}/etc/containerd/config_${version}.toml"
local output_config="${shared_dir}/output/config_${version}.toml"
local output_dir=$(dirname ${output_config})
mkdir -p ${output_dir}
cp -p "${input_config}" "${output_config}"
docker run --rm --privileged \
--volumes-from "${containerd_dind_ctr}" \
-v "${output_dir}:${output_dir}" \
--name "${containerd_test_ctr}" \
--entrypoint sh \
"${toolkit_container_image}" -c "containerd setup \
--config=${output_config} \
--socket=${containerd_dind_containerd_dir}/containerd.sock \
--restart-mode=NONE \
/usr/local/nvidia/toolkit"
# As a basic test we check that the config has changed
diff "${input_config}" "${output_config}" || test ${?} -ne 0
grep -q -E "^version = \d" "${output_config}"
grep -q -E "default_runtime_name = \"nvidia\"" "${output_config}"
docker run --rm --privileged \
--volumes-from "${containerd_dind_ctr}" \
-v "${output_dir}:${output_dir}" \
--name "${containerd_test_ctr}" \
--entrypoint sh \
"${toolkit_container_image}" -c "containerd cleanup \
--config=${output_config} \
--socket=${containerd_dind_containerd_dir}/containerd.sock \
--restart-mode=NONE \
/usr/local/nvidia/toolkit"
if [[ -s "${input_config}" ]]; then
# Compare the input and output config. These should be the same.
diff "${input_config}" "${output_config}" || true
else
# If the input config is empty, the output should not exist.
test ! -e "${output_config}"
fi
}
testing::containerd::main() {
testing::containerd::dind::setup
testing::containerd::toolkit::test_config empty
testing::containerd::toolkit::test_config v1
testing::containerd::toolkit::test_config v2
testing::containerd::cleanup
testing::containerd::dind::setup
testing::containerd::toolkit::run empty
testing::containerd::cleanup
testing::containerd::dind::setup
testing::containerd::toolkit::run v1
testing::containerd::cleanup
testing::containerd::dind::setup
testing::containerd::toolkit::run v2
testing::containerd::cleanup
}
testing::containerd::cleanup() {
docker kill "${containerd_dind_ctr}" &> /dev/null || true
docker kill "${containerd_test_ctr}" &> /dev/null || true
}

View File

@@ -0,0 +1,42 @@
#! /bin/bash
# Copyright (c) 2019, NVIDIA CORPORATION. All rights reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
testing::crio::hook_created() {
testing::docker_run::toolkit::shell 'crio setup /run/nvidia/toolkit'
test ! -z "$(ls -A "${shared_dir}${CRIO_HOOKS_DIR}")"
cat "${shared_dir}${CRIO_HOOKS_DIR}/${CRIO_HOOK_FILENAME}" | \
jq -r '.hook.path' | grep -q "/run/nvidia/toolkit/"
test $? -eq 0
cat "${shared_dir}${CRIO_HOOKS_DIR}/${CRIO_HOOK_FILENAME}" | \
jq -r '.hook.env[0]' | grep -q ":/run/nvidia/toolkit"
test $? -eq 0
}
testing::crio::hook_cleanup() {
testing::docker_run::toolkit::shell 'crio cleanup'
test -z "$(ls -A "${shared_dir}${CRIO_HOOKS_DIR}")"
}
testing::crio::main() {
testing::crio::hook_created
testing::crio::hook_cleanup
}
testing::crio::cleanup() {
:
}

57
test/container/docker_test.sh Executable file
View File

@@ -0,0 +1,57 @@
#! /bin/bash
# Copyright (c) 2019, NVIDIA CORPORATION. All rights reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
readonly docker_dind_ctr="container-config-docker-dind-ctr-name"
readonly docker_test_ctr="container-config-docker-test-ctr-name"
readonly docker_dind_socket="/run/nvidia/docker.sock"
testing::docker::dind::setup() {
# Docker creates /etc/docker when starting
# by default there isn't any config in this directory (even after the daemon starts)
docker run -d --rm --privileged \
-v "${shared_dir}/etc/docker:/etc/docker" \
-v "${shared_dir}/run/nvidia:/run/nvidia" \
-v "${shared_dir}/usr/local/nvidia:/usr/local/nvidia" \
--name "${docker_dind_ctr}" \
docker:stable-dind -H unix://${docker_dind_socket}
}
testing::docker::dind::exec() {
docker exec "${docker_dind_ctr}" sh -c "$*"
}
testing::docker::toolkit::run() {
# Share the volumes so that we can edit the config file and point to the new runtime
# Share the pid so that we can ask docker to reload its config
docker run -d --rm --privileged \
--volumes-from "${docker_dind_ctr}" \
--pid "container:${docker_dind_ctr}" \
-e "RUNTIME_ARGS=--socket ${docker_dind_socket}" \
--name "${docker_test_ctr}" \
"${toolkit_container_image}" "/usr/local/nvidia" "--no-daemon"
# Ensure that we haven't broken non GPU containers
with_retry 3 5s testing::docker::dind::exec docker run -t alpine echo foo
}
testing::docker::main() {
testing::docker::dind::setup
testing::docker::toolkit::run
}
testing::docker::cleanup() {
docker kill "${docker_dind_ctr}" &> /dev/null || true
docker kill "${docker_test_ctr}" &> /dev/null || true
}

77
test/container/main.sh Normal file
View File

@@ -0,0 +1,77 @@
#! /bin/bash
# Copyright (c) 2019, NVIDIA CORPORATION. All rights reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
set -eEuo pipefail
shopt -s lastpipe
readonly basedir="$(dirname "$(realpath "$0")")"
source "${basedir}/common.sh"
source "${basedir}/toolkit_test.sh"
source "${basedir}/docker_test.sh"
source "${basedir}/crio_test.sh"
source "${basedir}/containerd_test.sh"
: ${CLEANUP:=true}
usage() {
cat >&2 <<EOF
Usage: $0 COMMAND [ARG...]
Commands:
run SHARED_DIR TOOLKIT_CONTAINER_IMAGE [-c | --no-cleanup-on-error ]
clean SHARED_DIR
EOF
}
if [ $# -lt 2 ]; then usage; exit 1; fi
# We defined shared_dir here so that it can be used in cleanup
readonly command=${1}; shift
readonly shared_dir="${1}"; shift;
case "${command}" in
clean) testing::cleanup; exit 0;;
run) ;;
*) usage; exit 0;;
esac
if [ $# -eq 0 ]; then usage; exit 1; fi
readonly toolkit_container_image="${1}"; shift
options=$(getopt -l no-cleanup-on-error -o c -- "$@")
if [[ "$?" -ne 0 ]]; then usage; exit 1; fi
# set options to positional parameters
eval set -- "${options}"
for opt in ${options}; do
case "${opt}" in
c | --no-cleanup-on-error) CLEANUP=false; shift;;
--) shift; break;;
esac
done
trap '"$CLEANUP" && testing::cleanup' ERR
readonly test_cases="${TEST_CASES:-toolkit docker crio containerd}"
testing::cleanup
for tc in ${test_cases}; do
log INFO "=================Testing ${tc}================="
testing::setup
testing::${tc}::main "$@"
testing::cleanup
done

View File

@@ -0,0 +1,92 @@
oom_score = 0
root = "/var/lib/containerd"
state = "/run/containerd"
[cgroup]
path = ""
[debug]
address = "/var/run/docker/containerd/containerd-debug.sock"
gid = 0
level = ""
uid = 0
[grpc]
address = "/var/run/docker/containerd/containerd.sock"
gid = 0
max_recv_message_size = 16777216
max_send_message_size = 16777216
uid = 0
[metrics]
address = ""
grpc_histogram = false
[plugins]
[plugins.cgroups]
no_prometheus = false
[plugins.cri]
disable_proc_mount = false
enable_selinux = false
enable_tls_streaming = false
max_container_log_line_size = 16384
sandbox_image = "k8s.gcr.io/pause:3.1"
stats_collect_period = 10
stream_server_address = "127.0.0.1"
stream_server_port = "0"
systemd_cgroup = false
[plugins.cri.cni]
bin_dir = "/opt/cni/bin"
conf_dir = "/etc/cni/net.d"
conf_template = ""
[plugins.cri.containerd]
no_pivot = false
snapshotter = "overlayfs"
[plugins.cri.containerd.default_runtime]
runtime_engine = ""
runtime_root = ""
runtime_type = "io.containerd.runtime.v1.linux"
[plugins.cri.containerd.untrusted_workload_runtime]
runtime_engine = ""
runtime_root = ""
runtime_type = ""
[plugins.cri.registry]
[plugins.cri.registry.mirrors]
[plugins.cri.registry.mirrors."docker.io"]
endpoint = ["https://registry-1.docker.io"]
[plugins.cri.x509_key_pair_streaming]
tls_cert_file = ""
tls_key_file = ""
[plugins.diff-service]
default = ["walking"]
[plugins.linux]
no_shim = false
runtime = "runc"
runtime_root = "/var/lib/docker/runc"
shim = "containerd-shim"
shim_debug = false
[plugins.opt]
path = "/opt/containerd"
[plugins.restart]
interval = "10s"
[plugins.scheduler]
deletion_threshold = 0
mutation_threshold = 100
pause_threshold = 0.02
schedule_delay = "0s"
startup_delay = "100ms"

View File

@@ -0,0 +1,139 @@
disabled_plugins = []
oom_score = 0
plugin_dir = ""
required_plugins = []
root = "/var/lib/containerd"
state = "/run/containerd"
version = 2
[cgroup]
path = ""
[debug]
address = "/var/run/docker/containerd/containerd-debug.sock"
gid = 0
level = ""
uid = 0
[grpc]
address = "/var/run/docker/containerd/containerd.sock"
gid = 0
max_recv_message_size = 16777216
max_send_message_size = 16777216
tcp_address = ""
tcp_tls_cert = ""
tcp_tls_key = ""
uid = 0
[metrics]
address = ""
grpc_histogram = false
[plugins]
[plugins."io.containerd.gc.v1.scheduler"]
deletion_threshold = 0
mutation_threshold = 100
pause_threshold = 0.02
schedule_delay = "0s"
startup_delay = "100ms"
[plugins."io.containerd.grpc.v1.cri"]
disable_apparmor = false
disable_cgroup = false
disable_proc_mount = false
disable_tcp_service = true
enable_selinux = false
enable_tls_streaming = false
max_concurrent_downloads = 3
max_container_log_line_size = 16384
restrict_oom_score_adj = false
sandbox_image = "k8s.gcr.io/pause:3.1"
stats_collect_period = 10
stream_idle_timeout = "4h0m0s"
stream_server_address = "127.0.0.1"
stream_server_port = "0"
systemd_cgroup = false
[plugins."io.containerd.grpc.v1.cri".cni]
bin_dir = "/opt/cni/bin"
conf_dir = "/etc/cni/net.d"
conf_template = ""
max_conf_num = 1
[plugins."io.containerd.grpc.v1.cri".containerd]
default_runtime_name = "runc"
no_pivot = false
snapshotter = "overlayfs"
[plugins."io.containerd.grpc.v1.cri".containerd.default_runtime]
privileged_without_host_devices = false
runtime_engine = ""
runtime_root = ""
runtime_type = ""
[plugins."io.containerd.grpc.v1.cri".containerd.runtimes]
[plugins."io.containerd.grpc.v1.cri".containerd.runtimes.runc]
privileged_without_host_devices = false
runtime_engine = ""
runtime_root = ""
runtime_type = "io.containerd.runc.v1"
[plugins."io.containerd.grpc.v1.cri".containerd.untrusted_workload_runtime]
privileged_without_host_devices = false
runtime_engine = ""
runtime_root = ""
runtime_type = ""
[plugins."io.containerd.grpc.v1.cri".registry]
[plugins."io.containerd.grpc.v1.cri".registry.mirrors]
[plugins."io.containerd.grpc.v1.cri".registry.mirrors."docker.io"]
endpoint = ["https://registry-1.docker.io"]
[plugins."io.containerd.grpc.v1.cri".x509_key_pair_streaming]
tls_cert_file = ""
tls_key_file = ""
[plugins."io.containerd.internal.v1.opt"]
path = "/opt/containerd"
[plugins."io.containerd.internal.v1.restart"]
interval = "10s"
[plugins."io.containerd.metadata.v1.bolt"]
content_sharing_policy = "shared"
[plugins."io.containerd.monitor.v1.cgroups"]
no_prometheus = false
[plugins."io.containerd.runtime.v1.linux"]
no_shim = false
runtime = "runc"
runtime_root = "/var/lib/docker/runc"
shim = "containerd-shim"
shim_debug = false
[plugins."io.containerd.runtime.v2.task"]
platforms = ["linux/amd64"]
[plugins."io.containerd.service.v1.diff-service"]
default = ["walking"]
[plugins."io.containerd.snapshotter.v1.devmapper"]
base_image_size = ""
pool_name = ""
root_path = ""
[timeouts]
"io.containerd.timeout.shim.cleanup" = "5s"
"io.containerd.timeout.shim.load" = "5s"
"io.containerd.timeout.shim.shutdown" = "3s"
"io.containerd.timeout.task.state" = "2s"
[ttrpc]
address = ""
gid = 0
uid = 0

View File

@@ -0,0 +1,3 @@
{
"registry-mirrors": ["https://mirror.gcr.io"]
}

View File

@@ -0,0 +1 @@
# This is a dummy lib file to test nvidia-runtime-experimental

View File

@@ -0,0 +1,79 @@
#! /bin/bash
# Copyright (c) 2019-2021, NVIDIA CORPORATION. All rights reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
testing::toolkit::install() {
local -r uid=$(id -u)
local -r gid=$(id -g)
local READLINK="readlink"
local -r platform=$(uname)
if [[ "${platform}" == "Darwin" ]]; then
READLINK="greadlink"
fi
testing::docker_run::toolkit::shell 'toolkit install /usr/local/nvidia/toolkit'
docker run --rm -v "${shared_dir}:/work" alpine sh -c "chown -R ${uid}:${gid} /work/"
# Ensure toolkit dir is correctly setup
test ! -z "$(ls -A "${shared_dir}/usr/local/nvidia/toolkit")"
test -L "${shared_dir}/usr/local/nvidia/toolkit/libnvidia-container.so.1"
test -e "$(${READLINK} -f "${shared_dir}/usr/local/nvidia/toolkit/libnvidia-container.so.1")"
test -e "${shared_dir}/usr/local/nvidia/toolkit/nvidia-container-cli"
test -e "${shared_dir}/usr/local/nvidia/toolkit/nvidia-container-toolkit"
test -e "${shared_dir}/usr/local/nvidia/toolkit/nvidia-container-runtime"
grep -q -E "nvidia driver modules are not yet loaded, invoking runc directly" "${shared_dir}/usr/local/nvidia/toolkit/nvidia-container-runtime"
grep -q -E "exec runc \".@\"" "${shared_dir}/usr/local/nvidia/toolkit/nvidia-container-runtime"
test -e "${shared_dir}/usr/local/nvidia/toolkit/nvidia-container-cli.real"
test -e "${shared_dir}/usr/local/nvidia/toolkit/nvidia-container-toolkit.real"
test -e "${shared_dir}/usr/local/nvidia/toolkit/nvidia-container-runtime.real"
test -e "${shared_dir}/usr/local/nvidia/toolkit/nvidia-container-runtime.experimental"
test -e "${shared_dir}/usr/local/nvidia/toolkit/nvidia-container-runtime-experimental"
grep -q -E "nvidia driver modules are not yet loaded, invoking runc directly" "${shared_dir}/usr/local/nvidia/toolkit/nvidia-container-runtime-experimental"
grep -q -E "exec runc \".@\"" "${shared_dir}/usr/local/nvidia/toolkit/nvidia-container-runtime-experimental"
grep -q -E "LD_LIBRARY_PATH=/run/nvidia/driver/usr/lib64:\\\$LD_LIBRARY_PATH " "${shared_dir}/usr/local/nvidia/toolkit/nvidia-container-runtime-experimental"
test -e "${shared_dir}/usr/local/nvidia/toolkit/.config/nvidia-container-runtime/config.toml"
# Ensure that the config file has the required contents.
# NOTE: This assumes that RUN_DIR is '/run/nvidia'
local -r nvidia_run_dir="/run/nvidia"
grep -q -E "^\s*ldconfig = \"@${nvidia_run_dir}/driver/sbin/ldconfig(.real)?\"" "${shared_dir}/usr/local/nvidia/toolkit/.config/nvidia-container-runtime/config.toml"
grep -q -E "^\s*root = \"${nvidia_run_dir}/driver\"" "${shared_dir}/usr/local/nvidia/toolkit/.config/nvidia-container-runtime/config.toml"
grep -q -E "^\s*path = \"/usr/local/nvidia/toolkit/nvidia-container-cli\"" "${shared_dir}/usr/local/nvidia/toolkit/.config/nvidia-container-runtime/config.toml"
}
testing::toolkit::delete() {
testing::docker_run::toolkit::shell 'mkdir -p /usr/local/nvidia/delete-toolkit'
testing::docker_run::toolkit::shell 'touch /usr/local/nvidia/delete-toolkit/test.file'
testing::docker_run::toolkit::shell 'toolkit delete /usr/local/nvidia/delete-toolkit'
test ! -z "$(ls -A "${shared_dir}/usr/local/nvidia")"
test ! -e "${shared_dir}/usr/local/nvidia/delete-toolkit"
}
testing::toolkit::main() {
testing::toolkit::install
testing::toolkit::delete
}
testing::toolkit::cleanup() {
:
}

View File

@@ -0,0 +1,784 @@
{
"ociVersion": "1.0.2-dev",
"process": {
"terminal": true,
"user": {
"uid": 0,
"gid": 0
},
"args": [
"sleep",
"60"
],
"env": [
"PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin",
"HOSTNAME=8de5efc6a95c",
"TERM=xterm"
],
"cwd": "/",
"capabilities": {
"bounding": [
"CAP_CHOWN",
"CAP_DAC_OVERRIDE",
"CAP_FSETID",
"CAP_FOWNER",
"CAP_MKNOD",
"CAP_NET_RAW",
"CAP_SETGID",
"CAP_SETUID",
"CAP_SETFCAP",
"CAP_SETPCAP",
"CAP_NET_BIND_SERVICE",
"CAP_SYS_CHROOT",
"CAP_KILL",
"CAP_AUDIT_WRITE"
],
"effective": [
"CAP_CHOWN",
"CAP_DAC_OVERRIDE",
"CAP_FSETID",
"CAP_FOWNER",
"CAP_MKNOD",
"CAP_NET_RAW",
"CAP_SETGID",
"CAP_SETUID",
"CAP_SETFCAP",
"CAP_SETPCAP",
"CAP_NET_BIND_SERVICE",
"CAP_SYS_CHROOT",
"CAP_KILL",
"CAP_AUDIT_WRITE"
],
"inheritable": [
"CAP_CHOWN",
"CAP_DAC_OVERRIDE",
"CAP_FSETID",
"CAP_FOWNER",
"CAP_MKNOD",
"CAP_NET_RAW",
"CAP_SETGID",
"CAP_SETUID",
"CAP_SETFCAP",
"CAP_SETPCAP",
"CAP_NET_BIND_SERVICE",
"CAP_SYS_CHROOT",
"CAP_KILL",
"CAP_AUDIT_WRITE"
],
"permitted": [
"CAP_CHOWN",
"CAP_DAC_OVERRIDE",
"CAP_FSETID",
"CAP_FOWNER",
"CAP_MKNOD",
"CAP_NET_RAW",
"CAP_SETGID",
"CAP_SETUID",
"CAP_SETFCAP",
"CAP_SETPCAP",
"CAP_NET_BIND_SERVICE",
"CAP_SYS_CHROOT",
"CAP_KILL",
"CAP_AUDIT_WRITE"
]
},
"apparmorProfile": "docker-default",
"oomScoreAdj": 0
},
"root": {
"path": "/var/lib/docker/overlay2/fbf92f54592ddb439159bc7eb25c865b9347a2d71d63b41b7b4e4a471847c84f/merged"
},
"hostname": "8de5efc6a95c",
"mounts": [
{
"destination": "/proc",
"type": "proc",
"source": "proc",
"options": [
"nosuid",
"noexec",
"nodev"
]
},
{
"destination": "/dev",
"type": "tmpfs",
"source": "tmpfs",
"options": [
"nosuid",
"strictatime",
"mode=755",
"size=65536k"
]
},
{
"destination": "/dev/pts",
"type": "devpts",
"source": "devpts",
"options": [
"nosuid",
"noexec",
"newinstance",
"ptmxmode=0666",
"mode=0620",
"gid=5"
]
},
{
"destination": "/sys",
"type": "sysfs",
"source": "sysfs",
"options": [
"nosuid",
"noexec",
"nodev",
"ro"
]
},
{
"destination": "/sys/fs/cgroup",
"type": "cgroup",
"source": "cgroup",
"options": [
"ro",
"nosuid",
"noexec",
"nodev"
]
},
{
"destination": "/dev/mqueue",
"type": "mqueue",
"source": "mqueue",
"options": [
"nosuid",
"noexec",
"nodev"
]
},
{
"destination": "/dev/shm",
"type": "tmpfs",
"source": "shm",
"options": [
"nosuid",
"noexec",
"nodev",
"mode=1777",
"size=67108864"
]
},
{
"destination": "/etc/resolv.conf",
"type": "bind",
"source": "/var/lib/docker/containers/8de5efc6a95c4ddae36dde7beb656ed5c7de912ecec2f628c42dbd0ef7bbeec6/resolv.conf",
"options": [
"rbind",
"rprivate"
]
},
{
"destination": "/etc/hostname",
"type": "bind",
"source": "/var/lib/docker/containers/8de5efc6a95c4ddae36dde7beb656ed5c7de912ecec2f628c42dbd0ef7bbeec6/hostname",
"options": [
"rbind",
"rprivate"
]
},
{
"destination": "/etc/hosts",
"type": "bind",
"source": "/var/lib/docker/containers/8de5efc6a95c4ddae36dde7beb656ed5c7de912ecec2f628c42dbd0ef7bbeec6/hosts",
"options": [
"rbind",
"rprivate"
]
}
],
"hooks": {
"prestart": [
{
"path": "/proc/593/exe",
"args": [
"libnetwork-setkey",
"-exec-root=/var/run/docker",
"8de5efc6a95c4ddae36dde7beb656ed5c7de912ecec2f628c42dbd0ef7bbeec6",
"9967b9f7c4d4"
]
}
]
},
"linux": {
"sysctl": {
"net.ipv4.ip_unprivileged_port_start": "0"
},
"resources": {
"devices": [
{
"allow": false,
"access": "rwm"
},
{
"allow": true,
"type": "c",
"major": 1,
"minor": 5,
"access": "rwm"
},
{
"allow": true,
"type": "c",
"major": 1,
"minor": 3,
"access": "rwm"
},
{
"allow": true,
"type": "c",
"major": 1,
"minor": 9,
"access": "rwm"
},
{
"allow": true,
"type": "c",
"major": 1,
"minor": 8,
"access": "rwm"
},
{
"allow": true,
"type": "c",
"major": 5,
"minor": 0,
"access": "rwm"
},
{
"allow": true,
"type": "c",
"major": 5,
"minor": 1,
"access": "rwm"
},
{
"allow": false,
"type": "c",
"major": 10,
"minor": 229,
"access": "rwm"
}
],
"memory": {
"disableOOMKiller": false
},
"cpu": {
"shares": 0
},
"blockIO": {
"weight": 0
}
},
"cgroupsPath": "/docker/8de5efc6a95c4ddae36dde7beb656ed5c7de912ecec2f628c42dbd0ef7bbeec6",
"namespaces": [
{
"type": "mount"
},
{
"type": "network"
},
{
"type": "uts"
},
{
"type": "pid"
},
{
"type": "ipc"
}
],
"seccomp": {
"defaultAction": "SCMP_ACT_ERRNO",
"architectures": [
"SCMP_ARCH_X86_64",
"SCMP_ARCH_X86",
"SCMP_ARCH_X32"
],
"syscalls": [
{
"names": [
"accept",
"accept4",
"access",
"adjtimex",
"alarm",
"bind",
"brk",
"capget",
"capset",
"chdir",
"chmod",
"chown",
"chown32",
"clock_adjtime",
"clock_adjtime64",
"clock_getres",
"clock_getres_time64",
"clock_gettime",
"clock_gettime64",
"clock_nanosleep",
"clock_nanosleep_time64",
"close",
"close_range",
"connect",
"copy_file_range",
"creat",
"dup",
"dup2",
"dup3",
"epoll_create",
"epoll_create1",
"epoll_ctl",
"epoll_ctl_old",
"epoll_pwait",
"epoll_pwait2",
"epoll_wait",
"epoll_wait_old",
"eventfd",
"eventfd2",
"execve",
"execveat",
"exit",
"exit_group",
"faccessat",
"faccessat2",
"fadvise64",
"fadvise64_64",
"fallocate",
"fanotify_mark",
"fchdir",
"fchmod",
"fchmodat",
"fchown",
"fchown32",
"fchownat",
"fcntl",
"fcntl64",
"fdatasync",
"fgetxattr",
"flistxattr",
"flock",
"fork",
"fremovexattr",
"fsetxattr",
"fstat",
"fstat64",
"fstatat64",
"fstatfs",
"fstatfs64",
"fsync",
"ftruncate",
"ftruncate64",
"futex",
"futex_time64",
"futimesat",
"getcpu",
"getcwd",
"getdents",
"getdents64",
"getegid",
"getegid32",
"geteuid",
"geteuid32",
"getgid",
"getgid32",
"getgroups",
"getgroups32",
"getitimer",
"getpeername",
"getpgid",
"getpgrp",
"getpid",
"getppid",
"getpriority",
"getrandom",
"getresgid",
"getresgid32",
"getresuid",
"getresuid32",
"getrlimit",
"get_robust_list",
"getrusage",
"getsid",
"getsockname",
"getsockopt",
"get_thread_area",
"gettid",
"gettimeofday",
"getuid",
"getuid32",
"getxattr",
"inotify_add_watch",
"inotify_init",
"inotify_init1",
"inotify_rm_watch",
"io_cancel",
"ioctl",
"io_destroy",
"io_getevents",
"io_pgetevents",
"io_pgetevents_time64",
"ioprio_get",
"ioprio_set",
"io_setup",
"io_submit",
"io_uring_enter",
"io_uring_register",
"io_uring_setup",
"ipc",
"kill",
"lchown",
"lchown32",
"lgetxattr",
"link",
"linkat",
"listen",
"listxattr",
"llistxattr",
"_llseek",
"lremovexattr",
"lseek",
"lsetxattr",
"lstat",
"lstat64",
"madvise",
"membarrier",
"memfd_create",
"mincore",
"mkdir",
"mkdirat",
"mknod",
"mknodat",
"mlock",
"mlock2",
"mlockall",
"mmap",
"mmap2",
"mprotect",
"mq_getsetattr",
"mq_notify",
"mq_open",
"mq_timedreceive",
"mq_timedreceive_time64",
"mq_timedsend",
"mq_timedsend_time64",
"mq_unlink",
"mremap",
"msgctl",
"msgget",
"msgrcv",
"msgsnd",
"msync",
"munlock",
"munlockall",
"munmap",
"nanosleep",
"newfstatat",
"_newselect",
"open",
"openat",
"openat2",
"pause",
"pidfd_open",
"pidfd_send_signal",
"pipe",
"pipe2",
"poll",
"ppoll",
"ppoll_time64",
"prctl",
"pread64",
"preadv",
"preadv2",
"prlimit64",
"pselect6",
"pselect6_time64",
"pwrite64",
"pwritev",
"pwritev2",
"read",
"readahead",
"readlink",
"readlinkat",
"readv",
"recv",
"recvfrom",
"recvmmsg",
"recvmmsg_time64",
"recvmsg",
"remap_file_pages",
"removexattr",
"rename",
"renameat",
"renameat2",
"restart_syscall",
"rmdir",
"rseq",
"rt_sigaction",
"rt_sigpending",
"rt_sigprocmask",
"rt_sigqueueinfo",
"rt_sigreturn",
"rt_sigsuspend",
"rt_sigtimedwait",
"rt_sigtimedwait_time64",
"rt_tgsigqueueinfo",
"sched_getaffinity",
"sched_getattr",
"sched_getparam",
"sched_get_priority_max",
"sched_get_priority_min",
"sched_getscheduler",
"sched_rr_get_interval",
"sched_rr_get_interval_time64",
"sched_setaffinity",
"sched_setattr",
"sched_setparam",
"sched_setscheduler",
"sched_yield",
"seccomp",
"select",
"semctl",
"semget",
"semop",
"semtimedop",
"semtimedop_time64",
"send",
"sendfile",
"sendfile64",
"sendmmsg",
"sendmsg",
"sendto",
"setfsgid",
"setfsgid32",
"setfsuid",
"setfsuid32",
"setgid",
"setgid32",
"setgroups",
"setgroups32",
"setitimer",
"setpgid",
"setpriority",
"setregid",
"setregid32",
"setresgid",
"setresgid32",
"setresuid",
"setresuid32",
"setreuid",
"setreuid32",
"setrlimit",
"set_robust_list",
"setsid",
"setsockopt",
"set_thread_area",
"set_tid_address",
"setuid",
"setuid32",
"setxattr",
"shmat",
"shmctl",
"shmdt",
"shmget",
"shutdown",
"sigaltstack",
"signalfd",
"signalfd4",
"sigprocmask",
"sigreturn",
"socket",
"socketcall",
"socketpair",
"splice",
"stat",
"stat64",
"statfs",
"statfs64",
"statx",
"symlink",
"symlinkat",
"sync",
"sync_file_range",
"syncfs",
"sysinfo",
"tee",
"tgkill",
"time",
"timer_create",
"timer_delete",
"timer_getoverrun",
"timer_gettime",
"timer_gettime64",
"timer_settime",
"timer_settime64",
"timerfd_create",
"timerfd_gettime",
"timerfd_gettime64",
"timerfd_settime",
"timerfd_settime64",
"times",
"tkill",
"truncate",
"truncate64",
"ugetrlimit",
"umask",
"uname",
"unlink",
"unlinkat",
"utime",
"utimensat",
"utimensat_time64",
"utimes",
"vfork",
"vmsplice",
"wait4",
"waitid",
"waitpid",
"write",
"writev"
],
"action": "SCMP_ACT_ALLOW"
},
{
"names": [
"ptrace"
],
"action": "SCMP_ACT_ALLOW"
},
{
"names": [
"personality"
],
"action": "SCMP_ACT_ALLOW",
"args": [
{
"index": 0,
"value": 0,
"op": "SCMP_CMP_EQ"
}
]
},
{
"names": [
"personality"
],
"action": "SCMP_ACT_ALLOW",
"args": [
{
"index": 0,
"value": 8,
"op": "SCMP_CMP_EQ"
}
]
},
{
"names": [
"personality"
],
"action": "SCMP_ACT_ALLOW",
"args": [
{
"index": 0,
"value": 131072,
"op": "SCMP_CMP_EQ"
}
]
},
{
"names": [
"personality"
],
"action": "SCMP_ACT_ALLOW",
"args": [
{
"index": 0,
"value": 131080,
"op": "SCMP_CMP_EQ"
}
]
},
{
"names": [
"personality"
],
"action": "SCMP_ACT_ALLOW",
"args": [
{
"index": 0,
"value": 4294967295,
"op": "SCMP_CMP_EQ"
}
]
},
{
"names": [
"arch_prctl"
],
"action": "SCMP_ACT_ALLOW"
},
{
"names": [
"modify_ldt"
],
"action": "SCMP_ACT_ALLOW"
},
{
"names": [
"clone"
],
"action": "SCMP_ACT_ALLOW",
"args": [
{
"index": 0,
"value": 2114060288,
"op": "SCMP_CMP_MASKED_EQ"
}
]
},
{
"names": [
"clone3"
],
"action": "SCMP_ACT_ERRNO",
"errnoRet": 38
},
{
"names": [
"chroot"
],
"action": "SCMP_ACT_ALLOW"
}
]
},
"maskedPaths": [
"/proc/asound",
"/proc/acpi",
"/proc/kcore",
"/proc/keys",
"/proc/latency_stats",
"/proc/timer_list",
"/proc/timer_stats",
"/proc/sched_debug",
"/proc/scsi",
"/sys/firmware"
],
"readonlyPaths": [
"/proc/bus",
"/proc/fs",
"/proc/irq",
"/proc/sys",
"/proc/sysrq-trigger"
]
}
}

178
test/input/test_spec.json Normal file
View File

@@ -0,0 +1,178 @@
{
"ociVersion": "1.0.1-dev",
"process": {
"terminal": true,
"user": {
"uid": 0,
"gid": 0
},
"args": [
"sh"
],
"env": [
"PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin",
"TERM=xterm"
],
"cwd": "/",
"capabilities": {
"bounding": [
"CAP_AUDIT_WRITE",
"CAP_KILL",
"CAP_NET_BIND_SERVICE"
],
"effective": [
"CAP_AUDIT_WRITE",
"CAP_KILL",
"CAP_NET_BIND_SERVICE"
],
"inheritable": [
"CAP_AUDIT_WRITE",
"CAP_KILL",
"CAP_NET_BIND_SERVICE"
],
"permitted": [
"CAP_AUDIT_WRITE",
"CAP_KILL",
"CAP_NET_BIND_SERVICE"
],
"ambient": [
"CAP_AUDIT_WRITE",
"CAP_KILL",
"CAP_NET_BIND_SERVICE"
]
},
"rlimits": [
{
"type": "RLIMIT_NOFILE",
"hard": 1024,
"soft": 1024
}
],
"noNewPrivileges": true
},
"root": {
"path": "rootfs",
"readonly": true
},
"hostname": "runc",
"mounts": [
{
"destination": "/proc",
"type": "proc",
"source": "proc"
},
{
"destination": "/dev",
"type": "tmpfs",
"source": "tmpfs",
"options": [
"nosuid",
"strictatime",
"mode=755",
"size=65536k"
]
},
{
"destination": "/dev/pts",
"type": "devpts",
"source": "devpts",
"options": [
"nosuid",
"noexec",
"newinstance",
"ptmxmode=0666",
"mode=0620",
"gid=5"
]
},
{
"destination": "/dev/shm",
"type": "tmpfs",
"source": "shm",
"options": [
"nosuid",
"noexec",
"nodev",
"mode=1777",
"size=65536k"
]
},
{
"destination": "/dev/mqueue",
"type": "mqueue",
"source": "mqueue",
"options": [
"nosuid",
"noexec",
"nodev"
]
},
{
"destination": "/sys",
"type": "sysfs",
"source": "sysfs",
"options": [
"nosuid",
"noexec",
"nodev",
"ro"
]
},
{
"destination": "/sys/fs/cgroup",
"type": "cgroup",
"source": "cgroup",
"options": [
"nosuid",
"noexec",
"nodev",
"relatime",
"ro"
]
}
],
"linux": {
"resources": {
"devices": [
{
"allow": false,
"access": "rwm"
}
]
},
"namespaces": [
{
"type": "pid"
},
{
"type": "network"
},
{
"type": "ipc"
},
{
"type": "uts"
},
{
"type": "mount"
}
],
"maskedPaths": [
"/proc/kcore",
"/proc/latency_stats",
"/proc/timer_list",
"/proc/timer_stats",
"/proc/sched_debug",
"/sys/firmware",
"/proc/scsi"
],
"readonlyPaths": [
"/proc/asound",
"/proc/bus",
"/proc/fs",
"/proc/irq",
"/proc/sys",
"/proc/sysrq-trigger"
]
}
}

59
test/release/Makefile Normal file
View File

@@ -0,0 +1,59 @@
# Copyright (c) 2021, NVIDIA CORPORATION. All rights reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
WORKFLOW ?= nvidia-docker
TEST_REPO ?= elezar.github.io
DISTRIBUTIONS := ubuntu18.04 centos8
IMAGE_TARGETS := $(patsubst %,image-%, $(DISTRIBUTIONS))
RUN_TARGETS := $(patsubst %,run-%, $(DISTRIBUTIONS))
RELEASE_TARGETS := $(patsubst %,release-%, $(DISTRIBUTIONS))
LOCAL_TARGETS := $(patsubst %,local-%, $(DISTRIBUTIONS))
.PHONY: $(IMAGE_TARGETS)
image-%: DOCKERFILE = docker/$(*)/Dockerfile
images: $(IMAGE_TARGETS)
$(IMAGE_TARGETS): image-%:
docker build \
--build-arg WORKFLOW="$(WORKFLOW)" \
--build-arg TEST_REPO="$(TEST_REPO)" \
-t nvidia-container-toolkit-repo-test:$(*) \
-f $(DOCKERFILE) \
$(shell dirname $(DOCKERFILE))
%-ubuntu18.04: ARCH = amd64
%-centos8: ARCH = x86_64
RELEASE_TEST_DIR := $(patsubst %/,%,$(dir $(abspath $(lastword $(MAKEFILE_LIST)))))
PROJECT_ROOT := $(RELEASE_TEST_DIR)/../..
LOCAL_PACKAGE_ROOT := $(PROJECT_ROOT)/dist
local-%: DIST = $(*)
local-%: LOCAL_REPO_ARGS = -v $(LOCAL_PACKAGE_ROOT)/$(DIST)/$(ARCH):/local-repository
$(LOCAL_TARGETS): local-%: release-% run-% | release-%
run-%: DIST = $(*)
$(RUN_TARGETS): run-%:
docker run --rm -ti \
$(LOCAL_REPO_ARGS) \
nvidia-container-toolkit-repo-test:$(*)
# Ensure that the local package root exists
$(RELEASE_TARGETS): release-%: $(LOCAL_PACKAGE_ROOT)/$(*)/$(ARCH)
$(PROJECT_ROOT)/scripts/release.sh $(*)-$(ARCH)

107
test/release/README.md Normal file
View File

@@ -0,0 +1,107 @@
# Testing Packaging Workflows
## Building the Docker images:
```bash
make images
```
This assumes that the `nvidia-docker` workflow is being tested and that the test packages have been published to the `elezar.github.io` GitHub pages page. These can be overridden
using make variables.
Valid values for `workflow` are:
* `nvidia-docker`
* `nvidia-container-runtime`
This follows the instructions for setting up the [`nvidia-docker` repostitory](https://docs.nvidia.com/datacenter/cloud-native/container-toolkit/install-guide.html#setting-up-nvidia-container-toolkit).
## Testing local package changes
Running:
```bash
make local-ubuntu18.04
```
will build the `ubuntu18.04-amd64` packages for release and launch a docker container with these packages added to a local APT repository.
The various `apt` workflows can then be tested as if the packages were released to the `libnvidia-container` experimental repository.
The `local-centos8` make target is available for testing the `centos8-x86_64` workflows as representative of `yum`-based installation workflows.
### Example
In the `centos8`-based container above we see the following `nvidia-docker2` packages available with the `local-repository` disabled:
```bash
$ yum list --showduplicates nvidia-docker2 | tail -2
nvidia-docker2.noarch 2.5.0-1 nvidia-docker
nvidia-docker2.noarch 2.6.0-1 nvidia-docker
```
Installing `nvidia-docker2` shows:
```bash
$ yum install -y nvidia-docker2
```
installs the following packages:
```bash
$ yum list installed | grep nvidia
libnvidia-container-tools.x86_64 1.5.1-1 @libnvidia-container
libnvidia-container1.x86_64 1.5.1-1 @libnvidia-container
nvidia-container-runtime.x86_64 3.5.0-1 @nvidia-container-runtime
nvidia-container-toolkit.x86_64 1.5.1-2 @nvidia-container-runtime
nvidia-docker2.noarch 2.6.0-1 @nvidia-docker
```
Note the repositories where these packages were installed from.
We now enable the `local-repository` to simulate the new packages being published to the `libnvidia-container` experimental repository and check the available `nvidia-docker2` versions:
```bash
$ yum-config-manager --enable local-repository
$ yum list --showduplicates nvidia-docker2 | tail -2
nvidia-docker2.noarch 2.6.0-1 nvidia-docker
nvidia-docker2.noarch 2.6.1-0.1.rc.1 local-repository
```
Showing the new version available in the local repository.
Running:
```
$ yum install nvidia-docker2
Last metadata expiration check: 0:01:15 ago on Fri Sep 24 12:49:20 2021.
Package nvidia-docker2-2.6.0-1.noarch is already installed.
Dependencies resolved.
===============================================================================================================================
Package Architecture Version Repository Size
===============================================================================================================================
Upgrading:
libnvidia-container-tools x86_64 1.6.0-0.1.rc.1 local-repository 48 k
libnvidia-container1 x86_64 1.6.0-0.1.rc.1 local-repository 95 k
nvidia-container-toolkit x86_64 1.6.0-0.1.rc.1 local-repository 1.5 M
replacing nvidia-container-runtime.x86_64 3.5.0-1
nvidia-docker2 noarch 2.6.1-0.1.rc.1 local-repository 13 k
Transaction Summary
===============================================================================================================================
Upgrade 4 Packages
Total size: 1.7 M
```
Showing that all the components of the stack will be updated with versions from the `local-repository`.
After installation the installed packages are shown as:
```bash
$ yum list installed | grep nvidia
libnvidia-container-tools.x86_64 1.6.0-0.1.rc.1 @local-repository
libnvidia-container1.x86_64 1.6.0-0.1.rc.1 @local-repository
nvidia-container-toolkit.x86_64 1.6.0-0.1.rc.1 @local-repository
nvidia-docker2.noarch 2.6.1-0.1.rc.1 @local-repository
```
Showing that:
1. All versions have been installed from the same repository
2. The `nvidia-container-runtime` package was removed as it is no longer required.
The `nvidia-container-runtime` executable is, however, still present on the system:
```bash
# ls -l /usr/bin/nvidia-container-runtime
-rwxr-xr-x 1 root root 2256280 Sep 24 12:42 /usr/bin/nvidia-container-runtime
```

View File

@@ -0,0 +1,35 @@
FROM centos:8
RUN yum install -y \
yum-utils \
ruby-devel \
gcc \
make \
rpm-build \
rubygems \
createrepo
RUN gem install --no-document fpm
# We create and install a dummy docker package since these dependencies are out of
# scope for the tests performed here.
RUN fpm -s empty \
-t rpm \
--description "A dummy package for docker-ce_18.06.3.ce-3.el7" \
-n docker-ce --version 18.06.3.ce-3.el7 \
-p /tmp/docker.rpm \
&& \
yum localinstall -y /tmp/docker.rpm \
&& \
rm -f /tmp/docker.rpm
ARG WORKFLOW=nvidia-docker
ARG TEST_REPO=nvidia.github.io
ENV TEST_REPO ${TEST_REPO}
RUN curl -s -L https://nvidia.github.io/${WORKFLOW}/centos8/nvidia-docker.repo \
| tee /etc/yum.repos.d/nvidia-docker.repo
COPY entrypoint.sh /
ENTRYPOINT [ "/entrypoint.sh" ]

View File

@@ -0,0 +1,42 @@
#!/usr/bin/env bash
# Copyright (c) 2021, NVIDIA CORPORATION. All rights reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
# This script is used to build the packages for the components of the NVIDIA
# Container Stack. These include the nvidia-container-toolkit in this repository
# as well as the components included in the third_party folder.
# All required packages are generated in the specified dist folder.
: ${LOCAL_REPO_DIRECTORY:=/local-repository}
if [[ -d ${LOCAL_REPO_DIRECTORY} ]]; then
echo "Setting up local-repository"
createrepo /local-repository
cat >/etc/yum.repos.d/local.repo <<EOL
[local-repository]
name=NVIDIA Container Toolkit Local Packages
baseurl=file:///local-repository
enabled=0
gpgcheck=0
protect=1
EOL
yum-config-manager --enable local-repository
else
echo "Setting up TEST repo: ${TEST_REPO}"
sed -i -e 's#nvidia\.github\.io/libnvidia-container#${TEST_REPO}/libnvidia-container#g' /etc/yum.repos.d/nvidia-docker.repo
yum-config-manager --enable libnvidia-container-experimental
fi
exec bash $@

View File

@@ -0,0 +1,50 @@
# Copyright (c) 2021, NVIDIA CORPORATION. All rights reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
FROM ubuntu:18.04
ARG DEBIAN_FRONTEND noninteractive
RUN apt-get update && apt-get install --no-install-recommends -y \
curl \
gnupg2 \
apt-transport-https \
ca-certificates \
apt-utils \
ruby ruby-dev rubygems build-essential
RUN gem install --no-document fpm
# We create and install a dummy docker package since these dependencies are out of
# scope for the tests performed here.
RUN fpm -s empty \
-t deb \
--description "A dummy package for docker.io_18.06.0" \
-n docker.io --version 18.06.0 \
-p /tmp/docker.deb \
--deb-no-default-config-files \
&& \
dpkg -i /tmp/docker.deb \
&& \
rm -f /tmp/docker.deb
ARG WORKFLOW=nvidia-docker
RUN curl -s -L https://nvidia.github.io/${WORKFLOW}/gpgkey | apt-key add - \
&& curl -s -L https://nvidia.github.io/${WORKFLOW}/ubuntu18.04/nvidia-docker.list | tee /etc/apt/sources.list.d/nvidia-docker.list \
&& apt-get update
COPY entrypoint.sh /
COPY install_repo.sh /
ENTRYPOINT [ "/entrypoint.sh" ]

View File

@@ -0,0 +1,34 @@
#!/usr/bin/env bash
# Copyright (c) 2021, NVIDIA CORPORATION. All rights reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
# This script is used to build the packages for the components of the NVIDIA
# Container Stack. These include the nvidia-container-toolkit in this repository
# as well as the components included in the third_party folder.
# All required packages are generated in the specified dist folder.
: ${LOCAL_REPO_DIRECTORY:=/local-repository}
if [[ -d ${LOCAL_REPO_DIRECTORY} ]]; then
echo "Setting up local-repository"
echo "deb [trusted=yes] file:/local-repository ./" > /etc/apt/sources.list.d/local.list
$(cd /local-repository && apt-ftparchive packages . > Packages)
elif [[ -n ${TEST_REPO} ]]; then
./install_repo.sh ${TEST_REPO}
else
echo "Skipping repo setup"
fi
apt-get update
exec bash $@

View File

@@ -0,0 +1,25 @@
#!/usr/bin/env bash
# Copyright (c) 2021, NVIDIA CORPORATION. All rights reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
# This script is used to build the packages for the components of the NVIDIA
# Container Stack. These include the nvidia-container-toolkit in this repository
# as well as the components included in the third_party folder.
# All required packages are generated in the specified dist folder.
test_repo=$1
echo "Setting up TEST repo: ${test_repo}"
sed -i -e "s#nvidia\.github\.io/libnvidia-container#${test_repo}/libnvidia-container#g" /etc/apt/sources.list.d/nvidia-docker.list
sed -i -e '/experimental/ s/^#//g' /etc/apt/sources.list.d/nvidia-docker.list

1
third_party/nvidia-docker vendored Submodule

77
tools/container/README.md Normal file
View File

@@ -0,0 +1,77 @@
## Introduction
This repository contains tools that allow docker, containerd, or cri-o to be configured to use the NVIDIA Container Toolkit.
*Note*: These were copied from the [`container-config` repository](https://gitlab.com/nvidia/container-toolkit/container-config/-/tree/383587f766a55177ede0e39e3810a974043e503e) are being migrated to commands installed with the NVIDIA Container Toolkit.
These will be migrated into an upcoming `nvidia-ctk` CLI as required.
### Docker
After building the `docker` binary, run:
```bash
docker setup \
--runtime-name NAME \
/run/nvidia/toolkit
```
Configure the `nvidia-container-runtime` as a docker runtime named `NAME`. If the `--runtime-name` flag is not specified, this runtime would be called `nvidia`. A runtime named `nvidia-experimental` will also be configured using the `nvidia-container-runtime-experimental` OCI-compliant runtime shim.
Since `--set-as-default` is enabled by default, the specified runtime name will also be set as the default docker runtime. This can be disabled by explicityly specifying `--set-as-default=false`.
**Note**: If `--runtime-name` is specified as `nvidia-experimental` explicitly, the `nvidia-experimental` runtime will be configured as the default runtime, with the `nvidia` runtime still configured and available for use.
The following table describes the behaviour for different `--runtime-name` and `--set-as-default` flag combinations.
| Flags | Installed Runtimes | Default Runtime |
|-------------------------------------------------------------|:--------------------------------|:----------------------|
| **NONE SPECIFIED** | `nvidia`, `nvidia-experimental` | `nvidia` |
| `--runtime-name nvidia` | `nvidia`, `nvidia-experimental` | `nvidia` |
| `--runtime-name NAME` | `NAME`, `nvidia-experimental` | `NAME` |
| `--runtime-name nvidia-experimental` | `nvidia`, `nvidia-experimental` | `nvidia-experimental` |
| `--set-as-default` | `nvidia`, `nvidia-experimental` | `nvidia` |
| `--set-as-default --runtime-name nvidia` | `nvidia`, `nvidia-experimental` | `nvidia` |
| `--set-as-default --runtime-name NAME` | `NAME`, `nvidia-experimental` | `NAME` |
| `--set-as-default --runtime-name nvidia-experimental` | `nvidia`, `nvidia-experimental` | `nvidia-experimental` |
| `--set-as-default=false` | `nvidia`, `nvidia-experimental` | **NOT SET** |
| `--set-as-default=false --runtime-name NAME` | `NAME`, `nvidia-experimental` | **NOT SET** |
| `--set-as-default=false --runtime-name nvidia` | `nvidia`, `nvidia-experimental` | **NOT SET** |
| `--set-as-default=false --runtime-name nvidia-experimental` | `nvidia`, `nvidia-experimental` | **NOT SET** |
These combinations also hold for the environment variables that map to the command line flags: `DOCKER_RUNTIME_NAME`, `DOCKER_SET_AS_DEFAULT`.
### Containerd
After running the `containerd` binary, run:
```bash
containerd setup \
--runtime-class NAME \
/run/nvidia/toolkit
```
Configure the `nvidia-container-runtime` as a runtime class named `NAME`. If the `--runtime-class` flag is not specified, this runtime would be called `nvidia`. A runtime class named `nvidia-experimental` will also be configured using the `nvidia-container-runtime-experimental` OCI-compliant runtime shim.
Adding the `--set-as-default` flag as follows:
```bash
containerd setup \
--runtime-class NAME \
--set-as-default \
/run/nvidia/toolkit
```
will set the runtime class `NAME` (or `nvidia` if not specified) as the default runtime class.
**Note**: If `--runtime-class` is specified as `nvidia-experimental` explicitly and `--set-as-default` is specified, the `nvidia-experimental` runtime will be configured as the default runtime class, with the `nvidia` runtime class still configured and available for use.
The following table describes the behaviour for different `--runtime-class` and `--set-as-default` flag combinations.
| Flags | Installed Runtime Classes | Default Runtime Class |
|--------------------------------------------------------|:--------------------------------|:----------------------|
| **NONE SPECIFIED** | `nvidia`, `nvidia-experimental` | **NOT SET** |
| `--runtime-class NAME` | `NAME`, `nvidia-experimental` | **NOT SET** |
| `--runtime-class nvidia` | `nvidia`, `nvidia-experimental` | **NOT SET** |
| `--runtime-class nvidia-experimental` | `nvidia`, `nvidia-experimental` | **NOT SET** |
| `--set-as-default` | `nvidia`, `nvidia-experimental` | `nvidia` |
| `--set-as-default --runtime-class NAME` | `NAME`, `nvidia-experimental` | `NAME` |
| `--set-as-default --runtime-class nvidia` | `nvidia`, `nvidia-experimental` | `nvidia` |
| `--set-as-default --runtime-class nvidia-experimental` | `nvidia`, `nvidia-experimental` | `nvidia-experimental` |
These combinations also hold for the environment variables that map to the command line flags.

View File

@@ -0,0 +1,116 @@
/**
# Copyright (c) 2020-2021, NVIDIA CORPORATION. All rights reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
*/
package main
import (
"github.com/pelletier/go-toml"
)
// UpdateReverter defines the interface for applying and reverting configurations
type UpdateReverter interface {
Update(o *options) error
Revert(o *options) error
}
type config struct {
*toml.Tree
version int64
cri string
binaryKey string
}
// update adds the specified runtime class to the the containerd config.
// if set-as default is specified, the runtime class is also set as the
// default runtime.
func (config *config) update(runtimeClass string, runtimeType string, runtimeBinary string, setAsDefault bool) {
config.Set("version", config.version)
runcPath := config.runcPath()
runtimeClassPath := config.runtimeClassPath(runtimeClass)
switch runc := config.GetPath(runcPath).(type) {
case *toml.Tree:
runc, _ = toml.Load(runc.String())
config.SetPath(runtimeClassPath, runc)
}
config.initRuntime(runtimeClassPath, runtimeType, runtimeBinary)
if setAsDefault {
defaultRuntimeNamePath := config.defaultRuntimeNamePath()
config.SetPath(defaultRuntimeNamePath, runtimeClass)
}
}
// revert removes the configuration applied in an update call.
func (config *config) revert(runtimeClass string) {
runtimeClassPath := config.runtimeClassPath(runtimeClass)
defaultRuntimeNamePath := config.defaultRuntimeNamePath()
config.DeletePath(runtimeClassPath)
if runtime, ok := config.GetPath(defaultRuntimeNamePath).(string); ok {
if runtimeClass == runtime {
config.DeletePath(defaultRuntimeNamePath)
}
}
for i := 0; i < len(runtimeClassPath); i++ {
if runtimes, ok := config.GetPath(runtimeClassPath[:len(runtimeClassPath)-i]).(*toml.Tree); ok {
if len(runtimes.Keys()) == 0 {
config.DeletePath(runtimeClassPath[:len(runtimeClassPath)-i])
}
}
}
if len(config.Keys()) == 1 && config.Keys()[0] == "version" {
config.Delete("version")
}
}
// initRuntime creates a runtime config if it does not exist and ensures that the
// runtimes binary path is specified.
func (config *config) initRuntime(path []string, runtimeType string, binary string) {
if config.GetPath(path) == nil {
config.SetPath(append(path, "runtime_type"), runtimeType)
config.SetPath(append(path, "runtime_root"), "")
config.SetPath(append(path, "runtime_engine"), "")
config.SetPath(append(path, "privileged_without_host_devices"), false)
}
binaryPath := append(path, "options", config.binaryKey)
config.SetPath(binaryPath, binary)
}
func (config config) runcPath() []string {
return config.runtimeClassPath("runc")
}
func (config config) runtimeClassBinaryPath(runtimeClass string) []string {
return append(config.runtimeClassPath(runtimeClass), "options", config.binaryKey)
}
func (config config) runtimeClassPath(runtimeClass string) []string {
return append(config.containerdPath(), "runtimes", runtimeClass)
}
func (config config) defaultRuntimeNamePath() []string {
return append(config.containerdPath(), "default_runtime_name")
}
func (config config) containerdPath() []string {
return []string{"plugins", config.cri, "containerd"}
}

View File

@@ -0,0 +1,126 @@
/**
# Copyright (c) 2020-2021, NVIDIA CORPORATION. All rights reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
*/
package main
import (
"path"
"github.com/pelletier/go-toml"
log "github.com/sirupsen/logrus"
)
// configV1 represents a V1 containerd config
type configV1 struct {
config
}
func newConfigV1(cfg *toml.Tree) UpdateReverter {
c := configV1{
config: config{
Tree: cfg,
version: 1,
cri: "cri",
binaryKey: "Runtime",
},
}
return &c
}
// Update performs an update specific to v1 of the containerd config
func (config *configV1) Update(o *options) error {
// For v1 config, the `default_runtime_name` setting is only supported
// for containerd version at least v1.3
supportsDefaultRuntimeName := !o.useLegacyConfig
defaultRuntime := o.getDefaultRuntime()
for runtimeClass, runtimeBinary := range o.getRuntimeBinaries() {
isDefaultRuntime := runtimeClass == defaultRuntime
config.update(runtimeClass, o.runtimeType, runtimeBinary, isDefaultRuntime && supportsDefaultRuntimeName)
if !isDefaultRuntime {
continue
}
if supportsDefaultRuntimeName {
defaultRuntimePath := append(config.containerdPath(), "default_runtime")
if config.GetPath(defaultRuntimePath) != nil {
log.Warnf("The setting of default_runtime (%v) in containerd is deprecated", defaultRuntimePath)
}
continue
}
log.Warnf("Setting default_runtime is deprecated")
defaultRuntimePath := append(config.containerdPath(), "default_runtime")
config.initRuntime(defaultRuntimePath, o.runtimeType, runtimeBinary)
}
return nil
}
// Revert performs a revert specific to v1 of the containerd config
func (config *configV1) Revert(o *options) error {
defaultRuntimePath := append(config.containerdPath(), "default_runtime")
defaultRuntimeOptionsPath := append(defaultRuntimePath, "options")
if runtime, ok := config.GetPath(append(defaultRuntimeOptionsPath, "Runtime")).(string); ok {
for _, runtimeBinary := range o.getRuntimeBinaries() {
if path.Base(runtimeBinary) == path.Base(runtime) {
config.DeletePath(append(defaultRuntimeOptionsPath, "Runtime"))
break
}
}
}
if options, ok := config.GetPath(defaultRuntimeOptionsPath).(*toml.Tree); ok {
if len(options.Keys()) == 0 {
config.DeletePath(defaultRuntimeOptionsPath)
}
}
if runtime, ok := config.GetPath(defaultRuntimePath).(*toml.Tree); ok {
fields := []string{"runtime_type", "runtime_root", "runtime_engine", "privileged_without_host_devices"}
if len(runtime.Keys()) <= len(fields) {
matches := []string{}
for _, f := range fields {
e := runtime.Get(f)
if e != nil {
matches = append(matches, f)
}
}
if len(matches) == len(runtime.Keys()) {
for _, m := range matches {
runtime.Delete(m)
}
}
}
}
for i := 0; i < len(defaultRuntimePath); i++ {
if runtimes, ok := config.GetPath(defaultRuntimePath[:len(defaultRuntimePath)-i]).(*toml.Tree); ok {
if len(runtimes.Keys()) == 0 {
config.DeletePath(defaultRuntimePath[:len(defaultRuntimePath)-i])
}
}
}
for runtimeClass := range nvidiaRuntimeBinaries {
config.revert(runtimeClass)
}
return nil
}

View File

@@ -0,0 +1,365 @@
/**
# Copyright (c) 2021, NVIDIA CORPORATION. All rights reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
*/
package main
import (
"testing"
"github.com/pelletier/go-toml"
"github.com/stretchr/testify/require"
)
func TestUpdateV1ConfigDefaultRuntime(t *testing.T) {
const runtimeDir = "/test/runtime/dir"
testCases := []struct {
legacyConfig bool
setAsDefault bool
runtimeClass string
expectedDefaultRuntimeName interface{}
expectedDefaultRuntimeBinary interface{}
}{
{},
{
legacyConfig: true,
setAsDefault: false,
expectedDefaultRuntimeName: nil,
expectedDefaultRuntimeBinary: nil,
},
{
legacyConfig: true,
setAsDefault: true,
expectedDefaultRuntimeName: nil,
expectedDefaultRuntimeBinary: "/test/runtime/dir/nvidia-container-runtime",
},
{
legacyConfig: true,
setAsDefault: true,
runtimeClass: "NAME",
expectedDefaultRuntimeName: nil,
expectedDefaultRuntimeBinary: "/test/runtime/dir/nvidia-container-runtime",
},
{
legacyConfig: true,
setAsDefault: true,
runtimeClass: "nvidia-experimental",
expectedDefaultRuntimeName: nil,
expectedDefaultRuntimeBinary: "/test/runtime/dir/nvidia-container-runtime-experimental",
},
{
legacyConfig: false,
setAsDefault: false,
expectedDefaultRuntimeName: nil,
expectedDefaultRuntimeBinary: nil,
},
{
legacyConfig: false,
setAsDefault: true,
expectedDefaultRuntimeName: "nvidia",
expectedDefaultRuntimeBinary: nil,
},
{
legacyConfig: false,
setAsDefault: true,
runtimeClass: "NAME",
expectedDefaultRuntimeName: "NAME",
expectedDefaultRuntimeBinary: nil,
},
{
legacyConfig: false,
setAsDefault: true,
runtimeClass: "nvidia-experimental",
expectedDefaultRuntimeName: "nvidia-experimental",
expectedDefaultRuntimeBinary: nil,
},
}
for i, tc := range testCases {
o := &options{
useLegacyConfig: tc.legacyConfig,
setAsDefault: tc.setAsDefault,
runtimeClass: tc.runtimeClass,
runtimeType: runtimeType,
runtimeDir: runtimeDir,
}
config, err := toml.TreeFromMap(map[string]interface{}{})
require.NoError(t, err, "%d: %v", i, tc)
err = UpdateV1Config(config, o)
require.NoError(t, err, "%d: %v", i, tc)
defaultRuntimeName := config.GetPath([]string{"plugins", "cri", "containerd", "default_runtime_name"})
require.EqualValues(t, tc.expectedDefaultRuntimeName, defaultRuntimeName, "%d: %v", i, tc)
defaultRuntime := config.GetPath([]string{"plugins", "cri", "containerd", "default_runtime"})
if tc.expectedDefaultRuntimeBinary == nil {
require.Nil(t, defaultRuntime, "%d: %v", i, tc)
} else {
expected, err := runtimeTomlConfigV1(tc.expectedDefaultRuntimeBinary.(string))
require.NoError(t, err, "%d: %v", i, tc)
configContents, _ := toml.Marshal(defaultRuntime.(*toml.Tree))
expectedContents, _ := toml.Marshal(expected)
require.Equal(t, string(expectedContents), string(configContents), "%d: %v: %v", i, tc)
}
}
}
func TestUpdateV1Config(t *testing.T) {
const runtimeDir = "/test/runtime/dir"
const expectedVersion = int64(1)
expectedBinaries := []string{
"/test/runtime/dir/nvidia-container-runtime",
"/test/runtime/dir/nvidia-container-runtime-experimental",
}
testCases := []struct {
runtimeClass string
expectedRuntimes []string
}{
{
runtimeClass: "nvidia",
expectedRuntimes: []string{"nvidia", "nvidia-experimental"},
},
{
runtimeClass: "NAME",
expectedRuntimes: []string{"NAME", "nvidia-experimental"},
},
{
runtimeClass: "nvidia-experimental",
expectedRuntimes: []string{"nvidia", "nvidia-experimental"},
},
}
for i, tc := range testCases {
o := &options{
runtimeClass: tc.runtimeClass,
runtimeType: runtimeType,
runtimeDir: runtimeDir,
}
config, err := toml.TreeFromMap(map[string]interface{}{})
require.NoError(t, err, "%d: %v", i, tc)
err = UpdateV1Config(config, o)
require.NoError(t, err, "%d: %v", i, tc)
version, ok := config.Get("version").(int64)
require.True(t, ok)
require.EqualValues(t, expectedVersion, version)
runtimes, ok := config.GetPath([]string{"plugins", "cri", "containerd", "runtimes"}).(*toml.Tree)
require.True(t, ok)
runtimeClasses := runtimes.Keys()
require.ElementsMatch(t, tc.expectedRuntimes, runtimeClasses, "%d: %v", i, tc)
for i, r := range tc.expectedRuntimes {
runtimeConfig := runtimes.Get(r)
expected, err := runtimeTomlConfigV1(expectedBinaries[i])
require.NoError(t, err, "%d: %v", i, tc)
configContents, _ := toml.Marshal(runtimeConfig)
expectedContents, _ := toml.Marshal(expected)
require.Equal(t, string(expectedContents), string(configContents), "%d: %v: %v", i, r, tc)
}
}
}
func TestUpdateV1ConfigWithRuncPresent(t *testing.T) {
const runcBinary = "/runc-binary"
const runtimeDir = "/test/runtime/dir"
const expectedVersion = int64(1)
expectedBinaries := []string{
runcBinary,
"/test/runtime/dir/nvidia-container-runtime",
"/test/runtime/dir/nvidia-container-runtime-experimental",
}
testCases := []struct {
runtimeClass string
expectedRuntimes []string
}{
{
runtimeClass: "nvidia",
expectedRuntimes: []string{"runc", "nvidia", "nvidia-experimental"},
},
{
runtimeClass: "NAME",
expectedRuntimes: []string{"runc", "NAME", "nvidia-experimental"},
},
{
runtimeClass: "nvidia-experimental",
expectedRuntimes: []string{"runc", "nvidia", "nvidia-experimental"},
},
}
for i, tc := range testCases {
o := &options{
runtimeClass: tc.runtimeClass,
runtimeType: runtimeType,
runtimeDir: runtimeDir,
}
config, err := toml.TreeFromMap(runcConfigMapV1("/runc-binary"))
require.NoError(t, err, "%d: %v", i, tc)
err = UpdateV1Config(config, o)
require.NoError(t, err, "%d: %v", i, tc)
version, ok := config.Get("version").(int64)
require.True(t, ok)
require.EqualValues(t, expectedVersion, version)
runtimes, ok := config.GetPath([]string{"plugins", "cri", "containerd", "runtimes"}).(*toml.Tree)
require.True(t, ok)
runtimeClasses := runtimes.Keys()
require.ElementsMatch(t, tc.expectedRuntimes, runtimeClasses, "%d: %v", i, tc)
for i, r := range tc.expectedRuntimes {
runtimeConfig := runtimes.Get(r)
expected, err := toml.TreeFromMap(runcRuntimeConfigMapV1(expectedBinaries[i]))
require.NoError(t, err, "%d: %v", i, tc)
configContents, _ := toml.Marshal(runtimeConfig)
expectedContents, _ := toml.Marshal(expected)
require.Equal(t, string(expectedContents), string(configContents), "%d: %v: %v", i, r, tc)
}
}
}
func TestRevertV1Config(t *testing.T) {
testCases := []struct {
config map[string]interface {
}
expected map[string]interface{}
}{
{},
{
config: map[string]interface{}{
"version": int64(1),
},
},
{
config: map[string]interface{}{
"version": int64(1),
"plugins": map[string]interface{}{
"cri": map[string]interface{}{
"containerd": map[string]interface{}{
"runtimes": map[string]interface{}{
"nvidia": runtimeMapV1("/test/runtime/dir/nvidia-container-runtime"),
"nvidia-experimental": runtimeMapV1("/test/runtime/dir/nvidia-container-runtime-experimental"),
},
},
},
},
},
},
{
config: map[string]interface{}{
"version": int64(1),
"plugins": map[string]interface{}{
"cri": map[string]interface{}{
"containerd": map[string]interface{}{
"runtimes": map[string]interface{}{
"nvidia": runtimeMapV1("/test/runtime/dir/nvidia-container-runtime"),
"nvidia-experimental": runtimeMapV1("/test/runtime/dir/nvidia-container-runtime-experimental"),
},
"default_runtime": runtimeMapV1("/test/runtime/dir/nvidia-container-runtime"),
"default_runtime_name": "nvidia",
},
},
},
},
},
}
for i, tc := range testCases {
o := &options{
runtimeClass: "nvidia",
}
config, err := toml.TreeFromMap(tc.config)
require.NoError(t, err, "%d: %v", i, tc)
expected, err := toml.TreeFromMap(tc.expected)
require.NoError(t, err, "%d: %v", i, tc)
err = RevertV1Config(config, o)
require.NoError(t, err, "%d: %v", i, tc)
configContents, _ := toml.Marshal(config)
expectedContents, _ := toml.Marshal(expected)
require.Equal(t, string(expectedContents), string(configContents), "%d: %v", i, tc)
}
}
func runtimeTomlConfigV1(binary string) (*toml.Tree, error) {
return toml.TreeFromMap(runtimeMapV1(binary))
}
func runtimeMapV1(binary string) map[string]interface{} {
return map[string]interface{}{
"runtime_type": runtimeType,
"runtime_root": "",
"runtime_engine": "",
"privileged_without_host_devices": false,
"options": map[string]interface{}{
"Runtime": binary,
},
}
}
func runcConfigMapV1(binary string) map[string]interface{} {
return map[string]interface{}{
"plugins": map[string]interface{}{
"cri": map[string]interface{}{
"containerd": map[string]interface{}{
"runtimes": map[string]interface{}{
"runc": runcRuntimeConfigMapV1(binary),
},
},
},
},
}
}
func runcRuntimeConfigMapV1(binary string) map[string]interface{} {
return map[string]interface{}{
"runtime_type": "runc_runtime_type",
"runtime_root": "runc_runtime_root",
"runtime_engine": "runc_runtime_engine",
"privileged_without_host_devices": true,
"options": map[string]interface{}{
"runc-option": "value",
"Runtime": binary,
},
}
}

View File

@@ -0,0 +1,59 @@
/**
# Copyright (c) 2020-2021, NVIDIA CORPORATION. All rights reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
*/
package main
import (
"github.com/pelletier/go-toml"
)
// configV2 represents a V2 containerd config
type configV2 struct {
config
}
func newConfigV2(cfg *toml.Tree) UpdateReverter {
c := configV2{
config: config{
Tree: cfg,
version: 2,
cri: "io.containerd.grpc.v1.cri",
binaryKey: "BinaryName",
},
}
return &c
}
// Update performs an update specific to v2 of the containerd config
func (config *configV2) Update(o *options) error {
defaultRuntime := o.getDefaultRuntime()
for runtimeClass, runtimeBinary := range o.getRuntimeBinaries() {
setAsDefault := defaultRuntime == runtimeClass
config.update(runtimeClass, o.runtimeType, runtimeBinary, setAsDefault)
}
return nil
}
// Revert performs a revert specific to v2 of the containerd config
func (config *configV2) Revert(o *options) error {
for runtimeClass := range o.getRuntimeBinaries() {
config.revert(runtimeClass)
}
return nil
}

View File

@@ -0,0 +1,329 @@
/**
# Copyright (c) 2021, NVIDIA CORPORATION. All rights reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
*/
package main
import (
"testing"
"github.com/pelletier/go-toml"
"github.com/stretchr/testify/require"
)
const (
runtimeType = "runtime_type"
)
func TestUpdateV2ConfigDefaultRuntime(t *testing.T) {
const runtimeDir = "/test/runtime/dir"
testCases := []struct {
setAsDefault bool
runtimeClass string
expectedDefaultRuntimeName interface{}
}{
{},
{
setAsDefault: false,
runtimeClass: "nvidia",
expectedDefaultRuntimeName: nil,
},
{
setAsDefault: false,
runtimeClass: "NAME",
expectedDefaultRuntimeName: nil,
},
{
setAsDefault: false,
runtimeClass: "nvidia-experimental",
expectedDefaultRuntimeName: nil,
},
{
setAsDefault: true,
runtimeClass: "nvidia",
expectedDefaultRuntimeName: "nvidia",
},
{
setAsDefault: true,
runtimeClass: "NAME",
expectedDefaultRuntimeName: "NAME",
},
{
setAsDefault: true,
runtimeClass: "nvidia-experimental",
expectedDefaultRuntimeName: "nvidia-experimental",
},
}
for i, tc := range testCases {
o := &options{
setAsDefault: tc.setAsDefault,
runtimeClass: tc.runtimeClass,
runtimeDir: runtimeDir,
}
config, err := toml.TreeFromMap(map[string]interface{}{})
require.NoError(t, err, "%d: %v", i, tc)
err = UpdateV2Config(config, o)
require.NoError(t, err, "%d: %v", i, tc)
defaultRuntimeName := config.GetPath([]string{"plugins", "io.containerd.grpc.v1.cri", "containerd", "default_runtime_name"})
require.EqualValues(t, tc.expectedDefaultRuntimeName, defaultRuntimeName, "%d: %v", i, tc)
}
}
func TestUpdateV2Config(t *testing.T) {
const runtimeDir = "/test/runtime/dir"
const expectedVersion = int64(2)
expectedBinaries := []string{
"/test/runtime/dir/nvidia-container-runtime",
"/test/runtime/dir/nvidia-container-runtime-experimental",
}
testCases := []struct {
runtimeClass string
expectedRuntimes []string
}{
{
runtimeClass: "nvidia",
expectedRuntimes: []string{"nvidia", "nvidia-experimental"},
},
{
runtimeClass: "NAME",
expectedRuntimes: []string{"NAME", "nvidia-experimental"},
},
{
runtimeClass: "nvidia-experimental",
expectedRuntimes: []string{"nvidia", "nvidia-experimental"},
},
}
for i, tc := range testCases {
o := &options{
runtimeClass: tc.runtimeClass,
runtimeType: runtimeType,
runtimeDir: runtimeDir,
}
config, err := toml.TreeFromMap(map[string]interface{}{})
require.NoError(t, err, "%d: %v", i, tc)
err = UpdateV2Config(config, o)
require.NoError(t, err, "%d: %v", i, tc)
version, ok := config.Get("version").(int64)
require.True(t, ok)
require.EqualValues(t, expectedVersion, version, "%d: %v", i, tc)
runtimes, ok := config.GetPath([]string{"plugins", "io.containerd.grpc.v1.cri", "containerd", "runtimes"}).(*toml.Tree)
require.True(t, ok)
runtimeClasses := runtimes.Keys()
require.ElementsMatch(t, tc.expectedRuntimes, runtimeClasses, "%d: %v", i, tc)
for i, r := range tc.expectedRuntimes {
runtimeConfig := runtimes.Get(r)
expected, err := runtimeTomlConfigV2(expectedBinaries[i])
require.NoError(t, err, "%d: %v", i, tc)
configContents, _ := toml.Marshal(runtimeConfig)
expectedContents, _ := toml.Marshal(expected)
require.Equal(t, string(expectedContents), string(configContents), "%d: %v: %v", i, r, tc)
}
}
}
func TestUpdateV2ConfigWithRuncPresent(t *testing.T) {
const runcBinary = "/runc-binary"
const runtimeDir = "/test/runtime/dir"
const expectedVersion = int64(2)
expectedBinaries := []string{
runcBinary,
"/test/runtime/dir/nvidia-container-runtime",
"/test/runtime/dir/nvidia-container-runtime-experimental",
}
testCases := []struct {
runtimeClass string
expectedRuntimes []string
}{
{
runtimeClass: "nvidia",
expectedRuntimes: []string{"runc", "nvidia", "nvidia-experimental"},
},
{
runtimeClass: "NAME",
expectedRuntimes: []string{"runc", "NAME", "nvidia-experimental"},
},
{
runtimeClass: "nvidia-experimental",
expectedRuntimes: []string{"runc", "nvidia", "nvidia-experimental"},
},
}
for i, tc := range testCases {
o := &options{
runtimeClass: tc.runtimeClass,
runtimeType: runtimeType,
runtimeDir: runtimeDir,
}
config, err := toml.TreeFromMap(runcConfigMapV2("/runc-binary"))
require.NoError(t, err, "%d: %v", i, tc)
err = UpdateV2Config(config, o)
require.NoError(t, err, "%d: %v", i, tc)
version, ok := config.Get("version").(int64)
require.True(t, ok)
require.EqualValues(t, expectedVersion, version)
runtimes, ok := config.GetPath([]string{"plugins", "io.containerd.grpc.v1.cri", "containerd", "runtimes"}).(*toml.Tree)
require.True(t, ok, "%d: %v", i, tc)
runtimeClasses := runtimes.Keys()
require.ElementsMatch(t, tc.expectedRuntimes, runtimeClasses, "%d: %v", i, tc)
for i, r := range tc.expectedRuntimes {
runtimeConfig := runtimes.Get(r)
expected, err := toml.TreeFromMap(runcRuntimeConfigMapV2(expectedBinaries[i]))
require.NoError(t, err, "%d: %v", i, tc)
configContents, _ := toml.Marshal(runtimeConfig)
expectedContents, _ := toml.Marshal(expected)
require.Equal(t, string(expectedContents), string(configContents), "%d: %v: %v", i, r, tc)
}
}
}
func TestRevertV2Config(t *testing.T) {
testCases := []struct {
config map[string]interface {
}
expected map[string]interface{}
}{
{},
{
config: map[string]interface{}{
"version": int64(2),
},
},
{
config: map[string]interface{}{
"version": int64(2),
"plugins": map[string]interface{}{
"io.containerd.grpc.v1.cri": map[string]interface{}{
"containerd": map[string]interface{}{
"runtimes": map[string]interface{}{
"nvidia": runtimeMapV2("/test/runtime/dir/nvidia-container-runtime"),
"nvidia-experimental": runtimeMapV2("/test/runtime/dir/nvidia-container-runtime-experimental"),
},
},
},
},
},
},
{
config: map[string]interface{}{
"version": int64(2),
"plugins": map[string]interface{}{
"io.containerd.grpc.v1.cri": map[string]interface{}{
"containerd": map[string]interface{}{
"runtimes": map[string]interface{}{
"nvidia": runtimeMapV2("/test/runtime/dir/nvidia-container-runtime"),
"nvidia-experimental": runtimeMapV2("/test/runtime/dir/nvidia-container-runtime-experimental"),
},
"default_runtime_name": "nvidia",
},
},
},
},
},
}
for i, tc := range testCases {
o := &options{
runtimeClass: "nvidia",
}
config, err := toml.TreeFromMap(tc.config)
require.NoError(t, err, "%d: %v", i, tc)
expected, err := toml.TreeFromMap(tc.expected)
require.NoError(t, err, "%d: %v", i, tc)
err = RevertV2Config(config, o)
require.NoError(t, err, "%d: %v", i, tc)
configContents, _ := toml.Marshal(config)
expectedContents, _ := toml.Marshal(expected)
require.Equal(t, string(expectedContents), string(configContents), "%d: %v", i, tc)
}
}
func runtimeTomlConfigV2(binary string) (*toml.Tree, error) {
return toml.TreeFromMap(runtimeMapV2(binary))
}
func runtimeMapV2(binary string) map[string]interface{} {
return map[string]interface{}{
"runtime_type": runtimeType,
"runtime_root": "",
"runtime_engine": "",
"privileged_without_host_devices": false,
"options": map[string]interface{}{
"BinaryName": binary,
},
}
}
func runcConfigMapV2(binary string) map[string]interface{} {
return map[string]interface{}{
"plugins": map[string]interface{}{
"io.containerd.grpc.v1.cri": map[string]interface{}{
"containerd": map[string]interface{}{
"runtimes": map[string]interface{}{
"runc": runcRuntimeConfigMapV2(binary),
},
},
},
},
}
}
func runcRuntimeConfigMapV2(binary string) map[string]interface{} {
return map[string]interface{}{
"runtime_type": "runc_runtime_type",
"runtime_root": "runc_runtime_root",
"runtime_engine": "runc_runtime_engine",
"privileged_without_host_devices": true,
"options": map[string]interface{}{
"runc-option": "value",
"BinaryName": binary,
},
}
}

View File

@@ -0,0 +1,587 @@
/**
# Copyright (c) 2020-2021, NVIDIA CORPORATION. All rights reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
*/
package main
import (
"fmt"
"net"
"os"
"os/exec"
"path/filepath"
"syscall"
"time"
"github.com/containerd/containerd/plugin"
toml "github.com/pelletier/go-toml"
log "github.com/sirupsen/logrus"
cli "github.com/urfave/cli/v2"
)
const (
restartModeSignal = "signal"
restartModeSystemd = "systemd"
restartModeNone = "NONE"
nvidiaRuntimeName = "nvidia"
nvidiaRuntimeBinary = "nvidia-container-runtime"
nvidiaExperimentalRuntimeName = "nvidia-experimental"
nvidiaExperimentalRuntimeBinary = "nvidia-container-runtime-experimental"
defaultConfig = "/etc/containerd/config.toml"
defaultSocket = "/run/containerd/containerd.sock"
defaultRuntimeClass = "nvidia"
defaultRuntmeType = plugin.RuntimeRuncV2
defaultSetAsDefault = true
defaultRestartMode = restartModeSignal
defaultHostRootMount = "/host"
reloadBackoff = 5 * time.Second
maxReloadAttempts = 6
socketMessageToGetPID = ""
)
// nvidiaRuntimeBinaries defines a map of runtime names to binary names
var nvidiaRuntimeBinaries = map[string]string{
nvidiaRuntimeName: nvidiaRuntimeBinary,
nvidiaExperimentalRuntimeName: nvidiaExperimentalRuntimeBinary,
}
// options stores the configuration from the command line or environment variables
type options struct {
config string
socket string
runtimeClass string
runtimeType string
setAsDefault bool
restartMode string
hostRootMount string
runtimeDir string
useLegacyConfig bool
}
func main() {
options := options{}
// Create the top-level CLI
c := cli.NewApp()
c.Name = "containerd"
c.Usage = "Update a containerd config with the nvidia-container-runtime"
c.Version = "0.1.0"
// Create the 'setup' subcommand
setup := cli.Command{}
setup.Name = "setup"
setup.Usage = "Trigger a containerd config to be updated"
setup.ArgsUsage = "<runtime_dirname>"
setup.Action = func(c *cli.Context) error {
return Setup(c, &options)
}
// Create the 'cleanup' subcommand
cleanup := cli.Command{}
cleanup.Name = "cleanup"
cleanup.Usage = "Trigger any updates made to a containerd config to be undone"
cleanup.ArgsUsage = "<runtime_dirname>"
cleanup.Action = func(c *cli.Context) error {
return Cleanup(c, &options)
}
// Register the subcommands with the top-level CLI
c.Commands = []*cli.Command{
&setup,
&cleanup,
}
// Setup common flags across both subcommands. All subcommands get the same
// set of flags even if they don't use some of them. This is so that we
// only require the user to specify one set of flags for both 'startup'
// and 'cleanup' to simplify things.
commonFlags := []cli.Flag{
&cli.StringFlag{
Name: "config",
Aliases: []string{"c"},
Usage: "Path to the containerd config file",
Value: defaultConfig,
Destination: &options.config,
EnvVars: []string{"CONTAINERD_CONFIG"},
},
&cli.StringFlag{
Name: "socket",
Aliases: []string{"s"},
Usage: "Path to the containerd socket file",
Value: defaultSocket,
Destination: &options.socket,
EnvVars: []string{"CONTAINERD_SOCKET"},
},
&cli.StringFlag{
Name: "runtime-class",
Aliases: []string{"r"},
Usage: "The name of the runtime class to set for the nvidia-container-runtime",
Value: defaultRuntimeClass,
Destination: &options.runtimeClass,
EnvVars: []string{"CONTAINERD_RUNTIME_CLASS"},
},
&cli.StringFlag{
Name: "runtime-type",
Usage: "The runtime_type to use for the configured runtime classes",
Value: defaultRuntmeType,
Destination: &options.runtimeType,
EnvVars: []string{"CONTAINERD_RUNTIME_TYPE"},
},
// The flags below are only used by the 'setup' command.
&cli.BoolFlag{
Name: "set-as-default",
Aliases: []string{"d"},
Usage: "Set nvidia-container-runtime as the default runtime",
Value: defaultSetAsDefault,
Destination: &options.setAsDefault,
EnvVars: []string{"CONTAINERD_SET_AS_DEFAULT"},
Hidden: true,
},
&cli.StringFlag{
Name: "restart-mode",
Usage: "Specify how containerd should be restarted; [signal | systemd]",
Value: defaultRestartMode,
Destination: &options.restartMode,
EnvVars: []string{"CONTAINERD_RESTART_MODE"},
},
&cli.StringFlag{
Name: "host-root",
Usage: "Specify the path to the host root to be used when restarting containerd using systemd",
Value: defaultHostRootMount,
Destination: &options.hostRootMount,
EnvVars: []string{"HOST_ROOT_MOUNT"},
},
&cli.BoolFlag{
Name: "use-legacy-config",
Usage: "Specify whether a legacy (pre v1.3) config should be used",
Destination: &options.useLegacyConfig,
EnvVars: []string{"CONTAINERD_USE_LEGACY_CONFIG"},
},
}
// Update the subcommand flags with the common subcommand flags
setup.Flags = append([]cli.Flag{}, commonFlags...)
cleanup.Flags = append([]cli.Flag{}, commonFlags...)
// Run the top-level CLI
if err := c.Run(os.Args); err != nil {
log.Fatal(fmt.Errorf("Error: %v", err))
}
}
// Setup updates a containerd configuration to include the nvidia-containerd-runtime and reloads it
func Setup(c *cli.Context, o *options) error {
log.Infof("Starting 'setup' for %v", c.App.Name)
runtimeDir, err := ParseArgs(c)
if err != nil {
return fmt.Errorf("unable to parse args: %v", err)
}
o.runtimeDir = runtimeDir
cfg, err := LoadConfig(o.config)
if err != nil {
return fmt.Errorf("unable to load config: %v", err)
}
version, err := ParseVersion(cfg, o.useLegacyConfig)
if err != nil {
return fmt.Errorf("unable to parse version: %v", err)
}
err = UpdateConfig(cfg, o, version)
if err != nil {
return fmt.Errorf("unable to update config: %v", err)
}
err = FlushConfig(o.config, cfg)
if err != nil {
return fmt.Errorf("unable to flush config: %v", err)
}
err = RestartContainerd(o)
if err != nil {
return fmt.Errorf("unable to restart containerd: %v", err)
}
log.Infof("Completed 'setup' for %v", c.App.Name)
return nil
}
// Cleanup reverts a containerd configuration to remove the nvidia-containerd-runtime and reloads it
func Cleanup(c *cli.Context, o *options) error {
log.Infof("Starting 'cleanup' for %v", c.App.Name)
_, err := ParseArgs(c)
if err != nil {
return fmt.Errorf("unable to parse args: %v", err)
}
cfg, err := LoadConfig(o.config)
if err != nil {
return fmt.Errorf("unable to load config: %v", err)
}
version, err := ParseVersion(cfg, o.useLegacyConfig)
if err != nil {
return fmt.Errorf("unable to parse version: %v", err)
}
err = RevertConfig(cfg, o, version)
if err != nil {
return fmt.Errorf("unable to update config: %v", err)
}
err = FlushConfig(o.config, cfg)
if err != nil {
return fmt.Errorf("unable to flush config: %v", err)
}
err = RestartContainerd(o)
if err != nil {
return fmt.Errorf("unable to restart containerd: %v", err)
}
log.Infof("Completed 'cleanup' for %v", c.App.Name)
return nil
}
// ParseArgs parses the command line arguments to the CLI
func ParseArgs(c *cli.Context) (string, error) {
args := c.Args()
log.Infof("Parsing arguments: %v", args.Slice())
if args.Len() != 1 {
return "", fmt.Errorf("incorrect number of arguments")
}
runtimeDir := args.Get(0)
log.Infof("Successfully parsed arguments")
return runtimeDir, nil
}
// LoadConfig loads the containerd config from disk
func LoadConfig(config string) (*toml.Tree, error) {
log.Infof("Loading config: %v", config)
info, err := os.Stat(config)
if os.IsExist(err) && info.IsDir() {
return nil, fmt.Errorf("config file is a directory")
}
configFile := config
if os.IsNotExist(err) {
configFile = "/dev/null"
log.Infof("Config file does not exist, creating new one")
}
cfg, err := toml.LoadFile(configFile)
if err != nil {
return nil, err
}
log.Infof("Successfully loaded config")
return cfg, nil
}
// ParseVersion parses the version field out of the containerd config
func ParseVersion(config *toml.Tree, useLegacyConfig bool) (int, error) {
var defaultVersion int
if !useLegacyConfig {
defaultVersion = 2
} else {
defaultVersion = 1
}
var version int
switch v := config.Get("version").(type) {
case nil:
switch len(config.Keys()) {
case 0: // No config exists, or the config file is empty, use version inferred from containerd
version = defaultVersion
default: // A config file exists, has content, and no version is set
version = 1
}
case int64:
version = int(v)
default:
return -1, fmt.Errorf("unsupported type for version field: %v", v)
}
log.Infof("Config version: %v", version)
if version == 1 {
log.Warnf("Support for containerd config version 1 is deprecated")
}
return version, nil
}
// UpdateConfig updates the containerd config to include the nvidia-container-runtime
func UpdateConfig(config *toml.Tree, o *options, version int) error {
var err error
log.Infof("Updating config")
switch version {
case 1:
err = UpdateV1Config(config, o)
case 2:
err = UpdateV2Config(config, o)
default:
err = fmt.Errorf("unsupported containerd config version: %v", version)
}
if err != nil {
return err
}
log.Infof("Successfully updated config")
return nil
}
// RevertConfig reverts the containerd config to remove the nvidia-container-runtime
func RevertConfig(config *toml.Tree, o *options, version int) error {
var err error
log.Infof("Reverting config")
switch version {
case 1:
err = RevertV1Config(config, o)
case 2:
err = RevertV2Config(config, o)
default:
err = fmt.Errorf("unsupported containerd config version: %v", version)
}
if err != nil {
return err
}
log.Infof("Successfully reverted config")
return nil
}
// UpdateV1Config performs an update specific to v1 of the containerd config
func UpdateV1Config(config *toml.Tree, o *options) error {
c := newConfigV1(config)
return c.Update(o)
}
// RevertV1Config performs a revert specific to v1 of the containerd config
func RevertV1Config(config *toml.Tree, o *options) error {
c := newConfigV1(config)
return c.Revert(o)
}
// UpdateV2Config performs an update specific to v2 of the containerd config
func UpdateV2Config(config *toml.Tree, o *options) error {
c := newConfigV2(config)
return c.Update(o)
}
// RevertV2Config performs a revert specific to v2 of the containerd config
func RevertV2Config(config *toml.Tree, o *options) error {
c := newConfigV2(config)
return c.Revert(o)
}
// FlushConfig flushes the updated/reverted config out to disk
func FlushConfig(config string, cfg *toml.Tree) error {
log.Infof("Flushing config")
output, err := cfg.ToTomlString()
if err != nil {
return fmt.Errorf("unable to convert to TOML: %v", err)
}
switch len(output) {
case 0:
err := os.Remove(config)
if err != nil {
return fmt.Errorf("unable to remove empty file: %v", err)
}
log.Infof("Config empty, removing file")
default:
f, err := os.Create(config)
if err != nil {
return fmt.Errorf("unable to open '%v' for writing: %v", config, err)
}
defer f.Close()
_, err = f.WriteString(output)
if err != nil {
return fmt.Errorf("unable to write output: %v", err)
}
}
log.Infof("Successfully flushed config")
return nil
}
// RestartContainerd restarts containerd depending on the value of restartModeFlag
func RestartContainerd(o *options) error {
switch o.restartMode {
case restartModeNone:
log.Warnf("Skipping sending signal to containerd due to --restart-mode=%v", o.restartMode)
return nil
case restartModeSignal:
err := SignalContainerd(o)
if err != nil {
return fmt.Errorf("unable to signal containerd: %v", err)
}
case restartModeSystemd:
return RestartContainerdSystemd(o.hostRootMount)
default:
return fmt.Errorf("Invalid restart mode specified: %v", o.restartMode)
}
return nil
}
// SignalContainerd sends a SIGHUP signal to the containerd daemon
func SignalContainerd(o *options) error {
log.Infof("Sending SIGHUP signal to containerd")
// Wrap the logic to perform the SIGHUP in a function so we can retry it on failure
retriable := func() error {
conn, err := net.Dial("unix", o.socket)
if err != nil {
return fmt.Errorf("unable to dial: %v", err)
}
defer conn.Close()
sconn, err := conn.(*net.UnixConn).SyscallConn()
if err != nil {
return fmt.Errorf("unable to get syscall connection: %v", err)
}
err1 := sconn.Control(func(fd uintptr) {
err = syscall.SetsockoptInt(int(fd), syscall.SOL_SOCKET, syscall.SO_PASSCRED, 1)
})
if err1 != nil {
return fmt.Errorf("unable to issue call on socket fd: %v", err1)
}
if err != nil {
return fmt.Errorf("unable to SetsockoptInt on socket fd: %v", err)
}
_, _, err = conn.(*net.UnixConn).WriteMsgUnix([]byte(socketMessageToGetPID), nil, nil)
if err != nil {
return fmt.Errorf("unable to WriteMsgUnix on socket fd: %v", err)
}
oob := make([]byte, 1024)
_, oobn, _, _, err := conn.(*net.UnixConn).ReadMsgUnix(nil, oob)
if err != nil {
return fmt.Errorf("unable to ReadMsgUnix on socket fd: %v", err)
}
oob = oob[:oobn]
scm, err := syscall.ParseSocketControlMessage(oob)
if err != nil {
return fmt.Errorf("unable to ParseSocketControlMessage from message received on socket fd: %v", err)
}
ucred, err := syscall.ParseUnixCredentials(&scm[0])
if err != nil {
return fmt.Errorf("unable to ParseUnixCredentials from message received on socket fd: %v", err)
}
err = syscall.Kill(int(ucred.Pid), syscall.SIGHUP)
if err != nil {
return fmt.Errorf("unable to send SIGHUP to 'containerd' process: %v", err)
}
return nil
}
// Try to send a SIGHUP up to maxReloadAttempts times
var err error
for i := 0; i < maxReloadAttempts; i++ {
err = retriable()
if err == nil {
break
}
if i == maxReloadAttempts-1 {
break
}
log.Warnf("Error signaling containerd, attempt %v/%v: %v", i+1, maxReloadAttempts, err)
time.Sleep(reloadBackoff)
}
if err != nil {
log.Warnf("Max retries reached %v/%v, aborting", maxReloadAttempts, maxReloadAttempts)
return err
}
log.Infof("Successfully signaled containerd")
return nil
}
// RestartContainerdSystemd restarts containerd using systemctl
func RestartContainerdSystemd(hostRootMount string) error {
log.Infof("Restarting containerd using systemd and host root mounted at %v", hostRootMount)
command := "chroot"
args := []string{hostRootMount, "systemctl", "restart", "containerd"}
cmd := exec.Command(command, args...)
cmd.Stdout = os.Stdout
cmd.Stderr = os.Stderr
err := cmd.Run()
if err != nil {
return fmt.Errorf("error restarting containerd using systemd: %v", err)
}
return nil
}
// getDefaultRuntime returns the default runtime for the configured options.
// If the configuration is invalid or the default runtimes should not be set
// the empty string is returned.
func (o options) getDefaultRuntime() string {
if o.setAsDefault {
if o.runtimeClass == nvidiaExperimentalRuntimeName {
return nvidiaExperimentalRuntimeName
}
if o.runtimeClass == "" {
return defaultRuntimeClass
}
return o.runtimeClass
}
return ""
}
// getRuntimeBinaries returns a map of runtime names to binary paths. This includes the
// renaming of the `nvidia` runtime as per the --runtime-class command line flag.
func (o options) getRuntimeBinaries() map[string]string {
runtimeBinaries := make(map[string]string)
for rt, bin := range nvidiaRuntimeBinaries {
runtime := rt
if o.runtimeClass != "" && o.runtimeClass != nvidiaExperimentalRuntimeName && runtime == defaultRuntimeClass {
runtime = o.runtimeClass
}
runtimeBinaries[runtime] = filepath.Join(o.runtimeDir, bin)
}
return runtimeBinaries
}

View File

@@ -0,0 +1,106 @@
/**
# Copyright (c) 2021, NVIDIA CORPORATION. All rights reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
*/
package main
import (
"testing"
"github.com/stretchr/testify/require"
)
func TestOptions(t *testing.T) {
testCases := []struct {
options options
expectedDefaultRuntime string
expectedRuntimeBinaries map[string]string
}{
{
expectedRuntimeBinaries: map[string]string{
"nvidia": "nvidia-container-runtime",
"nvidia-experimental": "nvidia-container-runtime-experimental",
},
},
{
options: options{
setAsDefault: true,
},
expectedDefaultRuntime: "nvidia",
expectedRuntimeBinaries: map[string]string{
"nvidia": "nvidia-container-runtime",
"nvidia-experimental": "nvidia-container-runtime-experimental",
},
},
{
options: options{
setAsDefault: true,
runtimeClass: "nvidia",
},
expectedDefaultRuntime: "nvidia",
expectedRuntimeBinaries: map[string]string{
"nvidia": "nvidia-container-runtime",
"nvidia-experimental": "nvidia-container-runtime-experimental",
},
},
{
options: options{
setAsDefault: true,
runtimeClass: "NAME",
},
expectedDefaultRuntime: "NAME",
expectedRuntimeBinaries: map[string]string{
"NAME": "nvidia-container-runtime",
"nvidia-experimental": "nvidia-container-runtime-experimental",
},
},
{
options: options{
setAsDefault: false,
runtimeClass: "NAME",
},
expectedRuntimeBinaries: map[string]string{
"NAME": "nvidia-container-runtime",
"nvidia-experimental": "nvidia-container-runtime-experimental",
},
},
{
options: options{
setAsDefault: true,
runtimeClass: "nvidia-experimental",
},
expectedDefaultRuntime: "nvidia-experimental",
expectedRuntimeBinaries: map[string]string{
"nvidia": "nvidia-container-runtime",
"nvidia-experimental": "nvidia-container-runtime-experimental",
},
},
{
options: options{
setAsDefault: false,
runtimeClass: "nvidia-experimental",
},
expectedRuntimeBinaries: map[string]string{
"nvidia": "nvidia-container-runtime",
"nvidia-experimental": "nvidia-container-runtime-experimental",
},
},
}
for i, tc := range testCases {
require.Equal(t, tc.expectedDefaultRuntime, tc.options.getDefaultRuntime(), "%d: %v", i, tc)
require.EqualValues(t, tc.expectedRuntimeBinaries, tc.options.getRuntimeBinaries(), "%d: %v", i, tc)
}
}

View File

@@ -0,0 +1,185 @@
/**
# Copyright (c) 2021, NVIDIA CORPORATION. All rights reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
*/
package main
import (
"encoding/json"
"fmt"
"os"
"path/filepath"
hooks "github.com/containers/podman/v2/pkg/hooks/1.0.0"
rspec "github.com/opencontainers/runtime-spec/specs-go"
log "github.com/sirupsen/logrus"
cli "github.com/urfave/cli/v2"
)
const (
defaultHooksDir = "/usr/share/containers/oci/hooks.d"
defaultHookFilename = "oci-nvidia-hook.json"
)
var hooksDirFlag string
var hookFilenameFlag string
var tooklitDirArg string
func main() {
// Create the top-level CLI
c := cli.NewApp()
c.Name = "crio"
c.Usage = "Update cri-o hooks to include the NVIDIA runtime hook"
c.ArgsUsage = "<toolkit_dirname>"
c.Version = "0.1.0"
// Create the 'setup' subcommand
setup := cli.Command{}
setup.Name = "setup"
setup.Usage = "Create the cri-o hook required to run NVIDIA GPU containers"
setup.ArgsUsage = "<toolkit_dirname>"
setup.Action = Setup
setup.Before = ParseArgs
// Create the 'cleanup' subcommand
cleanup := cli.Command{}
cleanup.Name = "cleanup"
cleanup.Usage = "Remove the NVIDIA cri-o hook"
cleanup.Action = Cleanup
// Register the subcommands with the top-level CLI
c.Commands = []*cli.Command{
&setup,
&cleanup,
}
// Setup common flags across both subcommands. All subcommands get the same
// set of flags even if they don't use some of them. This is so that we
// only require the user to specify one set of flags for both 'startup'
// and 'cleanup' to simplify things.
commonFlags := []cli.Flag{
&cli.StringFlag{
Name: "hooks-dir",
Aliases: []string{"d"},
Usage: "path to the cri-o hooks directory",
Value: defaultHooksDir,
Destination: &hooksDirFlag,
EnvVars: []string{"CRIO_HOOKS_DIR"},
DefaultText: defaultHooksDir,
},
&cli.StringFlag{
Name: "hook-filename",
Aliases: []string{"f"},
Usage: "filename of the cri-o hook that will be created / removed in the hooks directory",
Value: defaultHookFilename,
Destination: &hookFilenameFlag,
EnvVars: []string{"CRIO_HOOK_FILENAME"},
DefaultText: defaultHookFilename,
},
}
// Update the subcommand flags with the common subcommand flags
setup.Flags = append([]cli.Flag{}, commonFlags...)
cleanup.Flags = append([]cli.Flag{}, commonFlags...)
// Run the top-level CLI
if err := c.Run(os.Args); err != nil {
log.Fatal(fmt.Errorf("error: %v", err))
}
}
// Setup installs the prestart hook required to launch GPU-enabled containers
func Setup(c *cli.Context) error {
log.Infof("Starting 'setup' for %v", c.App.Name)
err := os.MkdirAll(hooksDirFlag, 0755)
if err != nil {
return fmt.Errorf("error creating hooks directory %v: %v", hooksDirFlag, err)
}
hookPath := getHookPath(hooksDirFlag, hookFilenameFlag)
err = createHook(tooklitDirArg, hookPath)
if err != nil {
return fmt.Errorf("error creating hook: %v", err)
}
return nil
}
// Cleanup removes the specified prestart hook
func Cleanup(c *cli.Context) error {
log.Infof("Starting 'cleanup' for %v", c.App.Name)
hookPath := getHookPath(hooksDirFlag, hookFilenameFlag)
err := os.Remove(hookPath)
if err != nil {
return fmt.Errorf("error removing hook '%v': %v", hookPath, err)
}
return nil
}
// ParseArgs parses the command line arguments to the CLI
func ParseArgs(c *cli.Context) error {
args := c.Args()
log.Infof("Parsing arguments: %v", args.Slice())
if c.NArg() != 1 {
return fmt.Errorf("incorrect number of arguments")
}
tooklitDirArg = args.Get(0)
log.Infof("Successfully parsed arguments")
return nil
}
func createHook(toolkitDir string, hookPath string) error {
hook, err := os.Create(hookPath)
if err != nil {
return fmt.Errorf("error creating hook file '%v': %v", hookPath, err)
}
defer hook.Close()
encoder := json.NewEncoder(hook)
err = encoder.Encode(generateOciHook(tooklitDirArg))
if err != nil {
return fmt.Errorf("error writing hook file '%v': %v", hookPath, err)
}
return nil
}
func getHookPath(hooksDir string, hookFilename string) string {
return filepath.Join(hooksDir, hookFilename)
}
func generateOciHook(toolkitDir string) hooks.Hook {
hookPath := filepath.Join(toolkitDir, "nvidia-container-toolkit")
envPath := "PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:" + toolkitDir
always := true
hook := hooks.Hook{
Version: "1.0.0",
Stages: []string{"prestart"},
Hook: rspec.Hook{
Path: hookPath,
Args: []string{"nvidia-container-toolkit", "prestart"},
Env: []string{envPath},
},
When: hooks.When{
Always: &always,
Commands: []string{".*"},
},
}
return hook
}

View File

@@ -0,0 +1,462 @@
/**
# Copyright (c) 2021, NVIDIA CORPORATION. All rights reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
*/
package main
import (
"bytes"
"encoding/json"
"fmt"
"io/ioutil"
"net"
"os"
"path/filepath"
"syscall"
"time"
log "github.com/sirupsen/logrus"
cli "github.com/urfave/cli/v2"
)
const (
nvidiaRuntimeName = "nvidia"
nvidiaRuntimeBinary = "nvidia-container-runtime"
nvidiaExperimentalRuntimeName = "nvidia-experimental"
nvidiaExperimentalRuntimeBinary = "nvidia-container-runtime-experimental"
defaultConfig = "/etc/docker/daemon.json"
defaultSocket = "/var/run/docker.sock"
defaultSetAsDefault = true
// defaultRuntimeName specifies the NVIDIA runtime to be use as the default runtime if setting the default runtime is enabled
defaultRuntimeName = nvidiaRuntimeName
reloadBackoff = 5 * time.Second
maxReloadAttempts = 6
defaultDockerRuntime = "runc"
socketMessageToGetPID = "GET /info HTTP/1.0\r\n\r\n"
)
// nvidiaRuntimeBinaries defines a map of runtime names to binary names
var nvidiaRuntimeBinaries = map[string]string{
nvidiaRuntimeName: nvidiaRuntimeBinary,
nvidiaExperimentalRuntimeName: nvidiaExperimentalRuntimeBinary,
}
// options stores the configuration from the command line or environment variables
type options struct {
config string
socket string
runtimeName string
setAsDefault bool
runtimeDir string
}
func main() {
options := options{}
// Create the top-level CLI
c := cli.NewApp()
c.Name = "docker"
c.Usage = "Update docker config with the nvidia runtime"
c.Version = "0.1.0"
// Create the 'setup' subcommand
setup := cli.Command{}
setup.Name = "setup"
setup.Usage = "Trigger docker config to be updated"
setup.ArgsUsage = "<runtime_dirname>"
setup.Action = func(c *cli.Context) error {
return Setup(c, &options)
}
// Create the 'cleanup' subcommand
cleanup := cli.Command{}
cleanup.Name = "cleanup"
cleanup.Usage = "Trigger any updates made to docker config to be undone"
cleanup.ArgsUsage = "<runtime_dirname>"
cleanup.Action = func(c *cli.Context) error {
return Cleanup(c, &options)
}
// Register the subcommands with the top-level CLI
c.Commands = []*cli.Command{
&setup,
&cleanup,
}
// Setup common flags across both subcommands. All subcommands get the same
// set of flags even if they don't use some of them. This is so that we
// only require the user to specify one set of flags for both 'startup'
// and 'cleanup' to simplify things.
commonFlags := []cli.Flag{
&cli.StringFlag{
Name: "config",
Aliases: []string{"c"},
Usage: "Path to docker config file",
Value: defaultConfig,
Destination: &options.config,
EnvVars: []string{"DOCKER_CONFIG"},
},
&cli.StringFlag{
Name: "socket",
Aliases: []string{"s"},
Usage: "Path to the docker socket file",
Value: defaultSocket,
Destination: &options.socket,
EnvVars: []string{"DOCKER_SOCKET"},
},
// The flags below are only used by the 'setup' command.
&cli.StringFlag{
Name: "runtime-name",
Aliases: []string{"r"},
Usage: "Specify the name of the `nvidia` runtime. If set-as-default is selected, the runtime is used as the default runtime.",
Value: defaultRuntimeName,
Destination: &options.runtimeName,
EnvVars: []string{"DOCKER_RUNTIME_NAME"},
},
&cli.BoolFlag{
Name: "set-as-default",
Aliases: []string{"d"},
Usage: "Set the `nvidia` runtime as the default runtime. If --runtime-name is specified as `nvidia-experimental` the experimental runtime is set as the default runtime instead",
Value: defaultSetAsDefault,
Destination: &options.setAsDefault,
EnvVars: []string{"DOCKER_SET_AS_DEFAULT"},
Hidden: true,
},
}
// Update the subcommand flags with the common subcommand flags
setup.Flags = append([]cli.Flag{}, commonFlags...)
cleanup.Flags = append([]cli.Flag{}, commonFlags...)
// Run the top-level CLI
if err := c.Run(os.Args); err != nil {
log.Errorf("Error running docker configuration: %v", err)
os.Exit(1)
}
}
// Setup updates docker configuration to include the nvidia runtime and reloads it
func Setup(c *cli.Context, o *options) error {
log.Infof("Starting 'setup' for %v", c.App.Name)
runtimeDir, err := ParseArgs(c)
if err != nil {
return fmt.Errorf("unable to parse args: %v", err)
}
o.runtimeDir = runtimeDir
cfg, err := LoadConfig(o.config)
if err != nil {
return fmt.Errorf("unable to load config: %v", err)
}
err = UpdateConfig(cfg, o)
if err != nil {
return fmt.Errorf("unable to update config: %v", err)
}
err = FlushConfig(cfg, o.config)
if err != nil {
return fmt.Errorf("unable to flush config: %v", err)
}
err = SignalDocker(o.socket)
if err != nil {
return fmt.Errorf("unable to signal docker: %v", err)
}
log.Infof("Completed 'setup' for %v", c.App.Name)
return nil
}
// Cleanup reverts docker configuration to remove the nvidia runtime and reloads it
func Cleanup(c *cli.Context, o *options) error {
log.Infof("Starting 'cleanup' for %v", c.App.Name)
_, err := ParseArgs(c)
if err != nil {
return fmt.Errorf("unable to parse args: %v", err)
}
cfg, err := LoadConfig(o.config)
if err != nil {
return fmt.Errorf("unable to load config: %v", err)
}
err = RevertConfig(cfg)
if err != nil {
return fmt.Errorf("unable to update config: %v", err)
}
err = FlushConfig(cfg, o.config)
if err != nil {
return fmt.Errorf("unable to flush config: %v", err)
}
err = SignalDocker(o.socket)
if err != nil {
return fmt.Errorf("unable to signal docker: %v", err)
}
log.Infof("Completed 'cleanup' for %v", c.App.Name)
return nil
}
// ParseArgs parses the command line arguments to the CLI
func ParseArgs(c *cli.Context) (string, error) {
args := c.Args()
log.Infof("Parsing arguments: %v", args.Slice())
if args.Len() != 1 {
return "", fmt.Errorf("incorrect number of arguments")
}
runtimeDir := args.Get(0)
log.Infof("Successfully parsed arguments")
return runtimeDir, nil
}
// LoadConfig loads the docker config from disk
func LoadConfig(config string) (map[string]interface{}, error) {
log.Infof("Loading config: %v", config)
info, err := os.Stat(config)
if os.IsExist(err) && info.IsDir() {
return nil, fmt.Errorf("config file is a directory")
}
cfg := make(map[string]interface{})
if os.IsNotExist(err) {
log.Infof("Config file does not exist, creating new one")
return cfg, nil
}
readBytes, err := ioutil.ReadFile(config)
if err != nil {
return nil, fmt.Errorf("unable to read config: %v", err)
}
reader := bytes.NewReader(readBytes)
if err := json.NewDecoder(reader).Decode(&cfg); err != nil {
return nil, err
}
log.Infof("Successfully loaded config")
return cfg, nil
}
// UpdateConfig updates the docker config to include the nvidia runtimes
func UpdateConfig(config map[string]interface{}, o *options) error {
defaultRuntime := o.getDefaultRuntime()
if defaultRuntime != "" {
config["default-runtime"] = defaultRuntime
}
runtimes := make(map[string]interface{})
if _, exists := config["runtimes"]; exists {
runtimes = config["runtimes"].(map[string]interface{})
}
for name, rt := range o.runtimes() {
runtimes[name] = rt
}
config["runtimes"] = runtimes
return nil
}
//RevertConfig reverts the docker config to remove the nvidia runtime
func RevertConfig(config map[string]interface{}) error {
if _, exists := config["default-runtime"]; exists {
defaultRuntime := config["default-runtime"].(string)
if _, exists := nvidiaRuntimeBinaries[defaultRuntime]; exists {
config["default-runtime"] = defaultDockerRuntime
}
}
if _, exists := config["runtimes"]; exists {
runtimes := config["runtimes"].(map[string]interface{})
for name := range nvidiaRuntimeBinaries {
delete(runtimes, name)
}
if len(runtimes) == 0 {
delete(config, "runtimes")
}
}
return nil
}
// FlushConfig flushes the updated/reverted config out to disk
func FlushConfig(cfg map[string]interface{}, config string) error {
log.Infof("Flushing config")
output, err := json.MarshalIndent(cfg, "", " ")
if err != nil {
return fmt.Errorf("unable to convert to JSON: %v", err)
}
switch len(output) {
case 0:
err := os.Remove(config)
if err != nil {
return fmt.Errorf("unable to remove empty file: %v", err)
}
log.Infof("Config empty, removing file")
default:
f, err := os.Create(config)
if err != nil {
return fmt.Errorf("unable to open %v for writing: %v", config, err)
}
defer f.Close()
_, err = f.WriteString(string(output))
if err != nil {
return fmt.Errorf("unable to write output: %v", err)
}
}
log.Infof("Successfully flushed config")
return nil
}
// SignalDocker sends a SIGHUP signal to docker daemon
func SignalDocker(socket string) error {
log.Infof("Sending SIGHUP signal to docker")
// Wrap the logic to perform the SIGHUP in a function so we can retry it on failure
retriable := func() error {
conn, err := net.Dial("unix", socket)
if err != nil {
return fmt.Errorf("unable to dial: %v", err)
}
defer conn.Close()
sconn, err := conn.(*net.UnixConn).SyscallConn()
if err != nil {
return fmt.Errorf("unable to get syscall connection: %v", err)
}
err1 := sconn.Control(func(fd uintptr) {
err = syscall.SetsockoptInt(int(fd), syscall.SOL_SOCKET, syscall.SO_PASSCRED, 1)
})
if err1 != nil {
return fmt.Errorf("unable to issue call on socket fd: %v", err1)
}
if err != nil {
return fmt.Errorf("unable to SetsockoptInt on socket fd: %v", err)
}
_, _, err = conn.(*net.UnixConn).WriteMsgUnix([]byte(socketMessageToGetPID), nil, nil)
if err != nil {
return fmt.Errorf("unable to WriteMsgUnix on socket fd: %v", err)
}
oob := make([]byte, 1024)
_, oobn, _, _, err := conn.(*net.UnixConn).ReadMsgUnix(nil, oob)
if err != nil {
return fmt.Errorf("unable to ReadMsgUnix on socket fd: %v", err)
}
oob = oob[:oobn]
scm, err := syscall.ParseSocketControlMessage(oob)
if err != nil {
return fmt.Errorf("unable to ParseSocketControlMessage from message received on socket fd: %v", err)
}
ucred, err := syscall.ParseUnixCredentials(&scm[0])
if err != nil {
return fmt.Errorf("unable to ParseUnixCredentials from message received on socket fd: %v", err)
}
err = syscall.Kill(int(ucred.Pid), syscall.SIGHUP)
if err != nil {
return fmt.Errorf("unable to send SIGHUP to 'docker' process: %v", err)
}
return nil
}
// Try to send a SIGHUP up to maxReloadAttempts times
var err error
for i := 0; i < maxReloadAttempts; i++ {
err = retriable()
if err == nil {
break
}
if i == maxReloadAttempts-1 {
break
}
log.Warnf("Error signaling docker, attempt %v/%v: %v", i+1, maxReloadAttempts, err)
time.Sleep(reloadBackoff)
}
if err != nil {
log.Warnf("Max retries reached %v/%v, aborting", maxReloadAttempts, maxReloadAttempts)
return err
}
log.Infof("Successfully signaled docker")
return nil
}
// getDefaultRuntime returns the default runtime for the configured options.
// If the configuration is invalid or the default runtimes should not be set
// the empty string is returned.
func (o options) getDefaultRuntime() string {
if o.setAsDefault == false {
return ""
}
return o.runtimeName
}
// runtimes returns the docker runtime definitions for the supported nvidia runtimes
// for the given options. This includes the path with the options runtimeDir applied
func (o options) runtimes() map[string]interface{} {
runtimes := make(map[string]interface{})
for r, bin := range o.getRuntimeBinaries() {
runtimes[r] = map[string]interface{}{
"path": bin,
"args": []string{},
}
}
return runtimes
}
// getRuntimeBinaries returns a map of runtime names to binary paths. This includes the
// renaming of the `nvidia` runtime as per the --runtime-class command line flag.
func (o options) getRuntimeBinaries() map[string]string {
runtimeBinaries := make(map[string]string)
for rt, bin := range nvidiaRuntimeBinaries {
runtime := rt
if o.runtimeName != "" && o.runtimeName != nvidiaExperimentalRuntimeName && runtime == defaultRuntimeName {
runtime = o.runtimeName
}
runtimeBinaries[runtime] = filepath.Join(o.runtimeDir, bin)
}
return runtimeBinaries
}

View File

@@ -0,0 +1,423 @@
/**
# Copyright (c) 2021, NVIDIA CORPORATION. All rights reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
*/
package main
import (
"encoding/json"
"testing"
"github.com/stretchr/testify/require"
)
func TestUpdateConfigDefaultRuntime(t *testing.T) {
const runtimeDir = "/test/runtime/dir"
testCases := []struct {
setAsDefault bool
runtimeName string
expectedDefaultRuntimeName interface{}
}{
{},
{
setAsDefault: false,
expectedDefaultRuntimeName: nil,
},
{
setAsDefault: true,
runtimeName: "NAME",
expectedDefaultRuntimeName: "NAME",
},
{
setAsDefault: true,
runtimeName: "nvidia-experimental",
expectedDefaultRuntimeName: "nvidia-experimental",
},
{
setAsDefault: true,
runtimeName: "nvidia",
expectedDefaultRuntimeName: "nvidia",
},
}
for i, tc := range testCases {
o := &options{
setAsDefault: tc.setAsDefault,
runtimeName: tc.runtimeName,
runtimeDir: runtimeDir,
}
config := map[string]interface{}{}
err := UpdateConfig(config, o)
require.NoError(t, err, "%d: %v", i, tc)
defaultRuntimeName := config["default-runtime"]
require.EqualValues(t, tc.expectedDefaultRuntimeName, defaultRuntimeName, "%d: %v", i, tc)
}
}
func TestUpdateConfig(t *testing.T) {
const runtimeDir = "/test/runtime/dir"
testCases := []struct {
config map[string]interface{}
setAsDefault bool
runtimeName string
expectedConfig map[string]interface{}
}{
{
config: map[string]interface{}{},
setAsDefault: false,
expectedConfig: map[string]interface{}{
"runtimes": map[string]interface{}{
"nvidia": map[string]interface{}{
"path": "/test/runtime/dir/nvidia-container-runtime",
"args": []string{},
},
"nvidia-experimental": map[string]interface{}{
"path": "/test/runtime/dir/nvidia-container-runtime-experimental",
"args": []string{},
},
},
},
},
{
config: map[string]interface{}{},
setAsDefault: false,
runtimeName: "NAME",
expectedConfig: map[string]interface{}{
"runtimes": map[string]interface{}{
"NAME": map[string]interface{}{
"path": "/test/runtime/dir/nvidia-container-runtime",
"args": []string{},
},
"nvidia-experimental": map[string]interface{}{
"path": "/test/runtime/dir/nvidia-container-runtime-experimental",
"args": []string{},
},
},
},
},
{
config: map[string]interface{}{},
setAsDefault: false,
runtimeName: "nvidia-experimental",
expectedConfig: map[string]interface{}{
"runtimes": map[string]interface{}{
"nvidia": map[string]interface{}{
"path": "/test/runtime/dir/nvidia-container-runtime",
"args": []string{},
},
"nvidia-experimental": map[string]interface{}{
"path": "/test/runtime/dir/nvidia-container-runtime-experimental",
"args": []string{},
},
},
},
},
{
config: map[string]interface{}{
"runtimes": map[string]interface{}{
"nvidia": map[string]interface{}{
"path": "nvidia-container-runtime",
"args": []string{},
},
},
},
setAsDefault: false,
expectedConfig: map[string]interface{}{
"runtimes": map[string]interface{}{
"nvidia": map[string]interface{}{
"path": "/test/runtime/dir/nvidia-container-runtime",
"args": []string{},
},
"nvidia-experimental": map[string]interface{}{
"path": "/test/runtime/dir/nvidia-container-runtime-experimental",
"args": []string{},
},
},
},
},
{
config: map[string]interface{}{
"runtimes": map[string]interface{}{
"not-nvidia": map[string]interface{}{
"path": "some-other-path",
"args": []string{},
},
},
},
expectedConfig: map[string]interface{}{
"runtimes": map[string]interface{}{
"not-nvidia": map[string]interface{}{
"path": "some-other-path",
"args": []string{},
},
"nvidia": map[string]interface{}{
"path": "/test/runtime/dir/nvidia-container-runtime",
"args": []string{},
},
"nvidia-experimental": map[string]interface{}{
"path": "/test/runtime/dir/nvidia-container-runtime-experimental",
"args": []string{},
},
},
},
},
{
config: map[string]interface{}{
"default-runtime": "runc",
},
setAsDefault: true,
runtimeName: "nvidia",
expectedConfig: map[string]interface{}{
"default-runtime": "nvidia",
"runtimes": map[string]interface{}{
"nvidia": map[string]interface{}{
"path": "/test/runtime/dir/nvidia-container-runtime",
"args": []string{},
},
"nvidia-experimental": map[string]interface{}{
"path": "/test/runtime/dir/nvidia-container-runtime-experimental",
"args": []string{},
},
},
},
},
{
config: map[string]interface{}{
"default-runtime": "runc",
},
setAsDefault: true,
runtimeName: "nvidia-experimental",
expectedConfig: map[string]interface{}{
"default-runtime": "nvidia-experimental",
"runtimes": map[string]interface{}{
"nvidia": map[string]interface{}{
"path": "/test/runtime/dir/nvidia-container-runtime",
"args": []string{},
},
"nvidia-experimental": map[string]interface{}{
"path": "/test/runtime/dir/nvidia-container-runtime-experimental",
"args": []string{},
},
},
},
},
{
config: map[string]interface{}{
"exec-opts": []string{"native.cgroupdriver=systemd"},
"log-driver": "json-file",
"log-opts": map[string]string{
"max-size": "100m",
},
"storage-driver": "overlay2",
},
expectedConfig: map[string]interface{}{
"exec-opts": []string{"native.cgroupdriver=systemd"},
"log-driver": "json-file",
"log-opts": map[string]string{
"max-size": "100m",
},
"storage-driver": "overlay2",
"runtimes": map[string]interface{}{
"nvidia": map[string]interface{}{
"path": "/test/runtime/dir/nvidia-container-runtime",
"args": []string{},
},
"nvidia-experimental": map[string]interface{}{
"path": "/test/runtime/dir/nvidia-container-runtime-experimental",
"args": []string{},
},
},
},
},
}
for i, tc := range testCases {
options := &options{
setAsDefault: tc.setAsDefault,
runtimeName: tc.runtimeName,
runtimeDir: runtimeDir,
}
err := UpdateConfig(tc.config, options)
require.NoError(t, err, "%d: %v", i, tc)
configContent, err := json.MarshalIndent(tc.config, "", " ")
require.NoError(t, err)
expectedContent, err := json.MarshalIndent(tc.expectedConfig, "", " ")
require.NoError(t, err)
require.EqualValues(t, string(expectedContent), string(configContent), "%d: %v", i, tc)
}
}
func TestRevertConfig(t *testing.T) {
testCases := []struct {
config map[string]interface{}
expectedConfig map[string]interface{}
}{
{
config: map[string]interface{}{},
expectedConfig: map[string]interface{}{},
},
{
config: map[string]interface{}{
"runtimes": map[string]interface{}{
"nvidia": map[string]interface{}{
"path": "/test/runtime/dir/nvidia-container-runtime",
"args": []string{},
},
},
},
expectedConfig: map[string]interface{}{},
},
{
config: map[string]interface{}{
"runtimes": map[string]interface{}{
"nvidia-experimental": map[string]interface{}{
"path": "/test/runtime/dir/nvidia-container-runtime",
"args": []string{},
},
},
},
expectedConfig: map[string]interface{}{},
},
{
config: map[string]interface{}{
"runtimes": map[string]interface{}{
"nvidia": map[string]interface{}{
"path": "/test/runtime/dir/nvidia-container-runtime",
"args": []string{},
},
"nvidia-experimental": map[string]interface{}{
"path": "/test/runtime/dir/nvidia-container-runtime-experimental",
"args": []string{},
},
},
},
expectedConfig: map[string]interface{}{},
},
{
config: map[string]interface{}{
"default-runtime": "nvidia",
"runtimes": map[string]interface{}{
"nvidia": map[string]interface{}{
"path": "/test/runtime/dir/nvidia-container-runtime",
"args": []string{},
},
},
},
expectedConfig: map[string]interface{}{
"default-runtime": "runc",
},
},
{
config: map[string]interface{}{
"default-runtime": "not-nvidia",
"runtimes": map[string]interface{}{
"nvidia": map[string]interface{}{
"path": "/test/runtime/dir/nvidia-container-runtime",
"args": []string{},
},
},
},
expectedConfig: map[string]interface{}{
"default-runtime": "not-nvidia",
},
},
{
config: map[string]interface{}{
"exec-opts": []string{"native.cgroupdriver=systemd"},
"log-driver": "json-file",
"log-opts": map[string]string{
"max-size": "100m",
},
"storage-driver": "overlay2",
"runtimes": map[string]interface{}{
"nvidia": map[string]interface{}{
"path": "/test/runtime/dir/nvidia-container-runtime",
"args": []string{},
},
},
},
expectedConfig: map[string]interface{}{
"exec-opts": []string{"native.cgroupdriver=systemd"},
"log-driver": "json-file",
"log-opts": map[string]string{
"max-size": "100m",
},
"storage-driver": "overlay2",
},
},
}
for i, tc := range testCases {
err := RevertConfig(tc.config)
require.NoError(t, err, "%d: %v", i, tc)
configContent, err := json.MarshalIndent(tc.config, "", " ")
require.NoError(t, err)
expectedContent, err := json.MarshalIndent(tc.expectedConfig, "", " ")
require.NoError(t, err)
require.EqualValues(t, string(expectedContent), string(configContent), "%d: %v", i, tc)
}
}
func TestFlagsDefaultRuntime(t *testing.T) {
testCases := []struct {
setAsDefault bool
runtimeName string
expected string
}{
{
expected: "",
},
{
runtimeName: "not-bool",
expected: "",
},
{
setAsDefault: false,
runtimeName: "nvidia",
expected: "",
},
{
setAsDefault: true,
runtimeName: "nvidia",
expected: "nvidia",
},
{
setAsDefault: true,
runtimeName: "nvidia-experimental",
expected: "nvidia-experimental",
},
}
for i, tc := range testCases {
f := options{
setAsDefault: tc.setAsDefault,
runtimeName: tc.runtimeName,
}
require.Equal(t, tc.expected, f.getDefaultRuntime(), "%d: %v", i, tc)
}
}

View File

@@ -0,0 +1,290 @@
package main
import (
"fmt"
"os"
"os/exec"
"os/signal"
"path/filepath"
"strings"
"syscall"
log "github.com/sirupsen/logrus"
cli "github.com/urfave/cli/v2"
unix "golang.org/x/sys/unix"
)
const (
runDir = "/run/nvidia"
pidFile = runDir + "/toolkit.pid"
toolkitCommand = "toolkit"
toolkitSubDir = "toolkit"
defaultToolkitArgs = ""
defaultRuntime = "docker"
defaultRuntimeArgs = ""
)
var availableRuntimes = map[string]struct{}{"docker": {}, "crio": {}, "containerd": {}}
var waitingForSignal = make(chan bool, 1)
var signalReceived = make(chan bool, 1)
var destinationArg string
var noDaemonFlag bool
var toolkitArgsFlag string
var runtimeFlag string
var runtimeArgsFlag string
// Version defines the CLI version. This is set at build time using LD FLAGS
var Version = "development"
func main() {
// Create the top-level CLI
c := cli.NewApp()
c.Name = "nvidia-toolkit"
c.Usage = "Install the nvidia-container-toolkit for use by a given runtime"
c.UsageText = "DESTINATION [-n | --no-daemon] [-t | --toolkit-args] [-r | --runtime] [-u | --runtime-args]"
c.Description = "DESTINATION points to the host path underneath which the nvidia-container-toolkit should be installed.\nIt will be installed at ${DESTINATION}/toolkit"
c.Version = Version
c.Action = Run
// Setup flags for the CLI
c.Flags = []cli.Flag{
&cli.BoolFlag{
Name: "no-daemon",
Aliases: []string{"n"},
Usage: "terminate immediatly after setting up the runtime. Note that no cleanup will be performed",
Destination: &noDaemonFlag,
EnvVars: []string{"NO_DAEMON"},
},
&cli.StringFlag{
Name: "toolkit-args",
Aliases: []string{"t"},
Usage: "arguments to pass to the underlying 'toolkit' command",
Value: defaultToolkitArgs,
Destination: &toolkitArgsFlag,
EnvVars: []string{"TOOLKIT_ARGS"},
},
&cli.StringFlag{
Name: "runtime",
Aliases: []string{"r"},
Usage: "the runtime to setup on this node. One of {'docker', 'crio', 'containerd'}",
Value: defaultRuntime,
Destination: &runtimeFlag,
EnvVars: []string{"RUNTIME"},
},
&cli.StringFlag{
Name: "runtime-args",
Aliases: []string{"u"},
Usage: "arguments to pass to 'docker', 'crio', or 'containerd' setup command",
Value: defaultRuntimeArgs,
Destination: &runtimeArgsFlag,
EnvVars: []string{"RUNTIME_ARGS"},
},
}
// Run the CLI
log.Infof("Starting %v", c.Name)
remainingArgs, err := ParseArgs(os.Args)
if err != nil {
log.Errorf("Error: unable to parse arguments: %v", err)
os.Exit(1)
}
if err := c.Run(remainingArgs); err != nil {
log.Errorf("error running nvidia-toolkit: %v", err)
os.Exit(1)
}
log.Infof("Completed %v", c.Name)
}
// Run runs the core logic of the CLI
func Run(c *cli.Context) error {
err := verifyFlags()
if err != nil {
return fmt.Errorf("unable to verify flags: %v", err)
}
err = initialize()
if err != nil {
return fmt.Errorf("unable to initialize: %v", err)
}
defer shutdown()
err = installToolkit()
if err != nil {
return fmt.Errorf("unable to install toolkit: %v", err)
}
err = setupRuntime()
if err != nil {
return fmt.Errorf("unable to setup runtime: %v", err)
}
if !noDaemonFlag {
err = waitForSignal()
if err != nil {
return fmt.Errorf("unable to wait for signal: %v", err)
}
err = cleanupRuntime()
if err != nil {
return fmt.Errorf("unable to cleanup runtime: %v", err)
}
}
return nil
}
// ParseArgs parses the command line arguments and returns the remaining arguments
func ParseArgs(args []string) ([]string, error) {
log.Infof("Parsing arguments")
numPositionalArgs := 2 // Includes command itself
if len(args) < numPositionalArgs {
return nil, fmt.Errorf("missing arguments")
}
for _, arg := range args {
if arg == "--help" || arg == "-h" {
return []string{args[0], arg}, nil
}
if arg == "--version" || arg == "-v" {
return []string{args[0], arg}, nil
}
}
for _, arg := range args[:numPositionalArgs] {
if strings.HasPrefix(arg, "-") {
return nil, fmt.Errorf("unexpected flag where argument should be")
}
}
for _, arg := range args[numPositionalArgs:] {
if !strings.HasPrefix(arg, "-") {
return nil, fmt.Errorf("unexpected argument where flag should be")
}
}
destinationArg = args[1]
return append([]string{args[0]}, args[numPositionalArgs:]...), nil
}
func verifyFlags() error {
log.Infof("Verifying Flags")
if _, exists := availableRuntimes[runtimeFlag]; !exists {
return fmt.Errorf("unknown runtime: %v", runtimeFlag)
}
return nil
}
func initialize() error {
log.Infof("Initializing")
f, err := os.Create(pidFile)
if err != nil {
return fmt.Errorf("unable to create pidfile: %v", err)
}
err = unix.Flock(int(f.Fd()), unix.LOCK_EX|unix.LOCK_NB)
if err != nil {
log.Warnf("Unable to get exclusive lock on '%v'", pidFile)
log.Warnf("This normally means an instance of the NVIDIA toolkit Container is already running, aborting")
return fmt.Errorf("unable to get flock on pidfile: %v", err)
}
_, err = f.WriteString(fmt.Sprintf("%v\n", os.Getpid()))
if err != nil {
return fmt.Errorf("unable to write PID to pidfile: %v", err)
}
sigs := make(chan os.Signal, 1)
signal.Notify(sigs, syscall.SIGHUP, syscall.SIGINT, syscall.SIGQUIT, syscall.SIGPIPE, syscall.SIGTERM)
go func() {
<-sigs
select {
case <-waitingForSignal:
signalReceived <- true
default:
log.Infof("Signal received, exiting early")
shutdown()
os.Exit(0)
}
}()
return nil
}
func installToolkit() error {
toolkitDir := filepath.Join(destinationArg, toolkitSubDir)
log.Infof("Installing toolkit")
cmdline := fmt.Sprintf("%v install %v %v\n", toolkitCommand, toolkitArgsFlag, toolkitDir)
cmd := exec.Command("sh", "-c", cmdline)
cmd.Stdout = os.Stdout
cmd.Stderr = os.Stderr
err := cmd.Run()
if err != nil {
return fmt.Errorf("error running %v command: %v", toolkitCommand, err)
}
return nil
}
func setupRuntime() error {
toolkitDir := filepath.Join(destinationArg, toolkitSubDir)
log.Infof("Setting up runtime")
cmdline := fmt.Sprintf("%v setup %v %v\n", runtimeFlag, runtimeArgsFlag, toolkitDir)
cmd := exec.Command("sh", "-c", cmdline)
cmd.Stdout = os.Stdout
cmd.Stderr = os.Stderr
err := cmd.Run()
if err != nil {
return fmt.Errorf("error running %v command: %v", runtimeFlag, err)
}
return nil
}
func waitForSignal() error {
log.Infof("Waiting for signal")
waitingForSignal <- true
<-signalReceived
return nil
}
func cleanupRuntime() error {
toolkitDir := filepath.Join(destinationArg, toolkitSubDir)
log.Infof("Cleaning up Runtime")
cmdline := fmt.Sprintf("%v cleanup %v %v\n", runtimeFlag, runtimeArgsFlag, toolkitDir)
cmd := exec.Command("sh", "-c", cmdline)
cmd.Stdout = os.Stdout
cmd.Stderr = os.Stderr
err := cmd.Run()
if err != nil {
return fmt.Errorf("error running %v command: %v", runtimeFlag, err)
}
return nil
}
func shutdown() {
log.Infof("Shutting Down")
err := os.Remove(pidFile)
if err != nil {
log.Warnf("Unable to remove pidfile: %v", err)
}
}

View File

@@ -0,0 +1,153 @@
/**
# Copyright (c) 2021, NVIDIA CORPORATION. All rights reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
*/
package main
import (
"fmt"
"io"
"os"
"path/filepath"
"sort"
"strings"
log "github.com/sirupsen/logrus"
)
type executableTarget struct {
dotfileName string
wrapperName string
}
type executable struct {
source string
target executableTarget
env map[string]string
preLines []string
argLines []string
}
// install installs an executable component of the NVIDIA container toolkit. The source executable
// is copied to a `.real` file and a wapper is created to set up the environment as required.
func (e executable) install(destFolder string) (string, error) {
log.Infof("Installing executable '%v' to %v", e.source, destFolder)
dotfileName := e.dotfileName()
installedDotfileName, err := installFileToFolderWithName(destFolder, dotfileName, e.source)
if err != nil {
return "", fmt.Errorf("error installing file '%v' as '%v': %v", e.source, dotfileName, err)
}
log.Infof("Installed '%v'", installedDotfileName)
wrapperFilename, err := e.installWrapper(destFolder, installedDotfileName)
if err != nil {
return "", fmt.Errorf("error wrapping '%v': %v", installedDotfileName, err)
}
log.Infof("Installed wrapper '%v'", wrapperFilename)
return wrapperFilename, nil
}
func (e executable) dotfileName() string {
return e.target.dotfileName
}
func (e executable) wrapperName() string {
return e.target.wrapperName
}
func (e executable) installWrapper(destFolder string, dotfileName string) (string, error) {
wrapperPath := filepath.Join(destFolder, e.wrapperName())
wrapper, err := os.Create(wrapperPath)
if err != nil {
return "", fmt.Errorf("error creating executable wrapper: %v", err)
}
defer wrapper.Close()
err = e.writeWrapperTo(wrapper, destFolder, dotfileName)
if err != nil {
return "", fmt.Errorf("error writing wrapper contents: %v", err)
}
err = ensureExecutable(wrapperPath)
if err != nil {
return "", fmt.Errorf("error making wrapper executable: %v", err)
}
return wrapperPath, nil
}
func (e executable) writeWrapperTo(wrapper io.Writer, destFolder string, dotfileName string) error {
r := newReplacements(destDirPattern, destFolder)
// Add the shebang
fmt.Fprintln(wrapper, "#! /bin/sh")
// Add the preceding lines if any
for _, line := range e.preLines {
fmt.Fprintf(wrapper, "%s\n", r.apply(line))
}
// Update the path to include the destination folder
var env map[string]string
if e.env == nil {
env = make(map[string]string)
} else {
env = e.env
}
path, specified := env["PATH"]
if !specified {
path = "$PATH"
}
env["PATH"] = strings.Join([]string{destFolder, path}, ":")
var sortedEnvvars []string
for e := range env {
sortedEnvvars = append(sortedEnvvars, e)
}
sort.Strings(sortedEnvvars)
for _, e := range sortedEnvvars {
v := env[e]
fmt.Fprintf(wrapper, "%s=%s \\\n", e, r.apply(v))
}
// Add the call to the target executable
fmt.Fprintf(wrapper, "%s \\\n", dotfileName)
// Insert additional lines in the `arg` list
for _, line := range e.argLines {
fmt.Fprintf(wrapper, "\t%s \\\n", r.apply(line))
}
// Add the script arguments "$@"
fmt.Fprintln(wrapper, "\t\"$@\"")
return nil
}
// ensureExecutable is equivalent to running chmod +x on the specified file
func ensureExecutable(path string) error {
info, err := os.Stat(path)
if err != nil {
return fmt.Errorf("error getting file info for '%v': %v", path, err)
}
executableMode := info.Mode() | 0111
err = os.Chmod(path, executableMode)
if err != nil {
return fmt.Errorf("error setting executable mode for '%v': %v", path, err)
}
return nil
}

View File

@@ -0,0 +1,152 @@
/**
# Copyright (c) 2021, NVIDIA CORPORATION. All rights reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
*/
package main
import (
"bytes"
"os"
"path/filepath"
"strings"
"testing"
"github.com/stretchr/testify/require"
)
func TestWrapper(t *testing.T) {
const shebang = "#! /bin/sh"
const destFolder = "/dest/folder"
const dotfileName = "source.real"
testCases := []struct {
e executable
expectedLines []string
}{
{
e: executable{},
expectedLines: []string{
shebang,
"PATH=/dest/folder:$PATH \\",
"source.real \\",
"\t\"$@\"",
"",
},
},
{
e: executable{
env: map[string]string{
"PATH": "some-path",
},
},
expectedLines: []string{
shebang,
"PATH=/dest/folder:some-path \\",
"source.real \\",
"\t\"$@\"",
"",
},
},
{
e: executable{
preLines: []string{
"preline1",
"preline2",
},
},
expectedLines: []string{
shebang,
"preline1",
"preline2",
"PATH=/dest/folder:$PATH \\",
"source.real \\",
"\t\"$@\"",
"",
},
},
{
e: executable{
argLines: []string{
"argline1",
"argline2",
},
},
expectedLines: []string{
shebang,
"PATH=/dest/folder:$PATH \\",
"source.real \\",
"\targline1 \\",
"\targline2 \\",
"\t\"$@\"",
"",
},
},
}
for i, tc := range testCases {
buf := &bytes.Buffer{}
err := tc.e.writeWrapperTo(buf, destFolder, dotfileName)
require.NoError(t, err)
exepectedContents := strings.Join(tc.expectedLines, "\n")
require.Equal(t, exepectedContents, buf.String(), "%v: %v", i, tc)
}
}
func TestInstallExecutable(t *testing.T) {
inputFolder, err := os.MkdirTemp("", "")
require.NoError(t, err)
defer os.RemoveAll(inputFolder)
// Create the source file
source := filepath.Join(inputFolder, "input")
sourceFile, err := os.Create(source)
base := filepath.Base(source)
require.NoError(t, err)
require.NoError(t, sourceFile.Close())
e := executable{
source: source,
target: executableTarget{
dotfileName: "input.real",
wrapperName: "input",
},
}
destFolder, err := os.MkdirTemp("", "output-*")
require.NoError(t, err)
defer os.RemoveAll(destFolder)
installed, err := e.install(destFolder)
require.NoError(t, err)
require.Equal(t, filepath.Join(destFolder, base), installed)
// Now check the post conditions:
sourceInfo, err := os.Stat(source)
require.NoError(t, err)
destInfo, err := os.Stat(filepath.Join(destFolder, base+".real"))
require.NoError(t, err)
require.Equal(t, sourceInfo.Size(), destInfo.Size())
require.Equal(t, sourceInfo.Mode(), destInfo.Mode())
wrapperInfo, err := os.Stat(installed)
require.NoError(t, err)
require.NotEqual(t, 0, wrapperInfo.Mode()&0111)
}

View File

@@ -0,0 +1,45 @@
/**
# Copyright (c) 2021, NVIDIA CORPORATION. All rights reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
*/
package main
import "strings"
const (
destDirPattern = "@destDir@"
)
type replacements map[string]string
func newReplacements(rules ...string) replacements {
r := make(replacements)
for i := 0; i < len(rules)-1; i += 2 {
old := rules[i]
new := rules[i+1]
r[old] = new
}
return r
}
func (r replacements) apply(input string) string {
output := input
for old, new := range r {
output = strings.ReplaceAll(output, old, new)
}
return output
}

View File

@@ -0,0 +1,132 @@
/**
# Copyright (c) 2021, NVIDIA CORPORATION. All rights reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
*/
package main
import (
"fmt"
"path/filepath"
"strings"
log "github.com/sirupsen/logrus"
)
const (
nvidiaContainerRuntimeSource = "/usr/bin/nvidia-container-runtime"
nvidiaContainerRuntimeTarget = "nvidia-container-runtime.real"
nvidiaContainerRuntimeWrapper = "nvidia-container-runtime"
nvidiaExperimentalContainerRuntimeSource = "nvidia-container-runtime.experimental"
nvidiaExperimentalContainerRuntimeTarget = nvidiaExperimentalContainerRuntimeSource
nvidiaExperimentalContainerRuntimeWrapper = "nvidia-container-runtime-experimental"
)
// installContainerRuntimes sets up the NVIDIA container runtimes, copying the executables
// and implementing the required wrapper
func installContainerRuntimes(toolkitDir string, driverRoot string) error {
r := newNvidiaContainerRuntimeInstaller()
_, err := r.install(toolkitDir)
if err != nil {
return fmt.Errorf("error installing NVIDIA container runtime: %v", err)
}
// Install the experimental runtime and treat failures as non-fatal.
err = installExperimentalRuntime(toolkitDir, driverRoot)
if err != nil {
log.Warnf("Could not install experimental runtime: %v", err)
}
return nil
}
// installExperimentalRuntime ensures that the experimental NVIDIA Container runtime is installed
func installExperimentalRuntime(toolkitDir string, driverRoot string) error {
libraryRoot, err := findLibraryRoot(driverRoot)
if err != nil {
log.Warnf("Error finding library path for root %v: %v", driverRoot, err)
}
log.Infof("Using library root %v", libraryRoot)
e := newNvidiaContainerRuntimeExperimentalInstaller(libraryRoot)
_, err = e.install(toolkitDir)
if err != nil {
return fmt.Errorf("error installing experimental NVIDIA Container Runtime: %v", err)
}
return nil
}
func newNvidiaContainerRuntimeInstaller() *executable {
target := executableTarget{
dotfileName: nvidiaContainerRuntimeTarget,
wrapperName: nvidiaContainerRuntimeWrapper,
}
return newRuntimeInstaller(nvidiaContainerRuntimeSource, target, nil)
}
func newNvidiaContainerRuntimeExperimentalInstaller(libraryRoot string) *executable {
target := executableTarget{
dotfileName: nvidiaExperimentalContainerRuntimeTarget,
wrapperName: nvidiaExperimentalContainerRuntimeWrapper,
}
env := make(map[string]string)
if libraryRoot != "" {
env["LD_LIBRARY_PATH"] = strings.Join([]string{libraryRoot, "$LD_LIBRARY_PATH"}, ":")
}
return newRuntimeInstaller(nvidiaExperimentalContainerRuntimeSource, target, env)
}
func newRuntimeInstaller(source string, target executableTarget, env map[string]string) *executable {
preLines := []string{
"",
"cat /proc/modules | grep -e \"^nvidia \" >/dev/null 2>&1",
"if [ \"${?}\" != \"0\" ]; then",
" echo \"nvidia driver modules are not yet loaded, invoking runc directly\"",
" exec runc \"$@\"",
"fi",
"",
}
runtimeEnv := make(map[string]string)
runtimeEnv["XDG_CONFIG_HOME"] = filepath.Join(destDirPattern, ".config")
for k, v := range env {
runtimeEnv[k] = v
}
r := executable{
source: source,
target: target,
env: runtimeEnv,
preLines: preLines,
}
return &r
}
func findLibraryRoot(root string) (string, error) {
libnvidiamlPath, err := findManagementLibrary(root)
if err != nil {
return "", fmt.Errorf("error locating NVIDIA management library: %v", err)
}
return filepath.Dir(libnvidiamlPath), nil
}
func findManagementLibrary(root string) (string, error) {
return findLibrary(root, "libnvidia-ml.so")
}

View File

@@ -0,0 +1,90 @@
/**
# Copyright (c) 2021, NVIDIA CORPORATION. All rights reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
*/
package main
import (
"bytes"
"strings"
"testing"
"github.com/stretchr/testify/require"
)
func TestNvidiaContainerRuntimeInstallerWrapper(t *testing.T) {
r := newNvidiaContainerRuntimeInstaller()
const shebang = "#! /bin/sh"
const destFolder = "/dest/folder"
const dotfileName = "source.real"
buf := &bytes.Buffer{}
err := r.writeWrapperTo(buf, destFolder, dotfileName)
require.NoError(t, err)
expectedLines := []string{
shebang,
"",
"cat /proc/modules | grep -e \"^nvidia \" >/dev/null 2>&1",
"if [ \"${?}\" != \"0\" ]; then",
" echo \"nvidia driver modules are not yet loaded, invoking runc directly\"",
" exec runc \"$@\"",
"fi",
"",
"PATH=/dest/folder:$PATH \\",
"XDG_CONFIG_HOME=/dest/folder/.config \\",
"source.real \\",
"\t\"$@\"",
"",
}
exepectedContents := strings.Join(expectedLines, "\n")
require.Equal(t, exepectedContents, buf.String())
}
func TestExperimentalContainerRuntimeInstallerWrapper(t *testing.T) {
r := newNvidiaContainerRuntimeExperimentalInstaller("/some/root/usr/lib64")
const shebang = "#! /bin/sh"
const destFolder = "/dest/folder"
const dotfileName = "source.real"
buf := &bytes.Buffer{}
err := r.writeWrapperTo(buf, destFolder, dotfileName)
require.NoError(t, err)
expectedLines := []string{
shebang,
"",
"cat /proc/modules | grep -e \"^nvidia \" >/dev/null 2>&1",
"if [ \"${?}\" != \"0\" ]; then",
" echo \"nvidia driver modules are not yet loaded, invoking runc directly\"",
" exec runc \"$@\"",
"fi",
"",
"LD_LIBRARY_PATH=/some/root/usr/lib64:$LD_LIBRARY_PATH \\",
"PATH=/dest/folder:$PATH \\",
"XDG_CONFIG_HOME=/dest/folder/.config \\",
"source.real \\",
"\t\"$@\"",
"",
}
exepectedContents := strings.Join(expectedLines, "\n")
require.Equal(t, exepectedContents, buf.String())
}

View File

@@ -0,0 +1,449 @@
/**
# Copyright (c) 2021, NVIDIA CORPORATION. All rights reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
*/
package main
import (
"fmt"
"io"
"os"
"path/filepath"
"strings"
toml "github.com/pelletier/go-toml"
log "github.com/sirupsen/logrus"
"github.com/urfave/cli/v2"
)
const (
// DefaultNvidiaDriverRoot specifies the default NVIDIA driver run directory
DefaultNvidiaDriverRoot = "/run/nvidia/driver"
nvidiaContainerCliSource = "/usr/bin/nvidia-container-cli"
nvidiaContainerRuntimeHookSource = "/usr/bin/nvidia-container-toolkit"
nvidiaContainerToolkitConfigSource = "/etc/nvidia-container-runtime/config.toml"
configFilename = "config.toml"
)
var toolkitDirArg string
var nvidiaDriverRootFlag string
var nvidiaContainerRuntimeDebugFlag string
var nvidiaContainerRuntimeLogLevelFlag string
var nvidiaContainerCLIDebugFlag string
func main() {
// Create the top-level CLI
c := cli.NewApp()
c.Name = "toolkit"
c.Usage = "Manage the NVIDIA container toolkit"
c.Version = "0.1.0"
// Create the 'install' subcommand
install := cli.Command{}
install.Name = "install"
install.Usage = "Install the components of the NVIDIA container toolkit"
install.ArgsUsage = "<toolkit_directory>"
install.Before = parseArgs
install.Action = Install
// Create the 'delete' command
delete := cli.Command{}
delete.Name = "delete"
delete.Usage = "Delete the NVIDIA container toolkit"
delete.ArgsUsage = "<toolkit_directory>"
delete.Before = parseArgs
delete.Action = Delete
// Register the subcommand with the top-level CLI
c.Commands = []*cli.Command{
&install,
&delete,
}
flags := []cli.Flag{
&cli.StringFlag{
Name: "nvidia-driver-root",
Value: DefaultNvidiaDriverRoot,
Destination: &nvidiaDriverRootFlag,
EnvVars: []string{"NVIDIA_DRIVER_ROOT"},
},
&cli.StringFlag{
Name: "nvidia-container-runtime-debug",
Usage: "Specify the location of the debug log file for the NVIDIA Container Runtime",
Destination: &nvidiaContainerRuntimeDebugFlag,
EnvVars: []string{"NVIDIA_CONTAINER_RUNTIME_DEBUG"},
},
&cli.StringFlag{
Name: "nvidia-container-runtime-debug-log-level",
Destination: &nvidiaContainerRuntimeLogLevelFlag,
EnvVars: []string{"NVIDIA_CONTAINER_RUNTIME_LOG_LEVEL"},
},
&cli.StringFlag{
Name: "nvidia-container-cli-debug",
Usage: "Specify the location of the debug log file for the NVIDIA Container CLI",
Destination: &nvidiaContainerCLIDebugFlag,
EnvVars: []string{"NVIDIA_CONTAINER_CLI_DEBUG"},
},
}
// Update the subcommand flags with the common subcommand flags
install.Flags = append([]cli.Flag{}, flags...)
// Run the top-level CLI
if err := c.Run(os.Args); err != nil {
log.Fatal(fmt.Errorf("error: %v", err))
}
}
// parseArgs parses the command line arguments to the CLI
func parseArgs(c *cli.Context) error {
args := c.Args()
log.Infof("Parsing arguments: %v", args.Slice())
if c.NArg() != 1 {
return fmt.Errorf("incorrect number of arguments")
}
toolkitDirArg = args.Get(0)
log.Infof("Successfully parsed arguments")
return nil
}
// Delete removes the NVIDIA container toolkit
func Delete(cli *cli.Context) error {
log.Infof("Deleting NVIDIA container toolkit from '%v'", toolkitDirArg)
err := os.RemoveAll(toolkitDirArg)
if err != nil {
return fmt.Errorf("error deleting toolkit directory: %v", err)
}
return nil
}
// Install installs the components of the NVIDIA container toolkit.
// Any existing installation is removed.
func Install(cli *cli.Context) error {
log.Infof("Installing NVIDIA container toolkit to '%v'", toolkitDirArg)
log.Infof("Removing existing NVIDIA container toolkit installation")
err := os.RemoveAll(toolkitDirArg)
if err != nil {
return fmt.Errorf("error removing toolkit directory: %v", err)
}
toolkitConfigDir := filepath.Join(toolkitDirArg, ".config", "nvidia-container-runtime")
toolkitConfigPath := filepath.Join(toolkitConfigDir, configFilename)
err = createDirectories(toolkitDirArg, toolkitConfigDir)
if err != nil {
return fmt.Errorf("could not create required directories: %v", err)
}
err = installContainerLibrary(toolkitDirArg)
if err != nil {
return fmt.Errorf("error installing NVIDIA container library: %v", err)
}
err = installContainerRuntimes(toolkitDirArg, nvidiaDriverRootFlag)
if err != nil {
return fmt.Errorf("error installing NVIDIA container runtime: %v", err)
}
nvidiaContainerCliExecutable, err := installContainerCLI(toolkitDirArg)
if err != nil {
return fmt.Errorf("error installing NVIDIA container CLI: %v", err)
}
_, err = installRuntimeHook(toolkitDirArg, toolkitConfigPath)
if err != nil {
return fmt.Errorf("error installing NVIDIA container runtime hook: %v", err)
}
err = installToolkitConfig(toolkitConfigPath, nvidiaDriverRootFlag, nvidiaContainerCliExecutable)
if err != nil {
return fmt.Errorf("error installing NVIDIA container toolkit config: %v", err)
}
return nil
}
// installContainerLibrary locates and installs the libnvidia-container.so.1 library.
// A predefined set of library candidates are considered, with the first one
// resulting in success being installed to the toolkit folder. The install process
// resolves the symlink for the library and copies the versioned library itself.
func installContainerLibrary(toolkitDir string) error {
log.Infof("Installing NVIDIA container library to '%v'", toolkitDir)
const libName = "libnvidia-container.so.1"
libraryPath, err := findLibrary("", libName)
if err != nil {
return fmt.Errorf("error locating NVIDIA container library: %v", err)
}
installedLibPath, err := installFileToFolder(toolkitDir, libraryPath)
if err != nil {
return fmt.Errorf("error installing %v to %v: %v", libraryPath, toolkitDir, err)
}
log.Infof("Installed '%v' to '%v'", libraryPath, installedLibPath)
if filepath.Base(installedLibPath) == libName {
return nil
}
err = installSymlink(toolkitDir, libName, installedLibPath)
if err != nil {
return fmt.Errorf("error installing symlink for NVIDIA container library: %v", err)
}
return nil
}
// installToolkitConfig installs the config file for the NVIDIA container toolkit ensuring
// that the settings are updated to match the desired install and nvidia driver directories.
func installToolkitConfig(toolkitConfigPath string, nvidiaDriverDir string, nvidiaContainerCliExecutablePath string) error {
log.Infof("Installing NVIDIA container toolkit config '%v'", toolkitConfigPath)
config, err := toml.LoadFile(nvidiaContainerToolkitConfigSource)
if err != nil {
return fmt.Errorf("could not open source config file: %v", err)
}
targetConfig, err := os.Create(toolkitConfigPath)
if err != nil {
return fmt.Errorf("could not create target config file: %v", err)
}
defer targetConfig.Close()
nvidiaContainerCliKey := func(p string) []string {
return []string{"nvidia-container-cli", p}
}
// Read the ldconfig path from the config as this may differ per platform
// On ubuntu-based systems this ends in `.real`
ldconfigPath := fmt.Sprintf("%s", config.GetPath(nvidiaContainerCliKey("ldconfig")))
// Use the driver run root as the root:
driverLdconfigPath := "@" + filepath.Join(nvidiaDriverDir, strings.TrimPrefix(ldconfigPath, "@/"))
config.SetPath(nvidiaContainerCliKey("root"), nvidiaDriverDir)
config.SetPath(nvidiaContainerCliKey("path"), nvidiaContainerCliExecutablePath)
config.SetPath(nvidiaContainerCliKey("ldconfig"), driverLdconfigPath)
// Set the debug options if selected
debugOptions := map[string]string{
"nvidia-container-runtime.debug": nvidiaContainerRuntimeDebugFlag,
"nvidia-container-runtime.log-level": nvidiaContainerRuntimeLogLevelFlag,
"nvidia-container-cli.debug": nvidiaContainerCLIDebugFlag,
}
for key, value := range debugOptions {
if value == "" {
continue
}
if config.Get(key) != nil {
continue
}
config.Set(key, value)
}
_, err = config.WriteTo(targetConfig)
if err != nil {
return fmt.Errorf("error writing config: %v", err)
}
return nil
}
// installContainerCLI sets up the NVIDIA container CLI executable, copying the executable
// and implementing the required wrapper
func installContainerCLI(toolkitDir string) (string, error) {
log.Infof("Installing NVIDIA container CLI from '%v'", nvidiaContainerCliSource)
env := map[string]string{
"LD_LIBRARY_PATH": toolkitDir,
}
e := executable{
source: nvidiaContainerCliSource,
target: executableTarget{
dotfileName: "nvidia-container-cli.real",
wrapperName: "nvidia-container-cli",
},
env: env,
}
installedPath, err := e.install(toolkitDir)
if err != nil {
return "", fmt.Errorf("error installing NVIDIA container CLI: %v", err)
}
return installedPath, nil
}
// installRuntimeHook sets up the NVIDIA runtime hook, copying the executable
// and implementing the required wrapper
func installRuntimeHook(toolkitDir string, configFilePath string) (string, error) {
log.Infof("Installing NVIDIA container runtime hook from '%v'", nvidiaContainerRuntimeHookSource)
argLines := []string{
fmt.Sprintf("-config \"%s\"", configFilePath),
}
e := executable{
source: nvidiaContainerRuntimeHookSource,
target: executableTarget{
dotfileName: "nvidia-container-toolkit.real",
wrapperName: "nvidia-container-toolkit",
},
argLines: argLines,
}
installedPath, err := e.install(toolkitDir)
if err != nil {
return "", fmt.Errorf("error installing NVIDIA container runtime hook: %v", err)
}
err = installSymlink(toolkitDir, "nvidia-container-runtime-hook", installedPath)
if err != nil {
return "", fmt.Errorf("error installing symlink to NVIDIA container runtime hook: %v", err)
}
return installedPath, nil
}
// installSymlink creates a symlink in the toolkitDirectory that points to the specified target.
// Note: The target is assumed to be local to the toolkit directory
func installSymlink(toolkitDir string, link string, target string) error {
symlinkPath := filepath.Join(toolkitDir, link)
targetPath := filepath.Base(target)
log.Infof("Creating symlink '%v' -> '%v'", symlinkPath, targetPath)
err := os.Symlink(targetPath, symlinkPath)
if err != nil {
return fmt.Errorf("error creating symlink '%v' => '%v': %v", symlinkPath, targetPath, err)
}
return nil
}
// installFileToFolder copies a source file to a destination folder.
// The path of the input file is ignored.
// e.g. installFileToFolder("/some/path/file.txt", "/output/path")
// will result in a file "/output/path/file.txt" being generated
func installFileToFolder(destFolder string, src string) (string, error) {
name := filepath.Base(src)
return installFileToFolderWithName(destFolder, name, src)
}
// cp src destFolder/name
func installFileToFolderWithName(destFolder string, name, src string) (string, error) {
dest := filepath.Join(destFolder, name)
err := installFile(dest, src)
if err != nil {
return "", fmt.Errorf("error copying '%v' to '%v': %v", src, dest, err)
}
return dest, nil
}
// installFile copies a file from src to dest and maintains
// file modes
func installFile(dest string, src string) error {
log.Infof("Installing '%v' to '%v'", src, dest)
source, err := os.Open(src)
if err != nil {
return fmt.Errorf("error opening source: %v", err)
}
defer source.Close()
destination, err := os.Create(dest)
if err != nil {
return fmt.Errorf("error creating destination: %v", err)
}
defer destination.Close()
_, err = io.Copy(destination, source)
if err != nil {
return fmt.Errorf("error copying file: %v", err)
}
err = applyModeFromSource(dest, src)
if err != nil {
return fmt.Errorf("error setting destination file mode: %v", err)
}
return nil
}
// applyModeFromSource sets the file mode for a destination file
// to match that of a specified source file
func applyModeFromSource(dest string, src string) error {
sourceInfo, err := os.Stat(src)
if err != nil {
return fmt.Errorf("error getting file info for '%v': %v", src, err)
}
err = os.Chmod(dest, sourceInfo.Mode())
if err != nil {
return fmt.Errorf("error setting mode for '%v': %v", dest, err)
}
return nil
}
// findLibrary searches a set of candidate libraries in the specified root for
// a given library name
func findLibrary(root string, libName string) (string, error) {
log.Infof("Finding library %v (root=%v)", libName, root)
candidateDirs := []string{
"/usr/lib64",
"/usr/lib/x86_64-linux-gnu",
}
for _, d := range candidateDirs {
l := filepath.Join(root, d, libName)
log.Infof("Checking library candidate '%v'", l)
libraryCandidate, err := resolveLink(l)
if err != nil {
log.Infof("Skipping library candidate '%v': %v", l, err)
continue
}
return libraryCandidate, nil
}
return "", fmt.Errorf("error locating library '%v'", libName)
}
// resolveLink finds the target of a symlink or the file itself in the
// case of a regular file.
// This is equivalent to running `readlink -f ${l}`
func resolveLink(l string) (string, error) {
resolved, err := filepath.EvalSymlinks(l)
if err != nil {
return "", fmt.Errorf("error resolving link '%v': %v", l, err)
}
if l != resolved {
log.Infof("Resolved link: '%v' => '%v'", l, resolved)
}
return resolved, nil
}
func createDirectories(dir ...string) error {
for _, d := range dir {
log.Infof("Creating directory '%v'", d)
err := os.MkdirAll(d, 0755)
if err != nil {
return fmt.Errorf("error creating directory: %v", err)
}
}
return nil
}

5
vendor/github.com/BurntSushi/toml/.gitignore generated vendored Normal file
View File

@@ -0,0 +1,5 @@
TAGS
tags
.*.swp
tomlcheck/tomlcheck
toml.test

15
vendor/github.com/BurntSushi/toml/.travis.yml generated vendored Normal file
View File

@@ -0,0 +1,15 @@
language: go
go:
- 1.1
- 1.2
- 1.3
- 1.4
- 1.5
- 1.6
- tip
install:
- go install ./...
- go get github.com/BurntSushi/toml-test
script:
- export PATH="$PATH:$HOME/gopath/bin"
- make test

3
vendor/github.com/BurntSushi/toml/COMPATIBLE generated vendored Normal file
View File

@@ -0,0 +1,3 @@
Compatible with TOML version
[v0.4.0](https://github.com/toml-lang/toml/blob/v0.4.0/versions/en/toml-v0.4.0.md)

21
vendor/github.com/BurntSushi/toml/COPYING generated vendored Normal file
View File

@@ -0,0 +1,21 @@
The MIT License (MIT)
Copyright (c) 2013 TOML authors
Permission is hereby granted, free of charge, to any person obtaining a copy
of this software and associated documentation files (the "Software"), to deal
in the Software without restriction, including without limitation the rights
to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
copies of the Software, and to permit persons to whom the Software is
furnished to do so, subject to the following conditions:
The above copyright notice and this permission notice shall be included in
all copies or substantial portions of the Software.
THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN
THE SOFTWARE.

19
vendor/github.com/BurntSushi/toml/Makefile generated vendored Normal file
View File

@@ -0,0 +1,19 @@
install:
go install ./...
test: install
go test -v
toml-test toml-test-decoder
toml-test -encoder toml-test-encoder
fmt:
gofmt -w *.go */*.go
colcheck *.go */*.go
tags:
find ./ -name '*.go' -print0 | xargs -0 gotags > TAGS
push:
git push origin master
git push github master

Some files were not shown because too many files have changed in this diff Show More