Compare commits

...

976 Commits

Author SHA1 Message Date
Evan Lezar
dcff3118d9 Merge branch 'update-libnvidia-container' into 'main'
Update libnvidia-container

See merge request nvidia/container-toolkit/container-toolkit!340
2023-03-14 13:54:11 +00:00
Evan Lezar
731168ec8d Update changelog
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2023-03-14 15:05:36 +02:00
Evan Lezar
7b4435a0f8 Update libnvidia-container
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2023-03-14 15:05:29 +02:00
Evan Lezar
738af29724 Merge branch 'explicit-cdi-enabled-flag' into 'main'
Add --cdi-enabled option to control generating CDI spec

See merge request nvidia/container-toolkit/container-toolkit!339
2023-03-14 07:00:30 +00:00
Evan Lezar
08ef242afb Add --cdi-enabled option to control generating CDI spec
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2023-03-13 18:19:00 +02:00
Evan Lezar
92ea8be309 Merge branch 'fix-privileged-check-cdi-mode' into 'main'
Return empty list of devices for unprivileged containers when...

See merge request nvidia/container-toolkit/container-toolkit!337
2023-03-13 07:36:25 +00:00
Christopher Desiniotis
48414e97bb Return empty list of devices for unprivileged containers when 'accept-nvidia-visible-devices-envvar-unprivileged=false'
Signed-off-by: Christopher Desiniotis <cdesiniotis@nvidia.com>
2023-03-10 13:11:29 -08:00
Evan Lezar
77a2975524 Merge branch 'fix-kitmaker' into 'main'
Use component name as folder name

See merge request nvidia/container-toolkit/container-toolkit!336
2023-03-10 13:57:24 +00:00
Evan Lezar
ce9477966d Use component name as folder name
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2023-03-10 15:51:36 +02:00
Evan Lezar
fe02351c3a Merge branch 'bump-cuda-version' into 'main'
Bump CUDA base image version to 12.1.0

See merge request nvidia/container-toolkit/container-toolkit!335
2023-03-10 10:23:30 +00:00
Evan Lezar
9c2018a0dc Bump CUDA base image version to 12.1.0
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2023-03-10 11:31:23 +02:00
Evan Lezar
33e5b34fa1 Merge branch 'CNT-3999/legacy-cli-doesnt-work-in-cdi-mode' into 'main'
Add nvidia-container-runtime-hook.skip-mode-detection option to config

See merge request nvidia/container-toolkit/container-toolkit!330
2023-03-09 19:18:16 +00:00
Evan Lezar
ccf73f2505 Set skip-mode-detection in the toolkit-container by default
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2023-03-09 20:16:10 +02:00
Evan Lezar
3a11f6ee0a Add nvidia-container-runtime-hook.skip-mode-detection option to config
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2023-03-09 20:15:40 +02:00
Evan Lezar
8f694bbfb7 Merge branch 'set-nvidia-ctk-path' into 'main'
Set nvidia-ctk.path config option based on installed path

See merge request nvidia/container-toolkit/container-toolkit!334
2023-03-09 16:44:13 +00:00
Evan Lezar
4c2eff4865 Merge branch 'CNT-3998/cdi-accept-visible-devices-when-privileged' into 'main'
Honor accept-nvidia-visible-devices-envvar-when-unprivileged setting in CDI mode

See merge request nvidia/container-toolkit/container-toolkit!331
2023-03-09 15:59:08 +00:00
Evan Lezar
1fbdc17c40 Set nvidia-ctk.path config option based on installed path
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2023-03-09 17:53:08 +02:00
Evan Lezar
965d62f326 Merge branch 'fix-containerd-integration-tests' into 'main'
Fix integration tests failing due to CDI spec generation

See merge request nvidia/container-toolkit/container-toolkit!333
2023-03-09 14:41:52 +00:00
Evan Lezar
25ea7fa98e Remove whitespace in Makefile
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2023-03-09 15:32:07 +02:00
Evan Lezar
5ee040ba95 Disable CDI spec generation for integration tests 2023-03-09 15:32:07 +02:00
Evan Lezar
eb2aec9da8 Allow CDI options to be set by envvars
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2023-03-09 12:25:05 +02:00
Evan Lezar
973e7bda5e Check accept-nvidia-visible-devices-envvar-when-unprivileged option for CDI
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2023-03-09 11:15:53 +02:00
Evan Lezar
154cd4ecf3 Add to config struct
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2023-03-09 11:15:53 +02:00
Evan Lezar
936fad1d04 Move check for privileged images to config/image/ package
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2023-03-09 11:15:53 +02:00
Evan Lezar
86dd046c7c Merge branch 'CNT-3928/allow-cdi-container-annotations' into 'main'
Add cdi.k8s.io annotations to runtimes configured in containerd

See merge request nvidia/container-toolkit/container-toolkit!315
2023-03-09 07:52:37 +00:00
Evan Lezar
510fb248fe Add cdi.k8s.io annotations to containerd config
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2023-03-08 07:23:27 +02:00
Evan Lezar
c7384c6aee Merge branch 'fix-comment' into 'main'
Fix comment

See merge request nvidia/container-toolkit/container-toolkit!329
2023-03-08 05:15:38 +00:00
Evan Lezar
1c3c9143f8 Fix comment
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2023-03-08 07:15:05 +02:00
Evan Lezar
1c696b1e39 Merge branch 'CNT-3894/configure-mode-specific-runtimes' into 'main'
Configure .cdi and .legacy executables in Toolkit Container

See merge request nvidia/container-toolkit/container-toolkit!308
2023-03-08 05:12:50 +00:00
Evan Lezar
a2adbc1133 Merge branch 'CNT-3898/improve-cdi-annotations' into 'main'
Improve handling of environment variable devices in CDI mode

See merge request nvidia/container-toolkit/container-toolkit!321
2023-03-08 04:37:41 +00:00
Evan Lezar
36576708f0 Merge branch 'CNT-3896/gds-mofed-devices' into 'main'
Add GDS and MOFED support to the NVCDI API

See merge request nvidia/container-toolkit/container-toolkit!323
2023-03-08 04:36:55 +00:00
Evan Lezar
cc7a6f166b Handle case were runtime name is set to predefined name
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2023-03-07 20:59:56 +02:00
Evan Lezar
62d88e7c95 Add cdi and legacy mode runtimes
This change adds .cdi and .legacy mode-specific runtimes the list of
runtimes supported by the operator. These are also installed as
part of the NVIDIA Container Toolkit.

Signed-off-by: Evan Lezar <elezar@nvidia.com>
2023-03-07 20:59:55 +02:00
Evan Lezar
dca8e3123f Migrate containerd config to engine.Interface
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2023-03-07 20:59:55 +02:00
Evan Lezar
3bac4fad09 Migrate cri-o config update to engine.Interface
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2023-03-07 20:59:54 +02:00
Evan Lezar
9fff19da23 Migrate docker config to engine.Interface
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2023-03-07 20:59:54 +02:00
Evan Lezar
e5bb4d2718 Move runtime config code from config to config/engine
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2023-03-07 20:59:54 +02:00
Evan Lezar
5bfb51f801 Add API for interacting with runtime engine configs
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2023-03-07 20:59:53 +02:00
Evan Lezar
ece5b29d97 Add tools/container/operator package to handle runtime naming
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2023-03-07 20:59:53 +02:00
Evan Lezar
ec8a92c17f Use nvidia-container-runtime.experimental as wrapper
This change switches to using nvidia-container-runtime.experimental as the
wrapper name over nvidia-container-runtime-experimental. This is consistent
with upcoming mode-specific binaries.

The wrapper is created at nvidia-container-runtime.experimental.real.

Signed-off-by: Evan Lezar <elezar@nvidia.com>
2023-03-07 20:59:53 +02:00
Evan Lezar
868393b7ed Add mofed mode to nvcdi API
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2023-03-07 18:47:52 +02:00
Evan Lezar
ebe18fbb7f Add gds mode to nvcdi API
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2023-03-07 18:47:52 +02:00
Evan Lezar
9435343541 Merge branch 'fix-kitmaker' into 'main'
Include = when extracting manifest information

See merge request nvidia/container-toolkit/container-toolkit!327
2023-03-07 14:44:27 +00:00
Evan Lezar
1cd20afe4f Include = when extracting manifest information
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2023-03-07 16:43:49 +02:00
Evan Lezar
1e6fe40c76 Allow nvidia-container-runtime.modes.cdi.default-kind to be set
This change allows the nvidia-container-runtime.modes.cdi.default-kind
to be set in the toolkit-container.

The NVIDIA_CONTAINER_RUNTIME_MODES_CDI_DEFAULT_KIND envvar is used.

Signed-off-by: Evan Lezar <elezar@nvidia.com>
2023-03-07 16:19:38 +02:00
Evan Lezar
6d220ed9a2 Rework selection of devices in CDI mode
The following changes are made:
* The default-cdi-kind config option is used to convert an envvar entry to a fully-qualified device name
* If annotation devices exist, these are used instead of the envvar devices.
* The `all` device is no longer treated as a special case and MUST exist in the CDI spec.

Signed-off-by: Evan Lezar <elezar@nvidia.com>
2023-03-07 16:18:53 +02:00
Evan Lezar
f00439c93e Add nvidia-container-runtime.modes.csv.default-kind config option
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2023-03-07 16:18:53 +02:00
Evan Lezar
c59696e30e Merge branch 'fix-kitmaker' into 'main'
Log source file

See merge request nvidia/container-toolkit/container-toolkit!326
2023-03-06 16:26:43 +00:00
Evan Lezar
89c18c73cd Add source and log curl command
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2023-03-06 18:26:05 +02:00
Evan Lezar
cb5006c73f Merge branch 'CNT-3897/generate-management-container-spec' into 'main'
Generate CDI specs for management containers

See merge request nvidia/container-toolkit/container-toolkit!314
2023-03-06 16:23:13 +00:00
Evan Lezar
547b71f222 Merge branch 'change-discovery-mode' into 'main'
Rename --discovery-mode to --mode

See merge request nvidia/container-toolkit/container-toolkit!318
2023-03-06 16:21:22 +00:00
Evan Lezar
ae84bfb055 Log source file
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2023-03-06 18:11:12 +02:00
Evan Lezar
9b303d5b89 Merge branch 'fix-changelist' into 'main'
Strip on tilde for kitmaker version

See merge request nvidia/container-toolkit/container-toolkit!325
2023-03-06 14:10:54 +00:00
Evan Lezar
d944f934d7 Strip on tilde for kitmaker version
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2023-03-06 16:10:25 +02:00
Evan Lezar
c37209cd09 Merge branch 'fix-changelist' into 'main'
Fix blank changelist in kitmaker properties

See merge request nvidia/container-toolkit/container-toolkit!324
2023-03-06 13:51:19 +00:00
Evan Lezar
863b569a61 Fix blank changelist in kitmaker properties
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2023-03-06 15:50:38 +02:00
Evan Lezar
f36c514f1f Merge branch 'update-kitmaker-folders' into 'main'
Update kitmaker target folder

See merge request nvidia/container-toolkit/container-toolkit!313
2023-03-06 11:16:49 +00:00
Evan Lezar
3ab28c7fa4 Merge branch 'fix-rule-for-release' into 'main'
Run full build on release- branches

See merge request nvidia/container-toolkit/container-toolkit!320
2023-03-06 10:56:58 +00:00
Evan Lezar
c03258325b Run full build on release- branches
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2023-03-06 12:54:27 +02:00
Evan Lezar
20d3bb189b Rename --discovery-mode to --mode
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2023-03-06 11:00:22 +02:00
Evan Lezar
90acec60bb Skip CDI spec generation in integration tests
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2023-03-06 10:57:40 +02:00
Evan Lezar
0565888c03 Generate CDI spec in toolkit container
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2023-03-06 10:57:40 +02:00
Evan Lezar
f7e817cff6 Support management mode in nvidia-ctk cdi generate
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2023-03-06 10:53:43 +02:00
Evan Lezar
29cbbe83f9 Add management mode to CDI spec generation API
These changes add support for generating a management spec to the nvcdi API.
A management spec consists of a single CDI device (`all`) which includes all expected
NVIDIA device nodes, driver libraries, binaries, and IPC sockets.

Signed-off-by: Evan Lezar <elezar@nvidia.com>
2023-03-06 10:53:43 +02:00
Evan Lezar
64b16acb1f Also install nvidia-ctk in toolkit-container
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2023-03-06 10:53:43 +02:00
Evan Lezar
19c20bb422 Merge branch 'CNT-3931/add-spec-validation' into 'main'
Add nvcdi.spec for writing and validating CDI specifications

See merge request nvidia/container-toolkit/container-toolkit!306
2023-03-06 08:52:56 +00:00
Evan Lezar
28b10d2ee0 Merge branch 'fix-toolkit-ctr-envvars' into 'main'
Fix handling of envvars in toolkit container which modify the NVIDIA Container Runtime config

See merge request nvidia/container-toolkit/container-toolkit!317
2023-03-06 07:36:03 +00:00
Christopher Desiniotis
1f5123f72a Fix handling of envvars in toolkit container which modify the NVIDIA Container Runtime config
Signed-off-by: Christopher Desiniotis <cdesiniotis@nvidia.com>
2023-03-05 20:14:04 -08:00
Evan Lezar
ac5b6d097b Use kitmaker folder for releases
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2023-03-01 17:27:07 +02:00
Evan Lezar
a7bf9ddf28 Update kitmaker folder structure
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2023-03-01 17:27:07 +02:00
Evan Lezar
e27479e170 Add GIT_COMMIT_SHORT to packaging image manifest
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2023-03-01 17:27:07 +02:00
Evan Lezar
fa28e738c6 Merge branch 'fix-internal-scans' into 'main'
Fix internal scans

See merge request nvidia/container-toolkit/container-toolkit!316
2023-03-01 15:26:27 +00:00
Evan Lezar
898c5555f6 Fix internal scans
This fixes the internal scans due to the removed ubuntu18.04 images.

Signed-off-by: Evan Lezar <elezar@nvidia.com>
2023-03-01 17:25:28 +02:00
Evan Lezar
314059fcf0 Move path manipulation to spec.Save
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2023-03-01 13:49:04 +02:00
Evan Lezar
221781bd0b Use full path for output spec
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2023-03-01 13:48:28 +02:00
Evan Lezar
9f5e141437 Expose vendor and class as options
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2023-03-01 13:48:28 +02:00
Evan Lezar
8be6de177f Move formatJSON and formatYAML to nvcdi/spec package
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2023-03-01 13:48:28 +02:00
Evan Lezar
890a519121 Use nvcdi.spec package to write and validate spec
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2023-03-01 13:48:28 +02:00
Evan Lezar
89321edae6 Add top-level GetSpec function to nvcdi API
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2023-03-01 13:48:28 +02:00
Evan Lezar
6d6cd56196 Return nvcdi.spec.Interface from GetSpec
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2023-03-01 12:45:30 +02:00
Evan Lezar
2e95e04359 Add nvcdi.spec package
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2023-03-01 12:45:30 +02:00
Evan Lezar
accba4ead5 Merge branch 'CNT-3965/clean-up-by-path-symlinks' into 'main'
Improve handling of /dev/dri devices and nested device paths

See merge request nvidia/container-toolkit/container-toolkit!307
2023-03-01 10:25:48 +00:00
Christopher Desiniotis
1e9b7883cf Merge branch 'CNT-3937/add-target-driver-root' into 'main'
Add a driver root transformer to nvcdi

See merge request nvidia/container-toolkit/container-toolkit!300
2023-02-28 18:04:29 +00:00
Christopher Desiniotis
87e406eee6 Update root transformer tests to ensure container path is not modified
Signed-off-by: Christopher Desiniotis <cdesiniotis@nvidia.com>
2023-02-28 09:00:05 -08:00
Christopher Desiniotis
45ed3b0412 Handle hook arguments for creation of symlinks
Signed-off-by: Christopher Desiniotis <cdesiniotis@nvidia.com>
2023-02-28 09:00:02 -08:00
Christopher Desiniotis
0516fc96ca Add Transform interface and initial implemention for a root transform
Signed-off-by: Christopher Desiniotis <cdesiniotis@nvidia.com>
2023-02-28 08:56:13 -08:00
Evan Lezar
e7a435fd5b Merge branch 'update-libnvidia-container' into 'main'
Update libnvidia-container

See merge request nvidia/container-toolkit/container-toolkit!312
2023-02-27 13:41:26 +00:00
Evan Lezar
7a249d7771 Update libnvidia-container
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2023-02-27 15:41:02 +02:00
Evan Lezar
7986ff9cee Merge branch 'CNT-3963/deduplicate-wsl-driverstore-paths' into 'main'
Deduplicate WSL driverstore paths

See merge request nvidia/container-toolkit/container-toolkit!304
2023-02-27 13:27:31 +00:00
Evan Lezar
b74c13d75f Merge branch 'fix-rpm-postun-scriptlet' into 'main'
nvidia-container-toolkit.spec: fix syntax error in postun scriptlet

See merge request nvidia/container-toolkit/container-toolkit!309
2023-02-27 12:36:49 +00:00
Evan Lezar
de8eeb87f4 Merge branch 'remove-outdated-platforms' into 'main'
Remove outdated platforms from CI

See merge request nvidia/container-toolkit/container-toolkit!310
2023-02-27 11:48:33 +00:00
Evan Lezar
36c4174de3 Update libnvidia-container
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2023-02-27 13:45:44 +02:00
Evan Lezar
3497936cdf Remove ubuntu18.04 toolkit-container image
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2023-02-27 12:55:17 +02:00
Evan Lezar
81abc92743 Remove fedora35 from 'all' targets
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2023-02-27 12:31:38 +02:00
Evan Lezar
1ef8dc3137 Remove centos7-ppc64le from CI
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2023-02-27 12:30:29 +02:00
Evan Lezar
9a5c1bbe48 Remove ubuntu16.04 packages from CI
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2023-02-27 12:29:35 +02:00
Evan Lezar
30dff61376 Remove debian9 packages from CI
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2023-02-27 12:28:46 +02:00
Claudius Volz
de1bb68d19 nvidia-container-toolkit.spec: fix syntax error in postun scriptlet
Signed-off-by: Claudius Volz <c.volz@gmx.de>
2023-02-27 00:45:21 +01:00
Evan Lezar
06d8bb5019 Merge branch 'CNT-3965/dont-fail-chmod-hook' into 'main'
Skip paths with errors in chmod hook

See merge request nvidia/container-toolkit/container-toolkit!303
2023-02-22 15:20:26 +00:00
Evan Lezar
b4dc1f338d Generate nested device folder permission hooks per device
This change generates device folder permission hooks per device instead of
at a spec level. This ensures that the hook is not injected for a device that
does not have any nested device nodes.

Signed-off-by: Evan Lezar <elezar@nvidia.com>
2023-02-22 17:16:23 +02:00
Evan Lezar
181128fe73 Only include by-path-symlinks for injected device nodes
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2023-02-22 16:53:04 +02:00
Evan Lezar
252838e696 Merge branch 'bump-version-v1.13.0-rc.2' into 'main'
Bump version to v1.13.0-rc.2

See merge request nvidia/container-toolkit/container-toolkit!305
2023-02-21 13:11:00 +00:00
Evan Lezar
49f171a8b1 Update libnvidia-container
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2023-02-21 14:27:02 +02:00
Evan Lezar
3d12803ab3 Bump version to v1.13.0-rc.2
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2023-02-21 14:24:37 +02:00
Evan Lezar
a168091bfb Add v1.13.0-rc.1 Changelog
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2023-02-21 14:23:52 +02:00
Evan Lezar
35fc57291f Deduplicate WSL driverstore paths
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2023-02-21 11:48:56 +02:00
Evan Lezar
2542224d7b Skip paths with errors in chmod hook
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2023-02-21 11:47:11 +02:00
Evan Lezar
882fbb3209 Merge branch 'add-cdi-auto-mode' into 'main'
Add constants for CDI mode to nvcdi API

See merge request nvidia/container-toolkit/container-toolkit!302
2023-02-20 14:41:07 +00:00
Evan Lezar
2680c45811 Add mode constants to nvcdi
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2023-02-20 16:33:51 +02:00
Evan Lezar
b76808dbd5 Add tests for CDI mode resolution
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2023-02-20 16:33:33 +02:00
Evan Lezar
ba50b50a15 Merge branch 'add-cdi-auto-mode' into 'main'
Add auto mode to CDI spec generation

See merge request nvidia/container-toolkit/container-toolkit!292
2023-02-20 14:30:33 +00:00
Evan Lezar
f6d3f8d471 Merge branch 'CNT-3895/add-runtime-mode-config' into 'main'
Add nvidia-container-runtime.mode config option

See merge request nvidia/container-toolkit/container-toolkit!299
2023-02-20 12:51:18 +00:00
Evan Lezar
d9859d66bf Update go vendoring
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2023-02-20 14:49:58 +02:00
Evan Lezar
4ccb0b9a53 Add and resolve auto discovery mode for cdi generation
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2023-02-20 14:49:58 +02:00
Evan Lezar
f36c775d50 Merge branch 'wsl2-wip' into 'main'
Add CDI Spec generation on WSL2

See merge request nvidia/container-toolkit/container-toolkit!289
2023-02-20 09:36:41 +00:00
Evan Lezar
b21dc929ef Add WSL2 discovery and spec generation
These changes add a wsl discovery mode to the nvidia-ctk cdi generate command.

If wsl mode is enabled, the driver store for the available devices is used as
the source for discovered entities.

Signed-off-by: Evan Lezar <elezar@nvidia.com>
2023-02-20 10:30:13 +02:00
Evan Lezar
d226925fe7 Construct nvml-based CDI lib based on mode
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2023-02-20 10:30:13 +02:00
Evan Lezar
20d6e9af04 Add --discovery-mode to nvidia-ctk cdi generate command
This change adds --discovery-mode flag to the nvidia-ctk cdi generate
command and plumbs this through to the CDI API.

Signed-off-by: Evan Lezar <elezar@nvidia.com>
2023-02-20 10:30:13 +02:00
Evan Lezar
5103adab89 Add mode option to nvcdi API
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2023-02-20 10:30:13 +02:00
Evan Lezar
7eb435eb73 Add basic dxcore bindings
This change copies dxcore.h and dxcore.c from libnvidia-container to
allow for the driver store path to be queried. Modifications are made
to dxcore to remove the code associated with checking the components
in the driver store path.

Signed-off-by: Evan Lezar <elezar@nvidia.com>
2023-02-20 10:30:13 +02:00
Evan Lezar
5d011c1333 Add Discoverer to create a single symlink
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2023-02-20 10:30:13 +02:00
Evan Lezar
6adb792d57 Merge branch 'fix-nvidia-ctk-path' into 'main'
Ensure that generate uses a consistent nvidia-ctk path

See merge request nvidia/container-toolkit/container-toolkit!301
2023-02-20 08:29:44 +00:00
Evan Lezar
a844749791 Ensure that generate uses a consistent nvidia-ctk path
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2023-02-20 10:28:45 +02:00
Evan Lezar
dd0d43e726 Add nvidia-container-runtime.mode config option
This change allows the nvidia-container-runtime.mode option to be set
by the toolkit container.

This is controlled by the --nvidia-container-runtime-mode command line
argument and the NVIDIA_CONTAINER_RUNTIME_MODE envvar.

Signed-off-by: Evan Lezar <elezar@nvidia.com>
2023-02-17 18:04:49 +02:00
Evan Lezar
25811471fa Merge branch 'update-libnvidia-container' into 'main'
Update libnvidia-container

See merge request nvidia/container-toolkit/container-toolkit!298
2023-02-17 08:46:56 +00:00
Evan Lezar
569bc1a889 Update Changelog
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2023-02-17 10:46:21 +02:00
Evan Lezar
b1756b410a Merge branch 'fix-logging' into 'main'
Fix nvidia-container-runtime logging

See merge request nvidia/container-toolkit/container-toolkit!296
2023-02-16 15:17:24 +00:00
Evan Lezar
7789ac6331 Fix logger.Update and Reset
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2023-02-16 15:22:56 +01:00
Evan Lezar
7a3aabbbda Add logger test
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2023-02-16 15:22:56 +01:00
Evan Lezar
e486095603 Merge branch 'fix-nvidia-ctk-path' into 'main'
Fix issue with blank nvidia-ctk path

See merge request nvidia/container-toolkit/container-toolkit!297
2023-02-16 13:58:43 +00:00
Evan Lezar
bf6babe07e Fix issue with blank nvidia-ctk path
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2023-02-16 14:18:07 +01:00
Kevin Klues
d5a4d89682 Merge branch 'support-multimple-firmware-files' into 'main'
Add globbing for mounting multiple GSP firmware files

See merge request nvidia/container-toolkit/container-toolkit!295
2023-02-16 13:09:47 +00:00
Kevin Klues
5710b9e7e8 Add globbing for mounting multiple GSP firmware files
Newer drivers have split the GSP firmware into multiple files so a simple match
against gsp.bin in the firmware directory is no longer possible. This patch
adds globbing capabilitis to match any GSP firmware files of the form gsp*.bin
and mount them all into the container.

Signed-off-by: Kevin Klues <kklues@nvidia.com>
2023-02-16 11:53:36 +00:00
Evan Lezar
b4ab95f00c Merge branch 'fix-nvcdi-constructor' into 'main'
fix: apply options when constructing an instance of the nvcdi library

See merge request nvidia/container-toolkit/container-toolkit!294
2023-02-15 08:13:19 +00:00
Christopher Desiniotis
a52c9f0ac6 fix: apply options when constructing an instance of the nvcdi library
Signed-off-by: Christopher Desiniotis <cdesiniotis@nvidia.com>
2023-02-14 16:32:40 -08:00
Evan Lezar
b6bab4d3fd Merge branch 'expose-generate-spec' into 'main'
Implement basic CDI spec generation API

See merge request nvidia/container-toolkit/container-toolkit!257
2023-02-14 19:36:31 +00:00
Evan Lezar
5b110fba2d Add nvcdi package with basic CDI generation API
This change adds an nvcdi package that exposes a basic API for
CDI spec generation. This is used from the nvidia-ctk cdi generate
command and can be consumed by DRA implementations and the device plugin.

Signed-off-by: Evan Lezar <elezar@nvidia.com>
2023-02-14 19:52:31 +01:00
Evan Lezar
179133c8ad Merge branch 'fix-ubi8' into 'main'
Fix package version in ubi8 container builds

See merge request nvidia/container-toolkit/container-toolkit!293
2023-02-14 10:31:21 +00:00
Evan Lezar
365b6c7bc2 Fix package version in ubi8 container builds
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2023-02-14 10:50:38 +01:00
Evan Lezar
dc4887cd44 Merge branch 'cdi-executable' into 'main'
Add nvidia-container-runtime.{{MODE}} executable that overrides runtime mode

See merge request nvidia/container-toolkit/container-toolkit!288
2023-02-14 08:01:41 +00:00
Evan Lezar
c4836a576f Also skip nvidia-container-toolit-operator-extensions in release scripts
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2023-02-13 16:10:01 +01:00
Evan Lezar
98afe0d27a Generate nvidia-container-toolkit-operator-extensions package
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2023-02-13 16:09:46 +01:00
Evan Lezar
fdc759f7c2 Add nvidia-container-runtime.legacy executable
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2023-02-13 16:09:46 +01:00
Evan Lezar
43448bac11 Add nvidia-container-runtime.cdi executable
This change adds an nvidia-container-runtime.cdi executable that
overrides the runtime mode from the config to "cdi".

Signed-off-by: Evan Lezar <elezar@nvidia.com>
2023-02-13 16:09:46 +01:00
Evan Lezar
456d2864a6 Log config in JSON if possible
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2023-02-13 16:09:46 +01:00
Evan Lezar
406a5ec76f Implement runtime package for creating runtime CLI
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2023-02-13 16:09:46 +01:00
Evan Lezar
f71c419cfb Move modifying OCI runtime wrapper to oci package
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2023-02-13 16:09:46 +01:00
Evan Lezar
babb73295f Update gitignore
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2023-02-13 16:09:45 +01:00
Evan Lezar
f3ec5fd329 Merge branch 'packaging-verisons' into 'main'
Align release candidate RPM version with Debian version

See merge request nvidia/container-toolkit/container-toolkit!291
2023-02-13 14:53:01 +00:00
Evan Lezar
5aca0d147d Use - as version-tag separator for libnvidia-container
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2023-02-13 15:10:08 +01:00
Evan Lezar
f2b19b6ae9 Update libnvidia-container
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2023-02-13 14:39:14 +01:00
Evan Lezar
7cb9ed66be Align release candidate RPM version with Debian version
The version for RPM release candidates has the form `1.13.0-0.1.rc.1-1` whereas debian packages have the form `1.13.0~rc.1-1`.

Note that since the `~` is handled in [the same way](https://docs.fedoraproject.org/en-US/packaging-guidelines/Versioning/#_handling_non_sorting_versions_with_tilde_dot_and_caret) as for Debian packages, there does not seem to be a specific reason for this and dealing with multiple version strings in our entire pipeline adds complexity.

This change aligns the package versioning for rpm packages with Debian packages.

Signed-off-by: Evan Lezar <elezar@nvidia.com>
2023-02-13 14:31:23 +01:00
Evan Lezar
d578f4598a Remove fedora35 pipeline targets
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2023-02-13 14:31:23 +01:00
Evan Lezar
d30e6c23ab Merge branch 'update-ldflags' into 'main'
Update ldflags for cgo

See merge request nvidia/container-toolkit/container-toolkit!290
2023-02-10 14:17:53 +00:00
Evan Lezar
1c05f2fb9a Merge branch 'add-options-to-mounts' into 'main'
Add Options to mounts to refactor IPC CDI spec generation

See merge request nvidia/container-toolkit/container-toolkit!287
2023-02-10 08:04:24 +00:00
Evan Lezar
1407ace94a Update ldflags for cgo
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2023-02-08 21:54:49 +01:00
Evan Lezar
97008f2db6 Move IPC discoverer into DriverDiscoverer
This simplifies the construction of the required common edits
when constructing a CDI specification.

Signed-off-by: Evan Lezar <elezar@nvidia.com>
2023-02-08 09:06:07 +01:00
Evan Lezar
076eed7eb4 Update ipcMount to add noexec option
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2023-02-08 09:06:07 +01:00
Evan Lezar
33c7b056ea Add ipcMounts type
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2023-02-08 09:06:07 +01:00
Evan Lezar
3b8c40c3e6 Move IPC discoverer to internal/discover package
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2023-02-08 09:06:07 +01:00
Evan Lezar
3f70521a63 Add Options to discover.Mount
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2023-02-08 09:06:07 +01:00
Evan Lezar
21f5895b5a Merge branch 'bump-version-1.13.0-rc.1' into 'main'
Bump version to 1.13.0-rc.1

See merge request nvidia/container-toolkit/container-toolkit!286
2023-02-07 11:38:18 +00:00
Evan Lezar
738a2e7343 Bump version to 1.13.0-rc.1
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2023-02-07 11:55:47 +01:00
Evan Lezar
62bd015475 Merge branch 'bump-version-v1.12.0' into 'main'
Bump version to v1.12.0

See merge request nvidia/container-toolkit/container-toolkit!285
2023-02-03 14:07:30 +00:00
Evan Lezar
ac5c62c116 Bump CUDA base images to 12.0.1
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2023-02-03 14:25:42 +01:00
Evan Lezar
80fe1065ad Update libnvidia-container
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2023-02-03 14:23:49 +01:00
Evan Lezar
fea195cc8d Bump version to v1.12.0
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2023-02-03 13:58:26 +01:00
Evan Lezar
9ef314e1e3 Merge branch 'rename-root-flag' into 'main'
Rename root to driverRoot for CDI generation

See merge request nvidia/container-toolkit/container-toolkit!284
2023-02-02 16:33:24 +00:00
Evan Lezar
95f859118b Update changelog
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2023-02-02 15:58:00 +01:00
Evan Lezar
daceac9117 Rename discover.Config.Root to discover.Config.DriverRoot
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2023-02-02 15:57:15 +01:00
Evan Lezar
cfa2647260 Rename root to driverRoot for CDI generation
This makes the intent of the command line argument clearer since this
relates specifically to the root where the NVIDIA driver is installed.

Signed-off-by: Evan Lezar <elezar@nvidia.com>
2023-02-02 15:42:04 +01:00
Evan Lezar
03cdf3b5d7 Merge branch 'bump-version-v1.12.0-rc.6' into 'main'
Bump version to v1.12.0-rc.6

See merge request nvidia/container-toolkit/container-toolkit!283
2023-02-02 13:50:25 +00:00
Evan Lezar
f8f415a605 Ensure container-archive name is unique
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2023-02-02 12:03:34 +01:00
Evan Lezar
fe117d3916 Udpate libnvidia-container
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2023-02-02 12:02:57 +01:00
Evan Lezar
069536d598 Bump version to v1.12.0-rc.6
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2023-02-02 12:00:08 +01:00
Evan Lezar
5f53ca0af5 Add missing v1.12.0-rc.5 changelog entry
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2023-02-02 11:58:44 +01:00
Evan Lezar
9a06768863 Merge branch 'fix-nvidia-ctk-path' into 'main'
Only use configured nvidia-ctk path if it is a full path

See merge request nvidia/container-toolkit/container-toolkit!281
2023-02-01 11:42:09 +00:00
Evan Lezar
0c8379f681 Fix nvidia-ctk path for update ldcache hook
This change ensures that the update-ldcache hook is created in a manner
consistent with other nvidia-ctk hooks ensuring that a full path is
used.

Without this change the update-ldcache hook on Tegra-based sytems had an
invalid path.

Signed-off-by: Evan Lezar <elezar@nvidia.com>
2023-02-01 12:00:23 +01:00
Evan Lezar
92dc0506fe Add hook path to logger output
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2023-02-01 12:00:23 +01:00
Evan Lezar
7045a223d2 Only use configured nvidia-ctk path if it is a full path
If this is not done, the default config which sets the nvidia-ctk.path
option as "nvidia-ctk" will result in an invalid OCI spec if a hook is
injected. This change ensures that the path used is always an absolute
path as required by the hook spec.

Signed-off-by: Evan Lezar <elezar@nvidia.com>
2023-02-01 12:00:23 +01:00
Evan Lezar
763e4936cd Merge branch 'fix-kitmaker' into 'main'
Add additional build args to manifest

See merge request nvidia/container-toolkit/container-toolkit!279
2023-02-01 09:48:08 +00:00
Evan Lezar
f0c7491029 Use 'main' as branch component in kitmaker archive
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2023-02-01 04:54:59 +01:00
Evan Lezar
ba5c4b2831 Use package version as version
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2023-02-01 04:54:59 +01:00
Evan Lezar
9c73438682 Add additional build args to manifest
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2023-02-01 04:54:59 +01:00
Evan Lezar
37f7337d2b Merge branch 'bump-version-1.12.0-rc.5' into 'main'
Bump version to 1.12.0-rc.5

See merge request nvidia/container-toolkit/container-toolkit!280
2023-02-01 03:18:26 +00:00
Evan Lezar
98285c27ab Update libnvidia-container
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2023-02-01 04:17:42 +01:00
Evan Lezar
5750881cea Bump version to 1.12.0-rc.5
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2023-02-01 04:08:57 +01:00
Evan Lezar
95ca1c2e50 Merge branch 'fix-git-branch' into 'main'
Fix git branch shell command

See merge request nvidia/container-toolkit/container-toolkit!277
2023-01-31 14:08:35 +00:00
Evan Lezar
e4031ced39 Fix git branch shell command
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2023-01-31 15:08:12 +01:00
Evan Lezar
7f6d21c53b Merge branch 'fix-kitmaker' into 'main'
Fix GIT_BRANCH command

See merge request nvidia/container-toolkit/container-toolkit!276
2023-01-31 13:16:10 +00:00
Evan Lezar
846ac347fe Fix GIT_BRANCH command
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2023-01-31 14:15:48 +01:00
Evan Lezar
50afd443fc Merge branch 'fix-libraries-cdi' into 'main'
Fix relative link resolution for ldcache

See merge request nvidia/container-toolkit/container-toolkit!275
2023-01-31 13:07:35 +00:00
Evan Lezar
14bcebd8b7 Fix relative link resolution for ldcache
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2023-01-31 13:51:48 +01:00
Evan Lezar
d091d3c7f4 Merge branch 'fix-kitmaker' into 'main'
Ensure git is available in image build step

See merge request nvidia/container-toolkit/container-toolkit!274
2023-01-31 11:20:40 +00:00
Evan Lezar
eb0ef8ab31 Ensure git is available in image build step
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2023-01-31 12:20:11 +01:00
Evan Lezar
9c5c12a1bc Merge branch 'kitmaker-update' into 'main'
Update the process for publishing packages to kitmaker

See merge request nvidia/container-toolkit/container-toolkit!271
2023-01-30 18:37:42 +00:00
Evan Lezar
8b197b27ed Rework the upload of archives to kitmaker.
This change simplifies how Kitmaker archives are constructed.

Currently only centos8 and ubuntu18.04 packages are included.

Signed-off-by: Evan Lezar <elezar@nvidia.com>
2023-01-30 14:52:16 +01:00
Evan Lezar
8c57e55b59 Add additional information to the manifest
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2023-01-30 14:52:16 +01:00
Evan Lezar
6d1639a513 Merge branch 'set-default-to-index' into 'main'
Use device index as CDI device names by default

See merge request nvidia/container-toolkit/container-toolkit!273
2023-01-30 13:22:36 +00:00
Evan Lezar
5e6f72e8f4 Update changelog
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2023-01-30 13:40:25 +01:00
Evan Lezar
707e3479f8 Fix lint errors
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2023-01-30 13:39:57 +01:00
Evan Lezar
201232dae3 Add logging of minimum CDI version
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2023-01-30 13:39:08 +01:00
Evan Lezar
f768bb5783 Use device index as CDI device names by default
This change uses the `index` mode for the --device-name-strategy when
generating CDI specifications by default. This generates device names
such as nvidia.com/gpu=0 or nvidia.com/gpu=1:0 by default.

Note that this requires a CDI spec version of 0.5.0 and for consumers
(e.g. podman) that are only compatible with older versions one of the
other stragegies (`type-index` or `uuid`) should be used instead to
generate a v0.3.0 or v0.4.0 specification.

Signed-off-by: Evan Lezar <elezar@nvidia.com>
2023-01-30 13:36:17 +01:00
Evan Lezar
f0de3ccd9c Merge branch 'CNT-3718/allow-device-name-to-be-controlled' into 'main'
Add --device-name-strategy flag for CDI spec generation

See merge request nvidia/container-toolkit/container-toolkit!269
2023-01-30 12:28:38 +00:00
Evan Lezar
09e8d4c4f3 Merge branch 'move-dev-char-creation' into 'main'
Move `create-dev-char-symlinks` from `nvidia-ctk hook` to `nvidia-ctk system`

See merge request nvidia/container-toolkit/container-toolkit!272
2023-01-30 12:10:17 +00:00
Evan Lezar
8188400c97 Move create-dev-char-symlinks subcommand from hook to system
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2023-01-27 12:12:54 +01:00
Evan Lezar
962d38e9dd Add nvidia-ctk system subcommand
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2023-01-27 12:12:54 +01:00
Kevin Klues
9fc2c59122 Merge branch 'CNT-3845/add-dev-char-symlink' into 'main'
Add create-dev-char-symlinks hook

See merge request nvidia/container-toolkit/container-toolkit!267
2023-01-27 10:11:29 +00:00
Evan Lezar
540f4349f5 Update vendoring for nvpci
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2023-01-25 13:43:43 +01:00
Evan Lezar
1d7e419008 Add --create-all mode to creation of dev/char symlinks
This change adds a --create-all mode to the create-dev-char-symlinks hook.
This mode creates all POSSIBLE symlinks to device nodes for regular and cap
devices. With the number of GPUs inferred from the PCI device information.

Signed-off-by: Evan Lezar <elezar@nvidia.com>
2023-01-25 13:43:43 +01:00
Evan Lezar
95394e0fc8 Add internal/info/proc/devices package to read device majors
This change adds basic functionality to process the /proc/devices
file to extract device majors.

Signed-off-by: Evan Lezar <elezar@nvidia.com>
2023-01-25 13:43:43 +01:00
Evan Lezar
f9330a4c2c Add --watch option to create-dev-char-symlinks
This change adds a --watch option to the create-dev-char-symlinks hook. This
installs an fsnotify watcher that creates symlinks for ADDED device nodes under
/dev/char.

Signed-off-by: Evan Lezar <elezar@nvidia.com>
2023-01-25 13:43:43 +01:00
Evan Lezar
be0e4667a5 Add create-dev-char-symlinks hook
This change adds an nvidia-ctk hook create-dev-char-symlinks
subcommand that creates symlinks to device nodes (as required by
systemd) under /dev/char.

Signed-off-by: Evan Lezar <elezar@nvidia.com>
2023-01-25 13:43:43 +01:00
Evan Lezar
408eeae70f Allow locator to be marked as optional
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2023-01-25 10:38:11 +01:00
Evan Lezar
27c82c19ea Merge branch 'bump-version' into 'main'
Bump version to 1.12.0-rc.4

See merge request nvidia/container-toolkit/container-toolkit!270
2023-01-25 09:37:41 +00:00
Evan Lezar
937f3d0d78 Add changelog for v1.12.0-rc.3
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2023-01-25 10:37:24 +01:00
Evan Lezar
bc3cc71f90 Update libnvidia-container to 1.12.0-rc.4
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2023-01-25 10:23:54 +01:00
Evan Lezar
ad4531db1e Bump version to 1.12.0-rc.4
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2023-01-25 10:23:02 +01:00
Evan Lezar
e5d8d10d4f Merge branch 'CNT-3854/discover-first' into 'main'
Limit number of candidates for executables

See merge request nvidia/container-toolkit/container-toolkit!268
2023-01-23 17:43:27 +00:00
Evan Lezar
89bf81a9db Add --device-name-strategy flag for CDI spec generation
This change adds a --device-name-strategy flag for generating a CDI
specificaion. This allows a CDI spec to be generated with the following
names used for device:

* type-index: gpu0 and mig0:1
* index: 0 and 0:1
* uuid: GPU and MIG UUIDs

Note that the use of 'index' generates a v0.5.0 CDI specification since
this relaxes the restriction on the device names.

Signed-off-by: Evan Lezar <elezar@nvidia.com>
2023-01-20 16:17:32 +01:00
Evan Lezar
6237477ba3 Limit number of candidates for executables
This change ensures that the first match of an executable in the path
is retured instead of a list of candidates. This prevents a CDI spec,
for example, from containing multiple entries for a single executable
(e.g. nvidia-smi).

Signed-off-by: Evan Lezar <elezar@nvidia.com>
2023-01-20 15:10:24 +01:00
Evan Lezar
6706024687 Merge branch 'fix-missing-nvidia-container-runtime-hook-1682' into 'main'
Avoid missing nvidia-container-runtime-hook during rpm update from <=1.10.0

See merge request nvidia/container-toolkit/container-toolkit!263
2023-01-19 16:32:58 +00:00
Evan Lezar
7649126248 Remove rpm-state directory instead of just single file. 2023-01-19 14:50:35 +00:00
Evan Lezar
104dca867f Merge branch 'fix-ldcache-list' into 'main'
Fix and refactor code related to reading LDCache

See merge request nvidia/container-toolkit/container-toolkit!266
2023-01-19 13:52:55 +00:00
Evan Lezar
881b1c0e08 introduce resolveSelected helper
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2023-01-19 14:10:55 +01:00
Evan Lezar
3537d76726 Further refactoring of ldcache code
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2023-01-19 14:10:36 +01:00
Evan Lezar
ccd1961c60 Ensure root is included in absolute ldcache paths
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2023-01-19 14:09:43 +01:00
Evan Lezar
f350f0c0bb Refactor resolving of links in ldcache
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2023-01-19 14:09:41 +01:00
Evan Lezar
80672d33af Continue instead of break on error when listing libraries
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2023-01-19 13:54:24 +01:00
Evan Lezar
7a1cfb48b9 Merge branch 'update-cdi' into 'main'
Determine the minumum required spec version

See merge request nvidia/container-toolkit/container-toolkit!265
2023-01-19 11:57:19 +00:00
Evan Lezar
ae3b213b0e Merge branch 'fix-cdi-library-container-path' into 'main'
Reuse mount discovery for driver libraries

See merge request nvidia/container-toolkit/container-toolkit!262
2023-01-19 11:52:21 +00:00
Evan Lezar
eaf9bdaeb4 Determine the minumum required spec version
This change uses functionality from the CDI package to determine
the minimum required CDI spec version. This allows for a spec with
the widest compatibility to be specified.

Signed-off-by: Evan Lezar <elezar@nvidia.com>
2023-01-19 12:14:00 +01:00
Evan Lezar
bc4bfb94a2 Update CDI package
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2023-01-19 12:12:54 +01:00
Evan Lezar
a77331f8f0 Reuse mount discovery for driver libraries
This change implements the discovery of versioned driver libaries
by reusing the mounts and update ldcache discoverers use for, for example,
CVS file discovery. This allows the container paths to be correctly generated
without requiring specific manipulation.

Signed-off-by: Evan Lezar <elezar@nvidia.com>
2023-01-19 12:11:13 +01:00
Evan Lezar
94b7add334 Merge branch 'bugfix-nvidia-ctk' into 'main'
Fix nvidia-ctk path in spec generation

See merge request nvidia/container-toolkit/container-toolkit!264
2023-01-19 11:10:38 +00:00
Evan Lezar
9c9e6cd324 Fix nvidia-ctk path in spec generation
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2023-01-19 12:03:07 +01:00
Evan Lezar
f50efca73f Merge branch 'specify-nvidia-ctk-path' into 'main'
Make handling of nvidia-ctk path consistent

See merge request nvidia/container-toolkit/container-toolkit!261
2023-01-19 10:20:25 +00:00
Evan Lezar
19cfb2774d Use common code to construct nvidia-ctk hooks
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2023-01-19 10:37:10 +01:00
Evan Lezar
27347c98d9 Consolidate code to find nvidia-ctk
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2023-01-19 10:31:42 +01:00
Evan Lezar
ebbc47702d Remove 'Executable' from private struct member names
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2023-01-18 17:02:42 +01:00
Evan Lezar
09d42f0ad9 Remove 'Executable' from config struct member
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2023-01-18 17:02:42 +01:00
Evan Lezar
35df24d63a Make handling of nvidia-ctk path consistent
This change adds an --nvidia-ctk-path to the nvidia-ctk cdi generate
command. This ensures that the executable path for the generated
hooks can be specified consistently.

Since the NVIDIA Container Runtime already allows for the executable
path to be specified in the config the utility code to update the
LDCache and create other nvidia-ctk hooks are also updated.

Signed-off-by: Evan Lezar <elezar@nvidia.com>
2023-01-18 17:02:42 +01:00
Claudius Volz
f93b6a13f4 Preserve a temporary copy of nvidia-container-runtime-hook during rpm update,
to avoid being destroyed by when updating from <=1.10.0

Signed-off-by: Claudius Volz <c.volz@gmx.de>
2023-01-15 01:22:37 +01:00
Evan Lezar
50d7fb8f41 Merge branch 'missing-dra-devices' into 'main'
Ensure existence of DRM devices nodes is checked

See merge request nvidia/container-toolkit/container-toolkit!260
2022-12-13 12:42:04 +00:00
Evan Lezar
311e7a1feb Ensure existence of DRM devices nodes is checked
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2022-12-12 14:48:54 +01:00
Evan Lezar
14e587d55f Merge branch 'update-libnvidia-container' into 'main'
Update libnvidia-container to v1.12.0-rc.3

See merge request nvidia/container-toolkit/container-toolkit!259
2022-12-09 09:38:27 +00:00
Evan Lezar
66ec967de2 Update libnvidia-container to 1.12.0-rc.3
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2022-12-09 10:20:36 +01:00
Evan Lezar
252693aeac Use SHA for ineffassign
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2022-12-09 09:58:49 +01:00
Evan Lezar
079b47ed94 Use sha instead of latest for golint
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2022-12-09 09:51:22 +01:00
Evan Lezar
d2952b07aa Merge branch 'fix-from-discover' into 'main'
Ensure that an empty discoverer returns valid edits

See merge request nvidia/container-toolkit/container-toolkit!258
2022-12-09 08:37:26 +00:00
Evan Lezar
41f1b93422 Use NewContainerEdits utility function for CDI generation
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2022-12-07 11:09:19 +01:00
Evan Lezar
3140810c95 Add NewContainerEdits utility function
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2022-12-07 11:03:45 +01:00
Evan Lezar
046d761f4c Ensure that an empty discoverer returns valid edits
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2022-12-06 14:01:35 +01:00
Evan Lezar
0a2083df72 Merge branch 'CNT-3707/add-root-flag' into 'main'
Add --root flag to nvidia-ctk cdi generate command

See merge request nvidia/container-toolkit/container-toolkit!256
2022-12-02 15:54:27 +00:00
Evan Lezar
80c810bf9e Add --root flag to CDI generate command
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2022-12-02 16:13:53 +01:00
Evan Lezar
82ba424212 Simplify device folder permission hook
This simplifies the device folder permission hook to only handle
/dev/dri and /dev/nvidia-caps folders.

Signed-off-by: Evan Lezar <elezar@nvidia.com>
2022-12-02 16:13:53 +01:00
Evan Lezar
c131b99cb3 Merge branch 'CNT-3613/add-firmware-cdi' into 'main'
Include GSP firmware path in CDI specification

See merge request nvidia/container-toolkit/container-toolkit!254
2022-12-02 14:51:53 +00:00
Evan Lezar
64a85fb832 Include GSP firmware path in CDI specification
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2022-12-02 14:35:22 +01:00
Evan Lezar
ebf1772068 Merge branch 'CNT-3580/inject-egl-wayland' into 'main'
Add egl_external_platform.d/10_nvidia_wayland.json to graphics mounts

See merge request nvidia/container-toolkit/container-toolkit!252
2022-12-02 13:05:18 +00:00
Evan Lezar
8604c255c4 Use Options to set FileLocator options
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2022-12-02 13:57:33 +01:00
Evan Lezar
bea8321205 Use prefix search for locating graphics files
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2022-12-02 13:55:13 +01:00
Evan Lezar
db962c4bf2 Use getSearchPrefixes for all locators
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2022-12-02 13:55:13 +01:00
Evan Lezar
d1a3de7671 Add test for device locator
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2022-12-02 13:55:13 +01:00
Evan Lezar
8da7e74408 Add tests for executable locator
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2022-12-02 13:55:13 +01:00
Evan Lezar
55eb898186 Add support for specifying multiple prefixes
This change allows the file Locator to be instantiated with multiple
search prefixes.

Signed-off-by: Evan Lezar <elezar@nvidia.com>
2022-12-02 13:55:13 +01:00
Evan Lezar
a7fc29d4bd Add tests for file locator
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2022-12-02 13:55:13 +01:00
Evan Lezar
fdb3e51294 Add egl_external_platform.d/10_nvidia_wayland.json to graphics mounts
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2022-12-02 13:55:13 +01:00
Evan Lezar
0582180cab Merge branch 'rework-cdi-generation' into 'main'
Rework CDI spec generation to use discoverers

See merge request nvidia/container-toolkit/container-toolkit!248
2022-12-02 11:32:29 +00:00
Evan Lezar
46667b5a8c Remove unused code
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2022-12-02 11:49:37 +01:00
Evan Lezar
e4e1de82ec Refactor nvidia-ctk cdi generate command
This change refactors the generation of CDI specifications
to use discoverers and generate the CDI specifications from these
discoverers. This allows for better reuse.

Signed-off-by: Evan Lezar <elezar@nvidia.com>
2022-12-02 11:49:37 +01:00
Evan Lezar
d51c8fcfa7 Add utility function to generatee nvidia-ctk OCI hook
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2022-12-02 10:01:22 +01:00
Evan Lezar
9b33c34a57 Allow graphics mount discoverer to be instantiated independently
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2022-12-02 10:01:22 +01:00
Evan Lezar
0b6cd7e90e Add FromDiscoverer function to generate container edits
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2022-12-02 10:01:22 +01:00
Evan Lezar
029a04c37d Use blank device hostPath if same as Path
The HostPath field was added in the v0.5.0 CDI specification.
The cdi package uses strict unmarshalling when loading specs
from file causing failures for unexpected fields.

Since the behaviour for HostPath == "" and HostPath == Path are
equivalent, we clear HostPath if it is equal to Path to ensure
compatibility with the widest range of specs.

This allows, for example, a v0.4.0 spec to be generated as required.

Signed-off-by: Evan Lezar <elezar@nvidia.com>
2022-12-02 10:01:22 +01:00
Evan Lezar
60c1df4e9c Remove unneeded workaround for CDI edit generation
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2022-12-02 10:01:22 +01:00
Evan Lezar
3e35312537 Merge branch 'fix-json-mode' into 'main'
Remove unused jsonMode and fix output

See merge request nvidia/container-toolkit/container-toolkit!255
2022-12-01 16:01:33 +00:00
Evan Lezar
932b39fd08 Remove unused jsonMode and fix output
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2022-12-01 16:21:50 +01:00
Evan Lezar
78cafe45d4 Merge branch 'create-cdi-output-folder' into 'main'
Ensure output folder exists for CDI spec

See merge request nvidia/container-toolkit/container-toolkit!250
2022-12-01 12:44:49 +00:00
Evan Lezar
584e792a5a Ensure output folder exists for CDI spec
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2022-11-30 19:40:58 +01:00
Evan Lezar
f0bcfa0415 Merge branch 'add-format-flag' into 'main'
Switch to string-based flag for CDI output format

See merge request nvidia/container-toolkit/container-toolkit!247
2022-11-29 16:47:40 +00:00
Evan Lezar
d45ec7bd28 Switch to string-based flag for CDI output format
This change replaces the `--json` flag of the nvidia-ctk cdi generate
command with a --format flag that accepts a string format of either
json or yaml.

Signed-off-by: Evan Lezar <elezar@nvidia.com>
2022-11-29 16:56:26 +01:00
Evan Lezar
153f2f6300 Merge branch 'fix-by-path-missing' into 'main'
Skip missing by-path symlinks instead of failing

See merge request nvidia/container-toolkit/container-toolkit!249
2022-11-25 12:11:56 +00:00
Evan Lezar
9df3975740 Merge branch 'bump-version-1.12.0-rc.3' into 'main'
Bump version to 1.12.0-rc.3

See merge request nvidia/container-toolkit/container-toolkit!246
2022-11-23 21:22:46 +00:00
Evan Lezar
5575b391ff Skip missing by-path symlinks instead of failing
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2022-11-23 22:21:58 +01:00
Evan Lezar
9faf11ddf3 Fix error message
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2022-11-23 22:21:58 +01:00
Evan Lezar
d3ed27722e Bump version to 1.12.0-rc.3
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2022-11-23 21:26:34 +01:00
Evan Lezar
07a3f3040a Merge branch 'fix-release-scripts' into 'main'
Fix array arguments for release scripts

See merge request nvidia/container-toolkit/container-toolkit!245
2022-11-22 13:00:42 +00:00
Evan Lezar
749ab2a746 Fix array arguments for release scripts
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2022-11-21 21:19:29 +01:00
Evan Lezar
217a135eb1 Merge branch 'fix-kitmaker' into 'main'
Fix kitmaker release for opensuse-leap

See merge request nvidia/container-toolkit/container-toolkit!244
2022-11-21 15:21:46 +00:00
Evan Lezar
22e65b320b Fix kitmaker release for opensuse-leap
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2022-11-21 16:19:13 +01:00
Evan Lezar
53bb940b30 Merge branch 'bump-golang-version' into 'main'
Update go version to 1.18

See merge request nvidia/container-toolkit/container-toolkit!243
2022-11-21 09:26:32 +00:00
Evan Lezar
1c1ad8098a Update go version to 1.18
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2022-11-21 09:35:03 +01:00
Evan Lezar
203db4390c Merge branch 'add-graphics-edits-to-CDI-spec' into 'main'
Include graphics devices in generated CDI specification

See merge request nvidia/container-toolkit/container-toolkit!242
2022-11-20 21:46:00 +00:00
Evan Lezar
b6d9c2c1ad Add graphics devices and libraries to CDI specification
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2022-11-14 13:55:56 +01:00
Evan Lezar
429ef4d4e9 Make NewVisibleDevices public
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2022-11-14 12:19:59 +01:00
Evan Lezar
25759ca933 Merge branch 'fix-kitmaker-scripts' into 'main'
Fix scripts and pipeline for artifactory release

See merge request nvidia/container-toolkit/container-toolkit!241
2022-11-11 12:28:35 +00:00
Evan Lezar
74abea07e2 Add top-level variable to set kitmaker folder
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2022-11-10 16:06:12 +01:00
Evan Lezar
7955bb1a84 Use short sha for kitmaker version
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2022-11-10 16:06:12 +01:00
Evan Lezar
75b11eb80a Use VERSION in kitmaker archive name
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2022-11-10 16:06:12 +01:00
Evan Lezar
c958817eef Log applied properties.
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2022-11-10 16:06:12 +01:00
Evan Lezar
80f8c2a418 Correct artifactory upload URL
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2022-11-10 16:06:12 +01:00
Evan Lezar
08640a6f64 Ensure CURL is set for kitmaker upload
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2022-11-10 16:06:12 +01:00
Evan Lezar
9db31f7506 Fix number of arguments for kitmaker release script
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2022-11-10 16:06:12 +01:00
Evan Lezar
7fd40632fe Update regctl version
The regctl image copy-file command was added in v0.4.5.

Signed-off-by: Evan Lezar <elezar@nvidia.com>
2022-11-10 14:43:19 +01:00
Evan Lezar
6ef19d2925 Remove call to non-existant script
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2022-11-10 14:42:56 +01:00
Evan Lezar
83ce83239b Correct extract package image argument
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2022-11-10 14:33:48 +01:00
Evan Lezar
30fb486e44 Add basic logging
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2022-11-10 14:27:58 +01:00
Evan Lezar
0022661565 Merge branch 'add-cdi-readme' into 'main'
Add README for generating CDI specifications

See merge request nvidia/container-toolkit/container-toolkit!239
2022-11-09 13:29:36 +00:00
Jon Mayo
28e882f26f Merge branch 'cnt-2210' into 'main'
[ci] push package releases to artifactory

See merge request nvidia/container-toolkit/container-toolkit!231
2022-11-08 16:45:34 +00:00
Jon Mayo
71fbe7a812 [ci] push package releases to artifactory 2022-11-08 16:45:34 +00:00
Evan Lezar
ce3d94af1a Add README for generating CDI specifications
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2022-11-08 15:15:27 +01:00
Evan Lezar
0bc09665a8 Merge branch 'CNT-1380/add-crio-config' into 'main'
Add support for updating crio config

See merge request nvidia/container-toolkit/container-toolkit!176
2022-11-07 10:54:34 +00:00
Evan Lezar
205ba098e9 Merge branch 'multiple-docker-swarm' into 'main'
Consider all Swarm resource envvars

See merge request nvidia/container-toolkit/container-toolkit!222
2022-11-07 10:43:49 +00:00
Evan Lezar
877832da69 Consider all Swarm resource envvars
This change extends the support for multiple envvars when
specifying swarm resources to consider ALL of the specified
environment variables instead of the first match.

Signed-off-by: Evan Lezar <elezar@nvidia.com>
2022-11-04 10:01:28 +01:00
Evan Lezar
b7ba96a72e Merge branch 'update-libnvidia-container' into 'main'
Update libnvidia-container

See merge request nvidia/container-toolkit/container-toolkit!237
2022-11-03 13:38:50 +00:00
Evan Lezar
93c59f2d9c Skip nvidia-container-runtime and nvidia-docker builds for release candidates
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2022-11-03 14:38:11 +01:00
Evan Lezar
5a56b658ba Update changelog
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2022-11-03 14:30:13 +01:00
Evan Lezar
99889671b5 Update libnvidia-container
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2022-11-03 14:25:31 +01:00
Evan Lezar
a2fb017208 Merge branch 'rework-cdi-cli' into 'main'
Rename nvidia-ctk info generate-cdi command

See merge request nvidia/container-toolkit/container-toolkit!236
2022-11-03 09:31:26 +00:00
Evan Lezar
f7021d84b5 Merge branch 'add-dev-dri' into 'main'
Inject DRM device nodes into containers when Graphics or Display capabilities are requested

See merge request nvidia/container-toolkit/container-toolkit!235
2022-11-03 09:31:03 +00:00
Evan Lezar
c793fc27d8 Output YAML separator
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2022-11-02 15:03:18 +01:00
Evan Lezar
3d2328bdfd Rename nvidia-ctk info generate-cdi command
This change renames the nvidia-ctk info generate-cdi command as

nvidia-ctk cdi generate

Signed-off-by: Evan Lezar <elezar@nvidia.com>
2022-11-02 14:56:26 +01:00
Evan Lezar
76b69f45de Add discovery of DRM devices
This change adds the discovery of DRM devices associated with requested
devices. This means that the /dev/dri/card* and /dev/dri/renderD*
devices associated with each requested NVIDIA GPU are injected into
the container and that the /dev/dri/by-path symlinks associated with
these devices are created in the container.

Signed-off-by: Evan Lezar <elezar@nvidia.com>
2022-11-02 14:49:08 +01:00
Evan Lezar
73e65edaa9 Also trigger graphics modifier for display capability
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2022-11-02 14:42:51 +01:00
Evan Lezar
cd7ee5a435 Add test for graphics modifier
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2022-11-02 14:42:51 +01:00
Evan Lezar
eac4faddc6 Use :: as link separator
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2022-11-02 14:42:51 +01:00
Evan Lezar
bc8a73dde4 Add a Filter interface to the discover package
This change adds support for filtering entities by specifying a filter.
This can be used, for example, to check whether a mount or device
has a particular property and removing it from the set of discovered
entities if it does not.

Signed-off-by: Evan Lezar <elezar@nvidia.com>
2022-11-02 14:42:48 +01:00
Evan Lezar
624b9d8ee6 Add internal drm package for determining DRM devices
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2022-11-02 14:39:53 +01:00
Evan Lezar
9d6e2ff1b0 Add internal proc package for processing GPU information files
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2022-11-02 14:39:53 +01:00
Evan Lezar
aca0c7bc5a Add Devices abstraction to CUDA image
This change adds a Devices abstraction to the CUDA image utilities. This
allows for checking whether a devices is selected, for example.

Signed-off-by: Evan Lezar <elezar@nvidia.com>
2022-11-02 14:39:53 +01:00
Evan Lezar
db47b58275 Add utilities for driver capabilities to image packages
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2022-11-02 14:35:42 +01:00
Evan Lezar
59bf7607ce Merge branch 'ipc-rw' into 'main'
Mount IPC sockets with noexec flag

See merge request nvidia/container-toolkit/container-toolkit!234
2022-11-02 12:15:47 +00:00
Evan Lezar
61ff3fbd7b Merge branch 'chmod-hook' into 'main'
Add nvidia-ctk hook chmod command to set permissions and ensure permissions of `/dev/nvidia-caps` is set

See merge request nvidia/container-toolkit/container-toolkit!232
2022-11-02 12:15:23 +00:00
Evan Lezar
523fc57ab4 Use an Executable Locator to lookup chmod
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2022-10-26 16:24:11 +02:00
Evan Lezar
ae18c5d847 Include chmod hook for device subfolders in CDI spec generation
This change generates one or more createContainer hooks for ensuring
that subfolders in /dev have the required permissions in the container.
As an example, a user requires read permissions to the /dev/nvidia-caps
in addition to including the specific caps devices under this folder.

Signed-off-by: Evan Lezar <elezar@nvidia.com>
2022-10-26 16:08:13 +02:00
Evan Lezar
4abdc2f35d Add nvidia-ctk hook chmod command to set permissions
This change adds an nvidia-ctk hook chmod command that can be used
to update the permissions for paths in the container.

This prepends the container root to the paths to allow these to be
updated by runtime executables.

Signed-off-by: Evan Lezar <elezar@nvidia.com>
2022-10-26 16:01:52 +02:00
Evan Lezar
f8748bfa9a Mount IPC sockets with noexec flag
This change ensures that the CDI spec mounts the ipc sockets with the
noexec flag to allow these to function in rootless mode with podman.

Signed-off-by: Evan Lezar <elezar@nvidia.com>
2022-10-21 16:44:02 +02:00
Evan Lezar
5fb0ae2c2d Merge branch 'fix-mig-caps-paths' into 'main'
Correct construction of MIG Caps

See merge request nvidia/container-toolkit/container-toolkit!230
2022-10-17 11:41:18 +00:00
Evan Lezar
899fc72014 Correct constructin of MIG Caps
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2022-10-13 14:06:30 +02:00
Evan Lezar
1267c1d9a2 Refactor docker config update
This change updates the docker config update for simplicitly.
This also allows for the API to match the crio update code.

Signed-off-by: Evan Lezar <elezar@nvidia.com>
2022-10-11 11:42:38 +02:00
Evan Lezar
9a697e340b Add support for updating crio configs
This adds support for updating crio configs (instead of installing hooks)
and adds crio support to the nvidia-ctk runtime configure command.

Signed-off-by: Evan Lezar <elezar@nvidia.com>
2022-10-11 11:42:38 +02:00
Evan Lezar
abe8ca71e0 Use struct to store cri-o command line flags
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2022-10-11 11:35:56 +02:00
Evan Lezar
9bbf7dcf96 Merge branch 'fix-hook-removal' into 'main'
Improve locating NVIDIA Container Runtime Hook

See merge request nvidia/container-toolkit/container-toolkit!215
2022-10-11 09:32:08 +00:00
Evan Lezar
ec1222b58b Merge branch 'bump-1.12.0-rc.2' into 'main'
Bump version to 1.12.0-rc.2

See merge request nvidia/container-toolkit/container-toolkit!229
2022-10-11 09:27:16 +00:00
Evan Lezar
229b46e0ca Bump version to 1.12.0-rc.2
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2022-10-10 17:11:53 +02:00
Evan Lezar
b6a68c4add Merge branch 'overwrite-rule' into 'main'
Reorder extends for internal pipelines

See merge request nvidia/container-toolkit/container-toolkit!228
2022-10-10 12:58:34 +00:00
Evan Lezar
e588bfac7d Reorder extends for internal pipelines
This change updates the ordering of internal pipeline dependencies to
ensure that the correct rules are applied.

Signed-off-by: Evan Lezar <elezar@nvidia.com>
2022-10-10 14:58:09 +02:00
Evan Lezar
224020533e Merge branch 'fix-internal-ci' into 'main'
Fix internal CI rules

See merge request nvidia/container-toolkit/container-toolkit!227
2022-10-10 11:43:32 +00:00
Evan Lezar
3736bb3aca Fix internal CI rules
This change updates the internal CI rules for the optimizations
to skip non-critical images on MRs.

Signed-off-by: Evan Lezar <elezar@nvidia.com>
2022-10-10 13:43:01 +02:00
Evan Lezar
1e72f92b74 Merge branch 'update-changelog' into 'main'
Update changelog for v1.12.0-rc.1

See merge request nvidia/container-toolkit/container-toolkit!226
2022-10-10 10:12:46 +00:00
Evan Lezar
896f5b2e9f Update changelog for v1.12.0-rc.1
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2022-10-10 12:12:14 +02:00
Evan Lezar
c068d4048f Merge branch 'update-cdi-spec-generation' into 'main'
Update CDI spec generation

See merge request nvidia/container-toolkit/container-toolkit!225
2022-10-10 10:07:19 +00:00
Evan Lezar
8796cd76b0 Merge branch 'streamline-cicd' into 'main'
Add rules to skip distributions when not on main

See merge request nvidia/container-toolkit/container-toolkit!224
2022-10-10 08:34:00 +00:00
Evan Lezar
1597ede2af Add all device
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2022-10-10 10:19:08 +02:00
Evan Lezar
3dd8020695 Include meta devices in generated CDI spec
This change includes meta devices (e.g. /dev/nvidiactl) in the
generated CDI spec. Missing device nodes are ignored.

Signed-off-by: Evan Lezar <elezar@nvidia.com>
2022-10-07 16:23:37 +02:00
Evan Lezar
dfa041991f Generate v0.4.0 CDI spec
This change generates a v0.4.0 CDI spec instead of a v0.5.0 spec.
This allows older versions of podman, for example, to be used.

This requires that the device names do not start on a numeric character
and that the HostPath for a device is unspecified.

Signed-off-by: Evan Lezar <elezar@nvidia.com>
2022-10-07 16:10:47 +02:00
Evan Lezar
568896742b Remove ubuntu 20.04 tests
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2022-10-07 15:49:23 +02:00
Evan Lezar
f52973217f Add rules to skip distributions when not on main
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2022-10-07 15:46:26 +02:00
Evan Lezar
efd29f1cec Merge branch 'update-cuda-base-image' into 'main'
Update CUDA base image to 11.8.0

See merge request nvidia/container-toolkit/container-toolkit!223
2022-10-07 12:32:25 +00:00
Evan Lezar
4b02670049 Use 40 digit sha for version string
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2022-10-07 14:31:49 +02:00
Evan Lezar
8550874686 Update CUDA base image to 11.8.0
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2022-10-07 14:31:10 +02:00
Evan Lezar
38513d5a53 Merge branch 'multiple-docker-swarm' into 'main'
Add support for multiple swarm resource envvars

See merge request nvidia/container-toolkit/container-toolkit!220
2022-10-04 13:03:27 +00:00
Evan Lezar
a35236a8f6 Correct test cases for NVIDIA_VISIBLE_DEVICES=void
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2022-10-04 14:14:44 +02:00
Evan Lezar
0c2e72b7c1 Update gitignore
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2022-10-04 14:11:10 +02:00
Evan Lezar
f0bdfbebe4 Add support for multiple swarm resource envvars
This change allows the swarm-resource config option to specify a
comma-separated list of environment variables instead of a single
environment variable.

The first environment variable matched is considered and other
environment variables are ignored.

Signed-off-by: Evan Lezar <elezar@nvidia.com>
2022-10-04 14:11:10 +02:00
Evan Lezar
a4fa61d05d Merge branch 'cdi-tooling' into 'main'
Add nvidia-ctk info generate-cdi command to generate CDI specification

See merge request nvidia/container-toolkit/container-toolkit!217
2022-10-04 12:10:07 +00:00
Evan Lezar
6e23a635c6 Merge branch 'update-libnvidia-container' into 'main'
Update libnvidia-contianer submodule

See merge request nvidia/container-toolkit/container-toolkit!218
2022-09-29 10:48:15 +00:00
Evan Lezar
4dedac6a24 Use base filename as first hook argument
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2022-09-29 12:14:12 +02:00
Evan Lezar
8c1b9b33c1 Use common code to construct ldconfig hook
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2022-09-29 12:12:42 +02:00
Evan Lezar
d37c17857e Add nvidia-ctk info generate-cdi command
This change adds functionality to generate CDI specifications
for all devices detected on the system. A specification containing
all GPUs and MIG devices is generated. All libraries on the host
ldcache that have an NVIDIA Driver Version suffix are included as
are the required binaries and IPC sockets.

A hook (based on the nvidia-ctk hook subcommand) to update the ldcache
in the container for the libraries being injected is also added to the
CDI specificiation.

Signed-off-by: Evan Lezar <elezar@nvidia.com>
2022-09-29 12:11:42 +02:00
Evan Lezar
a0065456d0 Add internal/nvcaps package
This change adds an internal nvcaps pacakge.

This package will be migrated to go-nvlib.

Signed-off-by: Evan Lezar <elezar@nvidia.com>
2022-09-29 12:11:42 +02:00
Evan Lezar
a34a571d2e Update CDI dependency to v0.5.2
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2022-09-29 12:11:41 +02:00
Evan Lezar
bb4cfece61 Update go module version to 1.17
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2022-09-29 12:11:41 +02:00
Evan Lezar
b16d263ee7 Add tests for ldcache hook
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2022-09-29 12:11:40 +02:00
Evan Lezar
027395bb8a Update libnvidia-contianer submodule
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2022-09-29 11:26:18 +02:00
Evan Lezar
3ecd790206 Merge branch 'opengl-poc' into 'main'
Add support for injecting vulkan configs and libraries

See merge request nvidia/container-toolkit/container-toolkit!196
2022-09-29 09:23:54 +00:00
Evan Lezar
52bb9e186b Add vulkan support through OCI spec modification
This change allows the NVIDIA Container Runtime to inject vulkan
loaders and libraries by modifying the OCI runtime specification.

This allows vulkan applications to run in containers without
additional modifications.

Signed-off-by: Evan Lezar <elezar@nvidia.com>
2022-09-28 16:51:52 +02:00
Evan Lezar
68b6d1cab1 Add a locator for libraries
This change adds a Locator that can be used to locate libraries.
If library names are specified, the ldcache is searched otherwise
symlinks are resolved.

Signed-off-by: Evan Lezar <elezar@nvidia.com>
2022-09-28 16:43:21 +02:00
Evan Lezar
bdb67b4fba Add package for locating libraries in LDCache
This change adds a package that reads an ldcache and allows for libraries
to be searched by prefix.

Signed-off-by: Evan Lezar <elezar@nvidia.com>
2022-09-28 16:43:21 +02:00
Evan Lezar
d0c39a11d5 Merge branch 'update-go-nvlib' into 'main'
Use go-nvlib nvlib/info package

See merge request nvidia/container-toolkit/container-toolkit!216
2022-09-28 12:28:43 +00:00
Evan Lezar
9de6361938 Update vendoring
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2022-09-28 13:40:18 +02:00
Evan Lezar
fb016dca86 Use go-nvlib nvlib/info package
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2022-09-28 13:40:18 +02:00
Evan Lezar
8beb7b4231 Only remove nvidia-container-toolkit if it is a symlink
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2022-09-19 15:31:10 +02:00
Evan Lezar
2b08a79206 Ensure that errors are logged
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2022-09-19 15:29:29 +02:00
Evan Lezar
5885fead8f Improve locating NVIDIA Container Runtime Hook
This change ensures that a more concrete error is provided by the NVIDIA
Container Runtime if the NVIDIA Container Runtime hook cannot be
located.

Signed-off-by: Evan Lezar <elezar@nvidia.com>
2022-09-19 15:29:29 +02:00
Evan Lezar
a9fb7a4a88 Merge branch 'remove-positional-arguments' into 'main'
Allow install root to be set as positional argument OR flag

See merge request nvidia/container-toolkit/container-toolkit!212
2022-09-16 09:36:17 +00:00
Evan Lezar
b5dbcaeaf9 Merge branch 'bump-post-release' into 'main'
Bump versions post release

See merge request nvidia/container-toolkit/container-toolkit!214
2022-09-14 15:12:09 +00:00
Evan Lezar
80a46d4a5c Bump version to 1.12.0-rc.1
This bumps the package versions to:

* nvidia-container-toolkit 1.12.0-rc.1
* nvidia-container-runtime 3.12.0-rc.1
* nvidia-docker2 2.12.0-rc.1

Signed-off-by: Evan Lezar <elezar@nvidia.com>
2022-09-14 15:42:13 +02:00
Evan Lezar
febce822d5 Fix fedora35 test container repo URL
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2022-09-14 14:17:46 +02:00
Evan Lezar
e8099a713c Ensure that existing packages are not re-released
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2022-09-14 14:17:25 +02:00
Evan Lezar
d9de4a09b8 Merge branch 'bump-version-1.11.0' into 'main'
Bump version to 1.11.0

See merge request nvidia/container-toolkit/container-toolkit!213
2022-09-06 09:12:10 +00:00
Evan Lezar
2dbcda2619 Ensure that base package is built for debian
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2022-09-05 17:04:49 +02:00
Evan Lezar
691b93ffb0 Update libnvidia-container submodule
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2022-09-05 16:33:42 +02:00
Evan Lezar
cb0c94cd40 Bump version to v1.11.0
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2022-09-05 15:57:57 +02:00
Evan Lezar
3168718563 Update git commit command
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2022-09-05 15:57:29 +02:00
Evan Lezar
dc8972a26a Allow install root to be set as flag
This change allows the destination / root to be set as the
first positional argument OR as a command line flag. This
allows for the GPU Operator to transition to a case where
on the flag / envvar is used.

Signed-off-by: Evan Lezar <elezar@nvidia.com>
2022-08-26 16:06:48 +02:00
Evan Lezar
0a2d8f4d22 Move destinationArg to options struct
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2022-08-26 15:04:07 +02:00
Evan Lezar
8d623967ed Move runtime flags to struct
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2022-08-26 14:59:57 +02:00
Evan Lezar
503ed96275 Merge branch 'fix-release-tooling' into 'main'
Ensure CLI versions are set correctly for RPM packages

See merge request nvidia/container-toolkit/container-toolkit!211
2022-08-24 10:45:38 +00:00
Evan Lezar
d8ba84d427 Add release tests for fedora35
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2022-08-24 11:57:20 +02:00
Evan Lezar
8e8c41a3bc Clean up repo test scripts
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2022-08-24 11:57:20 +02:00
Evan Lezar
e34fe17b45 Add fedora35 to release and signing scripts
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2022-08-24 11:57:20 +02:00
Evan Lezar
c5b0278c58 Ensure CLI versions are set correctly for RPM packages
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2022-08-24 11:57:20 +02:00
Evan Lezar
8daa257b35 Merge branch 'update-changelog' into 'main'
Add changelog for 1.11.0-rc.3

See merge request nvidia/container-toolkit/container-toolkit!210
2022-08-24 09:01:39 +00:00
Evan Lezar
6329174cfc Add changelog for 1.11.0-rc.3
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2022-08-24 10:08:23 +02:00
Evan Lezar
1ec41c1bf1 Merge branch 'update-libnvidia-container' into 'main'
Update libnvidia-container

See merge request nvidia/container-toolkit/container-toolkit!209
2022-08-23 16:52:09 +00:00
Evan Lezar
581a76de38 Update libnvidia-container
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2022-08-23 17:29:01 +02:00
Evan Lezar
5d52ca8909 Merge branch 'add-fedora35' into 'main'
Add fedora35 package targets

See merge request nvidia/container-toolkit/container-toolkit!205
2022-08-23 13:04:45 +00:00
Evan Lezar
ad7151d394 Update CUDA base image to 11.7.1
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2022-08-23 14:18:49 +02:00
Evan Lezar
3269a7b0e7 Update libnvidia-container submodule
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2022-08-23 14:18:49 +02:00
Evan Lezar
6a155cc606 Increase package build timeout to 3 hours for slow aarch64 builds
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2022-08-23 14:18:49 +02:00
Evan Lezar
a5bbf613e8 Use single config file for centos, al2, and fedora
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2022-08-23 14:18:49 +02:00
Evan Lezar
22427c1359 Add fedora35 CI targets
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2022-08-23 14:18:49 +02:00
Evan Lezar
f17121fd6c Add fedora targets to release scripts
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2022-08-23 14:18:49 +02:00
Evan Lezar
256e37eb3f Add fedora35 package targets
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2022-08-23 14:18:49 +02:00
Evan Lezar
bdfd123b9d Switch to single docker file yum-based rpm builds
This reuses the docker file for yum-based rpm distros (centos, amazonlinux)
instead of maintaining two files with the same contents.

Signed-off-by: Evan Lezar <elezar@nvidia.com>
2022-08-23 14:18:49 +02:00
Jon Mayo
3f7dce202a Merge branch 'remove-podman' into 'main'
Specify hook structure instead of importing Podman

See merge request nvidia/container-toolkit/container-toolkit!208
2022-08-22 15:25:40 +00:00
Evan Lezar
a6d21abe14 Merge branch 'add-package-with-no-libnvidia-container' into 'main'
Split nvidia-container-toolkit package

See merge request nvidia/container-toolkit/container-toolkit!195
2022-08-22 09:08:33 +00:00
Evan Lezar
d0f1fe2273 Use new packages in toolkit image
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2022-08-19 12:38:17 +02:00
Evan Lezar
8de9593209 Split nvidia-container-toolkit package
This change splits the nvidia-container-toolkit package into the top-level package and
an nvidia-container-toolkit-base package.
The nvidia-container-toolkit-base package allows the NVIDIA Container Runtime and
NVIDIA Container Toolkit CLI to be installed on systems without requiring that the
NVIDIA Container Runtine Hook and the transitive dependencies included in the NVIDIA
Container Library and NVIDIA Container CLI also be installed.

This allows the runtime to be used on systems where the CSV or CDI mode of the runtime
is used exclusively.

Signed-off-by: Evan Lezar <elezar@nvidia.com>
2022-08-19 12:38:17 +02:00
Evan Lezar
64b2b50470 Fix centos8 test image
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2022-08-19 12:36:52 +02:00
Evan Lezar
4dc1451c49 Fix indentation in makefile
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2022-08-19 12:36:52 +02:00
Evan Lezar
211081ff25 Update vendoring
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2022-08-19 10:28:00 +02:00
Evan Lezar
c1c1d5cf8e Specify hook structure instead of importing Podman
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2022-08-19 10:26:34 +02:00
Evan Lezar
e91ffef258 Merge branch 'fix-runtime-hook-rename' into 'main'
Fix cleanup of nvidia-container-toolkit link

See merge request nvidia/container-toolkit/container-toolkit!207
2022-08-18 12:51:51 +00:00
Evan Lezar
47c8aa3790 Fix cleanup of nvidia-container-toolkit link
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2022-08-18 14:06:08 +02:00
Evan Lezar
33b4e7fb0a Merge branch 'fix-containerd-tests' into 'main'
Fix image in containerd tests

See merge request nvidia/container-toolkit/container-toolkit!206
2022-08-12 13:46:24 +00:00
Evan Lezar
936da0295b Use proper cuda image for containerd tests
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2022-08-12 14:23:24 +02:00
Evan Lezar
c2205c14fb Update subcomponents
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2022-08-12 14:22:40 +02:00
Evan Lezar
56935f5743 Merge branch 'fix-mounts' into 'main'
Fix setting of toolkit config option in toolkit container

See merge request nvidia/container-toolkit/container-toolkit!204
2022-08-09 15:46:15 +00:00
Evan Lezar
1b3bae790c Update image used for containerd tests
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2022-08-09 16:55:51 +02:00
Evan Lezar
47559a8c87 Output applied config to toolkit container stdout
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2022-08-09 15:18:59 +02:00
Evan Lezar
86412ea821 Ensure that toolkit-container sets correct default value
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2022-08-09 15:18:52 +02:00
Evan Lezar
b8aa844171 Fix setting of toolkit config option in toolkit container
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2022-08-09 15:18:52 +02:00
Evan Lezar
f9464c5cf9 Update libnvidia-container
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2022-08-09 15:18:52 +02:00
Evan Lezar
9df75e1fa3 Merge branch 'add-tegra-files-as-mounts' into 'main'
Add modifier to inject Tegra platform files

See merge request nvidia/container-toolkit/container-toolkit!203
2022-08-09 11:43:04 +00:00
Evan Lezar
0218e2ebf7 Update vendoring
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2022-08-08 17:12:47 +02:00
Evan Lezar
a9dc6550d5 Use nvinfo package from go-nvlib
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2022-08-08 17:11:42 +02:00
Evan Lezar
ffd6ec3c54 Add modifier to inject Tegra platform files
This change adds a modifier to that injects the tegra platform files
* /etc/nv_tegra_release
* /sys/devices/soc0/family

allowing these files to be used for platform detection in a containerized
context such as the GPU device plugin.

Signed-off-by: Evan Lezar <elezar@nvidia.com>
2022-08-08 16:04:20 +02:00
Evan Lezar
de3e0df96c Merge branch 'bump-version-1.11.0-rc.3' into 'main'
Bump version to 1.11.0-rc.3

See merge request nvidia/container-toolkit/container-toolkit!202
2022-08-08 13:45:59 +00:00
Evan Lezar
e5dadf34d9 Bump version to 1.11.0-rc.3
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2022-08-08 14:56:01 +02:00
Evan Lezar
52145f2d73 Merge branch 'fix-libnvidia-container-tag' into 'main'
Fix setting of LIBNVIDIA_CONTAINER_TAG

See merge request nvidia/container-toolkit/container-toolkit!201
2022-07-27 11:31:06 +00:00
Evan Lezar
90df3caf62 Fix setting of LIBNVIDIA_CONTAINER_TAG
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2022-07-27 13:30:31 +02:00
Evan Lezar
50db66a925 Merge branch 'release-1.11.0-rc.2' into 'main'
Add CHANGELOG entry for 1.11.0-rc.2

See merge request nvidia/container-toolkit/container-toolkit!200
2022-07-27 10:53:26 +00:00
Evan Lezar
8587fa05bd Add CHANGELOG entry for 1.11.0-rc.2
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2022-07-27 12:06:09 +02:00
Evan Lezar
8129dade3c Merge branch 'set-mount-devices' into 'main'
Allow accept-nvidia-visible-devices-* to be set by toolkit contianer

See merge request nvidia/container-toolkit/container-toolkit!198
2022-07-27 09:58:25 +00:00
Evan Lezar
3610fe7c33 Update libnvidia-container
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2022-07-27 11:12:57 +02:00
Evan Lezar
90518e0ce5 Allow accept-visible-devices config options to be set
This change allows the
* accept-nvidia-visible-devices-envvar-when-unprivileged
* accept-nvidia-visible-devices-as-volume-mounts

options to be set in the toolkit-container. These are controlled
by command line flags or the following environment variables:

* ACCEPT_NVIDIA_VISIBLE_DEVICES_ENVVAR_WHEN_UNPRIVILEGED
* ACCEPT_NVIDIA_VISIBLE_DEVICES_AS_VOLUME_MOUNTS

Signed-off-by: Evan Lezar <elezar@nvidia.com>
2022-07-27 10:57:43 +02:00
Evan Lezar
9c060f06ba Remove unused TOOLKIT_ARGS / --toolkit-args
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2022-07-27 10:50:18 +02:00
Evan Lezar
e848aa7813 Set toolkit root as flag
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2022-07-27 10:50:06 +02:00
Evan Lezar
feedc912e4 Rename toolkitDir toolkitRoot
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2022-07-27 10:50:05 +02:00
Evan Lezar
ab3f05cf62 Move global toolkitDir to options struct
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2022-07-27 10:41:46 +02:00
Evan Lezar
35982e51bf Move toolkit options to struct
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2022-07-27 10:40:19 +02:00
Evan Lezar
94e650c518 Merge branch 'bump-version' into 'main'
bump version to 1.11.0-rc.2

See merge request nvidia/container-toolkit/container-toolkit!197
2022-07-26 17:57:23 +00:00
Evan Lezar
d9edc18bf8 Bump version to 1.11.0-rc.2
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2022-07-25 09:51:20 +02:00
Evan Lezar
f4d01e0a05 Add changelog entries for 1.11.0-rc.1
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2022-07-25 09:51:01 +02:00
Evan Lezar
648cfaba51 Merge branch 'update-error-message' into 'main'
Make error message clearer

See merge request nvidia/container-toolkit/container-toolkit!194
2022-07-21 08:49:56 +00:00
Christopher Desiniotis
3a9de13f4e Apply 1 suggestion(s) to 1 file(s) 2022-07-21 08:03:39 +00:00
Evan Lezar
629a68937e Merge branch 'fix-relative-files' into 'main'
Fix adjusting relative paths for containerised devices and mounts.

See merge request nvidia/container-toolkit/container-toolkit!193
2022-07-20 11:40:28 +00:00
Evan Lezar
34e80abdea Add root to mounts type
This change adds a root member to the mounts type that is used to
perform most of the lookups for files and devices. This allows
for consistent handling of relative paths.

Signed-off-by: Evan Lezar <elezar@nvidia.com>
2022-07-18 14:37:02 +02:00
Evan Lezar
1161b21166 Make error message clearer
This change improves the error message when invoking the NVIDIA
Runtime Hook in non-legacy mode. This should guide users to specifying
the --runtime=nvidia flag when using docker.

Signed-off-by: Evan Lezar <elezar@nvidia.com>
2022-07-18 13:09:59 +02:00
Evan Lezar
bcdef81e30 Merge branch 'fix-ordering-of-csv-hooks' into 'main'
Fix ordering of create-symlink and update-ldcache hooks

See merge request nvidia/container-toolkit/container-toolkit!192
2022-07-18 10:59:41 +00:00
Evan Lezar
acc0afbb7a Remove Relative method from Locator
The Relative method added to the Locator interface was
not correctly implemented in the file type. The root was
never set when instantiating the object.

This change removes this method from the interface and the file
type, switching to a local implementation in the mounts type
instead.

Signed-off-by: Evan Lezar <elezar@nvidia.com>
2022-07-15 16:40:27 +02:00
Evan Lezar
7584044b3c Fix bug where ldcache may not contain symlinks
Since the creation of symlinks may include other libraries / folders
the ldcache should be updated AFTER the symlinks are created.

Signed-off-by: Evan Lezar <elezar@nvidia.com>
2022-07-15 12:18:40 +02:00
Evan Lezar
02c14e981c Add tests for identifying libraries
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2022-07-15 12:17:15 +02:00
Evan Lezar
37ee972f74 Merge branch 'CNT-2349/configure-docker' into 'main'
Add nvidia-ctk runtime configure command to update docker config

See merge request nvidia/container-toolkit/container-toolkit!166
2022-07-14 08:06:27 +00:00
Evan Lezar
3809407b6a Merge branch 'rename-to-nvidia-container-hook' into 'main'
Rename -toolkit executable to -runtime-hook

See merge request nvidia/container-toolkit/container-toolkit!189
2022-07-13 11:08:53 +00:00
Evan Lezar
f9547c447a Merge branch 'fix-cdi-refresh' into 'main'
Ensure that CDI registry is refreshed

See merge request nvidia/container-toolkit/container-toolkit!191
2022-07-13 09:38:45 +00:00
Evan Lezar
eb85d45137 Merge branch 'CNT-3297/cdi-config' into 'main'
Add runtime config option for CDI spec dirs

See merge request nvidia/container-toolkit/container-toolkit!190
2022-07-13 09:36:33 +00:00
Evan Lezar
9f0060f651 Add nvidia-ctk runtime configure command
This change adds a `runtime configure` command to the nvidia-ctk CLI. This
command is currently limited to configuring the docker config on the
system by modifying the daemon.json config file associated with docker.

Signed-off-by: Evan Lezar <elezar@nvidia.com>
2022-07-13 10:33:46 +02:00
Evan Lezar
0e6dc3f7ea Move docker config handling to internal package
In preparation for adding a command to the nvidia-ctk CLI to modify
the docker config, this change refactors load, update, and flush logic
from the toolkit container docker CLI to an internal package.

Signed-off-by: Evan Lezar <elezar@nvidia.com>
2022-07-13 10:30:01 +02:00
Evan Lezar
1b4944e1de Ensure that CDI registry is refreshed
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2022-07-12 14:07:21 +02:00
Evan Lezar
83743e3613 Add runtime config option for CDI spec dirs
This change adds an nvidia-container-runtime.modes.cdi.spec-dirs
config option that allows the default spec dirs to be overridden.

Signed-off-by: Evan Lezar <elezar@nvidia.com>
2022-07-11 15:39:48 +02:00
Evan Lezar
87afcc3ef4 Reuse check for existing hook
This change reuse the code that checks for the existing NVIDIA
Container Runtime hook to ensure that both nvidia-container-toolkit
and nvidia-container-runtime-hook are detected.

Signed-off-by: Evan Lezar <elezar@nvidia.com>
2022-07-08 12:20:19 +02:00
Evan Lezar
6ed3a4e1a6 Update package descriptions and URLs
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2022-07-08 12:16:03 +02:00
Evan Lezar
8a56671d18 Update package definitions
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2022-07-08 12:16:03 +02:00
Evan Lezar
1d81db76a6 Update references to nvidia-container-runtime-hook
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2022-07-08 12:15:56 +02:00
Evan Lezar
f50aecb84e Rename -toolkit executable to -runtime-hook
This change renames the nvidia-container-toolkit executable
to nvidia-container-runtime-hook. Here nvidia-container-toolkit
is created as a symlink to nvidia-container-runtime-hook.

Signed-off-by: Evan Lezar <elezar@nvidia.com>
2022-07-08 12:09:11 +02:00
Evan Lezar
a4258277e1 Merge branch 'update-release-script' into 'main'
Update release tooling to allow for rc release that don't update all packages.

See merge request nvidia/container-toolkit/container-toolkit!188
2022-07-07 14:33:27 +00:00
Evan Lezar
18eb3c7c38 Skip packages that already exist
For rc releases we allow nvidia-container-toolkit versions
to not match libnvidia-container versions. This change ensures
that only changed packages are released.

Signed-off-by: Evan Lezar <elezar@nvidia.com>
2022-07-07 15:41:20 +02:00
Evan Lezar
a0e728b5c8 Use centos:stream8 image for signing
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2022-07-07 15:40:53 +02:00
Evan Lezar
df0176cca4 Merge branch 'support-host-device-paths' into 'main'
Support device nodes with a different root

See merge request nvidia/container-toolkit/container-toolkit!187
2022-07-07 11:35:10 +00:00
Evan Lezar
b68b3c543b Use device host path to determine properties
This mirrors what is done in cri-o and allows for devices nodes
from, for example, the driver container to be injected into a
container at /dev instead of <ROOT>/dev

Signed-off-by: Evan Lezar <elezar@nvidia.com>
2022-07-07 12:03:23 +02:00
Evan Lezar
aea1a85bb4 Update vendored runc version
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2022-07-07 11:30:01 +02:00
Evan Lezar
98e874e750 Merge branch 'add-cdi-mode' into 'main'
Add CDI mode to NVIDIA Container Runtime

See merge request nvidia/container-toolkit/container-toolkit!172
2022-07-07 08:09:38 +00:00
Evan Lezar
eef016c27d Merge branch 'refactor-csv-discovery' into 'main'
Refactor device discovery

See merge request nvidia/container-toolkit/container-toolkit!185
2022-07-07 08:07:43 +00:00
Evan Lezar
19f89ecafd Update cdi package and run go mod vendor
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2022-07-06 16:53:38 +02:00
Evan Lezar
8817dee66c Add support for specifying devices in annotations
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2022-07-06 16:53:36 +02:00
Evan Lezar
404e266222 Add cdi mode to NVIDIA Container Runtime
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2022-07-06 16:53:05 +02:00
Evan Lezar
9b898c65fa Merge branch 'move-license-make-target' into 'main'
The licenses make target should not be a check target

See merge request nvidia/container-toolkit/container-toolkit!186
2022-07-06 13:14:43 +00:00
Evan Lezar
5c39cf4deb The licenses make target should not be a check target
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2022-07-06 14:24:11 +02:00
Evan Lezar
beff276a52 Add charDevices discoverer for devices
This change adds a charDevices discoverer and using this
for CSV, GDS, and MOFED discovery. Internally the discoverer
is a "mounts" discoverer with a charDevice locator.

Signed-off-by: Evan Lezar <elezar@nvidia.com>
2022-07-06 13:43:23 +02:00
Evan Lezar
55cb82c6c8 Create single discoverer per mount type for CSV
Instead of creating a set of discoverers per file, this change creates
a discoverer per type by first concatenating the mount specifications
from all files. This will allow all device nodes, for example, to
be treated as a single device.

Signed-off-by: Evan Lezar <elezar@nvidia.com>
2022-07-06 10:57:35 +02:00
Evan Lezar
88d1143827 Merge branch 'add-go-license' into 'main'
Add tooling to check go licenses

See merge request nvidia/container-toolkit/container-toolkit!183
2022-07-06 05:08:59 +00:00
Evan Lezar
d5162b1917 Add tooling to check go licenses
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2022-07-05 20:19:23 +02:00
Evan Lezar
ec078543a1 Merge branch 'rename-discover-merge' into 'main'
Rename discover.NewList to discover.Merge

See merge request nvidia/container-toolkit/container-toolkit!182
2022-07-05 09:37:03 +00:00
Evan Lezar
9191074666 Rename discover.NewList to discover.Merge
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2022-07-05 10:28:40 +02:00
Evan Lezar
89824849d3 Merge branch 'refactor-envvar-devices' into 'main'
Add DevicesFromEnvvars function to CUDA image abstraction

See merge request nvidia/container-toolkit/container-toolkit!178
2022-07-04 08:47:28 +00:00
Evan Lezar
877083f091 Merge branch 'CNT-3242/strip-root-from-container-mount' into 'main'
Strip root (e.g. driver root) from located mount paths in the container

See merge request nvidia/container-toolkit/container-toolkit!177
2022-07-04 08:45:38 +00:00
Evan Lezar
6467fcd0f5 Merge branch 'ensure-test-output-path-exists' into 'main'
Ensure test/output path exists

See merge request nvidia/container-toolkit/container-toolkit!180
2022-07-04 08:44:11 +00:00
Evan Lezar
fd135f1a8b Add Relative function to Locator interface
This adds a Relative function to the Locator interface and uses
this to determine the host and container paths for located files
(and devices). This ensures that the root (e.g. the nvidia driver
root) is stripped from the container path.

Signed-off-by: Evan Lezar <elezar@nvidia.com>
2022-07-01 16:23:50 +02:00
Evan Lezar
4e08ec2405 Use CUDA.DevicesFromEnvvar to check if modifications are required
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2022-07-01 16:14:36 +02:00
Evan Lezar
925c348565 Add DevicesFromEnvvars function to CUDA image
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2022-07-01 16:12:13 +02:00
Kevin Klues
25fd1aaf7e Merge branch 'CNT-3084/include-cufile.json' into 'main'
Include cufile.json in GDS discovery

See merge request nvidia/container-toolkit/container-toolkit!175
2022-07-01 13:49:02 +00:00
Kevin Klues
91e645b91b Merge branch 'gds-poc' into 'main'
Add initial GDS and MOFED discovery

See merge request nvidia/container-toolkit/container-toolkit!163
2022-07-01 13:43:20 +00:00
Evan Lezar
a1c2f07b6e Add /etc/cufile.json to list of required mounts
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2022-07-01 14:54:58 +02:00
Evan Lezar
7f7bec0668 Create GDS and MOFED modifiers
This change creates GDS and MOFED modifiers and adds them to the
modifer created for the selected runtime mode if the NVIDIA_GDS
and NVIDIA_MOFED envvars are set to "enabled", respectively.

Signed-off-by: Evan Lezar <elezar@nvidia.com>
2022-07-01 14:54:05 +02:00
Evan Lezar
cb34f7c6d1 Add discovery of GDS and MOFED devices
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2022-07-01 14:40:55 +02:00
Evan Lezar
7f47a61986 Allow globs in filenames for locators
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2022-07-01 14:30:33 +02:00
Evan Lezar
e8843c38f2 Move cmd/nvidia-container-runtime/modifier package to internal/modifier
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2022-07-01 14:28:40 +02:00
Evan Lezar
d66c00dd1d Use modifier list and discoverModifer
This change uses modifier compositioning and the discoverModifier to
refactor the existing CSV modifier.

This change adds a discoverModifier to the internal/modifier package and
refactors the CSV modifier to use this abstraction.

Signed-off-by: Evan Lezar <elezar@nvidia.com>
2022-07-01 14:25:19 +02:00
Evan Lezar
55ac8628c8 Add lists of modifiers to allow for modifier compositioning
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2022-07-01 14:25:18 +02:00
Evan Lezar
175f75b43f Ensure test/output path exists
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2022-07-01 10:07:37 +02:00
Evan Lezar
da3226745c Update vendoring
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2022-06-22 10:58:09 +02:00
Evan Lezar
b23e3ea13a Merge branch 'bump-1.11.0-rc.1' into 'main'
Bump version to 1.11.0-rc.1

See merge request nvidia/container-toolkit/container-toolkit!170
2022-06-22 07:52:19 +00:00
Evan Lezar
02f0ee08fc Update nvidia-docker and nvidia-container-runtime
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2022-06-17 11:58:06 +02:00
Evan Lezar
4b0e79be50 Update nvidia-docker and nvidia-container-runtime branches to main
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2022-06-17 11:37:53 +02:00
Evan Lezar
8b729475e2 Allow any 1.* version of libnvidia-container package
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2022-06-16 14:57:30 +02:00
Evan Lezar
a1319b1786 Switch to latest docker and docker dind in CI
This change prevents errors when downloading ubuntu repos on
amd64 architectures. The `stable` images were last pushed
2 years ago.

Signed-off-by: Evan Lezar <elezar@nvidia.com>
2022-06-16 13:44:14 +02:00
Evan Lezar
278fa43303 Allow libnvidia-container1 version to be specified directly
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2022-06-15 13:37:42 +02:00
Evan Lezar
d75f364b27 Update build scripts to set libnvidia-container version
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2022-06-15 13:37:42 +02:00
Evan Lezar
52d5021b76 Bump version to 1.11.0-rc.1
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2022-06-15 13:37:40 +02:00
Kevin Klues
7cfd3bd510 Merge branch 'bump-v1.10.0' into 'main'
Bump version to v1.10.0

See merge request nvidia/container-toolkit/container-toolkit!169
2022-06-13 10:32:37 +00:00
Evan Lezar
05ca131858 Update libnvidia-container submodule to v1.10.0
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2022-06-13 11:40:18 +02:00
Evan Lezar
181ce8571d Bump version to v1.10.0
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2022-06-13 11:40:18 +02:00
Shiva Krishna Merla
2ab0c6abce Merge branch 'update_container_licenses' into 'main'
Update toolkit images to use NGC DL license

See merge request nvidia/container-toolkit/container-toolkit!164
2022-06-08 19:04:22 +00:00
Shiva Krishna Merla
50caf29b4e Update toolkit images to use NGC DL license 2022-06-08 19:04:21 +00:00
Evan Lezar
067f7af142 Merge branch 'update-nvidia-docker' into 'main'
Bump nvidia-docker version to 2.11.0

See merge request nvidia/container-toolkit/container-toolkit!167
2022-06-08 12:15:17 +00:00
Evan Lezar
d1449951bc Bump nvidia-docker version
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2022-06-08 13:25:27 +02:00
Evan Lezar
a05af50b0f Merge branch 'bump-cuda-version' into 'main'
Bump CUDA base image version to 11.7.0

See merge request nvidia/container-toolkit/container-toolkit!162
2022-06-07 15:22:05 +00:00
Evan Lezar
950aff269b Merge branch 'bump-version-1.10.0-rc.4' into 'main'
Update NVIDIA Container Runtime readme and installed configs

See merge request nvidia/container-toolkit/container-toolkit!160
2022-06-07 15:05:48 +00:00
Evan Lezar
e033db559f Switch default container-toolkit image target to ubuntu20.04
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2022-06-07 11:32:20 +02:00
Evan Lezar
9a24a40fd2 Merge branch 'only-bump-version' into 'main'
Bump version to 1.10.0-rc.4

See merge request nvidia/container-toolkit/container-toolkit!165
2022-06-07 09:00:38 +00:00
Evan Lezar
df391e2144 Only generate amd64 images for ubuntu18.04
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2022-06-07 10:58:15 +02:00
Evan Lezar
9146b4d4b6 Remove build and release of centos8 container-toolkit images
Note that the centos8 packages are still built.

Signed-off-by: Evan Lezar <elezar@nvidia.com>
2022-06-07 10:58:15 +02:00
Evan Lezar
068d7e085b Use ubi8 base image for centos8
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2022-06-07 10:58:15 +02:00
Evan Lezar
79510a8290 Bump CUDA base image version to 11.7.0
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2022-06-07 10:58:15 +02:00
Evan Lezar
50240c93bd Update config files with options and defaults
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2022-06-03 13:10:24 +02:00
Evan Lezar
7ca0e5db60 Update NVIDIA Container Runtime readme
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2022-06-03 13:10:21 +02:00
Evan Lezar
c0e6765d46 Update libnvidia-container
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2022-06-01 15:29:25 +02:00
Evan Lezar
7739b0e8ea Bump version to 1.10.0-rc.4
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2022-06-01 14:46:12 +02:00
Evan Lezar
ab23fc52db Merge branch 'fix-binary-name' into 'main'
Use BinaryName for v1 containerd runtime config

See merge request nvidia/container-toolkit/container-toolkit!159
2022-05-30 07:53:42 +00:00
Evan Lezar
530d66b5c7 Also set default_runtime.options.BinaryName
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2022-05-27 16:21:52 +02:00
Evan Lezar
dad3e855b5 Also cleanup v1 default_runtime if BinaryName is set
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2022-05-27 16:18:57 +02:00
Evan Lezar
15cbd54d1c Also set Runtime file v1 containerd runtime config
This ensures that older versions of containerd that may be expecting
this over options.BinaryName should continue to work.

Signed-off-by: Evan Lezar <elezar@nvidia.com>
2022-05-26 06:26:06 +02:00
Evan Lezar
4cd719692e Use BinaryName for v1 containerd runtime config
This fixes a bug where the runtime path for v1 containerd configs
was specified in the options.Runtime setting (which is used
for the default runtime) instead of options.BinaryName.

Signed-off-by: Evan Lezar <elezar@nvidia.com>
2022-05-26 06:25:09 +02:00
Evan Lezar
b940294557 Merge branch 'CNT-2979/allow-empty-config' into 'main'
Return default config if config path is not found

See merge request nvidia/container-toolkit/container-toolkit!156
2022-05-25 12:20:51 +00:00
Evan Lezar
840cdec36d Update libnvidia-container
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2022-05-25 13:23:21 +02:00
Evan Lezar
73a5b70a02 Return default config if config path is not found
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2022-05-25 13:22:45 +02:00
Evan Lezar
f0cae49892 Merge branch 'fix-jetpack-require' into 'main'
Ignore NVIDIA_REQUIRE_JETPACK* for image requirements

See merge request nvidia/container-toolkit/container-toolkit!158
2022-05-25 11:19:47 +00:00
Evan Lezar
e07c7f0fa2 Ignore NVIDIA_REQUIRE_JETPACK* for image requirements
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2022-05-24 09:53:37 +02:00
Evan Lezar
52ce97929c Merge branch 'fix-is-tegra-check' into 'main'
Fix bug in tegra detection

See merge request nvidia/container-toolkit/container-toolkit!157
2022-05-23 08:00:09 +00:00
Evan Lezar
084eae6e0d Fix bug in tegra detection
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2022-05-20 14:39:36 +02:00
Evan Lezar
f656b5c887 Merge branch 'fix-char-device' into 'main'
Fix assertCharDevice matching on all files

See merge request nvidia/container-toolkit/container-toolkit!155
2022-05-20 10:32:51 +00:00
Evan Lezar
55c1d7c256 Fix assertCharDevice matching on all files
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2022-05-20 10:08:00 +02:00
Evan Lezar
0f2b20fffc Merge branch 'auto-generate-changelog' into 'main'
Use single  changelog.md file instead of separate package-specific changelogs

See merge request nvidia/container-toolkit/container-toolkit!154
2022-05-20 08:03:19 +00:00
Evan Lezar
bb69727148 Include git commit in changelog URL
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2022-05-18 16:02:14 +02:00
Evan Lezar
0b4f3aaf69 Merge branch 'bump-1.10.0-rc.3' into 'main'
Bump version to 1.10.0-rc.3

See merge request nvidia/container-toolkit/container-toolkit!153
2022-05-18 13:46:41 +00:00
Evan Lezar
e5125515f0 Automatically generate changelogs in docker builds
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2022-05-18 14:54:58 +02:00
Evan Lezar
033b2fd90d Add dummy entry for rpm changelog matching other components
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2022-05-18 14:54:58 +02:00
Evan Lezar
a0a00e38fd Format CHANGELOG.md as markdown
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2022-05-18 14:54:58 +02:00
Evan Lezar
77cf70b625 Move debian changelog to CHANGELOG.md
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2022-05-18 14:54:58 +02:00
Evan Lezar
8ab3d713bc Update libnvidia-container version
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2022-05-18 14:53:29 +02:00
Evan Lezar
c58d81cec5 Bump version to 1.10.0-rc.3
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2022-05-18 13:38:54 +02:00
Evan Lezar
2a3b87157a Merge branch 'prep-release' into 'main'
Update changelog and libnvidia-container for release

See merge request nvidia/container-toolkit/container-toolkit!152
2022-05-13 11:52:58 +00:00
Evan Lezar
a68d1d914c Update libnvidia-container
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2022-05-13 13:52:20 +02:00
Evan Lezar
f7ac8b8139 Update changelog for release
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2022-05-13 13:44:43 +02:00
Evan Lezar
b2902cc04a Merge branch 'CNT-2967/add-version-string' into 'main'
Add --version support for the CLIs

See merge request nvidia/container-toolkit/container-toolkit!151
2022-05-13 11:02:34 +00:00
Evan Lezar
25710468dc Ensure that git commit is set in docker build
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2022-05-13 07:31:11 +02:00
Evan Lezar
4a19bf16a8 Set the version and gitCommit in the Makefile
This change ensures that the variables used to construct the
version strings for CMDs are set in the makefile.

Signed-off-by: Evan Lezar <elezar@nvidia.com>
2022-05-13 07:31:11 +02:00
Evan Lezar
c77e86137e Add version output to CLIs
This change adds version output to the nvidia-continer-runtime,
nvidia-container-toolkit, and nvidia-ctk CLIs. The same version
is used in all cases and includes a version string and a git
revision if set.

The construction of the version string mirrors what is done in runc.

Signed-off-by: Evan Lezar <elezar@nvidia.com>
2022-05-13 07:31:11 +02:00
Evan Lezar
60dacb76b6 Call logger.Reset() to ensure errors are captured
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2022-05-12 15:42:42 +02:00
Evan Lezar
19138a2110 Skip setting of log file for --version flag
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2022-05-12 14:34:37 +02:00
Evan Lezar
bdb43aa8f2 Merge branch 'CNT-2973/add-has-nvml' into 'main'
Include HasNVML check in ResolveAutoMode

See merge request nvidia/container-toolkit/container-toolkit!149
2022-05-12 11:23:47 +00:00
Evan Lezar
d62cce3c75 Merge branch 'CNT-2953/new-options' into 'main'
Update config options to control OCI Spec modification

See merge request nvidia/container-toolkit/container-toolkit!145
2022-05-12 11:22:42 +00:00
Evan Lezar
ff86ecb2a5 Include HasNVML check in ResolveAutoMode
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2022-05-12 10:55:58 +02:00
Evan Lezar
ad9ec1efae Add HasNVML function to check if NVML is supported
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2022-05-12 10:55:13 +02:00
Evan Lezar
9db5f9c9e8 Remove unneeded legacy discovery
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2022-05-12 10:53:52 +02:00
Evan Lezar
4c49f75365 Remove --force flag from nvidia-container-runtime-hook
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2022-05-12 10:53:52 +02:00
Evan Lezar
e591f3f26b Replace experimental and discover-mode
These changes replace the nvidia-container-runtime config options
experimental and discover-mode with a single mode config option.

Note that mode is now a string with a default value of "auto"
and a mode value of "legacy" is equivalent to experimental == false.

Signed-off-by: Evan Lezar <elezar@nvidia.com>
2022-05-12 10:53:50 +02:00
Evan Lezar
e0ad82e467 Move ResolveAutoMode to info package
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2022-05-12 10:28:56 +02:00
Evan Lezar
3a1404f2f4 Move isTegraSystem to internal info package
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2022-05-12 10:28:56 +02:00
Evan Lezar
cf7bb91481 Update nvidia-container-runtime config options
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2022-05-12 10:28:56 +02:00
Evan Lezar
ba0e606df2 Use toml unmarshal to read runtime config
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2022-05-12 10:28:56 +02:00
Evan Lezar
ae57a2fc93 Merge branch 'CNT-2875/create-specific-symlinks' into 'main'
Create specific symlinks for CSV mode

See merge request nvidia/container-toolkit/container-toolkit!150
2022-05-12 05:27:43 +00:00
Evan Lezar
1eb0e3c8b3 Merge branch 'fix-executable-locator' into 'main'
Fix location of executables in PATH

See merge request nvidia/container-toolkit/container-toolkit!148
2022-05-12 05:26:22 +00:00
Evan Lezar
a524c44161 Merge branch 'CNT-2926/runc-logging' into 'main'
Support runc logging command line options

See merge request nvidia/container-toolkit/container-toolkit!144
2022-05-12 05:24:25 +00:00
Evan Lezar
675fbace01 Add hook to create specific links
This change updates the create-symlinks hook to also create symlinks for
libcuda.so, libGLX_indirect.so.0, and libnvidia-opticalflow.so

Signed-off-by: Evan Lezar <elezar@nvidia.com>
2022-05-11 16:36:49 +02:00
Evan Lezar
eac326c5ea Add --link option to nvidia-ctk hook create-symlinks command
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2022-05-11 15:28:53 +02:00
Evan Lezar
b0f7a3809f Factor linkCreation into method
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2022-05-11 15:28:02 +02:00
Evan Lezar
126c004ee0 Improve symlink creation loop
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2022-05-11 15:17:15 +02:00
Evan Lezar
d2516cb5d5 Merge branch 'fix-container-root' into 'main'
Fix bug in update-ldcache hook when OCI spec contains a relative root

See merge request nvidia/container-toolkit/container-toolkit!147
2022-05-10 22:01:14 +00:00
Evan Lezar
4696d7ee69 Merge branch 'fix-hook-flags' into 'main'
Use singular instead of plural for hook arguments

See merge request nvidia/container-toolkit/container-toolkit!146
2022-05-10 22:00:51 +00:00
Evan Lezar
ef6f48e9f7 Use singular instead of plural for hook arguments
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2022-05-10 19:55:31 +02:00
Evan Lezar
088db09180 Use executable locator to find low-level runtime
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2022-05-10 15:21:48 +02:00
Evan Lezar
b8ef6be6ea Use lookup.GetPath from runtime hook
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2022-05-10 14:53:26 +02:00
Evan Lezar
1d2e1bd403 Add lookup.GetPath and lookup.GetPaths functions
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2022-05-10 14:52:47 +02:00
Evan Lezar
55efdc8765 Use state.GetContainerRoot in nvidia-ctk hook subcommands
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2022-05-10 11:48:43 +02:00
Evan Lezar
395f6cecb2 Add GetContainerRoot to oci.State type
This change adds a GetContainerRoot to the oci.State type to
encapsulate the logic around determining the container root.
This Fixes a bug where relative roots (e.g. as generated by contianerd)
are not supported.

Signed-off-by: Evan Lezar <elezar@nvidia.com>
2022-05-10 11:48:43 +02:00
Evan Lezar
e9d929dc2f Support runc logging command line options
This change processes and supports runc logging command line arguments.
This allows for better integration into container engines such as
docker.

Signed-off-by: Evan Lezar <elezar@nvidia.com>
2022-05-09 19:32:38 +02:00
Evan Lezar
117f68fa6e Merge branch 'CNT-2924/low-level-runtime-config' into 'main'
Add nvidia-container-runtime.runtimes config option

See merge request nvidia/container-toolkit/container-toolkit!143
2022-05-09 17:31:02 +00:00
Evan Lezar
7574a0d7de Make output of bundle directory a debug message
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2022-05-09 09:38:16 +02:00
Evan Lezar
335de5a352 Switch to debug logging when locating runtimes
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2022-05-09 09:38:16 +02:00
Evan Lezar
c76946cbcc Add nvidia-container-runtime.runtimes config option
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2022-05-09 09:38:12 +02:00
Evan Lezar
e93bafa6d4 Merge branch 'CNT-2676/nvidia-require' into 'main'
Add support for checking requirements to CSV discovery

See merge request nvidia/container-toolkit/container-toolkit!141
2022-05-06 12:52:53 +00:00
Evan Lezar
785f120c31 Fix form -> from in comment
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2022-05-06 13:22:34 +02:00
Evan Lezar
9e46d41dbe Add debug logging when checking requirements
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2022-05-05 14:14:01 +02:00
Evan Lezar
70c4588197 Add compute capability of first device as arch property
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2022-05-05 14:11:30 +02:00
Evan Lezar
9f50ac95c4 Add CUDA ComputeCapability function
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2022-05-05 14:09:28 +02:00
Evan Lezar
75ce057878 Add debug log for command line arguments
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2022-05-05 13:47:39 +02:00
Evan Lezar
9d2363e12e Return low-level runtime if subcommand is not create
This also removes a test that invokes nvidia-container-runtime run --bundle
expecting an error. This test is no longer valid since this command line
is forwared to runc where the error should be detected.

Signed-off-by: Evan Lezar <elezar@nvidia.com>
2022-05-05 13:43:13 +02:00
Evan Lezar
49f4bb3198 Check requirements before creating CSV discoverer
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2022-05-05 13:43:13 +02:00
Evan Lezar
583793b7ae Add processing for requirements and constraints
This change adds a Requirements abstraction that can be used to check
an images' NVIDIA_REQUIRE_* envvars against the host properties such
as CUDA version or architecture.

Signed-off-by: Evan Lezar <elezar@nvidia.com>
2022-05-05 13:43:13 +02:00
Evan Lezar
5d7b3a4a96 Return raw spec from Spec.Load
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2022-05-05 13:43:13 +02:00
Evan Lezar
a672713dba Add basic CUDA wrapper
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2022-05-05 13:43:13 +02:00
Evan Lezar
50cf07e4cd Use CUDA image abstraction for runtime hook
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2022-05-05 13:43:13 +02:00
Evan Lezar
8f0e1906c2 Add CUDA image abstraction
This change adds a CUDA image abstraction that encapsulates
the queries performed on a container image (e.g. envvars) to
check certain CUDA properties.

Signed-off-by: Evan Lezar <elezar@nvidia.com>
2022-05-05 13:43:13 +02:00
Evan Lezar
2e319b5b08 Add gcc for Amazonlinux builds
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2022-05-05 13:43:13 +02:00
Evan Lezar
f4d87e6912 Use go install to install go development tools
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2022-05-05 13:43:13 +02:00
Evan Lezar
fd06c7a00b Bump golang version to 1.17.8
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2022-05-05 13:43:13 +02:00
Evan Lezar
8fabeed3a4 Update go vendoring
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2022-05-05 13:43:13 +02:00
Kevin Klues
0c737bbdcc Merge branch 'fix-image-builds' into 'main'
Fix image building due to GPG key update

See merge request nvidia/container-toolkit/container-toolkit!142
2022-04-29 13:06:39 +00:00
Evan Lezar
38a4c9fa8f Fix image building due to GPG key update
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2022-04-29 14:13:33 +02:00
Evan Lezar
6e60b24828 Merge branch 'fix-version-parsing' into 'main'
Use semver package to parse CUDA version

See merge request nvidia/container-toolkit/container-toolkit!140
2022-04-25 12:35:26 +00:00
Evan Lezar
bdf997c761 Use semver package to parse CUDA version
This avoids the use of scanf on a user-provided string which is flagged
as a security vulnerability.

Signed-off-by: Evan Lezar <elezar@nvidia.com>
2022-04-22 14:57:52 +02:00
Evan Lezar
4ce932e7a7 Merge branch 'import-release-tooling' into 'main'
Add package release tooling

See merge request nvidia/container-toolkit/container-toolkit!102
2022-04-20 07:57:32 +00:00
Evan Lezar
4145cdf7f7 Merge branch 'bump-libnvidia-container-reference' into 'main'
Update libnvidia-container reference

See merge request nvidia/container-toolkit/container-toolkit!139
2022-04-19 14:40:53 +00:00
Evan Lezar
0b2be45ba2 Update libnvidia-container reference
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2022-04-19 15:46:14 +02:00
Evan Lezar
ce3cdb6fd9 Merge branch 'update-libnvidia-container-tracking-branch' into 'main'
Update libnvidia-container branch to main

See merge request nvidia/container-toolkit/container-toolkit!137
2022-04-19 13:44:01 +00:00
Jon Mayo
3ba18f89b0 Merge branch 'remove-dockerhub-release' into 'main'
Remove dockerhub publishing

See merge request nvidia/container-toolkit/container-toolkit!138
2022-04-14 00:13:45 +00:00
Jon Mayo
0de159e8b4 libnvidia-container: 'main' track branch 2022-04-13 16:20:51 -07:00
Jon Mayo
3fbffa0b48 Remove dockerhub publishing 2022-04-13 14:19:48 -07:00
Evan Lezar
75dfea1406 Merge branch 'dependabot/go_modules/github.com/containers/podman/v4-4.0.3' into 'main'
Bump github.com/containers/podman/v4 from 4.0.1 to 4.0.3

See merge request nvidia/container-toolkit/container-toolkit!134
2022-04-13 14:04:29 +00:00
Evan Lezar
c24bd4aa4e Update libnvidia-container branch to main
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2022-04-13 15:52:57 +02:00
dependabot[bot]
2b9dc5cbcf Bump github.com/containers/podman/v4 from 4.0.1 to 4.0.3
Bumps [github.com/containers/podman/v4](https://github.com/containers/podman) from 4.0.1 to 4.0.3.
- [Release notes](https://github.com/containers/podman/releases)
- [Changelog](https://github.com/containers/podman/blob/v4.0.3/RELEASE_NOTES.md)
- [Commits](https://github.com/containers/podman/compare/v4.0.1...v4.0.3)

---
updated-dependencies:
- dependency-name: github.com/containers/podman/v4
  dependency-type: direct:production
...

Signed-off-by: dependabot[bot] <support@github.com>
2022-04-13 11:05:29 +00:00
Evan Lezar
234d05e57e Improve handling of git remotes for gh-pages packages
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2022-04-13 12:14:51 +02:00
Evan Lezar
abb0b7be5d Add scripting to sign and publish packages
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2022-04-13 12:14:51 +02:00
Evan Lezar
c09e5aca77 Add envvar for package versions
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2022-04-13 12:14:51 +02:00
Evan Lezar
6709da4cea Rename release.sh to build-packages.sh
The name release.sh is overloaded. This change renames the script to make the
intent clearer.

Signed-off-by: Evan Lezar <elezar@nvidia.com>
2022-04-13 12:14:51 +02:00
Evan Lezar
84f7daf108 Merge branch 'replace-master-with-main' into 'main'
Change master references to main

See merge request nvidia/container-toolkit/container-toolkit!135
2022-04-12 13:47:22 +00:00
Evan Lezar
ac49dc320c Change master references to main
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2022-04-12 14:52:38 +02:00
Evan Lezar
d304e06ffe Merge branch 'bump-version-1.10.0-rc.2' into 'main'
Bump version to v1.10.0-rc.2

See merge request nvidia/container-toolkit/container-toolkit!133
2022-04-12 10:33:38 +00:00
Evan Lezar
49756cb7ba Update libnvidia-container submodule
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2022-04-12 11:40:04 +02:00
Evan Lezar
8c7d919d9f Bump version to v1.10.0-rc.2
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2022-04-12 10:48:03 +02:00
Evan Lezar
d7f53dcf64 Merge branch 'add-experimental-config' into 'master'
Add commented experimental option to config files

See merge request nvidia/container-toolkit/container-toolkit!131
2022-04-11 11:48:25 +00:00
Evan Lezar
36ffd0983c Merge branch 'revert-skip-release' into 'master'
Revert changes to skip release of images

See merge request nvidia/container-toolkit/container-toolkit!132
2022-04-11 11:46:36 +00:00
Evan Lezar
be680c6633 Add commented experimental option to config files
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2022-04-11 12:57:03 +02:00
Evan Lezar
e47aa2962a Revert "[ci] Skip external releases if associated OUT_REGISTRY value is empty."
This reverts commit c2f35badb0.
2022-04-11 12:53:42 +02:00
Evan Lezar
b5000c8107 Revert "[ci] echo skipped commands"
This reverts commit 3dab9da80e.
2022-04-11 12:53:22 +02:00
Evan Lezar
6d3bcb8723 Merge branch 'add-log-level-config' into 'master'
Add log-level config option for nvidia-container-runtime

See merge request nvidia/container-toolkit/container-toolkit!130
2022-04-11 07:32:41 +00:00
Evan Lezar
29e690f68a Update libnvidia-container
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2022-04-08 18:04:43 +02:00
Evan Lezar
c224832a6d Add log-level config option for nvidia-container-runtime
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2022-04-08 13:56:17 +02:00
Evan Lezar
5211960fc3 Merge branch 'detect-gpus-flag' into 'master'
Detect use of --gpus flag in experimental mode

See merge request nvidia/container-toolkit/container-toolkit!125
2022-04-08 11:18:11 +00:00
Evan Lezar
cfca18a5f8 Merge branch 'refactor-csv-mount-spec-discovery' into 'master'
Refactor CSV discovery to make char device discovery clearer

See merge request nvidia/container-toolkit/container-toolkit!129
2022-04-08 10:54:06 +00:00
Evan Lezar
43ee7f1cd2 Merge branch 'cleanup-default-executable-dir' into 'master'
Clean up NVIDIA Container Runtime Hook executable specification

See merge request nvidia/container-toolkit/container-toolkit!126
2022-04-08 10:29:25 +00:00
Evan Lezar
45160b88a4 Remove exsiting NVIDIA Container Runtime Hooks from the spec
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2022-04-08 12:03:22 +02:00
Evan Lezar
dab6f4b768 Specify --force flag when invoking nvidia-container-runtime-hook
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2022-04-08 12:03:22 +02:00
Evan Lezar
a9a4704273 Raise error if hook invoked in experimental mode without force flag
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2022-04-08 12:03:22 +02:00
Evan Lezar
2563c1b87c Export GetDefaultRuntimeConfig
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2022-04-08 12:03:22 +02:00
Evan Lezar
62f608a3fe Make order of discoverers deterministic
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2022-04-08 11:59:26 +02:00
Evan Lezar
2c1e356370 Refactor CSV discovery to make char device discovery clearer
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2022-04-08 11:47:47 +02:00
Evan Lezar
7ec3cd0b5b Fix creation of CSV parser in create-symlinks
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2022-04-08 11:39:18 +02:00
Evan Lezar
ab7f25500f Fix creation of CSV parser in create-symlinks
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2022-04-08 11:36:48 +02:00
Evan Lezar
196d5c5461 Move NVIDIA Container Runtime Hook executable name to shared constant
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2022-04-08 11:29:27 +02:00
Evan Lezar
f07d110e85 Use DefaultExecutableDir to determine default paths
This change adds a DefaultExecutableDir = /usr/bin constant that is used
to construct default paths for executables instead of specifying these
explicitly.

Signed-off-by: Evan Lezar <elezar@nvidia.com>
2022-04-08 11:28:03 +02:00
Evan Lezar
1ebd48dea6 Merge branch 'add-symlink-hook' into 'master'
Add hook create-symlinks subcommand to create symlinks in container

See merge request nvidia/container-toolkit/container-toolkit!121
2022-04-08 09:14:07 +00:00
Evan Lezar
f7c74d35cc Merge branch 'add-hooks-cli' into 'master'
Add nvidia-ctk CLI with hook command and update-ldcache subcommand to update LD cache

See merge request nvidia/container-toolkit/container-toolkit!115
2022-04-08 09:13:39 +00:00
Evan Lezar
0de7491ce3 Merge branch 'check-for-nil-modifier' into 'master'
Return unmodified runtime if specModifier is nil

See merge request nvidia/container-toolkit/container-toolkit!127
2022-04-08 09:05:24 +00:00
Evan Lezar
1296a0ecf4 Merge branch 'fix-missing-close-on-csv' into 'master'
Add missing close when reading CSV file

See merge request nvidia/container-toolkit/container-toolkit!128
2022-04-08 08:33:23 +00:00
Evan Lezar
d1a38f10a5 Refactor CSV file parsing
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2022-04-08 08:11:10 +02:00
Evan Lezar
d8109dc49b Add missing close when reading CSV file
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2022-04-08 08:00:37 +02:00
Evan Lezar
67602b28f9 Return unmodified runtime if specModifier is nil
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2022-04-08 07:50:40 +02:00
Evan Lezar
907736b053 Inject symlinks hook for creating symlinks in a container
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2022-04-07 20:25:55 +02:00
Evan Lezar
ecb4ef495a Add create-symlinks subcommand to create symlinks in container for specified CSV files
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2022-04-07 20:25:55 +02:00
Evan Lezar
95797a8252 Move reading of container state for internal/oci package
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2022-04-07 20:25:55 +02:00
Evan Lezar
c87ae586d4 FIX: Rename containerSpec flag to container-spec
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2022-04-07 20:25:19 +02:00
Evan Lezar
7c10762768 Include nvidia-ctk in deb and rpm packages
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2022-04-07 20:25:19 +02:00
Evan Lezar
9c3c8e038a Add cache for mounts
This change adds a cache to the mounts type. This means that if called to get
a list of folders, for example, the result is reused instead of recalculated.
This also avoids duplicate logging.

Signed-off-by: Evan Lezar <elezar@nvidia.com>
2022-04-07 20:25:19 +02:00
Evan Lezar
d970d0a627 Add discovery for ldconfig hook that updates the LDCache
This change adds a discovered hook for updating the ldcache as a container-create
hook. The mounts from a discoverer are inspected to determine the folders that must
be added to the cache using the nvidia-ctk hook update-ldcache command.

This is added to the "csv" discovery mode for the experimental runtime.

Signed-off-by: Evan Lezar <elezar@nvidia.com>
2022-04-07 20:25:19 +02:00
Evan Lezar
740bd3fb9d Add nvidia-ctk config section
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2022-04-07 20:25:18 +02:00
Evan Lezar
1c892af215 Add hook command to nvidia-ctk with update-ldcache subcommand
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2022-04-07 16:38:01 +02:00
Evan Lezar
c945cc714d Add stub nvidia-ctk CLI
This change adds an nvidia-ctk CLI that is used as the basis for
utilities related to the NVIDIA Container Toolkit.

Signed-off-by: Evan Lezar <elezar@nvidia.com>
2022-04-07 16:32:25 +02:00
Evan Lezar
7914957105 Refactor hook creation
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2022-04-07 16:32:24 +02:00
Evan Lezar
99baea9d51 Merge branch 'add-auto-discover-mode' into 'master'
Add auto discover mode and use this as the default

See merge request nvidia/container-toolkit/container-toolkit!124
2022-04-07 14:29:44 +00:00
Evan Lezar
516a658902 Merge branch 'add-jetson-csv-discovery' into 'master'
Add support for CSV mount specifications

See merge request nvidia/container-toolkit/container-toolkit!117
2022-04-07 14:25:51 +00:00
Evan Lezar
bb086d4b44 Add auto discover mode and use this as the default
This change adds an 'auto' discover mode that attempts to select the correct mode
for a given platform. This currently attempts to detect whether the platform is a
Tegra-based system in which case the 'csv' discover mode is used. The 'legacy'
discover mode is used as the fallback.

Signed-off-by: Evan Lezar <elezar@nvidia.com>
2022-04-07 15:37:03 +02:00
Evan Lezar
26d2873bb2 FIX: Rename DefaultRoot to DefaultMountSpecPath
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2022-04-07 14:11:52 +02:00
Evan Lezar
b7d130e151 FIX: Improve locator map construction
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2022-04-07 11:12:41 +02:00
Evan Lezar
8574879560 FIX: Update TODO for container path
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2022-04-07 11:07:57 +02:00
Evan Lezar
5a416bc99c FIX: Use MountSpec* constants
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2022-04-07 11:01:57 +02:00
Evan Lezar
df7c064257 FIX: Remove unused NewFromCSV constructor
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2022-04-07 10:59:03 +02:00
Evan Lezar
2f2846116e Correct typo in constructor name
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2022-04-07 10:46:26 +02:00
Evan Lezar
6682bc90b4 Add support for NVIDIA_REQUIRE_JETPACK envvar
This change ensures that by default, the CSV discovery only considers the base CSV
files (l4t.csv, drivers.csv, devices.csv) and skips the rest unless the
NVIDIA_REQUIRE_JETPACK is set to "csv-mounts=all", in which case, all CSV files in the
specified folder are considered.

Signed-off-by: Evan Lezar <elezar@nvidia.com>
2022-04-07 10:46:26 +02:00
Evan Lezar
1c05a463bd Add csv discovery mode to experimental runtime
This change adds support for a "csv" discovery mode to the experimental runtime.
If this is set with experimental = true, a CSV-based discovery of devices and
mounts are used to define the modifications required to the OCI spec. The edits
are expressed as CDI ContainerEdits.

Signed-off-by: Evan Lezar <elezar@nvidia.com>
2022-04-07 10:45:19 +02:00
Evan Lezar
14f9e986c9 Add CSV-based discovery of device nodes
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2022-04-07 10:44:14 +02:00
Evan Lezar
af0ef6fb66 Add CSV-based discovery of mounts
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2022-04-07 10:44:14 +02:00
Evan Lezar
7c5504a1cf Add locators for symlinks and character devices
This change adds a symlink locator that follows symlinks and returns all
elements in the chain and a device locator that finds character devices.

Signed-off-by: Evan Lezar <elezar@nvidia.com>
2022-04-07 10:44:14 +02:00
Evan Lezar
8e85e96f38 Add code to process Jetpack CSV files
This change adds code to process Jetpack CSV mount specifications.

Signed-off-by: Evan Lezar <elezar@nvidia.com>
2022-04-07 10:44:14 +02:00
Evan Lezar
1561a67d55 Merge branch 'add-v2-runtime-stub' into 'master'
Add experimental mode to nvidia-container-runtime

See merge request nvidia/container-toolkit/container-toolkit!114
2022-04-06 17:41:54 +00:00
Evan Lezar
9ce690093d FIX: Make isNVIDIAContainerRuntimeHook mode idiomatic
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2022-04-06 17:18:06 +02:00
Evan Lezar
b8dd473343 FIX: Simplify hook remover
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2022-04-06 17:15:57 +02:00
Evan Lezar
96e8eb3dde FIX: Rename path locator as executable locator
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2022-04-06 15:24:48 +02:00
Evan Lezar
0054481e15 FIX: Rename CLIConfig to ContainerCLIConfig
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2022-04-06 15:21:57 +02:00
Evan Lezar
11aa1d2a7d FIX: Factor out specModifier construction into function
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2022-04-06 15:18:12 +02:00
Evan Lezar
e6730fd0f0 FIX: Don't log that hooks is being removed if it is not
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2022-04-06 15:13:32 +02:00
Evan Lezar
8db287af8b FIX: Fix typo in comment
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2022-04-06 14:46:27 +02:00
Jon Mayo
3dab9da80e [ci] echo skipped commands 2022-04-04 07:02:33 -07:00
Evan Lezar
282a2c145e Fix typo in variable name
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2022-04-04 14:16:26 +02:00
Evan Lezar
d0608844dc Add basic README for nvidia-container-runtime
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2022-04-04 14:16:26 +02:00
Evan Lezar
a26d02890f Make error logging less verbose by default
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2022-04-04 14:16:26 +02:00
Evan Lezar
14fe35c3f4 Implement hook remover for existing nvidia-container-runtime-hooks
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2022-04-04 14:16:26 +02:00
Evan Lezar
d12dbd1bef Read top-level config to propagate Root to experimental runtime
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2022-04-04 14:16:25 +02:00
Evan Lezar
33d9c1dd57 Split loading config from reader and getting config from toml.Tree
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2022-04-04 14:16:25 +02:00
Evan Lezar
239b6d3739 Implement experimental modifier for NVIDIA Container Runtime
This change enables the experimental mode of the NVIDIA Container Runtime. If
enabled, the nvidia-container-runtime.discover-mode config option is
queried to determine how required OCI spec modifications should be defined.
If "legacy" is selected, the existing NVIDIA Container Runtime hooks is
discovered and injected into the OCI spec.

Signed-off-by: Evan Lezar <elezar@nvidia.com>
2022-04-04 14:16:25 +02:00
Evan Lezar
9dfe60b8b7 Add stable discoverer for nvidia-container-runtime hook
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2022-04-04 14:16:25 +02:00
Evan Lezar
390e5747ea Add lookup abstraction for locating executable files
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2022-04-04 14:16:25 +02:00
Evan Lezar
7137f4b05b Move runtime config to internal package
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2022-04-04 14:16:24 +02:00
Evan Lezar
9be6cca6db Don't skip internal packages for linting
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2022-04-04 14:16:24 +02:00
Evan Lezar
0c7eb93d62 Add experimental option to NVIDIA Container Runtime config
This change adds an experimental option to the NVIDIA Container Runtime config. To
simplify the extension of this experimental mode in future an error is raised if
this is enabled.

Signed-off-by: Evan Lezar <elezar@nvidia.com>
2022-04-04 14:16:24 +02:00
Evan Lezar
3bb539a5f7 Update libnvidia-container
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2022-04-04 14:16:22 +02:00
Jon Mayo
e39412ca44 Merge branch 'ci-release-ifonly' into 'master'
[ci] Skip external releases if associated OUT_REGISTRY value is empty.

See merge request nvidia/container-toolkit/container-toolkit!123
2022-03-31 20:29:13 +00:00
Jon Mayo
c2f35badb0 [ci] Skip external releases if associated OUT_REGISTRY value is empty.
Allows CI/CD environment variables to quickly disable any release job derived from the .release:external template

Template Usage: DRYRUN_RELEASE set to a value to echo docker and regctl commands in Makefile without running them (dry-run) SKIP_RELEASE set to a value to remove the job from the pipeline.

CI/CD Usage: NGC_SKIP_RELEASE set to disable external release to NGC. DOCKERHUB_SKIP_RELEASE set to disable external release to DH. NGC_DRYRUN_RELEASE set to dry-run external release to NGC. DOCKERHUB_DRYRUN_RELEASE set to dry-run external release to DH.
2022-03-31 20:29:13 +00:00
Evan Lezar
d0dfe27324 Merge branch 'refactor-stable-runtime' into 'master'
Refactor nvidia-container-runtime to prepare for experimental option

See merge request nvidia/container-toolkit/container-toolkit!119
2022-03-29 12:23:18 +00:00
Evan Lezar
c6dfc1027d Move modifier code for inserting nvidia-container-runtime-hook to separate package
This change moves the code defining the insertion of the nvidia-container-runtime
hook to a separate package. This allows for better distinction between the existing
and experimental modifications.

Signed-off-by: Evan Lezar <elezar@nvidia.com>
2022-03-29 11:05:42 +02:00
Evan Lezar
4177fddcc4 Import modifying runtime abstraction from experimental runtime
This change imports the modifying runtime abstraction from the
experimental branch. This encapsulates the checks for whether
modification is required, and forwards the loaded spec to
the specified modifier. This allows for the same code to be
reused when performing more complex modifications.

Signed-off-by: Evan Lezar <elezar@nvidia.com>
2022-03-29 11:05:42 +02:00
Evan Lezar
bf8c3bab72 Add test package with GetModuleRoot and PrependToPath function
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2022-03-29 11:05:41 +02:00
Evan Lezar
c5c2ffd68f Ensure that Exec error is also logged to file
This change removes unneeded logging and renames the return error value to rerr
to avoid it being aliased by local error values.

Signed-off-by: Evan Lezar <elezar@nvidia.com>
2022-03-29 11:05:41 +02:00
Evan Lezar
48d5a1cd1a Update go vendoring
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2022-03-29 11:05:41 +02:00
Evan Lezar
a7580e3872 Update podman hooks dependency
This is required to ensure that a newer version of
github.com/opencontainers/runtime-tools/generate is imported for use
with CDI.

Signed-off-by: Evan Lezar <elezar@nvidia.com>
2022-03-29 11:05:39 +02:00
Evan Lezar
4bf05325b5 Add .shell make target for non-Linux development
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2022-03-29 11:05:39 +02:00
Evan Lezar
ea7b8ab1f6 Add gcc for centos package builds including cgo
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2022-03-29 11:05:39 +02:00
Evan Lezar
c4bad9b36a Update gitignore
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2022-03-29 11:05:39 +02:00
Evan Lezar
3479e353c5 Merge branch 'centos8-stream' into 'master'
Switch to CentOS Stream 8 to build centos8 packages

See merge request nvidia/container-toolkit/container-toolkit!122
2022-03-29 09:03:48 +00:00
Evan Lezar
f50b4b2f91 Switch from centos:8 to centos:stream8 images to build centos8 packages
Due to the EOL of centos:8 we switch to centos:stream8 to build the centos8 and
rhel8 packages.

Signed-off-by: Evan Lezar <elezar@nvidia.com>
2022-03-29 08:07:06 +02:00
Evan Lezar
24ce09db0e Update git submodules
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2022-03-29 08:07:06 +02:00
Evan Lezar
a904076cf0 Update libnvidia-container submodule to v1.10.0-rc.1
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2022-03-28 15:54:28 +02:00
Evan Lezar
24d3f854af Bump version to 1.10.0-rc.1
This change make the following version bumps:

* nvidia-container-toolkit to 1.10.0-rc.1
* nvidia-contianer-runtime to 3.10.0-rc.1
* nvidia-docker to 2.10.0-rc.1

Signed-off-by: Evan Lezar <elezar@nvidia.com>
2022-03-24 16:56:27 +02:00
Evan Lezar
56ad97b8e5 Merge branch 'bump-1.9.0' into 'master'
Bump version to 1.9.0

See merge request nvidia/container-toolkit/container-toolkit!118
2022-03-18 13:36:30 +00:00
Evan Lezar
eb3be9d676 Use nvcr.io registry for Ubuntu CUDA base images
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2022-03-18 14:44:55 +02:00
Evan Lezar
4a3b532c29 Add CI definitions for building and publishing Ubuntu20.04 images
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2022-03-18 14:24:50 +02:00
Evan Lezar
cc68635c70 Upcate libnvidia-container submodule
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2022-03-18 12:34:02 +02:00
Evan Lezar
106279368a Bump version to 1.9.0
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2022-03-18 06:19:58 +02:00
Christopher Desiniotis
96772ccdcc Merge branch 'cve-libsasl' into 'master'
Update libsasl in both ubuntu/ubi toolkit images to address CVE-2022-24407

See merge request nvidia/container-toolkit/container-toolkit!116
2022-03-16 17:41:21 +00:00
Christopher Desiniotis
e2d1d379d5 Update libsasl in both ubuntu/ubi toolkit images to address CVE-2022-24407 2022-03-16 17:41:21 +00:00
Evan Lezar
cf74d14504 Merge branch 'update-libnvidia-container' into 'master'
Update libnvidia-container subcomponent

See merge request nvidia/container-toolkit/container-toolkit!112
2022-02-25 21:55:22 +00:00
Evan Lezar
aa3784d185 Update libnvidia-container subcomponent
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2022-02-25 21:58:19 +02:00
Evan Lezar
b0bb7b46e4 Merge branch 'CNT-2170/multi-arch' into 'master'
Use buildx and regctl to publish multi-arch images

See merge request nvidia/container-toolkit/container-toolkit!103
2022-02-23 07:08:56 +00:00
Evan Lezar
43ba5267c7 Merge branch 'add-docker-restart-mode-to-config' into 'master'
Add --restart-mode to docker config CLI

See merge request nvidia/container-toolkit/container-toolkit!106
2022-02-22 16:47:11 +00:00
Evan Lezar
5d4ecc24cb Use 'none' instead of 'NONE' to skip containerd restart
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2022-02-22 16:13:44 +02:00
Evan Lezar
d8ed16585a Add --restart-mode to docker config CLI
This change adds a --restart-mode option to the docker config CLI.
This mirrors the option added for containerd and allows 'none' to be
specified to disable the restart of docker. This is useful in
cases where the updated docker config should be reloaded out of
band.

Signed-off-by: Evan Lezar <elezar@nvidia.com>
2022-02-22 16:13:44 +02:00
Evan Lezar
a2060c74b3 Update component submodules
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2022-02-22 16:13:44 +02:00
Evan Lezar
2e4ed47ac4 Fix pushing of short tag for devel images
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2022-02-22 10:19:20 +02:00
Evan Lezar
93ca91ac3f Add multi-arch image scans
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2022-02-22 10:19:20 +02:00
Evan Lezar
cc593087d2 Also search /usr/lib/aarch64-linux-gnu for libnvidia-container libs
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2022-02-22 10:19:20 +02:00
Evan Lezar
b05db2befe Enable multi-arch builds in CI
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2022-02-22 10:19:20 +02:00
Evan Lezar
a0d2b22a54 Enable multi-arch builds
This change adds arm64/aarch64 images to supported distributions.
This is triggered if BUILD_MULTI_ARCH_IMAGE=true.

Note that for ubi8 images this means that we switch to using centos8
packages instead of centos7 since we do not build aarch64 packages
for the latter.

This also means that for centos7 we only build x86_64 images.

Signed-off-by: Evan Lezar <elezar@nvidia.com>
2022-02-22 10:19:20 +02:00
Evan Lezar
e8d555f155 Allow buildx to be used for mulit-arch images
This change allows for docker buildx to be used to build container
images. This also allows multi-arch images being built.

In addition to using docker buildx to build images, regctl as a
replacement for the docker push command to release images. This
tool also supports regctl.

The selection of docker buildx (and regctl) is controlled by a
BUILD_MULTI_ARCH_IMAGES make variable. If this is 'true',
the build-% make targets for the toolkit container will be
run through buildx  and the equivalent push-% targets will trigger
a regctl command.

Signed-off-by: Evan Lezar <elezar@nvidia.com>
2022-02-22 10:19:20 +02:00
Evan Lezar
ec7de9c4e8 Rename TARGETS make variable to DISTRIBUTIONS
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2022-02-22 10:19:20 +02:00
Evan Lezar
74ddfe901a Specify docker platform args for build and run commands
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2022-02-22 10:19:20 +02:00
Evan Lezar
a1ce176fc4 Ensure that Ubuntu20.04 images also build
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2022-02-22 10:19:20 +02:00
Evan Lezar
980185db55 Remove unneeded build-all CI steps
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2022-02-22 10:19:20 +02:00
Evan Lezar
ea4013fcd5 Fix centos8 builds
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2022-02-22 10:19:20 +02:00
Evan Lezar
97762ce5f9 Update submodules
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2022-02-22 10:19:20 +02:00
Evan Lezar
2adee1445b Merge branch 'fix-centos8' into 'master'
Fix centos8 builds

See merge request nvidia/container-toolkit/container-toolkit!111
2022-02-18 14:58:13 +00:00
Evan Lezar
38b49a7faa Remove unneeded build-all CI steps
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2022-02-18 16:13:38 +02:00
Evan Lezar
7b78a2a701 Update submodules
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2022-02-18 16:10:50 +02:00
Evan Lezar
596d7e8108 Fix centos8 builds
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2022-02-18 16:10:50 +02:00
Evan Lezar
5925b7e977 Bump version to 1.9.0-rc.1
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2022-02-18 16:10:47 +02:00
Evan Lezar
9d64ab6fb7 Merge branch 'fix-release-tests' into 'master'
Update centos:8 mirrors for release tests

See merge request nvidia/container-toolkit/container-toolkit!110
2022-02-17 14:58:30 +00:00
Evan Lezar
2ea632a861 Update centos:8 mirrors for release tests
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2022-02-14 14:15:33 +01:00
Evan Lezar
2c0a66c08c Merge branch 'update-libnvidia-container' into 'master'
Update changelogs

See merge request nvidia/container-toolkit/container-toolkit!109
2022-02-14 11:52:36 +00:00
Evan Lezar
ce7076e231 Update libnvidia-container submodule
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2022-02-14 12:09:03 +01:00
Evan Lezar
b79c9b9bca Update changelogs
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2022-02-14 10:12:45 +01:00
Evan Lezar
37a00041c4 Merge branch 'bump-1.8.1' into 'master'
Bump version to 1.8.1

See merge request nvidia/container-toolkit/container-toolkit!107
2022-02-10 08:43:20 +00:00
Evan Lezar
424b591535 Update libnvidia-container submodule
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2022-02-10 09:00:14 +01:00
Evan Lezar
99f6d45d71 Bump version to 1.8.1
This change make the following version bumps:

* nvidia-container-toolkit to 1.8.1
* nvidia-contianer-runtime to 3.8.1
* nvidia-docker to 2.9.1

Signed-off-by: Evan Lezar <elezar@nvidia.com>
2022-02-10 08:59:17 +01:00
Evan Lezar
a85caf93ff Fix changelog entry in rpm spec
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2022-02-09 14:00:41 +01:00
Kevin Klues
87e715ce6b Merge branch 'bump-version-1.8.0' into 'master'
Bump version to 1.8.0

See merge request nvidia/container-toolkit/container-toolkit!105
2022-02-04 09:08:17 +00:00
Evan Lezar
96811666b4 Update component submodules
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2022-02-04 09:24:29 +01:00
Evan Lezar
c76767d703 Bump version to 1.8.0
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2022-02-04 09:24:27 +01:00
Evan Lezar
588fdc82f7 Merge branch 'fix-centos8' into 'master'
Update centos8 repos

See merge request nvidia/container-toolkit/container-toolkit!104
2022-02-03 08:32:04 +00:00
Evan Lezar
5863be46ee Use 2h30m timeout for all packaging stages
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2022-02-02 15:53:46 +01:00
Evan Lezar
f097af79ca Update centos8 mirrors
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2022-02-02 13:43:31 +01:00
Evan Lezar
5c76493642 Update sub-modules
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2022-02-02 13:43:31 +01:00
Evan Lezar
ad877fb811 Bump version to 1.8.0-rc.3
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2022-02-02 13:43:31 +01:00
Evan Lezar
4562cb559c Merge branch 'update-release' into 'master'
Add scripting to update component submodules

See merge request nvidia/container-toolkit/container-toolkit!97
2022-01-28 10:44:51 +00:00
Evan Lezar
72e17e8632 Update libnvidia-container submodule
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2022-01-27 18:59:08 +01:00
Evan Lezar
6898917f41 Update components before building release
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2022-01-27 16:18:23 +01:00
Evan Lezar
53c130fb3c Merge branch 'remove-amazonlinux1' into 'master'
Remove building of Amazonlinux1 packages

See merge request nvidia/container-toolkit/container-toolkit!98
2022-01-24 12:31:35 +00:00
Evan Lezar
45bd3002da Merge branch 'CNT-2396/include-libnvidia-container-go' into 'master'
Copy libnivida-container-go to toolkit directory

See merge request nvidia/container-toolkit/container-toolkit!100
2022-01-21 15:48:53 +00:00
Evan Lezar
58042d78df Copy libnivida-container-go.so to toolkit directory
As of the NVIDIA Container Toolkit 1.8.0-rc.1 the libnvida-container*
packages also provide a libnvidia-container-go library. This must also
be installed in the toolkit container.

Signed-off-by: Evan Lezar <elezar@nvidia.com>
2022-01-21 15:07:36 +01:00
Evan Lezar
aa52b12c09 Merge branch 'bump-version-1.8.0-rc.2' into 'master'
Bump version to 1.8.0-rc.2

See merge request nvidia/container-toolkit/container-toolkit!96
2022-01-20 18:13:54 +00:00
Evan Lezar
47bc4f90ba Remove support for amazonlinux1
This commit removes support for building amazonlinux1 packages.

Signed-off-by: Evan Lezar <elezar@nvidia.com>
2022-01-20 17:47:46 +01:00
Evan Lezar
41c1c2312a Add check for matching toolkit and lib versions to release script
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2022-01-20 17:44:00 +01:00
Evan Lezar
9d34134b3f Update git submodules
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2022-01-20 17:43:44 +01:00
Evan Lezar
d931e861f3 Merge branch 'update-cuda-image' into 'master'
Update CUDA image version to 11.6.0

See merge request nvidia/container-toolkit/container-toolkit!99
2022-01-20 14:50:45 +00:00
Evan Lezar
b1c9b8bb49 Bump version to 1.8.0-rc.2
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2022-01-20 15:30:13 +01:00
Evan Lezar
50fbcebe31 Update CUDA image version to 11.6.0
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2022-01-20 15:07:32 +01:00
Kevin Klues
78f38455fd Merge branch 'fix-libnvidia-container-submodule' into 'master'
Update libnvidia-container submodule for WITH_NVCGO CI build fix

See merge request nvidia/container-toolkit/container-toolkit!92
2021-12-08 13:58:32 +00:00
Evan Lezar
f57e9b969c Update libnvidia-container submodule for WITH_NVCGO CI build fix
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2021-12-08 14:57:12 +01:00
Evan Lezar
a174aae7b5 Merge branch 'update-libnvidia-container' into 'master'
Update libnvidia-container submodule

See merge request nvidia/container-toolkit/container-toolkit!91
2021-12-08 12:33:51 +00:00
Evan Lezar
6890cb2ed8 Update libnvidia-container submodule
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2021-12-08 12:57:15 +01:00
Evan Lezar
13603e9794 Merge branch 'fix-centos7' into 'master'
Upgrade NSS for critical CVE in centos7 image

See merge request nvidia/container-toolkit/container-toolkit!90
2021-12-07 16:43:08 +00:00
Evan Lezar
afb260d82e Update nss on centos7 to address CVEs
This addresses https://access.redhat.com/security/cve/CVE-2021-43527

Signed-off-by: Evan Lezar <elezar@nvidia.com>
2021-12-07 16:20:17 +01:00
Evan Lezar
f0311bfe17 Allow packages to be specified to address CVEs
This change allows the CVE_UPGRADES build arg to be set
to address CVEs in base images instead of requesting waivers.

Signed-off-by: Evan Lezar <elezar@nvidia.com>
2021-12-07 16:19:01 +01:00
Evan Lezar
050c29b157 Merge branch 'enable-image-release' into 'master'
Enable release of toolkit-container images

See merge request nvidia/container-toolkit/container-toolkit!89
2021-12-06 09:15:57 +00:00
Evan Lezar
de9afd4623 Merge branch 'bump-post-1.7.0' into 'master'
Bump version post 1.7.0 release

See merge request nvidia/container-toolkit/container-toolkit!88
2021-12-03 16:03:19 +00:00
Evan Lezar
b231d8f365 Merge branch 'fix-skip-scan' into 'master'
Simplify skipping of scans

See merge request nvidia/container-toolkit/container-toolkit!87
2021-12-03 16:03:11 +00:00
Evan Lezar
ee2b84b228 Update libnvidia-container submodule
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2021-12-03 16:19:31 +01:00
Evan Lezar
0c24fa83ae Bump version post 1.7.0 release
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2021-12-03 16:19:29 +01:00
Evan Lezar
79660d1e55 Enable release of toolkit-container images
This change enables the release of toolkit-container images from this
repository instead of the container-config repository. This ensures
that these images are released along with the packages for the
NVIDIA Contianer Toolkit components.

Signed-off-by: Evan Lezar <elezar@nvidia.com>
2021-12-03 15:30:15 +01:00
Evan Lezar
39d2ff06fa Simplify skipping of scans
Scans are now only skipped if the SKIP_SCANS=yes.

Signed-off-by: Evan Lezar <elezar@nvidia.com>
2021-12-03 14:39:11 +01:00
Evan Lezar
0ac288e6dd Merge branch 'add-package-upload' into 'master'
Generate image containing packages for release

See merge request nvidia/container-toolkit/container-toolkit!82
2021-12-03 13:25:16 +00:00
Evan Lezar
b334f1977b Add delay and timeout to image pull job
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2021-12-03 12:19:57 +01:00
Evan Lezar
2d07385e81 Pull public staging images to scan and release
This change pulls images from public staging repositories to scan
and release. This ensures that the bits built and tested in public
CI (off the master branch, for example) match those scanned and
released. This also serves to reduce the load on our internal CI
runners as these don't have to store artifacts and build images.

Two CI variables: STAGING_REGISTRY and STAGING_VERSION are used
to control which image is pulled for release, with the latter
defaulting to the CI_COMMIT_SHORT_SHA.

Signed-off-by: Evan Lezar <elezar@nvidia.com>
2021-12-02 17:31:07 +01:00
Evan Lezar
fd5a1a72f0 Address review comments
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2021-12-01 18:45:30 +01:00
Evan Lezar
738d28dac5 Add script to pull packages from packaging image
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2021-12-01 18:45:30 +01:00
Evan Lezar
e662e8197c Add placeholder for testing packaging image
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2021-12-01 18:45:30 +01:00
Evan Lezar
2964f26533 Add packaging target to CI
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2021-12-01 18:45:30 +01:00
Evan Lezar
629d575fad Add packaging target that includes all release packages
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2021-12-01 18:45:30 +01:00
Evan Lezar
7fb04878c7 Include all architecture packages in toolkit container
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2021-12-01 18:45:30 +01:00
Evan Lezar
f10f533fb2 Merge branch 'bump-1.7.0' into 'master'
Bump version to 1.7.0

See merge request nvidia/container-toolkit/container-toolkit!85
2021-11-30 18:37:01 +00:00
Evan Lezar
9c2cdc2f81 Update libnvidia-container submodule
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2021-11-30 14:44:31 +01:00
Evan Lezar
5bbaf8af4b Bump version to 1.7.0
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2021-11-30 14:27:17 +01:00
Evan Lezar
c6ce5b5a29 Merge branch 'set-other-package-versions' into 'master'
Set nvidia-container-runtime and nvidia-docker versions

See merge request nvidia/container-toolkit/container-toolkit!84
2021-11-30 13:04:39 +00:00
Evan Lezar
b9e752e24e Update submodules
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2021-11-30 13:35:48 +01:00
Evan Lezar
94849fa822 Bump golang version to 1.16.4
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2021-11-30 13:35:48 +01:00
Evan Lezar
b0d6948d94 Add versions.mk file to define versions
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2021-11-30 13:35:46 +01:00
Evan Lezar
995bd0d34a Merge branch 'add-multi-arch-package-tests' into 'master'
Allow testing of packages for non-native architectures

See merge request nvidia/container-toolkit/container-toolkit!80
2021-11-29 13:57:53 +00:00
Evan Lezar
27bb5cca0c Specify nvidia-container-runtime and nvidia-docker versions
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2021-11-29 14:21:54 +01:00
Evan Lezar
72d1d90ce9 Bump post 1.7.0-rc.1 release
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2021-11-29 10:16:25 +01:00
Evan Lezar
6a1f7d0228 Don't rebuild packages for every local run
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2021-11-25 14:00:21 +01:00
Evan Lezar
094631329f Add basic multi-arch support to release tests
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2021-11-25 14:00:21 +01:00
Evan Lezar
6731f050da Rework init repo for centos8 release tests
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2021-11-25 14:00:21 +01:00
Evan Lezar
2ee6ec5d17 Merge branch 'update-libnvidia-container' into 'master'
Update libnvidia-container to latest for release

See merge request nvidia/container-toolkit/container-toolkit!83
2021-11-25 11:08:01 +00:00
Evan Lezar
1c25b349b1 Update libnvidia-container dependency for release
This includes support for filtering CLI flags based on libnvidia-container
version.

Signed-off-by: Evan Lezar <elezar@nvidia.com>
2021-11-25 11:38:56 +01:00
Evan Lezar
d87bdf9ab6 Update changelog
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2021-11-25 11:37:51 +01:00
Evan Lezar
472c89d051 Merge branch 'remove-containerd-dependency' into 'master'
Remove containerd dependency

See merge request nvidia/container-toolkit/container-toolkit!81
2021-11-25 09:13:29 +00:00
Evan Lezar
3470f2ecb9 Merge branch 'add-supported-driver-capabilities' into 'master'
Add supported-driver-capabilities config option

See merge request nvidia/container-toolkit/container-toolkit!74
2021-11-24 15:43:30 +00:00
Evan Lezar
9c27e03c87 Merge branch 'post-1.6.0-release' into 'master'
Bump post 1.6.0 release

See merge request nvidia/container-toolkit/container-toolkit!79
2021-11-24 15:40:36 +00:00
Evan Lezar
09c6995ff9 Update vendoring
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2021-11-24 15:42:37 +01:00
Evan Lezar
e2ec381093 Specify containerd runtime type as string
This removes the need to import the containerd package and reduces
the dependency management overhead.

Signed-off-by: Evan Lezar <elezar@nvidia.com>
2021-11-24 15:42:37 +01:00
Evan Lezar
7a31ebadb1 Update submodules
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2021-11-24 15:42:06 +01:00
Evan Lezar
7a34be62b2 Override LIB_TAGS for runtime and docker wrapper
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2021-11-24 10:59:10 +01:00
Evan Lezar
a4441b6545 Bump post 1.6.0 release
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2021-11-24 10:54:06 +01:00
Evan Lezar
ab3ebe5e49 Add jetpack-specific config.toml
This chagne adds a jetpack-specific config.toml file which specifies
supported-driver-capabilities to remove the unsupported ngx capability.

Signed-off-by: Evan Lezar <elezar@nvidia.com>
2021-11-17 16:53:08 +01:00
Evan Lezar
ea0bf6fbf8 Specify config.toml file suffix as docker build arg
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2021-11-17 16:53:08 +01:00
Evan Lezar
0a2db7c70e Add nvidia-container-config option to overide drivercapabilities
This change adds support for a supported-driver-capabilities config
option in the config.toml file that allows the driver capabilities
associated with the NVIDIA_DRIVER_CAPABILITIES=all environment variable.
This can be used on platforms such as Jetson to remove unsupported
capabilities such as "ngx".

Signed-off-by: Evan Lezar <elezar@nvidia.com>
2021-11-17 16:53:08 +01:00
Evan Lezar
92bb04f0fd Merge branch 'bump-version-1.6.0' into 'master'
Bump version for 1.6.0 release

See merge request nvidia/container-toolkit/container-toolkit!78
2021-11-17 11:17:39 +00:00
Evan Lezar
4d224a114a Update components versions for 1.6.0 release
* libnvidia-container v1.6.0: dd2c49d6699e4d8529fbeaa58ee91554977b652e
* nvidia-container-runtime v3.6.0: 38ff520daa33d3a3a733440957c6aa346757bd1f
* nvidia-docker v2.7.0: fd3233aa5f4ade28ac6bda616c2fa77a0ce89cd9

Signed-off-by: Evan Lezar <elezar@nvidia.com>
2021-11-17 11:49:06 +01:00
Evan Lezar
2795e7d132 Bump version to 1.6.0
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2021-11-17 10:07:36 +01:00
Evan Lezar
58801d0c71 Fix logging to stderr instead of file logger
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2021-11-16 21:37:50 +01:00
Christopher Desiniotis
a13c785865 Merge branch 'pulse-ci' into 'master'
[ci] remove --pss flag from pulse scanning

See merge request nvidia/container-toolkit/container-toolkit!77
2021-11-16 18:55:29 +00:00
Christopher Desiniotis
b57b8661ca [ci] remove --pss flag from pulse scanning
Signed-off-by: Christopher Desiniotis <cdesiniotis@nvidia.com>
2021-11-16 08:38:43 -08:00
Evan Lezar
d2575abd3a Merge branch 'match-toolkit-tag' into 'master'
Ensure that package tags for components match

See merge request nvidia/container-toolkit/container-toolkit!76
2021-11-16 10:57:37 +00:00
Evan Lezar
bc1f6e05a0 Check for matching tags in release script
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2021-11-16 09:53:32 +01:00
Evan Lezar
5db5205647 Get tags for all components in get-component-versions script
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2021-11-16 09:49:55 +01:00
Evan Lezar
6a747f5dd3 Update submodules
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2021-11-16 09:49:13 +01:00
Kevin Klues
81f9caa9aa Merge branch 'bump-libnvidia-container' into 'master'
Update libnvidia-container to ff6ed3d5637f0537c4951a2757512108cc0ae147

See merge request nvidia/container-toolkit/container-toolkit!75
2021-11-15 15:58:20 +00:00
Evan Lezar
684b5e9237 Update libnvidia-container to ff6ed3d5637f0537c4951a2757512108cc0ae147
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2021-11-15 16:29:51 +01:00
Evan Lezar
7d4a8200eb Merge branch 'bump-version' into 'master'
Bump version post v1.6.0-rc.2 release

See merge request nvidia/container-toolkit/container-toolkit!73
2021-11-15 13:28:22 +00:00
Evan Lezar
060f670232 Update libnvidia-container submodule to 1.6.0-rc.3
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2021-11-15 13:06:42 +01:00
Evan Lezar
1b3e2d9423 Bump version post v1.6.0-rc.2 release
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2021-11-15 13:03:24 +01:00
Evan Lezar
06cd37b892 Merge branch 'use-internal-oci' into 'master'
Import internal/oci package from experimental branch

See merge request nvidia/container-toolkit/container-toolkit!68
2021-11-12 10:03:46 +00:00
Evan Lezar
1d0fd7475c Merge branch 'packaging-fix' into 'master'
Update nvidia-docker submodule

See merge request nvidia/container-toolkit/container-toolkit!71
2021-11-05 14:09:47 +00:00
Evan Lezar
40032edc3b Update submodules for packaging
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2021-11-05 15:03:46 +01:00
Evan Lezar
f2d2991651 Update nvidia-docker submodule
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2021-11-05 14:04:28 +01:00
Evan Lezar
3d5be45349 Merge branch 'prep-for-release' into 'master'
Specify toolkit version when building runtime and docker packages

See merge request nvidia/container-toolkit/container-toolkit!70
2021-11-05 12:27:54 +00:00
Evan Lezar
4d945e96f3 Auto update debian changelog and release date
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2021-11-05 12:06:29 +01:00
Evan Lezar
14c641377f Update nvidia-docker submodule
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2021-11-04 17:14:36 +01:00
Evan Lezar
988e067091 Forward nvidia-container-toolkit versions to dependants
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2021-11-04 16:47:37 +01:00
Evan Lezar
98168ea16c Update libnvidia-container submodule
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2021-11-04 16:40:09 +01:00
Evan Lezar
d6a2733557 Update nvidia-container-runtime submodule
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2021-11-04 16:39:40 +01:00
Evan Lezar
ee6545fbab Merge branch 'update-oci-spec' into 'master'
Add basic test for preservation of OCI spec under modification

See merge request nvidia/container-toolkit/container-toolkit!69
2021-11-04 13:29:03 +00:00
Evan Lezar
e8cc95c53b Update imported OCI runtime spec
This change updates the imported OCI runtime spec to a3c33d663ebc which includes
the ability to override the return code for syscalls. This is used by docker for
the clone3 syscall, for example.

Signed-off-by: Evan Lezar <elezar@nvidia.com>
2021-11-04 13:42:46 +01:00
Evan Lezar
8afd89676f Add basic test for preservation of OCI spec under modification
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2021-11-04 13:42:46 +01:00
Evan Lezar
dd5c0a94ad Merge branch 'pulse-ci' into 'master'
[ci] use pulse instead of contamer for scans

See merge request nvidia/container-toolkit/container-toolkit!65
2021-11-02 18:50:42 +00:00
Christopher Desiniotis
93ecf3aeaf [ci] use pulse instead of contamer for scans
Signed-off-by: Christopher Desiniotis <cdesiniotis@nvidia.com>
2021-11-01 11:09:10 -07:00
Evan Lezar
ec8a6d978d Import cmd/nvidia-container-runtime from experimental branch
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2021-10-29 16:49:05 +02:00
Evan Lezar
d234077780 Remove unneeded files
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2021-10-29 16:49:02 +02:00
Evan Lezar
b8acd7657a Import internal/oci package from experimental branch
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2021-10-29 16:44:31 +02:00
Evan Lezar
55328126c6 Merge branch 'fix-devel-release-version' into 'master'
Rename RELEASE_DEVEL_TAG for consistency

See merge request nvidia/container-toolkit/container-toolkit!64
2021-10-27 11:52:57 +00:00
Evan Lezar
c2b35da111 Rename RELEASE_DEVEL_TAG for consistency
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2021-10-27 13:25:12 +02:00
Evan Lezar
2c210ebe21 Merge branch 'CNT-1874/add-internal-ci' into 'master'
Add release CI for toolkit-container

See merge request nvidia/container-toolkit/container-toolkit!58
2021-10-27 08:23:05 +00:00
Evan Lezar
1f0064525c Merge branch 'empty-bundle' into 'master'
Bump version to 1.6.0-rc.2

See merge request nvidia/container-toolkit/container-toolkit!63
2021-10-26 12:57:43 +00:00
Evan Lezar
c301bde4f4 Remove rule for merge requests
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2021-10-26 14:29:28 +02:00
Evan Lezar
5996379fcc Add changelog entry for config.json path changes
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2021-10-26 14:29:28 +02:00
Evan Lezar
23bdcbc818 Bump version to 1.6.0-rc.2
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2021-10-26 14:29:26 +02:00
Evan Lezar
ee7206ef29 Merge branch 'master' into 'master'
skip error when bundleDir not exist

See merge request nvidia/container-toolkit/container-toolkit!62
2021-10-26 11:10:46 +00:00
wenjun gao
350c8893fb skip error when bundleDir not exist 2021-10-26 17:51:12 +08:00
Evan Lezar
5b1a6765c6 Remove rule for merge requests
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2021-10-25 12:13:06 +02:00
Evan Lezar
cd1540300e Add internal CI definition for release
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2021-10-25 12:13:06 +02:00
Evan Lezar
52f52d5376 Merge branch 'CNT-1874/build-toolkit-container' into 'master'
Build container-toolkit images from nvidia-container-toolkit repository

See merge request nvidia/container-toolkit/container-toolkit!52
2021-10-22 11:09:02 +00:00
Evan Lezar
c35444c76c Add CI to build toolkit-container image
This change adds CI definitions for building the toolkit-container
images. This modifies the existing CI and replaces the build-one
stage with multiple stages that do the following:
* peform the standard golang checks
* build the packages required by the images
* build the images for supported platforms
* releases the images (currently to the CI staging registry)

The build-all stage is included as a final step in the CI. This is
run after the release stage as the target platforms are not requried
from an imaging perspective. The build-all stage is only run on
MRs or tagged builds.

Signed-off-by: Evan Lezar <elezar@nvidia.com>
2021-10-22 11:57:55 +02:00
Evan Lezar
0b3bc13b32 Add dockerfile and makefile to build toolkit-container
This change adds platform-specific Dockerfiles and a Makefile
to build the toolkit-container images.

This image builds the container-config commands from the tools
directory and installs the components of the NVIDIA Container Toolkit
directly from the nvidia-container-toolkit and libnvidia-container*
packages in the dist directory.

This includes make targets for the centos7, centos8, ubuntu18.04,
and ubi8 container-toolkit images as well as the container tests
make targets implemented in the contianer-config repository.

Files adapted from:
383587f766

Signed-off-by: Evan Lezar <elezar@nvidia.com>
2021-10-22 11:57:55 +02:00
Evan Lezar
f2c93363ab Copy container test scripts from container-config
Files copied from:

383587f766

Signed-off-by: Evan Lezar <elezar@nvidia.com>
2021-10-22 11:57:55 +02:00
Evan Lezar
7d76243783 Copy cmd from container-config
This change copies the code from container-config/cmd to
tools/container. This allows the code to be built and
added to the container image without additional refactoring.

As the configuration utilities are incorporated into the cmds
of the nvidia-container-toolkit, the code will be moved from tools.

Files copied from:

383587f766

Signed-off-by: Evan Lezar <elezar@nvidia.com>
2021-10-22 11:57:51 +02:00
Evan Lezar
7bf5c25831 Update go vendoring
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2021-10-21 12:07:42 +02:00
Evan Lezar
266b752b02 Merge branch 'al2-aarch64' into 'master'
Add aarch64 build for Amazon Linux 2

See merge request nvidia/container-toolkit/container-toolkit!55
2021-10-19 14:31:51 +00:00
Evan Lezar
7fc33d02b4 Update submodules
* libnvidia-container: @1fa138a694b3667fd89ac89dfdb26fcd06ab0bb9
* nvidia-container-runtime: @cd6aef41126b5409c2329b66803b278a697aaaf3
* nvidia-docker: @4613cdae34c3e106ef124c9b86e4cf998569bbd6

Signed-off-by: Evan Lezar <elezar@nvidia.com>
2021-10-19 15:30:29 +02:00
Evan Lezar
9be9b89f9f Add aarch64 build for Amazon Linux 2
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2021-10-18 12:03:22 +02:00
Evan Lezar
a036a83afa Merge branch 'improve-release-tests' into 'master'
Extend release testing toolking to allow for upgrade testing

See merge request nvidia/container-toolkit/container-toolkit!54
2021-10-14 17:23:49 +00:00
Evan Lezar
ee0b908613 Extend release testing toolking to allow for upgrade testing
This change allows for upgrade workflows to be tested in the
release test containers. To achieve this a script is added
to configure the test repositories leaving the defaults installed
initially.

Signed-off-by: Evan Lezar <elezar@nvidia.com>
2021-10-14 17:10:43 +02:00
Evan Lezar
28f6b7c02c Merge branch 'update-ci' into 'master'
Update CI to use build image directly

See merge request nvidia/container-toolkit/container-toolkit!53
2021-10-12 12:55:48 +00:00
Evan Lezar
f7e9d1ca45 Use build image directly in CI
This change uses the build image directly in CI instead of
using dind and invoking the docker-* make targets.

Signed-off-by: Evan Lezar <elezar@nvidia.com>
2021-10-12 13:53:11 +02:00
Evan Lezar
229f9c3730 Update DEVELOPMENT.md
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2021-10-12 13:53:06 +02:00
Evan Lezar
845701447c Merge branch 'build-from-repo' into 'master'
Add logic to build all package from nvidia-container-toolkit repo

See merge request nvidia/container-toolkit/container-toolkit!49
2021-10-07 14:30:13 +00:00
Evan Lezar
1ad98df39f Update submodules for packaging fixes
This change updates the git submodules for nvidia-docker and
nvidia-container-runtime to contain the package fixes and
code cleanup.

Signed-off-by: Evan Lezar <elezar@nvidia.com>
2021-10-07 15:28:52 +02:00
Evan Lezar
22a958fae7 Add docker-based tests for package installation workflows
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2021-09-24 14:58:17 +02:00
Evan Lezar
f10fa7b292 FIXUP: Update development
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2021-09-23 21:38:03 +02:00
Evan Lezar
fa7dc8cb31 Require at least a matching libnvidia-container-tools version
This change ensures that at least the same libnvidia-container-tools
version is required when installing nvidia-container-toolkit.

Signed-off-by: Evan Lezar <elezar@nvidia.com>
2021-09-23 21:34:09 +02:00
Evan Lezar
3fef6bb5ab Merge branch 'update-readme' into 'master'
Add README to nvidia-container-toolkit repository

See merge request nvidia/container-toolkit/container-toolkit!50
2021-09-23 19:07:22 +00:00
Evan Lezar
2dc85de5d4 Use consistent package revisions for all rpm-based packages
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2021-09-23 15:36:27 +02:00
Evan Lezar
bb6f4745e9 Fix nvidia-container-runtime breaks / replaces dependency
The relationship between packages also considers the package revision
when determining validity. This means that 3.5.0-1 is considered
greater than 3.5.0. This changed adds the package revision to the
nvidia-container-runtime breaks / replaces relationship.

Signed-off-by: Evan Lezar <elezar@nvidia.com>
2021-09-23 14:56:55 +02:00
Evan Lezar
77740c2a80 Add release script for specific targets
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2021-09-23 14:38:37 +02:00
Evan Lezar
f0fb4739ff Remove docker-all target from Makefile
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2021-09-23 14:38:37 +02:00
Evan Lezar
5ee2150eaa Add basic version checks
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2021-09-23 14:38:37 +02:00
Evan Lezar
34e023361b Add nvidia-docker as a git submodule
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2021-09-23 14:38:37 +02:00
Evan Lezar
2ed7d86709 Add nvidia-container-runtime as a git submodule
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2021-09-23 14:38:37 +02:00
Evan Lezar
e729e74fe5 Add libnvidia-container as a git submodule
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2021-09-23 14:38:37 +02:00
Evan Lezar
b551d0f4f4 Apply edits for the NVIDIA container toolkit
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2021-09-22 16:49:58 +02:00
Evan Lezar
1d674783b0 Copy README.md from nvidia-docker
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2021-09-22 15:22:53 +02:00
Evan Lezar
cc9c3c0d28 Copy scripts from nvidia-container-toolkit-release
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2021-09-22 13:42:32 +02:00
Evan Lezar
78f137a5ef Bump version for 1.6.0 development
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2021-09-22 13:42:32 +02:00
Evan Lezar
00258f14fb Merge branch 'include-nvidia-container-runtime' into 'master'
Include nvidia-container-runtime executable in nvidia-container-toolkit

See merge request nvidia/container-toolkit/container-toolkit!45
2021-09-20 14:54:06 +00:00
Evan Lezar
e828697f90 Update debian and rpm package definitions
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2021-09-20 14:54:54 +02:00
Evan Lezar
923344d376 Add PREFIX make variable to control command output
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2021-09-13 17:15:34 +02:00
Evan Lezar
35c6559013 Make all commands and copy executables
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2021-09-13 17:15:19 +02:00
Evan Lezar
eb67968911 Add cmds target to makefile to build all go commands
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2021-09-13 17:08:31 +02:00
Evan Lezar
6e1436cefb Update go vendoring
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2021-09-07 13:13:03 +02:00
Evan Lezar
10cd42273e Update package references
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2021-09-07 13:13:03 +02:00
Evan Lezar
b6a585c77d Copy code from nvidia-container-runtime
This change copies the cmd/nvidia-container-runtime, internal, and test
folders from github.com/NVIDIA/nvidia-container-runtime@8a63b4b34f3ce3b4167f0516aa3f7207ca280dfb

Signed-off-by: Evan Lezar <elezar@nvidia.com>
2021-09-07 13:13:03 +02:00
Evan Lezar
58e707fed6 Bump version for post 1.5.1 development
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2021-09-07 13:12:25 +02:00
Evan Lezar
28ee3d5fd5 Merge branch 'revert-nvlink' into 'master'
Revert support for NVIDIA_FABRIC_DEVICES

See merge request nvidia/container-toolkit/container-toolkit!41
2021-08-20 08:39:49 +00:00
Evan Lezar
c2ac6db43b Revert "Add support for NVIDIA_FABRIC_DEVICES"
This reverts commit f828efcf64.
2021-08-18 15:17:59 +02:00
Evan Lezar
620bd806e8 Revert "Bump version to 1.6.0~rc.1"
This reverts commit 2001d66f9b.
2021-08-18 15:17:37 +02:00
Evan Lezar
afe0f8b61f Merge branch 'release-1.6.0-rc.1' into 'master'
Bump version to 1.6.0~rc.1

See merge request nvidia/container-toolkit/container-toolkit!40
2021-08-16 10:27:11 +00:00
Evan Lezar
2001d66f9b Bump version to 1.6.0~rc.1
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2021-08-13 14:55:28 +02:00
Evan Lezar
7626578b8e Merge branch 'nvlink' into 'master'
Add support for NVIDIA_FABRIC_DEVICES

See merge request nvidia/container-toolkit/container-toolkit!39
2021-08-12 14:04:10 +00:00
Evan Lezar
f828efcf64 Add support for NVIDIA_FABRIC_DEVICES
This change adds support for the NVIDIA_FABRIC_DEVICES envvar. The (non-empty)
value of this envvar is passed to the NVIDIA Container CLI using the --fabric-device
command line flag and allows for nvswitch and nvlink devices to be mounted
into the container.

Signed-off-by: Evan Lezar <elezar@nvidia.com>
2021-08-11 10:33:55 +02:00
Evan Lezar
faf0df66c7 Merge branch 'improve-ci' into 'master'
Improve CI for container toolkit

See merge request nvidia/container-toolkit/container-toolkit!38
2021-07-15 15:40:56 +00:00
Evan Lezar
1ef4b1a14a Use extends keyword for build-one and build-all
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2021-07-15 16:27:17 +02:00
Evan Lezar
3df0969349 Improve CI for container toolkit
This change improves the CI for the container toolkit. The go targets are
executed in a docker container which allows for reproducible behaviour on
local systems as well as CI. The Makefile is updated to facilitate this.

Signed-off-by: Evan Lezar <elezar@nvidia.com>
2021-07-15 16:27:15 +02:00
Evan Lezar
22fcd022f3 Merge branch 'upstream-bump-v1.5.1' into 'master'
Bump to version 1.5.1

See merge request nvidia/container-toolkit/container-toolkit!35
2021-06-14 18:35:45 +00:00
Evan Lezar
492905de38 Bump to version 1.5.1
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2021-06-14 17:16:44 +02:00
Evan Lezar
17e76cad4d Merge branch 'make-binary-goos-explicit' into 'master'
Explicitly set GOOS when building binary

See merge request nvidia/container-toolkit/container-toolkit!33
2021-06-14 10:12:24 +00:00
Evan Lezar
c728bf4b1e Explicitly set GOOS when building binary
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2021-06-10 10:40:06 +02:00
Evan Lezar
f05e4e81c5 Merge branch 'fix-go-install' into 'master'
Move pkg to cmd/nvidia-container-toolkit

See merge request nvidia/container-toolkit/container-toolkit!32
2021-06-08 15:21:06 +00:00
Evan Lezar
14cd7c1833 Add coverage step to CI
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2021-06-08 15:23:19 +02:00
Evan Lezar
f72b79cc2a Move pkg to cmd/nvidia-container-toolkit
This change moves the pkg folder to `cmd/nvidia-container-toolkit` to
better match go best practices. This allows, for example, for the
`cmd/nvidia-container-toolkit` to be go installed.

The only package included in `pkg` was `main`.

Signed-off-by: Evan Lezar <elezar@nvidia.com>
2021-06-08 15:20:59 +02:00
Evan Lezar
f25698e96e Run go mod vendor and go mod tidy
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2021-06-08 10:51:33 +02:00
Evan Lezar
a02f7f8f6f Merge branch 'CNT-1554/docker-swarm' into 'master'
Fix bug where docker swarm device selection is overriden by NVIDIA_VISIBLE_DEVICES

See merge request nvidia/container-toolkit/container-toolkit!31
2021-06-08 05:31:05 +00:00
Evan Lezar
2a92d6acb7 Fix bug where docker swarm device selection is overriden by NVIDIA_VISIBLE_DEVICES
This change fixes a bug where the value of NVIDIA_VISIBLE_DEVICES would be used to
select devices even if the `swarm-resource` config option is specified.

Note that this does not change the value of NVIDIA_VISIBLE_DEVICES in the container.

Signed-off-by: Evan Lezar <elezar@nvidia.com>
2021-06-07 14:10:08 +02:00
Evan Lezar
602eaf0e60 Use require package for tests
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2021-06-07 13:31:41 +02:00
Evan Lezar
b930487dc5 Add coverage to go tests
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2021-06-07 13:21:28 +02:00
Evan Lezar
9aac07fe64 Update vendoring
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2021-06-07 13:20:34 +02:00
Evan Lezar
825990ba41 Merge branch 'CNT-1334-publish-tags-to-artifactory' into 'master'
Add artifactory publish step

See merge request nvidia/container-toolkit/container-toolkit!30
2021-05-18 17:25:07 +00:00
Evan Lezar
03d9c1d698 Update to Golang 1.16.3
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2021-05-18 11:25:52 +02:00
Evan Lezar
de172674b1 Add artifactory publish step
This change simplifies the build process by only targetting ubuntu20.04-amd64
and adds logic to push tagged builds to artifactory.

Signed-off-by: Evan Lezar <elezar@nvidia.com>
2021-05-18 11:25:48 +02:00
Kevin Klues
b71a9ed153 Merge branch 'upstream-bump-v1.5.0' into 'master'
Bump version to 1.5.0

See merge request nvidia/container-toolkit/container-toolkit!29
2021-04-29 14:08:23 +00:00
Kevin Klues
dde7159e11 Bump version to 1.5.0
Signed-off-by: Kevin Klues <kklues@nvidia.com>
2021-04-29 10:16:44 +00:00
Evan Lezar
46de426cc4 Merge branch 'CNT-1330/jenkins-ci' into 'master'
Add Jenkins file for CI build steps

See merge request nvidia/container-toolkit/container-toolkit!28
2021-03-18 10:06:44 +00:00
Evan Lezar
1c7d6a233a Add golang check targets
This change adds check targets for Golang to the make file. These are also
added as stages to the to the Jenkinsfile definition and the GitLab CI
is modified to use them too.

Signed-off-by: Evan Lezar <elezar@nvidia.com>
2021-03-17 16:58:39 +01:00
Evan Lezar
635aeb8343 Add Jenkinsfile definition for build targets
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2021-03-17 13:52:19 +01:00
Evan Lezar
ec9d296afe Move docker.mk to docker folder
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2021-03-17 13:52:14 +01:00
Evan Lezar
ff44395b31 Merge branch 'upstream-bump-v1.4.2' into 'master'
Bump version to 1.4.2

See merge request nvidia/container-toolkit/container-toolkit!27
2021-02-05 12:47:01 +00:00
Kevin Klues
8571e5ac5d Bump version to 1.4.2
Signed-off-by: Kevin Klues <kklues@nvidia.com>
2021-02-05 10:26:10 +00:00
Kevin Klues
108c99bb9b Merge branch 'upstream-bump-v1.4.1' into 'master'
Bump version to 1.4.1

See merge request nvidia/container-toolkit/container-toolkit!26
2021-01-25 13:35:42 +00:00
Kevin Klues
dfb5daf200 Bump version to 1.4.1
Signed-off-by: Kevin Klues <kklues@nvidia.com>
2021-01-25 10:42:32 +00:00
Kevin Klues
e8aa3cc8c3 Merge branch 'ignore-nvidia-visible-devices' into 'master'
Ignore NVIDIA_VISIBLE_DEVICES for containers with insufficent privileges

See merge request nvidia/container-toolkit/container-toolkit!25
2021-01-25 10:25:00 +00:00
Evan Lezar
fc408a32c7 Add utility function to get config name from struct
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2021-01-22 16:08:45 +01:00
Evan Lezar
f6b1b1afad Ignore NVIDIA_VISIBLE_DEVICES for containers with insufficent privileges
This change ignores the value of NVIDIA_VISIBLE_DEVICES instead of
raising an error when launching a container with insufficient permissions.

This changes the behaviour under the following conditions:

NVIDIA_VISIBLE_DEVICES is set
and

accept-nvidia-visible-devices-envvar-when-unprivileged = false (default: true)

or

privileged = false (default: false)

This means that a user need not explicitly clear the NVIDIA_VISIBLE_DEVICES
environment variable if no GPUs are to be used in unprivileged containers.
Note that this envvar is set to 'all' by default in many CUDA images that
are used as base images.

Signed-off-by: Evan Lezar <elezar@nvidia.com>
2021-01-22 15:34:52 +01:00
Kevin Klues
97516467c0 Merge branch 'upstream-bump-v1.4.0' into 'master'
Bump version to 1.4.0

See merge request nvidia/container-toolkit/container-toolkit!24
2020-12-14 14:41:02 +00:00
Kevin Klues
01063c0433 Bump version to 1.4.0
Signed-off-by: Kevin Klues <kklues@nvidia.com>
2020-12-11 18:05:49 +00:00
Kevin Klues
119f75dcf8 Merge branch 'upstream-add-compute-to-default-capabilities' into 'master'
Add 'compute' capability to list of defaults.

See merge request nvidia/container-toolkit/container-toolkit!23
2020-12-08 11:31:27 +00:00
Kevin Klues
20604621e4 Add 'compute' capability to list of defaults.
For most practical purposes, it should be fine to set
NVIDIA_DRIVER_CAPABILITIES=all nowadays.

Historically, these different capabilities exist because they were added
incrementally, with varying degrees of stability. It's fairly common to
run with GPUs in containers today, but a few years ago the driver didn't
support them very well, and it was important to make sure the libraries
being injected into the container actually worked in a containerized
environment. When they didn't, it was common to get information leaks,
crashes, or even silent failures.

In the past, whenever a new set of libraries was being vetted for
injected, a new capability was added to make sure that users had control
to explicitly include only those libraries they were comfortable having
injected into their containers.

The idea being that whoever puts together a container image for use with
GPUs should have the knowledge of what capabilities the software in that
container image requires, and can set the NVIDIA_DRIVER_CAPABILITIES
envvar in that image appropriately.

After some back and forth, we've decided it doesn't quite make sense to
set it to "all" just yet, but we should set it to "utility, compute"
instead of just "utility", so that at least the core CUDA libraries work
by default (once installed in the container).

Signed-off-by: Kevin Klues <kklues@nvidia.com>
2020-12-07 12:10:23 +00:00
945 changed files with 361474 additions and 943 deletions

255
.common-ci.yml Normal file
View File

@@ -0,0 +1,255 @@
# Copyright (c) 2021-2022, NVIDIA CORPORATION. All rights reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
default:
image: docker
services:
- name: docker:dind
command: ["--experimental"]
variables:
GIT_SUBMODULE_STRATEGY: recursive
BUILDIMAGE: "${CI_REGISTRY_IMAGE}/build:${CI_COMMIT_SHORT_SHA}"
BUILD_MULTI_ARCH_IMAGES: "true"
stages:
- image
- lint
- go-checks
- go-build
- unit-tests
- package-build
- image-build
- test
- scan
- release
.main-or-manual:
rules:
- if: $CI_COMMIT_BRANCH == "main"
- if: $CI_COMMIT_BRANCH =~ /^release-.*$/
- if: $CI_COMMIT_TAG && $CI_COMMIT_TAG != ""
- if: $CI_PIPELINE_SOURCE == "schedule"
when: manual
# Define the distribution targets
.dist-amazonlinux2:
rules:
- !reference [.main-or-manual, rules]
variables:
DIST: amazonlinux2
PACKAGE_REPO_TYPE: rpm
.dist-centos7:
rules:
- !reference [.main-or-manual, rules]
variables:
DIST: centos7
CVE_UPDATES: "cyrus-sasl-lib"
PACKAGE_REPO_TYPE: rpm
.dist-centos8:
variables:
DIST: centos8
CVE_UPDATES: "cyrus-sasl-lib"
PACKAGE_REPO_TYPE: rpm
.dist-debian10:
rules:
- !reference [.main-or-manual, rules]
variables:
DIST: debian10
PACKAGE_REPO_TYPE: debian
.dist-opensuse-leap15.1:
rules:
- !reference [.main-or-manual, rules]
variables:
DIST: opensuse-leap15.1
PACKAGE_REPO_TYPE: rpm
.dist-ubi8:
rules:
- !reference [.main-or-manual, rules]
variables:
DIST: ubi8
CVE_UPDATES: "cyrus-sasl-lib"
PACKAGE_REPO_TYPE: rpm
.dist-ubuntu18.04:
variables:
DIST: ubuntu18.04
CVE_UPDATES: "libsasl2-2 libsasl2-modules-db"
PACKAGE_REPO_TYPE: debian
.dist-ubuntu20.04:
variables:
DIST: ubuntu20.04
CVE_UPDATES: "libsasl2-2 libsasl2-modules-db"
PACKAGE_REPO_TYPE: debian
.dist-packaging:
variables:
DIST: packaging
# Define architecture targets
.arch-aarch64:
variables:
ARCH: aarch64
.arch-amd64:
variables:
ARCH: amd64
.arch-arm64:
variables:
ARCH: arm64
.arch-ppc64le:
rules:
- !reference [.main-or-manual, rules]
variables:
ARCH: ppc64le
.arch-x86_64:
variables:
ARCH: x86_64
# Define the platform targets
.platform-amd64:
variables:
PLATFORM: linux/amd64
.platform-arm64:
variables:
PLATFORM: linux/arm64
# Define test helpers
.integration:
stage: test
variables:
IMAGE_NAME: "${CI_REGISTRY_IMAGE}/container-toolkit"
VERSION: "${CI_COMMIT_SHORT_SHA}"
before_script:
- apk add --no-cache make bash jq
- docker login -u "${CI_REGISTRY_USER}" -p "${CI_REGISTRY_PASSWORD}" "${CI_REGISTRY}"
- docker pull "${IMAGE_NAME}:${VERSION}-${DIST}"
script:
- make -f build/container/Makefile test-${DIST}
# Define the test targets
test-packaging:
extends:
- .integration
- .dist-packaging
needs:
- image-packaging
# Download the regctl binary for use in the release steps
.regctl-setup:
before_script:
- export REGCTL_VERSION=v0.4.5
- apk add --no-cache curl
- mkdir -p bin
- curl -sSLo bin/regctl https://github.com/regclient/regclient/releases/download/${REGCTL_VERSION}/regctl-linux-amd64
- chmod a+x bin/regctl
- export PATH=$(pwd)/bin:${PATH}
# .release forms the base of the deployment jobs which push images to the CI registry.
# This is extended with the version to be deployed (e.g. the SHA or TAG) and the
# target os.
.release:
stage: release
variables:
# Define the source image for the release
IMAGE_NAME: "${CI_REGISTRY_IMAGE}/container-toolkit"
VERSION: "${CI_COMMIT_SHORT_SHA}"
# OUT_IMAGE_VERSION is overridden for external releases
OUT_IMAGE_VERSION: "${CI_COMMIT_SHORT_SHA}"
before_script:
- !reference [.regctl-setup, before_script]
# We ensure that the OUT_IMAGE_VERSION is set
- 'echo Version: ${OUT_IMAGE_VERSION} ; [[ -n "${OUT_IMAGE_VERSION}" ]] || exit 1'
# In the case where we are deploying a different version to the CI_COMMIT_SHA, we
# need to tag the image.
# Note: a leading 'v' is stripped from the version if present
- apk add --no-cache make bash
script:
# Log in to the "output" registry, tag the image and push the image
- 'echo "Logging in to CI registry ${CI_REGISTRY}"'
- regctl registry login "${CI_REGISTRY}" -u "${CI_REGISTRY_USER}" -p "${CI_REGISTRY_PASSWORD}"
- '[ ${CI_REGISTRY} = ${OUT_REGISTRY} ] || echo "Logging in to output registry ${OUT_REGISTRY}"'
- '[ ${CI_REGISTRY} = ${OUT_REGISTRY} ] || regctl registry login "${OUT_REGISTRY}" -u "${OUT_REGISTRY_USER}" -p "${OUT_REGISTRY_TOKEN}"'
# Since OUT_IMAGE_NAME and OUT_IMAGE_VERSION are set, this will push the CI image to the
# Target
- make -f build/container/Makefile push-${DIST}
# Define a staging release step that pushes an image to an internal "staging" repository
# This is triggered for all pipelines (i.e. not only tags) to test the pipeline steps
# outside of the release process.
.release:staging:
extends:
- .release
variables:
OUT_REGISTRY_USER: "${CI_REGISTRY_USER}"
OUT_REGISTRY_TOKEN: "${CI_REGISTRY_PASSWORD}"
OUT_REGISTRY: "${CI_REGISTRY}"
OUT_IMAGE_NAME: "${CI_REGISTRY_IMAGE}/staging/container-toolkit"
# Define an external release step that pushes an image to an external repository.
# This includes a devlopment image off main.
.release:external:
extends:
- .release
rules:
- if: $CI_COMMIT_TAG
variables:
OUT_IMAGE_VERSION: "${CI_COMMIT_TAG}"
- if: $CI_COMMIT_BRANCH == $RELEASE_DEVEL_BRANCH
variables:
OUT_IMAGE_VERSION: "${DEVEL_RELEASE_IMAGE_VERSION}"
# Define the release jobs
release:staging-centos7:
extends:
- .release:staging
- .dist-centos7
needs:
- image-centos7
release:staging-ubi8:
extends:
- .release:staging
- .dist-ubi8
needs:
- image-ubi8
release:staging-ubuntu20.04:
extends:
- .release:staging
- .dist-ubuntu20.04
needs:
- test-toolkit-ubuntu20.04
- test-containerd-ubuntu20.04
- test-crio-ubuntu20.04
- test-docker-ubuntu20.04
release:staging-packaging:
extends:
- .release:staging
- .dist-packaging
needs:
- test-packaging

View File

@@ -1,2 +1,2 @@
.git
dist
/shared-*

10
.gitignore vendored
View File

@@ -1,3 +1,13 @@
dist
artifacts
*.swp
*.swo
/coverage.out*
/test/output/
/nvidia-container-runtime
/nvidia-container-runtime.*
/nvidia-container-runtime-hook
/nvidia-container-toolkit
/nvidia-ctk
/shared-*
/release-*

View File

@@ -1,161 +1,314 @@
# Build packages for all supported OS / ARCH combinations
# Copyright (c) 2019-2022, NVIDIA CORPORATION. All rights reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
stages:
- tests
- build-one
- build-all
include:
- .common-ci.yml
.tests-setup: &tests-setup
image: golang:1.14.4
rules:
- when: always
variables:
GITHUB_ROOT: "github.com/NVIDIA"
PROJECT_GOPATH: "${GITHUB_ROOT}/nvidia-container-toolkit"
before_script:
- mkdir -p ${GOPATH}/src/${GITHUB_ROOT}
- ln -s ${CI_PROJECT_DIR} ${GOPATH}/src/${PROJECT_GOPATH}
.build-setup: &build-setup
image: docker:19.03.8
services:
- name: docker:19.03.8-dind
command: ["--experimental"]
before_script:
- apk update
- apk upgrade
- apk add coreutils build-base sed git bash make
- docker run --rm --privileged multiarch/qemu-user-static --reset -p yes -c yes
# Run a series of sanity-check tests over the code
lint:
<<: *tests-setup
stage: tests
build-dev-image:
stage: image
script:
- go get -u golang.org/x/lint/golint
- golint -set_exit_status ${PROJECT_GOPATH}/pkg
- apk --no-cache add make bash
- make .build-image
- docker login -u "${CI_REGISTRY_USER}" -p "${CI_REGISTRY_PASSWORD}" "${CI_REGISTRY}"
- make .push-build-image
vet:
<<: *tests-setup
stage: tests
script:
- go vet ${PROJECT_GOPATH}/pkg
.requires-build-image:
image: "${BUILDIMAGE}"
unit_test:
<<: *tests-setup
stage: tests
script:
- go test ${PROJECT_GOPATH}/pkg
.go-check:
extends:
- .requires-build-image
stage: go-checks
fmt:
<<: *tests-setup
stage: tests
extends:
- .go-check
script:
- res=$(gofmt -l pkg/*.go)
- echo "$res"
- test -z "$res"
- make assert-fmt
vet:
extends:
- .go-check
script:
- make vet
lint:
extends:
- .go-check
script:
- make lint
allow_failure: true
ineffassign:
<<: *tests-setup
stage: tests
extends:
- .go-check
script:
- go get -u github.com/gordonklaus/ineffassign
- ineffassign pkg/*.go
- make ineffassign
allow_failure: true
misspell:
<<: *tests-setup
stage: tests
extends:
- .go-check
script:
- go get -u github.com/client9/misspell/cmd/misspell
- misspell pkg/*.go
- make misspell
# build-one jobs build packages for a single OS / ARCH combination.
#
# They are run during the first stage of the pipeline as a smoke test to ensure
# that we can successfully build packages on all of our architectures for a
# single OS. They are triggered on any change to an MR. No artifacts are
# produced as part of build-one jobs.
.build-one-setup: &build-one-setup
<<: *build-setup
stage: build-one
only:
- merge_requests
go-build:
extends:
- .requires-build-image
stage: go-build
script:
- make build
# build-all jobs build packages for every OS / ARCH combination we support.
#
# They are run under two conditions:
# 1) Automatically whenever a new tag is pushed to the repo (e.g. v1.1.0)
# 2) Manually by a reviewer just before merging a MR.
#
# Unlike build-one jobs, it takes a long time to build the full suite
# OS / ARCH combinations, so this is optimized to only run once per MR
# (assuming it all passes). A full set of artifacts including the packages
# built for each OS / ARCH are produced as a result of these jobs.
.build-all-setup: &build-all-setup
<<: *build-setup
stage: build-all
timeout: 2h 30m
rules:
- if: $CI_COMMIT_TAG
when: always
- if: $CI_MERGE_REQUEST_ID
when: always
unit-tests:
extends:
- .requires-build-image
stage: unit-tests
script:
- make coverage
# Define the package build helpers
.multi-arch-build:
before_script:
- apk add --no-cache coreutils build-base sed git bash make
- '[[ -n "${SKIP_QEMU_SETUP}" ]] || docker run --rm --privileged multiarch/qemu-user-static --reset -p yes -c yes'
.package-artifacts:
variables:
ARTIFACTS_NAME: "${CI_PROJECT_NAME}-${CI_COMMIT_REF_SLUG}-${CI_JOB_NAME}-artifacts-${CI_PIPELINE_ID}"
ARTIFACTS_DIR: "${CI_PROJECT_NAME}-${CI_COMMIT_REF_SLUG}-artifacts-${CI_PIPELINE_ID}"
DIST_DIR: "${CI_PROJECT_DIR}/${ARTIFACTS_DIR}"
ARTIFACTS_NAME: "toolkit-container-${CI_PIPELINE_ID}"
ARTIFACTS_ROOT: "toolkit-container-${CI_PIPELINE_ID}"
DIST_DIR: ${CI_PROJECT_DIR}/${ARTIFACTS_ROOT}
.package-build:
extends:
- .multi-arch-build
- .package-artifacts
stage: package-build
timeout: 3h
script:
- ./scripts/build-packages.sh ${DIST}-${ARCH}
artifacts:
name: ${ARTIFACTS_NAME}
paths:
- ${ARTIFACTS_DIR}
- ${ARTIFACTS_ROOT}
# The full set of build-one jobs organizes to build
# ubuntu18.04 in parallel on each of our supported ARCHs.
build-one-amd64:
<<: *build-one-setup
script:
- make ubuntu18.04-amd64
# Define the package build targets
package-amazonlinux2-aarch64:
extends:
- .package-build
- .dist-amazonlinux2
- .arch-aarch64
build-one-ppc64le:
<<: *build-one-setup
script:
- make ubuntu18.04-ppc64le
package-amazonlinux2-x86_64:
extends:
- .package-build
- .dist-amazonlinux2
- .arch-x86_64
build-one-arm64:
<<: *build-one-setup
script:
- make ubuntu18.04-arm64
package-centos7-x86_64:
extends:
- .package-build
- .dist-centos7
- .arch-x86_64
# The full set of build-all jobs organized to
# have builds for each ARCH run in parallel.
build-all-amd64:
<<: *build-all-setup
script:
- make docker-amd64
package-centos8-aarch64:
extends:
- .package-build
- .dist-centos8
- .arch-aarch64
build-all-x86_64:
<<: *build-all-setup
script:
- make docker-x86_64
package-centos8-ppc64le:
extends:
- .package-build
- .dist-centos8
- .arch-ppc64le
build-all-ppc64le:
<<: *build-all-setup
script:
- make docker-ppc64le
package-centos8-x86_64:
extends:
- .package-build
- .dist-centos8
- .arch-x86_64
build-all-arm64:
<<: *build-all-setup
script:
- make docker-arm64
package-debian10-amd64:
extends:
- .package-build
- .dist-debian10
- .arch-amd64
build-all-aarch64:
<<: *build-all-setup
package-opensuse-leap15.1-x86_64:
extends:
- .package-build
- .dist-opensuse-leap15.1
- .arch-x86_64
package-ubuntu18.04-amd64:
extends:
- .package-build
- .dist-ubuntu18.04
- .arch-amd64
package-ubuntu18.04-arm64:
extends:
- .package-build
- .dist-ubuntu18.04
- .arch-arm64
package-ubuntu18.04-ppc64le:
extends:
- .package-build
- .dist-ubuntu18.04
- .arch-ppc64le
.buildx-setup:
before_script:
- export BUILDX_VERSION=v0.6.3
- apk add --no-cache curl
- mkdir -p ~/.docker/cli-plugins
- curl -sSLo ~/.docker/cli-plugins/docker-buildx "https://github.com/docker/buildx/releases/download/${BUILDX_VERSION}/buildx-${BUILDX_VERSION}.linux-amd64"
- chmod a+x ~/.docker/cli-plugins/docker-buildx
- docker buildx create --use --platform=linux/amd64,linux/arm64
- '[[ -n "${SKIP_QEMU_SETUP}" ]] || docker run --rm --privileged multiarch/qemu-user-static --reset -p yes'
# Define the image build targets
.image-build:
stage: image-build
variables:
IMAGE_NAME: "${CI_REGISTRY_IMAGE}/container-toolkit"
VERSION: "${CI_COMMIT_SHORT_SHA}"
PUSH_ON_BUILD: "true"
before_script:
- !reference [.buildx-setup, before_script]
- apk add --no-cache bash make git
- 'echo "Logging in to CI registry ${CI_REGISTRY}"'
- docker login -u "${CI_REGISTRY_USER}" -p "${CI_REGISTRY_PASSWORD}" "${CI_REGISTRY}"
script:
- make docker-aarch64
- make -f build/container/Makefile build-${DIST}
image-centos7:
extends:
- .image-build
- .package-artifacts
- .dist-centos7
needs:
- package-centos7-x86_64
image-ubi8:
extends:
- .image-build
- .package-artifacts
- .dist-ubi8
needs:
# Note: The ubi8 image uses the centos8 packages
- package-centos8-aarch64
- package-centos8-x86_64
- package-centos8-ppc64le
image-ubuntu20.04:
extends:
- .image-build
- .package-artifacts
- .dist-ubuntu20.04
needs:
- package-ubuntu18.04-amd64
- package-ubuntu18.04-arm64
- job: package-ubuntu18.04-ppc64le
optional: true
# The DIST=packaging target creates an image containing all built packages
image-packaging:
extends:
- .image-build
- .package-artifacts
- .dist-packaging
needs:
- job: package-centos8-aarch64
- job: package-centos8-x86_64
- job: package-ubuntu18.04-amd64
- job: package-ubuntu18.04-arm64
- job: package-amazonlinux2-aarch64
optional: true
- job: package-amazonlinux2-x86_64
optional: true
- job: package-centos7-x86_64
optional: true
- job: package-centos8-ppc64le
optional: true
- job: package-debian10-amd64
optional: true
- job: package-opensuse-leap15.1-x86_64
optional: true
- job: package-ubuntu18.04-ppc64le
optional: true
# Define publish test helpers
.test:toolkit:
extends:
- .integration
variables:
TEST_CASES: "toolkit"
.test:docker:
extends:
- .integration
variables:
TEST_CASES: "docker"
.test:containerd:
# TODO: The containerd tests fail due to issues with SIGHUP.
# Until this is resolved with retry up to twice and allow failure here.
retry: 2
allow_failure: true
extends:
- .integration
variables:
TEST_CASES: "containerd"
.test:crio:
extends:
- .integration
variables:
TEST_CASES: "crio"
# Define the test targets
test-toolkit-ubuntu20.04:
extends:
- .test:toolkit
- .dist-ubuntu20.04
needs:
- image-ubuntu20.04
test-containerd-ubuntu20.04:
extends:
- .test:containerd
- .dist-ubuntu20.04
needs:
- image-ubuntu20.04
test-crio-ubuntu20.04:
extends:
- .test:crio
- .dist-ubuntu20.04
needs:
- image-ubuntu20.04
test-docker-ubuntu20.04:
extends:
- .test:docker
- .dist-ubuntu20.04
needs:
- image-ubuntu20.04

12
.gitmodules vendored Normal file
View File

@@ -0,0 +1,12 @@
[submodule "third_party/libnvidia-container"]
path = third_party/libnvidia-container
url = https://gitlab.com/nvidia/container-toolkit/libnvidia-container.git
branch = main
[submodule "third_party/nvidia-container-runtime"]
path = third_party/nvidia-container-runtime
url = https://gitlab.com/nvidia/container-toolkit/container-runtime.git
branch = main
[submodule "third_party/nvidia-docker"]
path = third_party/nvidia-docker
url = https://gitlab.com/nvidia/container-toolkit/nvidia-docker.git
branch = main

241
.nvidia-ci.yml Normal file
View File

@@ -0,0 +1,241 @@
# Copyright (c) 2021-2022, NVIDIA CORPORATION. All rights reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
include:
- local: '.common-ci.yml'
default:
tags:
- cnt
- container-dev
- docker/multi-arch
- docker/privileged
- os/linux
- type/docker
variables:
DOCKER_DRIVER: overlay2
DOCKER_TLS_CERTDIR: "/certs"
# Release "devel"-tagged images off the main branch
RELEASE_DEVEL_BRANCH: "main"
DEVEL_RELEASE_IMAGE_VERSION: "devel"
# On the multi-arch builder we don't need the qemu setup.
SKIP_QEMU_SETUP: "1"
# Define the public staging registry
STAGING_REGISTRY: registry.gitlab.com/nvidia/container-toolkit/container-toolkit/staging
STAGING_VERSION: ${CI_COMMIT_SHORT_SHA}
ARTIFACTORY_REPO_BASE: "https://urm.nvidia.com/artifactory/sw-gpu-cloudnative"
KITMAKER_RELEASE_FOLDER: "kitmaker"
.image-pull:
stage: image-build
variables:
IN_REGISTRY: "${STAGING_REGISTRY}"
IN_IMAGE_NAME: container-toolkit
IN_VERSION: "${STAGING_VERSION}"
OUT_REGISTRY_USER: "${CI_REGISTRY_USER}"
OUT_REGISTRY_TOKEN: "${CI_REGISTRY_PASSWORD}"
OUT_REGISTRY: "${CI_REGISTRY}"
OUT_IMAGE_NAME: "${CI_REGISTRY_IMAGE}/container-toolkit"
PUSH_MULTIPLE_TAGS: "false"
# We delay the job start to allow the public pipeline to generate the required images.
rules:
- when: delayed
start_in: 30 minutes
timeout: 30 minutes
retry:
max: 2
when:
- job_execution_timeout
- stuck_or_timeout_failure
before_script:
- !reference [.regctl-setup, before_script]
- apk add --no-cache make bash
- >
regctl manifest get ${IN_REGISTRY}/${IN_IMAGE_NAME}:${IN_VERSION}-${DIST} --list > /dev/null && echo "${IN_REGISTRY}/${IN_IMAGE_NAME}:${IN_VERSION}-${DIST}" || ( echo "${IN_REGISTRY}/${IN_IMAGE_NAME}:${IN_VERSION}-${DIST} does not exist" && sleep infinity )
script:
- regctl registry login "${OUT_REGISTRY}" -u "${OUT_REGISTRY_USER}" -p "${OUT_REGISTRY_TOKEN}"
- make -f build/container/Makefile IMAGE=${IN_REGISTRY}/${IN_IMAGE_NAME}:${IN_VERSION}-${DIST} OUT_IMAGE=${OUT_IMAGE_NAME}:${CI_COMMIT_SHORT_SHA}-${DIST} push-${DIST}
image-centos7:
extends:
- .dist-centos7
- .image-pull
image-ubi8:
extends:
- .dist-ubi8
- .image-pull
image-ubuntu20.04:
extends:
- .dist-ubuntu20.04
- .image-pull
# The DIST=packaging target creates an image containing all built packages
image-packaging:
extends:
- .dist-packaging
- .image-pull
# We skip the integration tests for the internal CI:
.integration:
stage: test
before_script:
- echo "Skipped in internal CI"
script:
- echo "Skipped in internal CI"
# The .scan step forms the base of the image scan operation performed before releasing
# images.
.scan:
stage: scan
image: "${PULSE_IMAGE}"
variables:
IMAGE: "${CI_REGISTRY_IMAGE}/container-toolkit:${CI_COMMIT_SHORT_SHA}-${DIST}"
IMAGE_ARCHIVE: "container-toolkit-${DIST}-${ARCH}-${CI_JOB_ID}.tar"
rules:
- if: $SKIP_SCANS != "yes"
- when: manual
before_script:
- docker login -u "${CI_REGISTRY_USER}" -p "${CI_REGISTRY_PASSWORD}" "${CI_REGISTRY}"
# TODO: We should specify the architecture here and scan all architectures
- docker pull --platform="${PLATFORM}" "${IMAGE}"
- docker save "${IMAGE}" -o "${IMAGE_ARCHIVE}"
- AuthHeader=$(echo -n $SSA_CLIENT_ID:$SSA_CLIENT_SECRET | base64 -w0)
- >
export SSA_TOKEN=$(curl --request POST --header "Authorization: Basic $AuthHeader" --header "Content-Type: application/x-www-form-urlencoded" ${SSA_ISSUER_URL} | jq ".access_token" | tr -d '"')
- if [ -z "$SSA_TOKEN" ]; then exit 1; else echo "SSA_TOKEN set!"; fi
script:
- pulse-cli -n $NSPECT_ID --ssa $SSA_TOKEN scan -i $IMAGE_ARCHIVE -p $CONTAINER_POLICY -o
artifacts:
when: always
expire_in: 1 week
paths:
- pulse-cli.log
- licenses.json
- sbom.json
- vulns.json
- policy_evaluation.json
# Define the scan targets
scan-centos7-amd64:
extends:
- .dist-centos7
- .platform-amd64
- .scan
needs:
- image-centos7
scan-centos7-arm64:
extends:
- .dist-centos7
- .platform-arm64
- .scan
needs:
- image-centos7
- scan-centos7-amd64
scan-ubuntu20.04-amd64:
extends:
- .dist-ubuntu20.04
- .platform-amd64
- .scan
needs:
- image-ubuntu20.04
scan-ubuntu20.04-arm64:
extends:
- .dist-ubuntu20.04
- .platform-arm64
- .scan
needs:
- image-ubuntu20.04
- scan-ubuntu20.04-amd64
scan-ubi8-amd64:
extends:
- .dist-ubi8
- .platform-amd64
- .scan
needs:
- image-ubi8
scan-ubi8-arm64:
extends:
- .dist-ubi8
- .platform-arm64
- .scan
needs:
- image-ubi8
- scan-ubi8-amd64
# Define external release helpers
.release:ngc:
extends:
- .release:external
variables:
OUT_REGISTRY_USER: "${NGC_REGISTRY_USER}"
OUT_REGISTRY_TOKEN: "${NGC_REGISTRY_TOKEN}"
OUT_REGISTRY: "${NGC_REGISTRY}"
OUT_IMAGE_NAME: "${NGC_REGISTRY_IMAGE}"
.release:packages:
stage: release
needs:
- image-packaging
variables:
VERSION: "${CI_COMMIT_SHORT_SHA}"
PACKAGE_REGISTRY: "${CI_REGISTRY}"
PACKAGE_REGISTRY_USER: "${CI_REGISTRY_USER}"
PACKAGE_REGISTRY_TOKEN: "${CI_REGISTRY_PASSWORD}"
PACKAGE_IMAGE_NAME: "${CI_REGISTRY_IMAGE}/container-toolkit"
PACKAGE_IMAGE_TAG: "${CI_COMMIT_SHORT_SHA}-packaging"
KITMAKER_ARTIFACTORY_REPO: "${ARTIFACTORY_REPO_BASE}-generic-local/${KITMAKER_RELEASE_FOLDER}"
script:
- !reference [.regctl-setup, before_script]
- apk add --no-cache bash git
- regctl registry login "${PACKAGE_REGISTRY}" -u "${PACKAGE_REGISTRY_USER}" -p "${PACKAGE_REGISTRY_TOKEN}"
- ./scripts/extract-packages.sh "${PACKAGE_IMAGE_NAME}:${PACKAGE_IMAGE_TAG}"
# TODO: ./scripts/release-packages-artifactory.sh "${DIST}-${ARCH}" "${PACKAGE_ARTIFACTORY_REPO}"
- ./scripts/release-kitmaker-artifactory.sh "${KITMAKER_ARTIFACTORY_REPO}"
# Define the package release targets
release:packages:kitmaker:
extends:
- .release:packages
release:staging-ubuntu20.04:
extends:
- .release:staging
- .dist-ubuntu20.04
needs:
- image-ubuntu20.04
# Define the external release targets
# Release to NGC
release:ngc-centos7:
extends:
- .dist-centos7
- .release:ngc
release:ngc-ubuntu20.04:
extends:
- .dist-ubuntu20.04
- .release:ngc
release:ngc-ubi8:
extends:
- .dist-ubi8
- .release:ngc

278
CHANGELOG.md Normal file
View File

@@ -0,0 +1,278 @@
# NVIDIA Container Toolkit Changelog
## v1.13.0-rc.2
* Don't fail chmod hook if paths are not injected
* Only create `by-path` symlinks if CDI devices are actually requested.
* Fix possible blank `nvidia-ctk` path in generated CDI specifications
* Fix error in postun scriplet on RPM-based systems
* Only check `NVIDIA_VISIBLE_DEVICES` for environment variables if no annotations are specified.
* Add `cdi.default-kind` config option for constructing fully-qualified CDI device names in CDI mode
* Add support for `accept-nvidia-visible-devices-envvar-unprivileged` config setting in CDI mode
* Add `nvidia-container-runtime-hook.skip-mode-detection` config option to bypass mode detection. This allows `legacy` and `cdi` mode, for example, to be used at the same time.
* Add support for generating CDI specifications for GDS and MOFED devices
* Ensure CDI specification is validated on save when generating a spec
* Rename `--discovery-mode` argument to `--mode` for `nvidia-ctk cdi generate`
* [libnvidia-container] Fix segfault on WSL2 systems
* [toolkit-container] Add `--cdi-enabled` flag to toolkit config
* [toolkit-container] Install `nvidia-ctk` from toolkit container
* [toolkit-container] Use installed `nvidia-ctk` path in NVIDIA Container Toolkit config
* [toolkit-container] Bump CUDA base images to 12.1.0
* [toolkit-container] Set `nvidia-ctk` path in the
* [toolkit-container] Add `cdi.k8s.io/*` to set of allowed annotations in containerd config
* [toolkit-container] Generate CDI specification for use in management containers
* [toolkit-container] Install experimental runtime as `nvidia-container-runtime.experimental` instead of `nvidia-container-runtime-experimental`
* [toolkit-container] Install and configure mode-specific runtimes for `cdi` and `legacy` modes
## v1.13.0-rc.1
* Include MIG-enabled devices as GPUs when generating CDI specification
* Fix missing NVML symbols when running `nvidia-ctk` on some platforms [#49]
* Add CDI spec generation for WSL2-based systems to `nvidia-ctk cdi generate` command
* Add `auto` mode to `nvidia-ctk cdi generate` command to automatically detect a WSL2-based system over a standard NVML-based system.
* Add mode-specific (`.cdi` and `.legacy`) NVIDIA Container Runtime binaries for use in the GPU Operator
* Discover all `gsb*.bin` GSP firmware files when generating CDI specification.
* Align `.deb` and `.rpm` release candidate package versions
* Remove `fedora35` packaging targets
* [libnvidia-container] Include all `gsp*.bin` firmware files if present
* [libnvidia-container] Align `.deb` and `.rpm` release candidate package versions
* [libnvidia-container] Remove `fedora35` packaging targets
* [toolkit-container] Install `nvidia-container-toolkit-operator-extensions` package for mode-specific executables.
* [toolkit-container] Allow `nvidia-container-runtime.mode` to be set when configuring the NVIDIA Container Toolkit
## v1.12.0
* Promote `v1.12.0-rc.5` to `v1.12.0`
* Rename `nvidia cdi generate` `--root` flag to `--driver-root` to better indicate intent
* [libnvidia-container] Add nvcubins.bin to DriverStore components under WSL2
* [toolkit-container] Bump CUDA base images to 12.0.1
## v1.12.0-rc.5
* Fix bug here the `nvidia-ctk` path was not properly resolved. This causes failures to run containers when the runtime is configured in `csv` mode or if the `NVIDIA_DRIVER_CAPABILITIES` includes `graphics` or `display` (e.g. `all`).
## v1.12.0-rc.4
* Generate a minimum CDI spec version for improved compatibility.
* Add `--device-name-strategy` options to the `nvidia-ctk cdi generate` command that can be used to control how device names are constructed.
* Set default for CDI device name generation to `index` to generate device names such as `nvidia.com/gpu=0` or `nvidia.com/gpu=1:0` by default.
## v1.12.0-rc.3
* Don't fail if by-path symlinks for DRM devices do not exist
* Replace the --json flag with a --format [json|yaml] flag for the nvidia-ctk cdi generate command
* Ensure that the CDI output folder is created if required
* When generating a CDI specification use a blank host path for devices to ensure compatibility with the v0.4.0 CDI specification
* Add injection of Wayland JSON files
* Add GSP firmware paths to generated CDI specification
* Add --root flag to nvidia-ctk cdi generate command
## v1.12.0-rc.2
* Inject Direct Rendering Manager (DRM) devices into a container using the NVIDIA Container Runtime
* Improve logging of errors from the NVIDIA Container Runtime
* Improve CDI specification generation to support rootless podman
* Use `nvidia-ctk cdi generate` to generate CDI specifications instead of `nvidia-ctk info generate-cdi`
* [libnvidia-container] Skip creation of existing files when these are already mounted
## v1.12.0-rc.1
* Add support for multiple Docker Swarm resources
* Improve injection of Vulkan configurations and libraries
* Add `nvidia-ctk info generate-cdi` command to generated CDI specification for available devices
* [libnvidia-container] Include NVVM compiler library in compute libs
## v1.11.0
* Promote v1.11.0-rc.3 to v1.11.0
## v1.11.0-rc.3
* Build fedora35 packages
* Introduce an `nvidia-container-toolkit-base` package for better dependency management
* Fix removal of `nvidia-container-runtime-hook` on RPM-based systems
* Inject platform files into container on Tegra-based systems
* [toolkit container] Update CUDA base images to 11.7.1
* [libnvidia-container] Preload libgcc_s.so.1 on arm64 systems
## v1.11.0-rc.2
* Allow `accept-nvidia-visible-devices-*` config options to be set by toolkit container
* [libnvidia-container] Fix bug where LDCache was not updated when the `--no-pivot-root` option was specified
## v1.11.0-rc.1
* Add discovery of GPUDirect Storage (`nvidia-fs*`) devices if the `NVIDIA_GDS` environment variable of the container is set to `enabled`
* Add discovery of MOFED Infiniband devices if the `NVIDIA_MOFED` environment variable of the container is set to `enabled`
* Fix bug in CSV mode where libraries listed as `sym` entries in mount specification are not added to the LDCache.
* Rename `nvidia-container-toolkit` executable to `nvidia-container-runtime-hook` and create `nvidia-container-toolkit` as a symlink to `nvidia-container-runtime-hook` instead.
* Add `nvidia-ctk runtime configure` command to configure the Docker config file (e.g. `/etc/docker/daemon.json`) for use with the NVIDIA Container Runtime.
## v1.10.0
* Promote v1.10.0-rc.3 to v1.10.0
## v1.10.0-rc.3
* Use default config instead of raising an error if config file cannot be found
* Ignore NVIDIA_REQUIRE_JETPACK* environment variables for requirement checks
* Fix bug in detection of Tegra systems where `/sys/devices/soc0/family` is ignored
* Fix bug where links to devices were detected as devices
* [libnvida-container] Fix bug introduced when adding libcudadebugger.so to list of libraries
## v1.10.0-rc.2
* Add support for NVIDIA_REQUIRE_* checks for cuda version and arch to csv mode
* Switch to debug logging to reduce log verbosity
* Support logging to logs requested in command line
* Fix bug when launching containers with relative root path (e.g. using containerd)
* Allow low-level runtime path to be set explicitly as nvidia-container-runtime.runtimes option
* Fix failure to locate low-level runtime if PATH envvar is unset
* Replace experimental option for NVIDIA Container Runtime with nvidia-container-runtime.mode = csv option
* Use csv as default mode on Tegra systems without NVML
* Add --version flag to all CLIs
* [libnvidia-container] Bump libtirpc to 1.3.2
* [libnvidia-container] Fix bug when running host ldconfig using glibc compiled with a non-standard prefix
* [libnvidia-container] Add libcudadebugger.so to list of compute libraries
## v1.10.0-rc.1
* Include nvidia-ctk CLI in installed binaries
* Add experimental option to NVIDIA Container Runtime
## v1.9.0
* [libnvidia-container] Add additional check for Tegra in /sys/.../family file in CLI
* [libnvidia-container] Update jetpack-specific CLI option to only load Base CSV files by default
* [libnvidia-container] Fix bug (from 1.8.0) when mounting GSP firmware into containers without /lib to /usr/lib symlinks
* [libnvidia-container] Update nvml.h to CUDA 11.6.1 nvML_DEV 11.6.55
* [libnvidia-container] Update switch statement to include new brands from latest nvml.h
* [libnvidia-container] Process all --require flags on Jetson platforms
* [libnvidia-container] Fix long-standing issue with running ldconfig on Debian systems
## v1.8.1
* [libnvidia-container] Fix bug in determining cgroup root when running in nested containers
* [libnvidia-container] Fix permission issue when determining cgroup version
## v1.8.0
* Promote 1.8.0-rc.2-1 to 1.8.0
## v1.8.0-rc.2
* Remove support for building amazonlinux1 packages
## v1.8.0-rc.1
* [libnvidia-container] Add support for cgroupv2
* Release toolkit-container images from nvidia-container-toolkit repository
## v1.7.0
* Promote 1.7.0-rc.1-1 to 1.7.0
* Bump Golang version to 1.16.4
## v1.7.0-rc.1
* Specify containerd runtime type as string in config tools to remove dependency on containerd package
* Add supported-driver-capabilities config option to allow for a subset of all driver capabilities to be specified
## v1.6.0
* Promote 1.6.0-rc.3-1 to 1.6.0
* Fix unnecessary logging to stderr instead of configured nvidia-container-runtime log file
## v1.6.0-rc.3
* Add supported-driver-capabilities config option to the nvidia-container-toolkit
* Move OCI and command line checks for runtime to internal oci package
## v1.6.0-rc.2
* Use relative path to OCI specification file (config.json) if bundle path is not specified as an argument to the nvidia-container-runtime
## v1.6.0-rc.1
* Add AARCH64 package for Amazon Linux 2
* Include nvidia-container-runtime into nvidia-container-toolkit package
## v1.5.1
* Fix bug where Docker Swarm device selection is ignored if NVIDIA_VISIBLE_DEVICES is also set
* Improve unit testing by using require package and adding coverage reports
* Remove unneeded go dependencies by running go mod tidy
* Move contents of pkg directory to cmd for CLI tools
* Ensure make binary target explicitly sets GOOS
## v1.5.0
* Add dependence on libnvidia-container-tools >= 1.4.0
* Add golang check targets to Makefile
* Add Jenkinsfile definition for build targets
* Move docker.mk to docker folder
## v1.4.2
* Add dependence on libnvidia-container-tools >= 1.3.3
## v1.4.1
* Ignore NVIDIA_VISIBLE_DEVICES for containers with insufficent privileges
* Add dependence on libnvidia-container-tools >= 1.3.2
## v1.4.0
* Add 'compute' capability to list of defaults
* Add dependence on libnvidia-container-tools >= 1.3.1
## v1.3.0
* Promote 1.3.0-rc.2-1 to 1.3.0
* Add dependence on libnvidia-container-tools >= 1.3.0
## v1.3.0-rc.2
* 2c180947 Add more tests for new semantics with device list from volume mounts
* 7c003857 Refactor accepting device lists from volume mounts as a boolean
## v1.3.0-rc.1
* b50d86c1 Update build system to accept a TAG variable for things like rc.x
* fe65573b Add common CI tests for things like golint, gofmt, unit tests, etc.
* da6fbb34 Revert "Add ability to merge envars of the form NVIDIA_VISIBLE_DEVICES_*"
* a7fb3330 Flip build-all targets to run automatically on merge requests
* 8b248b66 Rename github.com/NVIDIA/container-toolkit to nvidia-container-toolkit
* da36874e Add new config options to pull device list from mounted files instead of ENVVAR
## v1.2.1
* 4e6e0ed4 Add 'ngx' to list of*all* driver capabilities
* 2f4af743 List config.toml as a config file in the RPM SPEC
## v1.2.0
* 8e0aab46 Fix repo listed in changelog for debian distributions
* 320bb6e4 Update dependence on libnvidia-container to 1.2.0
* 6cfc8097 Update package license to match source license
* e7dc3cbb Fix debian copyright file
* d3aee3e0 Add the 'ngx' driver capability
## v1.1.2
* c32237f3 Add support for parsing Linux Capabilities for older OCI specs
## v1.1.1
* d202aded Update dependence to libnvidia-container 1.1.1
## v1.1.0
* 4e4de762 Update build system to support multi-arch builds
* fcc1d116 Add support for MIG (Multi-Instance GPUs)
* d4ff0416 Add ability to merge envars of the form NVIDIA_VISIBLE_DEVICES_*
* 60f165ad Add no-pivot option to toolkit
## v1.0.5
* Initial release. Replaces older package nvidia-container-runtime-hook. (Closes: #XXXXXX)

61
DEVELOPMENT.md Normal file
View File

@@ -0,0 +1,61 @@
# NVIDIA Container Toolkit Release Tooling
This repository allows for the components of the NVIDIA container stack to be
built and released as the NVIDIA Container Toolkit from a single repository. The components:
* `libnvidia-container`
* `nvidia-container-runtime`
* `nvidia-docker`
are included as submodules in the `third_party` folder.
The `nvidia-container-toolkit` resides in this repo directly.
## Building
In oder to build the packages, the following command is executed
```sh
./scripts/build-packages.sh TARGET
```
where `TARGET` is a make target that is valid for each of the sub-components.
These include:
* `ubuntu18.04-amd64`
* `centos8-x86_64`
If no `TARGET` is specified, all valid release targets are built.
The packages are generated in the `dist` folder.
## Testing local changes
In oder to use the same build logic to be used to generate packages with local changes,
the location of the individual components can be overridded using the: `LIBNVIDIA_CONTAINER_ROOT`,
`NVIDIA_CONTAINER_TOOLKIT_ROOT`, `NVIDIA_CONTAINER_RUNTIME_ROOT`, and `NVIDIA_DOCKER_ROOT`
environment variables.
## Testing packages locally
The [test/release](./test/release/) folder contains documentation on how the installation of local or staged packages can be tested.
## Releasing
In order to release packages required for a release, a utility script
[`scripts/release-packages.sh`](./scripts/release-packages.sh) is provided.
This script can be executed as follows:
```bash
GPG_LOCAL_USER="GPG_USER" \
MASTER_KEY_PATH=/path/to/gpg-master.key \
SUB_KEY_PATH=/path/to/gpg-subkey.key \
./scripts/release-packages.sh REPO PACKAGE_REPO_ROOT [REFERENCE]
```
Where `REPO` is one of `stable` or `experimental`, `PACKAGE_REPO_ROOT` is the local path to the `libnvidia-container` repository checked out to the `gh-pages` branch, and `REFERENCE` is the git SHA that is to be released. If reference is not specified `HEAD` is assumed.
This scripts performs the following basic functions:
* Pulls the package image defined by the `REFERENCE` git SHA from the staging registry,
* Copies the required packages to the package repository at `PACKAGE_REPO_ROOT/REPO`,
* Signs the packages using the specified GPG keys
While the last two are performed, commits are added to the package repository. These can be pushed to the relevant repository.

142
Jenkinsfile vendored Normal file
View File

@@ -0,0 +1,142 @@
/*
# Copyright (c) 2021, NVIDIA CORPORATION. All rights reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
*/
podTemplate (cloud:'sw-gpu-cloudnative',
containers: [
containerTemplate(name: 'docker', image: 'docker:dind', ttyEnabled: true, privileged: true),
containerTemplate(name: 'golang', image: 'golang:1.16.3', ttyEnabled: true)
]) {
node(POD_LABEL) {
def scmInfo
stage('checkout') {
scmInfo = checkout(scm)
}
stage('dependencies') {
container('golang') {
sh 'GO111MODULE=off go get -u github.com/client9/misspell/cmd/misspell'
sh 'GO111MODULE=off go get -u github.com/gordonklaus/ineffassign'
sh 'GO111MODULE=off go get -u golang.org/x/lint/golint'
}
container('docker') {
sh 'apk add --no-cache make bash git'
}
}
stage('check') {
parallel (
getGolangStages(["assert-fmt", "lint", "vet", "ineffassign", "misspell"])
)
}
stage('test') {
parallel (
getGolangStages(["test"])
)
}
def versionInfo
stage('version') {
container('docker') {
versionInfo = getVersionInfo(scmInfo)
println "versionInfo=${versionInfo}"
}
}
def dist = 'ubuntu20.04'
def arch = 'amd64'
def stageLabel = "${dist}-${arch}"
stage('build-one') {
container('docker') {
stage (stageLabel) {
sh "make ${dist}-${arch}"
}
}
}
stage('release') {
container('docker') {
stage (stageLabel) {
def component = 'main'
def repository = 'sw-gpu-cloudnative-debian-local/pool/main/'
def uploadSpec = """{
"files":
[ {
"pattern": "./dist/${dist}/${arch}/*.deb",
"target": "${repository}",
"props": "deb.distribution=${dist};deb.component=${component};deb.architecture=${arch}"
}
]
}"""
sh "echo starting release with versionInfo=${versionInfo}"
if (versionInfo.isTag) {
// upload to artifactory repository
def server = Artifactory.server 'sw-gpu-artifactory'
server.upload spec: uploadSpec
} else {
sh "echo skipping release for non-tagged build"
}
}
}
}
}
}
def getGolangStages(def targets) {
stages = [:]
for (t in targets) {
stages[t] = getLintClosure(t)
}
return stages
}
def getLintClosure(def target) {
return {
container('golang') {
stage(target) {
sh "make ${target}"
}
}
}
}
// getVersionInfo returns a hash of version info
def getVersionInfo(def scmInfo) {
def versionInfo = [
isTag: isTag(scmInfo.GIT_BRANCH)
]
scmInfo.each { k, v -> versionInfo[k] = v }
return versionInfo
}
def isTag(def branch) {
if (!branch.startsWith('v')) {
return false
}
def version = shOutput('git describe --all --exact-match --always')
return version == "tags/${branch}"
}
def shOuptut(def script) {
return sh(script: script, returnStdout: true).trim()
}

159
Makefile
View File

@@ -1,19 +1,160 @@
# Copyright (c) 2017-2020, NVIDIA CORPORATION. All rights reserved.
# Copyright (c) 2017-2021, NVIDIA CORPORATION. All rights reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
DOCKER ?= docker
MKDIR ?= mkdir
DIST_DIR ?= $(CURDIR)/dist
LIB_NAME := nvidia-container-toolkit
LIB_VERSION := 1.3.0
LIB_TAG ?=
include $(CURDIR)/versions.mk
GOLANG_VERSION := 1.14.2
GOLANG_PKG_PATH := github.com/NVIDIA/nvidia-container-toolkit/pkg
MODULE := github.com/NVIDIA/nvidia-container-toolkit
# By default run all native docker-based targets
docker-native:
include $(CURDIR)/docker.mk
include $(CURDIR)/docker/docker.mk
binary:
go build -ldflags "-s -w" -o "$(LIB_NAME)" $(GOLANG_PKG_PATH)
ifeq ($(IMAGE_NAME),)
REGISTRY ?= nvidia
IMAGE_NAME = $(REGISTRY)/container-toolkit
endif
BUILDIMAGE_TAG ?= golang$(GOLANG_VERSION)
BUILDIMAGE ?= $(IMAGE_NAME)-build:$(BUILDIMAGE_TAG)
EXAMPLES := $(patsubst ./examples/%/,%,$(sort $(dir $(wildcard ./examples/*/))))
EXAMPLE_TARGETS := $(patsubst %,example-%, $(EXAMPLES))
CMDS := $(patsubst ./cmd/%/,%,$(sort $(dir $(wildcard ./cmd/*/))))
CMD_TARGETS := $(patsubst %,cmd-%, $(CMDS))
CHECK_TARGETS := assert-fmt vet lint ineffassign misspell
MAKE_TARGETS := binaries build check fmt lint-internal test examples cmds coverage generate licenses $(CHECK_TARGETS)
TARGETS := $(MAKE_TARGETS) $(EXAMPLE_TARGETS) $(CMD_TARGETS)
DOCKER_TARGETS := $(patsubst %,docker-%, $(TARGETS))
.PHONY: $(TARGETS) $(DOCKER_TARGETS)
ifeq ($(VERSION),)
CLI_VERSION = $(LIB_VERSION)$(if $(LIB_TAG),-$(LIB_TAG))
else
CLI_VERSION = $(VERSION)
endif
CLI_VERSION_PACKAGE = github.com/NVIDIA/nvidia-container-toolkit/internal/info
GOOS ?= linux
binaries: cmds
ifneq ($(PREFIX),)
cmd-%: COMMAND_BUILD_OPTIONS = -o $(PREFIX)/$(*)
endif
cmds: $(CMD_TARGETS)
$(CMD_TARGETS): cmd-%:
GOOS=$(GOOS) go build -ldflags "-extldflags=-Wl,-z,lazy -s -w -X $(CLI_VERSION_PACKAGE).gitCommit=$(GIT_COMMIT) -X $(CLI_VERSION_PACKAGE).version=$(CLI_VERSION)" $(COMMAND_BUILD_OPTIONS) $(MODULE)/cmd/$(*)
build:
GOOS=$(GOOS) go build ./...
examples: $(EXAMPLE_TARGETS)
$(EXAMPLE_TARGETS): example-%:
GOOS=$(GOOS) go build ./examples/$(*)
all: check test build binary
check: $(CHECK_TARGETS)
# Apply go fmt to the codebase
fmt:
go list -f '{{.Dir}}' $(MODULE)/... \
| xargs gofmt -s -l -w
assert-fmt:
go list -f '{{.Dir}}' $(MODULE)/... \
| xargs gofmt -s -l > fmt.out
@if [ -s fmt.out ]; then \
echo "\nERROR: The following files are not formatted:\n"; \
cat fmt.out; \
rm fmt.out; \
exit 1; \
else \
rm fmt.out; \
fi
ineffassign:
ineffassign $(MODULE)/...
lint:
# We use `go list -f '{{.Dir}}' $(MODULE)/...` to skip the `vendor` folder.
go list -f '{{.Dir}}' $(MODULE)/... | xargs golint -set_exit_status
misspell:
misspell $(MODULE)/...
vet:
go vet $(MODULE)/...
licenses:
go-licenses csv $(MODULE)/...
COVERAGE_FILE := coverage.out
test: build cmds
go test -v -coverprofile=$(COVERAGE_FILE) $(MODULE)/...
coverage: test
cat $(COVERAGE_FILE) | grep -v "_mock.go" > $(COVERAGE_FILE).no-mocks
go tool cover -func=$(COVERAGE_FILE).no-mocks
generate:
go generate $(MODULE)/...
# Generate an image for containerized builds
# Note: This image is local only
.PHONY: .build-image .pull-build-image .push-build-image
.build-image: docker/Dockerfile.devel
if [ x"$(SKIP_IMAGE_BUILD)" = x"" ]; then \
$(DOCKER) build \
--progress=plain \
--build-arg GOLANG_VERSION="$(GOLANG_VERSION)" \
--tag $(BUILDIMAGE) \
-f $(^) \
docker; \
fi
.pull-build-image:
$(DOCKER) pull $(BUILDIMAGE)
.push-build-image:
$(DOCKER) push $(BUILDIMAGE)
$(DOCKER_TARGETS): docker-%: .build-image
@echo "Running 'make $(*)' in docker container $(BUILDIMAGE)"
$(DOCKER) run \
--rm \
-e GOCACHE=/tmp/.cache \
-v $(PWD):$(PWD) \
-w $(PWD) \
--user $$(id -u):$$(id -g) \
$(BUILDIMAGE) \
make $(*)
# Start an interactive shell using the development image.
PHONY: .shell
.shell:
$(DOCKER) run \
--rm \
-ti \
-e GOCACHE=/tmp/.cache \
-v $(PWD):$(PWD) \
-w $(PWD) \
--user $$(id -u):$$(id -g) \
$(BUILDIMAGE)

31
README.md Normal file
View File

@@ -0,0 +1,31 @@
# NVIDIA Container Toolkit
[![GitHub license](https://img.shields.io/github/license/NVIDIA/nvidia-container-toolkit?style=flat-square)](https://raw.githubusercontent.com/NVIDIA/nvidia-container-toolkit/main/LICENSE)
[![Documentation](https://img.shields.io/badge/documentation-wiki-blue.svg?style=flat-square)](https://docs.nvidia.com/datacenter/cloud-native/container-toolkit/overview.html)
[![Package repository](https://img.shields.io/badge/packages-repository-b956e8.svg?style=flat-square)](https://nvidia.github.io/libnvidia-container)
![nvidia-container-stack](https://cloud.githubusercontent.com/assets/3028125/12213714/5b208976-b632-11e5-8406-38d379ec46aa.png)
## Introduction
The NVIDIA Container Toolkit allows users to build and run GPU accelerated containers. The toolkit includes a container runtime [library](https://github.com/NVIDIA/libnvidia-container) and utilities to automatically configure containers to leverage NVIDIA GPUs.
Product documentation including an architecture overview, platform support, and installation and usage guides can be found in the [documentation repository](https://docs.nvidia.com/datacenter/cloud-native/container-toolkit/overview.html).
## Getting Started
**Make sure you have installed the [NVIDIA driver](https://docs.nvidia.com/datacenter/cloud-native/container-toolkit/install-guide.html#nvidia-drivers) for your Linux Distribution**
**Note that you do not need to install the CUDA Toolkit on the host system, but the NVIDIA driver needs to be installed**
For instructions on getting started with the NVIDIA Container Toolkit, refer to the [installation guide](https://docs.nvidia.com/datacenter/cloud-native/container-toolkit/install-guide.html#installation-guide).
## Usage
The [user guide](https://docs.nvidia.com/datacenter/cloud-native/container-toolkit/user-guide.html) provides information on the configuration and command line options available when running GPU containers with Docker.
## Issues and Contributing
[Checkout the Contributing document!](CONTRIBUTING.md)
* Please let us know by [filing a new issue](https://github.com/NVIDIA/nvidia-container-toolkit/issues/new)
* You can contribute by creating a [merge request](https://gitlab.com/nvidia/container-toolkit/container-toolkit/-/merge_requests/new) to our public GitLab repository

View File

@@ -0,0 +1,97 @@
# Copyright (c) 2019-2021, NVIDIA CORPORATION. All rights reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
ARG BASE_DIST
ARG CUDA_VERSION
ARG GOLANG_VERSION=x.x.x
ARG VERSION="N/A"
# NOTE: In cases where the libc version is a concern, we would have to use an
# image based on the target OS to build the golang executables here -- especially
# if cgo code is included.
FROM golang:${GOLANG_VERSION} as build
# We override the GOPATH to ensure that the binaries are installed to
# /artifacts/bin
ARG GOPATH=/artifacts
# Install the experiemental nvidia-container-runtime
# NOTE: This will be integrated into the nvidia-container-toolkit package / repo
ARG NVIDIA_CONTAINER_RUNTIME_EXPERIMENTAL_VERSION=experimental
RUN GOPATH=/artifacts go install github.com/NVIDIA/nvidia-container-toolkit/cmd/nvidia-container-runtime.experimental@${NVIDIA_CONTAINER_RUNTIME_EXPERIMENTAL_VERSION}
WORKDIR /build
COPY . .
# NOTE: Until the config utilities are properly integrated into the
# nvidia-container-toolkit repository, these are built from the `tools` folder
# and not `cmd`.
RUN GOPATH=/artifacts go install -ldflags="-s -w -X 'main.Version=${VERSION}'" ./tools/...
FROM nvidia/cuda:${CUDA_VERSION}-base-${BASE_DIST}
ARG BASE_DIST
# See https://www.centos.org/centos-linux-eol/
# and https://stackoverflow.com/a/70930049 for move to vault.centos.org
# and https://serverfault.com/questions/1093922/failing-to-run-yum-update-in-centos-8 for move to vault.epel.cloud
RUN [[ "${BASE_DIST}" != "centos8" ]] || \
( \
sed -i 's/mirrorlist/#mirrorlist/g' /etc/yum.repos.d/CentOS-Linux-* && \
sed -i 's|#baseurl=http://mirror.centos.org|baseurl=http://vault.epel.cloud|g' /etc/yum.repos.d/CentOS-Linux-* \
)
ENV NVIDIA_DISABLE_REQUIRE="true"
ENV NVIDIA_VISIBLE_DEVICES=all
ENV NVIDIA_DRIVER_CAPABILITIES=utility
ARG ARTIFACTS_ROOT
ARG PACKAGE_DIST
COPY ${ARTIFACTS_ROOT}/${PACKAGE_DIST} /artifacts/packages/${PACKAGE_DIST}
WORKDIR /artifacts/packages
ARG PACKAGE_VERSION
ARG TARGETARCH
ENV PACKAGE_ARCH ${TARGETARCH}
RUN PACKAGE_ARCH=${PACKAGE_ARCH/amd64/x86_64} && PACKAGE_ARCH=${PACKAGE_ARCH/arm64/aarch64} && \
yum localinstall -y \
${PACKAGE_DIST}/${PACKAGE_ARCH}/libnvidia-container1-1.*.rpm \
${PACKAGE_DIST}/${PACKAGE_ARCH}/libnvidia-container-tools-1.*.rpm \
${PACKAGE_DIST}/${PACKAGE_ARCH}/nvidia-container-toolkit*-${PACKAGE_VERSION}*.rpm
WORKDIR /work
COPY --from=build /artifacts/bin /work
ENV PATH=/work:$PATH
LABEL io.k8s.display-name="NVIDIA Container Runtime Config"
LABEL name="NVIDIA Container Runtime Config"
LABEL vendor="NVIDIA"
LABEL version="${VERSION}"
LABEL release="N/A"
LABEL summary="Automatically Configure your Container Runtime for GPU support."
LABEL description="See summary"
RUN mkdir /licenses && mv /NGC-DL-CONTAINER-LICENSE /licenses/NGC-DL-CONTAINER-LICENSE
# Install / upgrade packages here that are required to resolve CVEs
ARG CVE_UPDATES
RUN if [ -n "${CVE_UPDATES}" ]; then \
yum update -y ${CVE_UPDATES} && \
rm -rf /var/cache/yum/*; \
fi
ENTRYPOINT ["/work/nvidia-toolkit"]

View File

@@ -0,0 +1,41 @@
# Copyright (c) 2021, NVIDIA CORPORATION. All rights reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
ARG BASE_DIST
ARG CUDA_VERSION
ARG GOLANG_VERSION=x.x.x
FROM nvidia/cuda:${CUDA_VERSION}-base-${BASE_DIST}
ARG ARTIFACTS_ROOT
COPY ${ARTIFACTS_ROOT} /artifacts/packages/
WORKDIR /artifacts/packages
# build-args are added to the manifest.txt file below.
ARG BASE_DIST
ARG PACKAGE_DIST
ARG PACKAGE_VERSION
ARG GIT_BRANCH
ARG GIT_COMMIT
ARG GIT_COMMIT_SHORT
ARG SOURCE_DATE_EPOCH
ARG VERSION
# Create a manifest.txt file with the absolute paths of all deb and rpm packages in the container
RUN echo "#IMAGE_EPOCH=$(date '+%s')" > /artifacts/manifest.txt && \
env | sed 's/^/#/g' >> /artifacts/manifest.txt && \
find /artifacts/packages -iname '*.deb' -o -iname '*.rpm' >> /artifacts/manifest.txt
RUN mkdir /licenses && mv /NGC-DL-CONTAINER-LICENSE /licenses/NGC-DL-CONTAINER-LICENSE

View File

@@ -0,0 +1,105 @@
# Copyright (c) 2019-2021, NVIDIA CORPORATION. All rights reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
ARG BASE_DIST
ARG CUDA_VERSION
ARG GOLANG_VERSION=x.x.x
ARG VERSION="N/A"
# NOTE: In cases where the libc version is a concern, we would have to use an
# image based on the target OS to build the golang executables here -- especially
# if cgo code is included.
FROM golang:${GOLANG_VERSION} as build
# We override the GOPATH to ensure that the binaries are installed to
# /artifacts/bin
ARG GOPATH=/artifacts
# Install the experiemental nvidia-container-runtime
# NOTE: This will be integrated into the nvidia-container-toolkit package / repo
ARG NVIDIA_CONTAINER_RUNTIME_EXPERIMENTAL_VERSION=experimental
RUN GOPATH=/artifacts go install github.com/NVIDIA/nvidia-container-toolkit/cmd/nvidia-container-runtime.experimental@${NVIDIA_CONTAINER_RUNTIME_EXPERIMENTAL_VERSION}
WORKDIR /build
COPY . .
# NOTE: Until the config utilities are properly integrated into the
# nvidia-container-toolkit repository, these are built from the `tools` folder
# and not `cmd`.
RUN GOPATH=/artifacts go install -ldflags="-s -w -X 'main.Version=${VERSION}'" ./tools/...
FROM nvcr.io/nvidia/cuda:${CUDA_VERSION}-base-${BASE_DIST}
# Remove the CUDA repository configurations to avoid issues with rotated GPG keys
RUN rm -f /etc/apt/sources.list.d/cuda.list
ARG DEBIAN_FRONTEND=noninteractive
RUN apt-get update && apt-get install -y --no-install-recommends \
libcap2 \
curl \
&& \
rm -rf /var/lib/apt/lists/*
ENV NVIDIA_DISABLE_REQUIRE="true"
ENV NVIDIA_VISIBLE_DEVICES=all
ENV NVIDIA_DRIVER_CAPABILITIES=utility
ARG ARTIFACTS_ROOT
ARG PACKAGE_DIST
COPY ${ARTIFACTS_ROOT}/${PACKAGE_DIST} /artifacts/packages/${PACKAGE_DIST}
WORKDIR /artifacts/packages
ARG PACKAGE_VERSION
ARG TARGETARCH
ENV PACKAGE_ARCH ${TARGETARCH}
ARG LIBNVIDIA_CONTAINER_REPO="https://nvidia.github.io/libnvidia-container"
ARG LIBNVIDIA_CONTAINER0_VERSION
RUN if [ "${PACKAGE_ARCH}" = "arm64" ]; then \
curl -L ${LIBNVIDIA_CONTAINER_REPO}/${PACKAGE_DIST}/${PACKAGE_ARCH}/libnvidia-container0_${LIBNVIDIA_CONTAINER0_VERSION}_${PACKAGE_ARCH}.deb \
--output ${PACKAGE_DIST}/${PACKAGE_ARCH}/libnvidia-container0_${LIBNVIDIA_CONTAINER0_VERSION}_${PACKAGE_ARCH}.deb && \
dpkg -i ${PACKAGE_DIST}/${PACKAGE_ARCH}/libnvidia-container0_${LIBNVIDIA_CONTAINER0_VERSION}_${PACKAGE_ARCH}.deb; \
fi
RUN dpkg -i \
${PACKAGE_DIST}/${PACKAGE_ARCH}/libnvidia-container1_1.*.deb \
${PACKAGE_DIST}/${PACKAGE_ARCH}/libnvidia-container-tools_1.*.deb \
${PACKAGE_DIST}/${PACKAGE_ARCH}/nvidia-container-toolkit*_${PACKAGE_VERSION}*.deb
WORKDIR /work
COPY --from=build /artifacts/bin /work/
ENV PATH=/work:$PATH
LABEL io.k8s.display-name="NVIDIA Container Runtime Config"
LABEL name="NVIDIA Container Runtime Config"
LABEL vendor="NVIDIA"
LABEL version="${VERSION}"
LABEL release="N/A"
LABEL summary="Automatically Configure your Container Runtime for GPU support."
LABEL description="See summary"
RUN mkdir /licenses && mv /NGC-DL-CONTAINER-LICENSE /licenses/NGC-DL-CONTAINER-LICENSE
# Install / upgrade packages here that are required to resolve CVEs
ARG CVE_UPDATES
RUN if [ -n "${CVE_UPDATES}" ]; then \
apt-get update && apt-get upgrade -y ${CVE_UPDATES} && \
rm -rf /var/lib/apt/lists/*; \
fi
ENTRYPOINT ["/work/nvidia-toolkit"]

149
build/container/Makefile Normal file
View File

@@ -0,0 +1,149 @@
# Copyright (c) 2021, NVIDIA CORPORATION. All rights reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
BUILD_MULTI_ARCH_IMAGES ?= false
DOCKER ?= docker
BUILDX =
ifeq ($(BUILD_MULTI_ARCH_IMAGES),true)
BUILDX = buildx
endif
MKDIR ?= mkdir
DIST_DIR ?= $(CURDIR)/dist
##### Global variables #####
include $(CURDIR)/versions.mk
ifeq ($(IMAGE_NAME),)
REGISTRY ?= nvidia
IMAGE_NAME := $(REGISTRY)/container-toolkit
endif
VERSION ?= $(LIB_VERSION)$(if $(LIB_TAG),-$(LIB_TAG))
IMAGE_VERSION := $(VERSION)
IMAGE_TAG ?= $(VERSION)-$(DIST)
IMAGE = $(IMAGE_NAME):$(IMAGE_TAG)
OUT_IMAGE_NAME ?= $(IMAGE_NAME)
OUT_IMAGE_VERSION ?= $(IMAGE_VERSION)
OUT_IMAGE_TAG = $(OUT_IMAGE_VERSION)-$(DIST)
OUT_IMAGE = $(OUT_IMAGE_NAME):$(OUT_IMAGE_TAG)
##### Public rules #####
DEFAULT_PUSH_TARGET := ubuntu20.04
DISTRIBUTIONS := ubuntu20.04 ubi8 centos7
META_TARGETS := packaging
BUILD_TARGETS := $(patsubst %,build-%,$(DISTRIBUTIONS) $(META_TARGETS))
PUSH_TARGETS := $(patsubst %,push-%,$(DISTRIBUTIONS) $(META_TARGETS))
TEST_TARGETS := $(patsubst %,test-%,$(DISTRIBUTIONS))
.PHONY: $(DISTRIBUTIONS) $(PUSH_TARGETS) $(BUILD_TARGETS) $(TEST_TARGETS)
ifneq ($(BUILD_MULTI_ARCH_IMAGES),true)
include $(CURDIR)/build/container/native-only.mk
else
include $(CURDIR)/build/container/multi-arch.mk
endif
# For the default push target we also push a short tag equal to the version.
# We skip this for the development release
DEVEL_RELEASE_IMAGE_VERSION ?= devel
PUSH_MULTIPLE_TAGS ?= true
ifeq ($(strip $(OUT_IMAGE_VERSION)),$(DEVEL_RELEASE_IMAGE_VERSION))
PUSH_MULTIPLE_TAGS = false
endif
ifeq ($(PUSH_MULTIPLE_TAGS),true)
push-$(DEFAULT_PUSH_TARGET): push-short
endif
push-%: DIST = $(*)
push-short: DIST = $(DEFAULT_PUSH_TARGET)
build-%: DIST = $(*)
build-%: DOCKERFILE = $(CURDIR)/build/container/Dockerfile.$(DOCKERFILE_SUFFIX)
ARTIFACTS_ROOT ?= $(shell realpath --relative-to=$(CURDIR) $(DIST_DIR))
# Use a generic build target to build the relevant images
$(BUILD_TARGETS): build-%: $(ARTIFACTS_ROOT)
DOCKER_BUILDKIT=1 \
$(DOCKER) $(BUILDX) build --pull \
$(DOCKER_BUILD_OPTIONS) \
$(DOCKER_BUILD_PLATFORM_OPTIONS) \
--tag $(IMAGE) \
--build-arg ARTIFACTS_ROOT="$(ARTIFACTS_ROOT)" \
--build-arg BASE_DIST="$(BASE_DIST)" \
--build-arg CUDA_VERSION="$(CUDA_VERSION)" \
--build-arg GOLANG_VERSION="$(GOLANG_VERSION)" \
--build-arg LIBNVIDIA_CONTAINER0_VERSION="$(LIBNVIDIA_CONTAINER0_DEPENDENCY)" \
--build-arg PACKAGE_DIST="$(PACKAGE_DIST)" \
--build-arg PACKAGE_VERSION="$(PACKAGE_VERSION)" \
--build-arg VERSION="$(VERSION)" \
--build-arg GIT_COMMIT="$(GIT_COMMIT)" \
--build-arg GIT_COMMIT_SHORT="$(GIT_COMMIT_SHORT)" \
--build-arg GIT_BRANCH="$(GIT_BRANCH)" \
--build-arg SOURCE_DATE_EPOCH="$(SOURCE_DATE_EPOCH)" \
--build-arg CVE_UPDATES="$(CVE_UPDATES)" \
-f $(DOCKERFILE) \
$(CURDIR)
build-ubuntu%: BASE_DIST = $(*)
build-ubuntu%: DOCKERFILE_SUFFIX := ubuntu
build-ubuntu%: PACKAGE_DIST = ubuntu18.04
build-ubuntu%: LIBNVIDIA_CONTAINER0_DEPENDENCY=$(LIBNVIDIA_CONTAINER0_VERSION)
build-ubi8: BASE_DIST := ubi8
build-ubi8: DOCKERFILE_SUFFIX := centos
build-ubi8: PACKAGE_DIST = centos8
build-centos7: BASE_DIST = $(*)
build-centos7: DOCKERFILE_SUFFIX := centos
build-centos7: PACKAGE_DIST = $(BASE_DIST)
build-packaging: BASE_DIST := ubuntu20.04
build-packaging: DOCKERFILE_SUFFIX := packaging
build-packaging: PACKAGE_ARCH := amd64
build-packaging: PACKAGE_DIST = all
# Test targets
test-%: DIST = $(*)
TEST_CASES ?= toolkit docker crio containerd
$(TEST_TARGETS): test-%:
TEST_CASES="$(TEST_CASES)" bash -x $(CURDIR)/test/container/main.sh run \
$(CURDIR)/shared-$(*) \
$(IMAGE) \
--no-cleanup-on-error
.PHONY: test-packaging
test-packaging: DIST = packaging
test-packaging:
@echo "Testing package image contents"
@$(DOCKER) run --rm $(IMAGE) test -d "/artifacts/packages/amazonlinux2/aarch64" || echo "Missing amazonlinux2/aarch64"
@$(DOCKER) run --rm $(IMAGE) test -d "/artifacts/packages/amazonlinux2/x86_64" || echo "Missing amazonlinux2/x86_64"
@$(DOCKER) run --rm $(IMAGE) test -d "/artifacts/packages/centos7/ppc64le" || echo "Missing centos7/ppc64le"
@$(DOCKER) run --rm $(IMAGE) test -d "/artifacts/packages/centos7/x86_64" || echo "Missing centos7/x86_64"
@$(DOCKER) run --rm $(IMAGE) test -d "/artifacts/packages/centos8/aarch64" || echo "Missing centos8/aarch64"
@$(DOCKER) run --rm $(IMAGE) test -d "/artifacts/packages/centos8/ppc64le" || echo "Missing centos8/ppc64le"
@$(DOCKER) run --rm $(IMAGE) test -d "/artifacts/packages/centos8/x86_64" || echo "Missing centos8/x86_64"
@$(DOCKER) run --rm $(IMAGE) test -d "/artifacts/packages/debian10/amd64" || echo "Missing debian10/amd64"
@$(DOCKER) run --rm $(IMAGE) test -d "/artifacts/packages/opensuse-leap15.1/x86_64" || echo "Missing opensuse-leap15.1/x86_64"
@$(DOCKER) run --rm $(IMAGE) test -d "/artifacts/packages/ubuntu18.04/amd64" || echo "Missing ubuntu18.04/amd64"
@$(DOCKER) run --rm $(IMAGE) test -d "/artifacts/packages/ubuntu18.04/arm64" || echo "Missing ubuntu18.04/arm64"
@$(DOCKER) run --rm $(IMAGE) test -d "/artifacts/packages/ubuntu18.04/ppc64le" || echo "Missing ubuntu18.04/ppc64le"

View File

@@ -0,0 +1,4 @@
# NVIDIA Container Toolkit Container
This folder contains make and docker files for building the NVIDIA Container Toolkit Container.

View File

@@ -0,0 +1,37 @@
# Copyright (c) 2022, NVIDIA CORPORATION. All rights reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
PUSH_ON_BUILD ?= false
DOCKER_BUILD_OPTIONS = --output=type=image,push=$(PUSH_ON_BUILD)
DOCKER_BUILD_PLATFORM_OPTIONS = --platform=linux/amd64,linux/arm64
REGCTL ?= regctl
$(PUSH_TARGETS): push-%:
$(REGCTL) \
image copy \
$(IMAGE) $(OUT_IMAGE)
push-short:
$(REGCTL) \
image copy \
$(IMAGE) $(OUT_IMAGE_NAME):$(OUT_IMAGE_VERSION)
# We only have x86_64 packages for centos7
build-centos7: DOCKER_BUILD_PLATFORM_OPTIONS = --platform=linux/amd64
# We only generate amd64 image for ubuntu18.04
build-ubuntu18.04: DOCKER_BUILD_PLATFORM_OPTIONS = --platform=linux/amd64
# We only generate a single image for packaging targets
build-packaging: DOCKER_BUILD_PLATFORM_OPTIONS = --platform=linux/amd64

View File

@@ -0,0 +1,23 @@
# Copyright (c) 2022, NVIDIA CORPORATION. All rights reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
DOCKER_BUILD_PLATFORM_OPTIONS = --platform=linux/amd64
$(PUSH_TARGETS): push-%:
$(DOCKER) tag "$(IMAGE)" "$(OUT_IMAGE)"
$(DOCKER) push "$(OUT_IMAGE)"
push-short:
$(DOCKER) tag "$(IMAGE_NAME):$(VERSION)-$(DEFAULT_PUSH_TARGET)" "$(OUT_IMAGE_NAME):$(OUT_IMAGE_VERSION)"
$(DOCKER) push "$(OUT_IMAGE_NAME):$(OUT_IMAGE_VERSION)"

View File

@@ -0,0 +1,83 @@
package main
import (
"log"
"strings"
)
const (
allDriverCapabilities = DriverCapabilities("compute,compat32,graphics,utility,video,display,ngx")
defaultDriverCapabilities = DriverCapabilities("utility,compute")
none = DriverCapabilities("")
all = DriverCapabilities("all")
)
func capabilityToCLI(cap string) string {
switch cap {
case "compute":
return "--compute"
case "compat32":
return "--compat32"
case "graphics":
return "--graphics"
case "utility":
return "--utility"
case "video":
return "--video"
case "display":
return "--display"
case "ngx":
return "--ngx"
default:
log.Panicln("unknown driver capability:", cap)
}
return ""
}
// DriverCapabilities is used to process the NVIDIA_DRIVER_CAPABILITIES environment
// variable. Operations include default values, filtering, and handling meta values such as "all"
type DriverCapabilities string
// Intersection returns intersection between two sets of capabilities.
func (d DriverCapabilities) Intersection(capabilities DriverCapabilities) DriverCapabilities {
if capabilities == all {
return d
}
if d == all {
return capabilities
}
lookup := make(map[string]bool)
for _, c := range d.list() {
lookup[c] = true
}
var found []string
for _, c := range capabilities.list() {
if lookup[c] {
found = append(found, c)
}
}
intersection := DriverCapabilities(strings.Join(found, ","))
return intersection
}
// String returns the string representation of the driver capabilities
func (d DriverCapabilities) String() string {
return string(d)
}
// list returns the driver capabilities as a list
func (d DriverCapabilities) list() []string {
var caps []string
for _, c := range strings.Split(string(d), ",") {
trimmed := strings.TrimSpace(c)
if len(trimmed) == 0 {
continue
}
caps = append(caps, trimmed)
}
return caps
}

View File

@@ -0,0 +1,134 @@
/**
# Copyright (c) 2021, NVIDIA CORPORATION. All rights reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
**/
package main
import (
"fmt"
"testing"
"github.com/stretchr/testify/require"
)
func TestDriverCapabilitiesIntersection(t *testing.T) {
testCases := []struct {
capabilities DriverCapabilities
supportedCapabilities DriverCapabilities
expectedIntersection DriverCapabilities
}{
{
capabilities: none,
supportedCapabilities: none,
expectedIntersection: none,
},
{
capabilities: all,
supportedCapabilities: none,
expectedIntersection: none,
},
{
capabilities: all,
supportedCapabilities: allDriverCapabilities,
expectedIntersection: allDriverCapabilities,
},
{
capabilities: allDriverCapabilities,
supportedCapabilities: all,
expectedIntersection: allDriverCapabilities,
},
{
capabilities: none,
supportedCapabilities: all,
expectedIntersection: none,
},
{
capabilities: none,
supportedCapabilities: DriverCapabilities("cap1"),
expectedIntersection: none,
},
{
capabilities: DriverCapabilities("cap0,cap1"),
supportedCapabilities: DriverCapabilities("cap1,cap0"),
expectedIntersection: DriverCapabilities("cap0,cap1"),
},
{
capabilities: defaultDriverCapabilities,
supportedCapabilities: allDriverCapabilities,
expectedIntersection: defaultDriverCapabilities,
},
{
capabilities: DriverCapabilities("compute,compat32,graphics,utility,video,display"),
supportedCapabilities: DriverCapabilities("compute,compat32,graphics,utility,video,display,ngx"),
expectedIntersection: DriverCapabilities("compute,compat32,graphics,utility,video,display"),
},
{
capabilities: DriverCapabilities("cap1"),
supportedCapabilities: none,
expectedIntersection: none,
},
{
capabilities: DriverCapabilities("compute,compat32,graphics,utility,video,display,ngx"),
supportedCapabilities: DriverCapabilities("compute,compat32,graphics,utility,video,display"),
expectedIntersection: DriverCapabilities("compute,compat32,graphics,utility,video,display"),
},
}
for i, tc := range testCases {
t.Run(fmt.Sprintf("test case %d", i), func(t *testing.T) {
intersection := tc.supportedCapabilities.Intersection(tc.capabilities)
require.EqualValues(t, tc.expectedIntersection, intersection)
})
}
}
func TestDriverCapabilitiesList(t *testing.T) {
testCases := []struct {
capabilities DriverCapabilities
expected []string
}{
{
capabilities: DriverCapabilities(""),
},
{
capabilities: DriverCapabilities(" "),
},
{
capabilities: DriverCapabilities(","),
},
{
capabilities: DriverCapabilities(",cap"),
expected: []string{"cap"},
},
{
capabilities: DriverCapabilities("cap,"),
expected: []string{"cap"},
},
{
capabilities: DriverCapabilities("cap0,,cap1"),
expected: []string{"cap0", "cap1"},
},
{
capabilities: DriverCapabilities("cap1,cap0,cap3"),
expected: []string{"cap1", "cap0", "cap3"},
},
}
for i, tc := range testCases {
t.Run(fmt.Sprintf("test case %d", i), func(t *testing.T) {
require.EqualValues(t, tc.expected, tc.capabilities.list())
})
}
}

View File

@@ -7,14 +7,13 @@ import (
"os"
"path"
"path/filepath"
"strconv"
"strings"
"github.com/NVIDIA/nvidia-container-toolkit/internal/config/image"
"github.com/opencontainers/runtime-spec/specs-go"
"golang.org/x/mod/semver"
)
var envSwarmGPU *string
const (
envCUDAVersion = "CUDA_VERSION"
envNVRequirePrefix = "NVIDIA_REQUIRE_"
@@ -26,11 +25,6 @@ const (
envNVDriverCapabilities = "NVIDIA_DRIVER_CAPABILITIES"
)
const (
allDriverCapabilities = "compute,compat32,graphics,utility,video,display,ngx"
defaultDriverCapabilities = "utility"
)
const (
capSysAdmin = "CAP_SYS_ADMIN"
)
@@ -109,32 +103,6 @@ type HookState struct {
BundlePath string `json:"bundlePath"`
}
func parseCudaVersion(cudaVersion string) (vmaj, vmin, vpatch uint32) {
if _, err := fmt.Sscanf(cudaVersion, "%d.%d.%d\n", &vmaj, &vmin, &vpatch); err != nil {
vpatch = 0
if _, err := fmt.Sscanf(cudaVersion, "%d.%d\n", &vmaj, &vmin); err != nil {
vmin = 0
if _, err := fmt.Sscanf(cudaVersion, "%d\n", &vmaj); err != nil {
log.Panicln("invalid CUDA version:", cudaVersion)
}
}
}
return
}
func getEnvMap(e []string) (m map[string]string) {
m = make(map[string]string)
for _, s := range e {
p := strings.SplitN(s, "=", 2)
if len(p) != 2 {
log.Panicln("environment error")
}
m[p[0]] = p[1]
}
return
}
func loadSpec(path string) (spec *Spec) {
f, err := os.Open(path)
if err != nil {
@@ -163,7 +131,7 @@ func isPrivileged(s *Spec) bool {
}
var caps []string
// If v1.1.0-rc1 <= OCI version < v1.0.0-rc5 parse s.Process.Capabilities as:
// If v1.0.0-rc1 <= OCI version < v1.0.0-rc5 parse s.Process.Capabilities as:
// github.com/opencontainers/runtime-spec/blob/v1.0.0-rc1/specs-go/config.go#L30-L54
rc1cmp := semver.Compare("v"+*s.Version, "v1.0.0-rc1")
rc5cmp := semver.Compare("v"+*s.Version, "v1.0.0-rc5")
@@ -172,72 +140,58 @@ func isPrivileged(s *Spec) bool {
if err != nil {
log.Panicln("could not decode Process.Capabilities in OCI spec:", err)
}
// Otherwise, parse s.Process.Capabilities as:
// github.com/opencontainers/runtime-spec/blob/v1.0.0/specs-go/config.go#L30-L54
for _, c := range caps {
if c == capSysAdmin {
return true
}
}
return false
}
// Otherwise, parse s.Process.Capabilities as:
// github.com/opencontainers/runtime-spec/blob/v1.0.0/specs-go/config.go#L30-L54
process := specs.Process{
Env: s.Process.Env,
}
err := json.Unmarshal(*s.Process.Capabilities, &process.Capabilities)
if err != nil {
log.Panicln("could not decode Process.Capabilities in OCI spec:", err)
}
fullSpec := specs.Spec{
Version: *s.Version,
Process: &process,
}
return image.IsPrivileged(&fullSpec)
}
func getDevicesFromEnvvar(image image.CUDA, swarmResourceEnvvars []string) *string {
// We check if the image has at least one of the Swarm resource envvars defined and use this
// if specified.
var hasSwarmEnvvar bool
for _, envvar := range swarmResourceEnvvars {
if _, exists := image[envvar]; exists {
hasSwarmEnvvar = true
break
}
}
var devices []string
if hasSwarmEnvvar {
devices = image.DevicesFromEnvvars(swarmResourceEnvvars...).List()
} else {
var lc LinuxCapabilities
err := json.Unmarshal(*s.Process.Capabilities, &lc)
if err != nil {
log.Panicln("could not decode Process.Capabilities in OCI spec:", err)
}
// We only make sure that the bounding capabibility set has
// CAP_SYS_ADMIN. This allows us to make sure that the container was
// actually started as '--privileged', but also allow non-root users to
// access the privileged NVIDIA capabilities.
caps = lc.Bounding
devices = image.DevicesFromEnvvars(envNVVisibleDevices).List()
}
for _, c := range caps {
if c == capSysAdmin {
return true
}
}
return false
}
func isLegacyCUDAImage(env map[string]string) bool {
legacyCudaVersion := env[envCUDAVersion]
cudaRequire := env[envNVRequireCUDA]
return len(legacyCudaVersion) > 0 && len(cudaRequire) == 0
}
func getDevicesFromEnvvar(env map[string]string, legacyImage bool) *string {
// Build a list of envvars to consider.
envVars := []string{envNVVisibleDevices}
if envSwarmGPU != nil {
// The Swarm envvar has higher precedence.
envVars = append([]string{*envSwarmGPU}, envVars...)
}
// Grab a reference to devices from the first envvar
// in the list that actually exists in the environment.
var devices *string
for _, envVar := range envVars {
if devs, ok := env[envVar]; ok {
devices = &devs
}
}
// Environment variable unset with legacy image: default to "all".
if devices == nil && legacyImage {
all := "all"
return &all
}
// Environment variable unset or empty or "void": return nil
if devices == nil || len(*devices) == 0 || *devices == "void" {
if len(devices) == 0 {
return nil
}
// Environment variable set to "none": reset to "".
if *devices == "none" {
empty := ""
return &empty
}
devicesString := strings.Join(devices, ",")
// Any other value.
return devices
return &devicesString
}
func getDevicesFromMounts(mounts []Mount) *string {
@@ -277,7 +231,7 @@ func getDevicesFromMounts(mounts []Mount) *string {
return &ret
}
func getDevices(hookConfig *HookConfig, env map[string]string, mounts []Mount, privileged bool, legacyImage bool) *string {
func getDevices(hookConfig *HookConfig, image image.CUDA, mounts []Mount, privileged bool) *string {
// If enabled, try and get the device list from volume mounts first
if hookConfig.AcceptDeviceListAsVolumeMounts {
devices := getDevicesFromMounts(mounts)
@@ -287,7 +241,7 @@ func getDevices(hookConfig *HookConfig, env map[string]string, mounts []Mount, p
}
// Fallback to reading from the environment variable if privileges are correct
devices := getDevicesFromEnvvar(env, legacyImage)
devices := getDevicesFromEnvvar(image, hookConfig.getSwarmResourceEnvvars())
if devices == nil {
return nil
}
@@ -295,8 +249,8 @@ func getDevices(hookConfig *HookConfig, env map[string]string, mounts []Mount, p
return devices
}
// Error out otherwise
log.Panicln("insufficient privileges to read device list from NVIDIA_VISIBLE_DEVICES envvar")
configName := hookConfig.getConfigOption("AcceptEnvvarUnprivileged")
log.Printf("Ignoring devices specified in NVIDIA_VISIBLE_DEVICES (privileged=%v, %v=%v) ", privileged, configName, hookConfig.AcceptEnvvarUnprivileged)
return nil
}
@@ -315,57 +269,35 @@ func getMigMonitorDevices(env map[string]string) *string {
return nil
}
func getDriverCapabilities(env map[string]string, legacyImage bool) *string {
// Grab a reference to the capabilities from the envvar
// if it actually exists in the environment.
var capabilities *string
if caps, ok := env[envNVDriverCapabilities]; ok {
capabilities = &caps
func getDriverCapabilities(env map[string]string, supportedDriverCapabilities DriverCapabilities, legacyImage bool) DriverCapabilities {
// We use the default driver capabilities by default. This is filtered to only include the
// supported capabilities
capabilities := supportedDriverCapabilities.Intersection(defaultDriverCapabilities)
capsEnv, capsEnvSpecified := env[envNVDriverCapabilities]
if !capsEnvSpecified && legacyImage {
// Environment variable unset with legacy image: set all capabilities.
return supportedDriverCapabilities
}
// Environment variable unset with legacy image: set all capabilities.
if capabilities == nil && legacyImage {
allCaps := allDriverCapabilities
return &allCaps
if capsEnvSpecified && len(capsEnv) > 0 {
// If the envvironment variable is specified and is non-empty, use the capabilities value
envCapabilities := DriverCapabilities(capsEnv)
capabilities = supportedDriverCapabilities.Intersection(envCapabilities)
if envCapabilities != all && capabilities != envCapabilities {
log.Panicln(fmt.Errorf("unsupported capabilities found in '%v' (allowed '%v')", envCapabilities, capabilities))
}
}
// Environment variable unset or set but empty: set default capabilities.
if capabilities == nil || len(*capabilities) == 0 {
defaultCaps := defaultDriverCapabilities
return &defaultCaps
}
// Environment variable set to "all": set all capabilities.
if *capabilities == "all" {
allCaps := allDriverCapabilities
return &allCaps
}
// Any other value
return capabilities
}
func getRequirements(env map[string]string, legacyImage bool) []string {
// All variables with the "NVIDIA_REQUIRE_" prefix are passed to nvidia-container-cli
var requirements []string
for name, value := range env {
if strings.HasPrefix(name, envNVRequirePrefix) {
requirements = append(requirements, value)
}
}
if legacyImage {
vmaj, vmin, _ := parseCudaVersion(env[envCUDAVersion])
cudaRequire := fmt.Sprintf("cuda>=%d.%d", vmaj, vmin)
requirements = append(requirements, cudaRequire)
}
return requirements
}
func getNvidiaConfig(hookConfig *HookConfig, env map[string]string, mounts []Mount, privileged bool) *nvidiaConfig {
legacyImage := isLegacyCUDAImage(env)
func getNvidiaConfig(hookConfig *HookConfig, image image.CUDA, mounts []Mount, privileged bool) *nvidiaConfig {
legacyImage := image.IsLegacy()
var devices string
if d := getDevices(hookConfig, env, mounts, privileged, legacyImage); d != nil {
if d := getDevices(hookConfig, image, mounts, privileged); d != nil {
devices = *d
} else {
// 'nil' devices means this is not a GPU container.
@@ -373,7 +305,7 @@ func getNvidiaConfig(hookConfig *HookConfig, env map[string]string, mounts []Mou
}
var migConfigDevices string
if d := getMigConfigDevices(env); d != nil {
if d := getMigConfigDevices(image); d != nil {
migConfigDevices = *d
}
if !privileged && migConfigDevices != "" {
@@ -381,22 +313,21 @@ func getNvidiaConfig(hookConfig *HookConfig, env map[string]string, mounts []Mou
}
var migMonitorDevices string
if d := getMigMonitorDevices(env); d != nil {
if d := getMigMonitorDevices(image); d != nil {
migMonitorDevices = *d
}
if !privileged && migMonitorDevices != "" {
log.Panicln("cannot set MIG_MONITOR_DEVICES in non privileged container")
}
var driverCapabilities string
if c := getDriverCapabilities(env, legacyImage); c != nil {
driverCapabilities = *c
driverCapabilities := getDriverCapabilities(image, hookConfig.SupportedDriverCapabilities, legacyImage).String()
requirements, err := image.GetRequirements()
if err != nil {
log.Panicln("failed to get requirements", err)
}
requirements := getRequirements(env, legacyImage)
// Don't fail on invalid values.
disableRequire, _ := strconv.ParseBool(env[envNVDisableRequire])
disableRequire := image.HasDisableRequire()
return &nvidiaConfig{
Devices: devices,
@@ -422,13 +353,16 @@ func getContainerConfig(hook HookConfig) (config containerConfig) {
s := loadSpec(path.Join(b, "config.json"))
env := getEnvMap(s.Process.Env)
image, err := image.NewCUDAImageFromEnv(s.Process.Env)
if err != nil {
log.Panicln(err)
}
privileged := isPrivileged(s)
envSwarmGPU = hook.SwarmResource
return containerConfig{
Pid: h.Pid,
Rootfs: s.Root.Path,
Env: env,
Nvidia: getNvidiaConfig(&hook, env, s.Mounts, privileged),
Env: image,
Nvidia: getNvidiaConfig(&hook, image, s.Mounts, privileged),
}
}

View File

@@ -1,9 +1,12 @@
package main
import (
"fmt"
"path/filepath"
"reflect"
"testing"
"github.com/NVIDIA/nvidia-container-toolkit/internal/config/image"
"github.com/stretchr/testify/require"
)
func TestGetNvidiaConfig(t *testing.T) {
@@ -11,6 +14,7 @@ func TestGetNvidiaConfig(t *testing.T) {
description string
env map[string]string
privileged bool
hookConfig *HookConfig
expectedConfig *nvidiaConfig
expectedPanic bool
}{
@@ -34,7 +38,7 @@ func TestGetNvidiaConfig(t *testing.T) {
privileged: false,
expectedConfig: &nvidiaConfig{
Devices: "all",
DriverCapabilities: allDriverCapabilities,
DriverCapabilities: allDriverCapabilities.String(),
Requirements: []string{"cuda>=9.0"},
DisableRequire: false,
},
@@ -48,7 +52,7 @@ func TestGetNvidiaConfig(t *testing.T) {
privileged: false,
expectedConfig: &nvidiaConfig{
Devices: "all",
DriverCapabilities: allDriverCapabilities,
DriverCapabilities: allDriverCapabilities.String(),
Requirements: []string{"cuda>=9.0"},
DisableRequire: false,
},
@@ -66,7 +70,7 @@ func TestGetNvidiaConfig(t *testing.T) {
description: "Legacy image, devices 'void', no capabilities, no requirements",
env: map[string]string{
envCUDAVersion: "9.0",
envNVVisibleDevices: "",
envNVVisibleDevices: "void",
},
privileged: false,
expectedConfig: nil,
@@ -80,7 +84,7 @@ func TestGetNvidiaConfig(t *testing.T) {
privileged: false,
expectedConfig: &nvidiaConfig{
Devices: "",
DriverCapabilities: allDriverCapabilities,
DriverCapabilities: allDriverCapabilities.String(),
Requirements: []string{"cuda>=9.0"},
DisableRequire: false,
},
@@ -94,7 +98,7 @@ func TestGetNvidiaConfig(t *testing.T) {
privileged: false,
expectedConfig: &nvidiaConfig{
Devices: "gpu0,gpu1",
DriverCapabilities: allDriverCapabilities,
DriverCapabilities: allDriverCapabilities.String(),
Requirements: []string{"cuda>=9.0"},
DisableRequire: false,
},
@@ -109,7 +113,7 @@ func TestGetNvidiaConfig(t *testing.T) {
privileged: false,
expectedConfig: &nvidiaConfig{
Devices: "gpu0,gpu1",
DriverCapabilities: defaultDriverCapabilities,
DriverCapabilities: defaultDriverCapabilities.String(),
Requirements: []string{"cuda>=9.0"},
DisableRequire: false,
},
@@ -124,7 +128,7 @@ func TestGetNvidiaConfig(t *testing.T) {
privileged: false,
expectedConfig: &nvidiaConfig{
Devices: "gpu0,gpu1",
DriverCapabilities: allDriverCapabilities,
DriverCapabilities: allDriverCapabilities.String(),
Requirements: []string{"cuda>=9.0"},
DisableRequire: false,
},
@@ -134,12 +138,12 @@ func TestGetNvidiaConfig(t *testing.T) {
env: map[string]string{
envCUDAVersion: "9.0",
envNVVisibleDevices: "gpu0,gpu1",
envNVDriverCapabilities: "cap0,cap1",
envNVDriverCapabilities: "video,display",
},
privileged: false,
expectedConfig: &nvidiaConfig{
Devices: "gpu0,gpu1",
DriverCapabilities: "cap0,cap1",
DriverCapabilities: "video,display",
Requirements: []string{"cuda>=9.0"},
DisableRequire: false,
},
@@ -149,14 +153,14 @@ func TestGetNvidiaConfig(t *testing.T) {
env: map[string]string{
envCUDAVersion: "9.0",
envNVVisibleDevices: "gpu0,gpu1",
envNVDriverCapabilities: "cap0,cap1",
envNVDriverCapabilities: "video,display",
envNVRequirePrefix + "REQ0": "req0=true",
envNVRequirePrefix + "REQ1": "req1=false",
},
privileged: false,
expectedConfig: &nvidiaConfig{
Devices: "gpu0,gpu1",
DriverCapabilities: "cap0,cap1",
DriverCapabilities: "video,display",
Requirements: []string{"cuda>=9.0", "req0=true", "req1=false"},
DisableRequire: false,
},
@@ -166,7 +170,7 @@ func TestGetNvidiaConfig(t *testing.T) {
env: map[string]string{
envCUDAVersion: "9.0",
envNVVisibleDevices: "gpu0,gpu1",
envNVDriverCapabilities: "cap0,cap1",
envNVDriverCapabilities: "video,display",
envNVRequirePrefix + "REQ0": "req0=true",
envNVRequirePrefix + "REQ1": "req1=false",
envNVDisableRequire: "true",
@@ -174,7 +178,7 @@ func TestGetNvidiaConfig(t *testing.T) {
privileged: false,
expectedConfig: &nvidiaConfig{
Devices: "gpu0,gpu1",
DriverCapabilities: "cap0,cap1",
DriverCapabilities: "video,display",
Requirements: []string{"cuda>=9.0", "req0=true", "req1=false"},
DisableRequire: true,
},
@@ -205,7 +209,7 @@ func TestGetNvidiaConfig(t *testing.T) {
privileged: false,
expectedConfig: &nvidiaConfig{
Devices: "all",
DriverCapabilities: defaultDriverCapabilities,
DriverCapabilities: defaultDriverCapabilities.String(),
Requirements: []string{"cuda>=9.0"},
DisableRequire: false,
},
@@ -223,7 +227,7 @@ func TestGetNvidiaConfig(t *testing.T) {
description: "Modern image, devices 'void', no capabilities, no requirements",
env: map[string]string{
envNVRequireCUDA: "cuda>=9.0",
envNVVisibleDevices: "",
envNVVisibleDevices: "void",
},
privileged: false,
expectedConfig: nil,
@@ -237,7 +241,7 @@ func TestGetNvidiaConfig(t *testing.T) {
privileged: false,
expectedConfig: &nvidiaConfig{
Devices: "",
DriverCapabilities: defaultDriverCapabilities,
DriverCapabilities: defaultDriverCapabilities.String(),
Requirements: []string{"cuda>=9.0"},
DisableRequire: false,
},
@@ -251,7 +255,7 @@ func TestGetNvidiaConfig(t *testing.T) {
privileged: false,
expectedConfig: &nvidiaConfig{
Devices: "gpu0,gpu1",
DriverCapabilities: defaultDriverCapabilities,
DriverCapabilities: defaultDriverCapabilities.String(),
Requirements: []string{"cuda>=9.0"},
DisableRequire: false,
},
@@ -266,7 +270,7 @@ func TestGetNvidiaConfig(t *testing.T) {
privileged: false,
expectedConfig: &nvidiaConfig{
Devices: "gpu0,gpu1",
DriverCapabilities: defaultDriverCapabilities,
DriverCapabilities: defaultDriverCapabilities.String(),
Requirements: []string{"cuda>=9.0"},
DisableRequire: false,
},
@@ -281,7 +285,7 @@ func TestGetNvidiaConfig(t *testing.T) {
privileged: false,
expectedConfig: &nvidiaConfig{
Devices: "gpu0,gpu1",
DriverCapabilities: allDriverCapabilities,
DriverCapabilities: allDriverCapabilities.String(),
Requirements: []string{"cuda>=9.0"},
DisableRequire: false,
},
@@ -291,12 +295,12 @@ func TestGetNvidiaConfig(t *testing.T) {
env: map[string]string{
envNVRequireCUDA: "cuda>=9.0",
envNVVisibleDevices: "gpu0,gpu1",
envNVDriverCapabilities: "cap0,cap1",
envNVDriverCapabilities: "video,display",
},
privileged: false,
expectedConfig: &nvidiaConfig{
Devices: "gpu0,gpu1",
DriverCapabilities: "cap0,cap1",
DriverCapabilities: "video,display",
Requirements: []string{"cuda>=9.0"},
DisableRequire: false,
},
@@ -306,14 +310,14 @@ func TestGetNvidiaConfig(t *testing.T) {
env: map[string]string{
envNVRequireCUDA: "cuda>=9.0",
envNVVisibleDevices: "gpu0,gpu1",
envNVDriverCapabilities: "cap0,cap1",
envNVDriverCapabilities: "video,display",
envNVRequirePrefix + "REQ0": "req0=true",
envNVRequirePrefix + "REQ1": "req1=false",
},
privileged: false,
expectedConfig: &nvidiaConfig{
Devices: "gpu0,gpu1",
DriverCapabilities: "cap0,cap1",
DriverCapabilities: "video,display",
Requirements: []string{"cuda>=9.0", "req0=true", "req1=false"},
DisableRequire: false,
},
@@ -323,7 +327,7 @@ func TestGetNvidiaConfig(t *testing.T) {
env: map[string]string{
envNVRequireCUDA: "cuda>=9.0",
envNVVisibleDevices: "gpu0,gpu1",
envNVDriverCapabilities: "cap0,cap1",
envNVDriverCapabilities: "video,display",
envNVRequirePrefix + "REQ0": "req0=true",
envNVRequirePrefix + "REQ1": "req1=false",
envNVDisableRequire: "true",
@@ -331,7 +335,7 @@ func TestGetNvidiaConfig(t *testing.T) {
privileged: false,
expectedConfig: &nvidiaConfig{
Devices: "gpu0,gpu1",
DriverCapabilities: "cap0,cap1",
DriverCapabilities: "video,display",
Requirements: []string{"cuda>=9.0", "req0=true", "req1=false"},
DisableRequire: true,
},
@@ -345,7 +349,7 @@ func TestGetNvidiaConfig(t *testing.T) {
expectedConfig: &nvidiaConfig{
Devices: "all",
DriverCapabilities: defaultDriverCapabilities,
DriverCapabilities: defaultDriverCapabilities.String(),
Requirements: []string{},
DisableRequire: false,
},
@@ -361,7 +365,7 @@ func TestGetNvidiaConfig(t *testing.T) {
expectedConfig: &nvidiaConfig{
Devices: "all",
MigConfigDevices: "mig0,mig1",
DriverCapabilities: defaultDriverCapabilities,
DriverCapabilities: defaultDriverCapabilities.String(),
Requirements: []string{"cuda>=9.0"},
DisableRequire: false,
},
@@ -387,7 +391,7 @@ func TestGetNvidiaConfig(t *testing.T) {
expectedConfig: &nvidiaConfig{
Devices: "all",
MigMonitorDevices: "mig0,mig1",
DriverCapabilities: defaultDriverCapabilities,
DriverCapabilities: defaultDriverCapabilities.String(),
Requirements: []string{"cuda>=9.0"},
DisableRequire: false,
},
@@ -402,19 +406,105 @@ func TestGetNvidiaConfig(t *testing.T) {
privileged: false,
expectedPanic: true,
},
{
description: "Hook config set as driver-capabilities-all",
env: map[string]string{
envNVVisibleDevices: "all",
envNVDriverCapabilities: "all",
},
privileged: true,
hookConfig: &HookConfig{
SupportedDriverCapabilities: "video,display",
},
expectedConfig: &nvidiaConfig{
Devices: "all",
DriverCapabilities: "video,display",
},
},
{
description: "Hook config set, envvar sets driver-capabilities",
env: map[string]string{
envNVVisibleDevices: "all",
envNVDriverCapabilities: "video,display",
},
privileged: true,
hookConfig: &HookConfig{
SupportedDriverCapabilities: "video,display,compute,utility",
},
expectedConfig: &nvidiaConfig{
Devices: "all",
DriverCapabilities: "video,display",
},
},
{
description: "Hook config set, envvar unset sets default driver-capabilities",
env: map[string]string{
envNVVisibleDevices: "all",
},
privileged: true,
hookConfig: &HookConfig{
SupportedDriverCapabilities: "video,display,utility,compute",
},
expectedConfig: &nvidiaConfig{
Devices: "all",
DriverCapabilities: defaultDriverCapabilities.String(),
},
},
{
description: "Hook config set, swarmResource overrides device selection",
env: map[string]string{
envNVVisibleDevices: "all",
"DOCKER_SWARM_RESOURCE": "GPU1,GPU2",
},
privileged: true,
hookConfig: &HookConfig{
SwarmResource: func() *string {
s := "DOCKER_SWARM_RESOURCE"
return &s
}(),
SupportedDriverCapabilities: "video,display,utility,compute",
},
expectedConfig: &nvidiaConfig{
Devices: "GPU1,GPU2",
DriverCapabilities: defaultDriverCapabilities.String(),
},
},
{
description: "Hook config set, comma separated swarmResource is split and overrides device selection",
env: map[string]string{
envNVVisibleDevices: "all",
"DOCKER_SWARM_RESOURCE": "GPU1,GPU2",
},
privileged: true,
hookConfig: &HookConfig{
SwarmResource: func() *string {
s := "NOT_DOCKER_SWARM_RESOURCE,DOCKER_SWARM_RESOURCE"
return &s
}(),
SupportedDriverCapabilities: "video,display,utility,compute",
},
expectedConfig: &nvidiaConfig{
Devices: "GPU1,GPU2",
DriverCapabilities: defaultDriverCapabilities.String(),
},
},
}
for _, tc := range tests {
t.Run(tc.description, func(t *testing.T) {
// Wrap the call to getNvidiaConfig() in a closure.
var config *nvidiaConfig
getConfig := func() {
hookConfig := getDefaultHookConfig()
config = getNvidiaConfig(&hookConfig, tc.env, nil, tc.privileged)
hookConfig := tc.hookConfig
if hookConfig == nil {
defaultConfig := getDefaultHookConfig()
hookConfig = &defaultConfig
}
config = getNvidiaConfig(hookConfig, tc.env, nil, tc.privileged)
}
// For any tests that are expected to panic, make sure they do.
if tc.expectedPanic {
mustPanic(t, getConfig)
require.Panics(t, getConfig)
return
}
@@ -422,31 +512,20 @@ func TestGetNvidiaConfig(t *testing.T) {
getConfig()
// And start comparing the test results to the expected results.
if config == nil && tc.expectedConfig == nil {
if tc.expectedConfig == nil {
require.Nil(t, config, tc.description)
return
}
if config != nil && tc.expectedConfig != nil {
if !reflect.DeepEqual(config.Devices, tc.expectedConfig.Devices) {
t.Errorf("Unexpected nvidiaConfig (got: %v, wanted: %v)", config, tc.expectedConfig)
}
if !reflect.DeepEqual(config.MigConfigDevices, tc.expectedConfig.MigConfigDevices) {
t.Errorf("Unexpected nvidiaConfig (got: %v, wanted: %v)", config, tc.expectedConfig)
}
if !reflect.DeepEqual(config.MigMonitorDevices, tc.expectedConfig.MigMonitorDevices) {
t.Errorf("Unexpected nvidiaConfig (got: %v, wanted: %v)", config, tc.expectedConfig)
}
if !reflect.DeepEqual(config.DriverCapabilities, tc.expectedConfig.DriverCapabilities) {
t.Errorf("Unexpected nvidiaConfig (got: %v, wanted: %v)", config, tc.expectedConfig)
}
if !elementsMatch(config.Requirements, tc.expectedConfig.Requirements) {
t.Errorf("Unexpected nvidiaConfig (got: %v, wanted: %v)", config, tc.expectedConfig)
}
if !reflect.DeepEqual(config.DisableRequire, tc.expectedConfig.DisableRequire) {
t.Errorf("Unexpected nvidiaConfig (got: %v, wanted: %v)", config, tc.expectedConfig)
}
return
}
t.Errorf("Unexpected nvidiaConfig (got: %v, wanted: %v)", config, tc.expectedConfig)
require.NotNil(t, config, tc.description)
require.Equal(t, tc.expectedConfig.Devices, config.Devices)
require.Equal(t, tc.expectedConfig.MigConfigDevices, config.MigConfigDevices)
require.Equal(t, tc.expectedConfig.MigMonitorDevices, config.MigMonitorDevices)
require.Equal(t, tc.expectedConfig.DriverCapabilities, config.DriverCapabilities)
require.ElementsMatch(t, tc.expectedConfig.Requirements, config.Requirements)
require.Equal(t, tc.expectedConfig.DisableRequire, config.DisableRequire)
})
}
}
@@ -524,9 +603,7 @@ func TestGetDevicesFromMounts(t *testing.T) {
for _, tc := range tests {
t.Run(tc.description, func(t *testing.T) {
devices := getDevicesFromMounts(tc.mounts)
if !reflect.DeepEqual(devices, tc.expectedDevices) {
t.Errorf("Unexpected devices (got: %v, wanted: %v)", *devices, *tc.expectedDevices)
}
require.Equal(t, tc.expectedDevices, devices)
})
}
}
@@ -540,7 +617,6 @@ func TestDeviceListSourcePriority(t *testing.T) {
acceptUnprivileged bool
acceptMounts bool
expectedDevices *string
expectedPanic bool
}{
{
description: "Mount devices, unprivileged, no accept unprivileged",
@@ -567,7 +643,7 @@ func TestDeviceListSourcePriority(t *testing.T) {
privileged: false,
acceptUnprivileged: false,
acceptMounts: true,
expectedPanic: true,
expectedDevices: nil,
},
{
description: "No mount devices, privileged, no accept unprivileged",
@@ -621,7 +697,7 @@ func TestDeviceListSourcePriority(t *testing.T) {
privileged: false,
acceptUnprivileged: false,
acceptMounts: false,
expectedPanic: true,
expectedDevices: nil,
},
}
for _, tc := range tests {
@@ -635,47 +711,351 @@ func TestDeviceListSourcePriority(t *testing.T) {
hookConfig := getDefaultHookConfig()
hookConfig.AcceptEnvvarUnprivileged = tc.acceptUnprivileged
hookConfig.AcceptDeviceListAsVolumeMounts = tc.acceptMounts
devices = getDevices(&hookConfig, env, tc.mountDevices, tc.privileged, false)
}
// For any tests that are expected to panic, make sure they do.
if tc.expectedPanic {
mustPanic(t, getDevices)
return
devices = getDevices(&hookConfig, env, tc.mountDevices, tc.privileged)
}
// For all other tests, just grab the devices and check the results
getDevices()
if !reflect.DeepEqual(devices, tc.expectedDevices) {
t.Errorf("Unexpected devices (got: %v, wanted: %v)", *devices, *tc.expectedDevices)
}
require.Equal(t, tc.expectedDevices, devices)
})
}
}
func elementsMatch(slice0, slice1 []string) bool {
map0 := make(map[string]int)
map1 := make(map[string]int)
func TestGetDevicesFromEnvvar(t *testing.T) {
all := "all"
empty := ""
envDockerResourceGPUs := "DOCKER_RESOURCE_GPUS"
gpuID := "GPU-12345"
anotherGPUID := "GPU-67890"
thirdGPUID := "MIG-12345"
for _, e := range slice0 {
map0[e]++
var tests = []struct {
description string
swarmResourceEnvvars []string
env map[string]string
expectedDevices *string
}{
{
description: "empty env returns nil for non-legacy image",
},
{
description: "blank NVIDIA_VISIBLE_DEVICES returns nil for non-legacy image",
env: map[string]string{
envNVVisibleDevices: "",
},
},
{
description: "'void' NVIDIA_VISIBLE_DEVICES returns nil for non-legacy image",
env: map[string]string{
envNVVisibleDevices: "void",
},
},
{
description: "'none' NVIDIA_VISIBLE_DEVICES returns empty for non-legacy image",
env: map[string]string{
envNVVisibleDevices: "none",
},
expectedDevices: &empty,
},
{
description: "NVIDIA_VISIBLE_DEVICES set returns value for non-legacy image",
env: map[string]string{
envNVVisibleDevices: gpuID,
},
expectedDevices: &gpuID,
},
{
description: "NVIDIA_VISIBLE_DEVICES set returns value for legacy image",
env: map[string]string{
envNVVisibleDevices: gpuID,
envCUDAVersion: "legacy",
},
expectedDevices: &gpuID,
},
{
description: "empty env returns all for legacy image",
env: map[string]string{
envCUDAVersion: "legacy",
},
expectedDevices: &all,
},
// Add the `DOCKER_RESOURCE_GPUS` envvar and ensure that this is ignored when
// not enabled
{
description: "missing NVIDIA_VISIBLE_DEVICES returns nil for non-legacy image",
env: map[string]string{
envDockerResourceGPUs: anotherGPUID,
},
},
{
description: "blank NVIDIA_VISIBLE_DEVICES returns nil for non-legacy image",
env: map[string]string{
envNVVisibleDevices: "",
envDockerResourceGPUs: anotherGPUID,
},
},
{
description: "'void' NVIDIA_VISIBLE_DEVICES returns nil for non-legacy image",
env: map[string]string{
envNVVisibleDevices: "void",
envDockerResourceGPUs: anotherGPUID,
},
},
{
description: "'none' NVIDIA_VISIBLE_DEVICES returns empty for non-legacy image",
env: map[string]string{
envNVVisibleDevices: "none",
envDockerResourceGPUs: anotherGPUID,
},
expectedDevices: &empty,
},
{
description: "NVIDIA_VISIBLE_DEVICES set returns value for non-legacy image",
env: map[string]string{
envNVVisibleDevices: gpuID,
envDockerResourceGPUs: anotherGPUID,
},
expectedDevices: &gpuID,
},
{
description: "NVIDIA_VISIBLE_DEVICES set returns value for legacy image",
env: map[string]string{
envNVVisibleDevices: gpuID,
envDockerResourceGPUs: anotherGPUID,
envCUDAVersion: "legacy",
},
expectedDevices: &gpuID,
},
{
description: "empty env returns all for legacy image",
env: map[string]string{
envDockerResourceGPUs: anotherGPUID,
envCUDAVersion: "legacy",
},
expectedDevices: &all,
},
// Add the `DOCKER_RESOURCE_GPUS` envvar and ensure that this is selected when
// enabled
{
description: "empty env returns nil for non-legacy image",
swarmResourceEnvvars: []string{envDockerResourceGPUs},
},
{
description: "blank DOCKER_RESOURCE_GPUS returns nil for non-legacy image",
swarmResourceEnvvars: []string{envDockerResourceGPUs},
env: map[string]string{
envDockerResourceGPUs: "",
},
},
{
description: "'void' DOCKER_RESOURCE_GPUS returns nil for non-legacy image",
swarmResourceEnvvars: []string{envDockerResourceGPUs},
env: map[string]string{
envDockerResourceGPUs: "void",
},
},
{
description: "'none' DOCKER_RESOURCE_GPUS returns empty for non-legacy image",
swarmResourceEnvvars: []string{envDockerResourceGPUs},
env: map[string]string{
envDockerResourceGPUs: "none",
},
expectedDevices: &empty,
},
{
description: "DOCKER_RESOURCE_GPUS set returns value for non-legacy image",
swarmResourceEnvvars: []string{envDockerResourceGPUs},
env: map[string]string{
envDockerResourceGPUs: gpuID,
},
expectedDevices: &gpuID,
},
{
description: "DOCKER_RESOURCE_GPUS set returns value for legacy image",
swarmResourceEnvvars: []string{envDockerResourceGPUs},
env: map[string]string{
envDockerResourceGPUs: gpuID,
envCUDAVersion: "legacy",
},
expectedDevices: &gpuID,
},
{
description: "DOCKER_RESOURCE_GPUS is selected if present",
swarmResourceEnvvars: []string{envDockerResourceGPUs},
env: map[string]string{
envDockerResourceGPUs: anotherGPUID,
},
expectedDevices: &anotherGPUID,
},
{
description: "DOCKER_RESOURCE_GPUS overrides NVIDIA_VISIBLE_DEVICES if present",
swarmResourceEnvvars: []string{envDockerResourceGPUs},
env: map[string]string{
envNVVisibleDevices: gpuID,
envDockerResourceGPUs: anotherGPUID,
},
expectedDevices: &anotherGPUID,
},
{
description: "DOCKER_RESOURCE_GPUS_ADDITIONAL overrides NVIDIA_VISIBLE_DEVICES if present",
swarmResourceEnvvars: []string{"DOCKER_RESOURCE_GPUS_ADDITIONAL"},
env: map[string]string{
envNVVisibleDevices: gpuID,
"DOCKER_RESOURCE_GPUS_ADDITIONAL": anotherGPUID,
},
expectedDevices: &anotherGPUID,
},
{
description: "All available swarm resource envvars are selected and override NVIDIA_VISIBLE_DEVICES if present",
swarmResourceEnvvars: []string{"DOCKER_RESOURCE_GPUS", "DOCKER_RESOURCE_GPUS_ADDITIONAL"},
env: map[string]string{
envNVVisibleDevices: gpuID,
"DOCKER_RESOURCE_GPUS": thirdGPUID,
"DOCKER_RESOURCE_GPUS_ADDITIONAL": anotherGPUID,
},
expectedDevices: func() *string {
result := fmt.Sprintf("%s,%s", thirdGPUID, anotherGPUID)
return &result
}(),
},
{
description: "DOCKER_RESOURCE_GPUS_ADDITIONAL or DOCKER_RESOURCE_GPUS override NVIDIA_VISIBLE_DEVICES if present",
swarmResourceEnvvars: []string{"DOCKER_RESOURCE_GPUS", "DOCKER_RESOURCE_GPUS_ADDITIONAL"},
env: map[string]string{
envNVVisibleDevices: gpuID,
"DOCKER_RESOURCE_GPUS_ADDITIONAL": anotherGPUID,
},
expectedDevices: &anotherGPUID,
},
}
for _, e := range slice1 {
map1[e]++
}
for i, tc := range tests {
t.Run(tc.description, func(t *testing.T) {
devices := getDevicesFromEnvvar(image.CUDA(tc.env), tc.swarmResourceEnvvars)
if tc.expectedDevices == nil {
require.Nil(t, devices, "%d: %v", i, tc)
return
}
for k0, v0 := range map0 {
if map1[k0] != v0 {
return false
}
require.NotNil(t, devices, "%d: %v", i, tc)
require.Equal(t, *tc.expectedDevices, *devices, "%d: %v", i, tc)
})
}
}
func TestGetDriverCapabilities(t *testing.T) {
supportedCapabilities := "compute,utility,display,video"
testCases := []struct {
description string
env map[string]string
legacyImage bool
supportedCapabilities string
expectedPanic bool
expectedCapabilities string
}{
{
description: "Env is set for legacy image",
env: map[string]string{
envNVDriverCapabilities: "display,video",
},
legacyImage: true,
supportedCapabilities: supportedCapabilities,
expectedCapabilities: "display,video",
},
{
description: "Env is all for legacy image",
env: map[string]string{
envNVDriverCapabilities: "all",
},
legacyImage: true,
supportedCapabilities: supportedCapabilities,
expectedCapabilities: supportedCapabilities,
},
{
description: "Env is empty for legacy image",
env: map[string]string{
envNVDriverCapabilities: "",
},
legacyImage: true,
supportedCapabilities: supportedCapabilities,
expectedCapabilities: defaultDriverCapabilities.String(),
},
{
description: "Env unset for legacy image is 'all'",
env: map[string]string{},
legacyImage: true,
supportedCapabilities: supportedCapabilities,
expectedCapabilities: supportedCapabilities,
},
{
description: "Env is set for modern image",
env: map[string]string{
envNVDriverCapabilities: "display,video",
},
legacyImage: false,
supportedCapabilities: supportedCapabilities,
expectedCapabilities: "display,video",
},
{
description: "Env unset for modern image is default",
env: map[string]string{},
legacyImage: false,
supportedCapabilities: supportedCapabilities,
expectedCapabilities: defaultDriverCapabilities.String(),
},
{
description: "Env is all for modern image",
env: map[string]string{
envNVDriverCapabilities: "all",
},
legacyImage: false,
supportedCapabilities: supportedCapabilities,
expectedCapabilities: supportedCapabilities,
},
{
description: "Env is empty for modern image",
env: map[string]string{
envNVDriverCapabilities: "",
},
legacyImage: false,
supportedCapabilities: supportedCapabilities,
expectedCapabilities: defaultDriverCapabilities.String(),
},
{
description: "Invalid capabilities panic",
env: map[string]string{
envNVDriverCapabilities: "compute,utility",
},
supportedCapabilities: "not-compute,not-utility",
expectedPanic: true,
},
{
description: "Default is restricted for modern image",
legacyImage: false,
supportedCapabilities: "compute",
expectedCapabilities: "compute",
},
}
for _, tc := range testCases {
t.Run(tc.description, func(t *testing.T) {
var capabilites DriverCapabilities
getDriverCapabilities := func() {
supportedCapabilities := DriverCapabilities(tc.supportedCapabilities)
capabilites = getDriverCapabilities(tc.env, supportedCapabilities, tc.legacyImage)
}
if tc.expectedPanic {
require.Panics(t, getDriverCapabilities)
return
}
getDriverCapabilities()
require.EqualValues(t, tc.expectedCapabilities, capabilites)
})
}
for k1, v1 := range map1 {
if map0[k1] != v1 {
return false
}
}
return true
}

View File

@@ -0,0 +1,140 @@
package main
import (
"log"
"os"
"path"
"reflect"
"strings"
"github.com/BurntSushi/toml"
"github.com/NVIDIA/nvidia-container-toolkit/internal/config"
)
const (
configPath = "/etc/nvidia-container-runtime/config.toml"
driverPath = "/run/nvidia/driver"
)
var defaultPaths = [...]string{
path.Join(driverPath, configPath),
configPath,
}
// CLIConfig : options for nvidia-container-cli.
type CLIConfig struct {
Root *string `toml:"root"`
Path *string `toml:"path"`
Environment []string `toml:"environment"`
Debug *string `toml:"debug"`
Ldcache *string `toml:"ldcache"`
LoadKmods bool `toml:"load-kmods"`
NoPivot bool `toml:"no-pivot"`
NoCgroups bool `toml:"no-cgroups"`
User *string `toml:"user"`
Ldconfig *string `toml:"ldconfig"`
}
// HookConfig : options for the nvidia-container-runtime-hook.
type HookConfig struct {
DisableRequire bool `toml:"disable-require"`
SwarmResource *string `toml:"swarm-resource"`
AcceptEnvvarUnprivileged bool `toml:"accept-nvidia-visible-devices-envvar-when-unprivileged"`
AcceptDeviceListAsVolumeMounts bool `toml:"accept-nvidia-visible-devices-as-volume-mounts"`
SupportedDriverCapabilities DriverCapabilities `toml:"supported-driver-capabilities"`
NvidiaContainerCLI CLIConfig `toml:"nvidia-container-cli"`
NVIDIAContainerRuntime config.RuntimeConfig `toml:"nvidia-container-runtime"`
NVIDIAContainerRuntimeHook config.RuntimeHookConfig `toml:"nvidia-container-runtime-hook"`
}
func getDefaultHookConfig() HookConfig {
return HookConfig{
DisableRequire: false,
SwarmResource: nil,
AcceptEnvvarUnprivileged: true,
AcceptDeviceListAsVolumeMounts: false,
SupportedDriverCapabilities: allDriverCapabilities,
NvidiaContainerCLI: CLIConfig{
Root: nil,
Path: nil,
Environment: []string{},
Debug: nil,
Ldcache: nil,
LoadKmods: true,
NoPivot: false,
NoCgroups: false,
User: nil,
Ldconfig: nil,
},
NVIDIAContainerRuntime: *config.GetDefaultRuntimeConfig(),
NVIDIAContainerRuntimeHook: *config.GetDefaultRuntimeHookConfig(),
}
}
func getHookConfig() (config HookConfig) {
var err error
if len(*configflag) > 0 {
config = getDefaultHookConfig()
_, err = toml.DecodeFile(*configflag, &config)
if err != nil {
log.Panicln("couldn't open configuration file:", err)
}
} else {
for _, p := range defaultPaths {
config = getDefaultHookConfig()
_, err = toml.DecodeFile(p, &config)
if err == nil {
break
} else if !os.IsNotExist(err) {
log.Panicln("couldn't open default configuration file:", err)
}
}
}
if config.SupportedDriverCapabilities == all {
config.SupportedDriverCapabilities = allDriverCapabilities
}
// We ensure that the supported-driver-capabilites option is a subset of allDriverCapabilities
if intersection := allDriverCapabilities.Intersection(config.SupportedDriverCapabilities); intersection != config.SupportedDriverCapabilities {
configName := config.getConfigOption("SupportedDriverCapabilities")
log.Panicf("Invalid value for config option '%v'; %v (supported: %v)\n", configName, config.SupportedDriverCapabilities, allDriverCapabilities)
}
return config
}
// getConfigOption returns the toml config option associated with the
// specified struct field.
func (c HookConfig) getConfigOption(fieldName string) string {
t := reflect.TypeOf(c)
f, ok := t.FieldByName(fieldName)
if !ok {
return fieldName
}
v, ok := f.Tag.Lookup("toml")
if !ok {
return fieldName
}
return v
}
// getSwarmResourceEnvvars returns the swarm resource envvars for the config.
func (c *HookConfig) getSwarmResourceEnvvars() []string {
if c.SwarmResource == nil {
return nil
}
candidates := strings.Split(*c.SwarmResource, ",")
var envvars []string
for _, c := range candidates {
trimmed := strings.TrimSpace(c)
if len(trimmed) > 0 {
envvars = append(envvars, trimmed)
}
}
return envvars
}

View File

@@ -0,0 +1,161 @@
/**
# Copyright (c) 2021, NVIDIA CORPORATION. All rights reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
**/
package main
import (
"fmt"
"os"
"testing"
"github.com/stretchr/testify/require"
)
func TestGetHookConfig(t *testing.T) {
testCases := []struct {
lines []string
expectedPanic bool
expectedDriverCapabilities DriverCapabilities
}{
{
expectedDriverCapabilities: allDriverCapabilities,
},
{
lines: []string{
"supported-driver-capabilities = \"all\"",
},
expectedDriverCapabilities: allDriverCapabilities,
},
{
lines: []string{
"supported-driver-capabilities = \"compute,utility,not-compute\"",
},
expectedPanic: true,
},
{
lines: []string{},
expectedDriverCapabilities: allDriverCapabilities,
},
{
lines: []string{
"supported-driver-capabilities = \"\"",
},
expectedDriverCapabilities: none,
},
{
lines: []string{
"supported-driver-capabilities = \"utility,compute\"",
},
expectedDriverCapabilities: DriverCapabilities("utility,compute"),
},
}
for i, tc := range testCases {
t.Run(fmt.Sprintf("test case %d", i), func(t *testing.T) {
var filename string
defer func() {
if len(filename) > 0 {
os.Remove(filename)
}
configflag = nil
}()
if tc.lines != nil {
configFile, err := os.CreateTemp("", "*.toml")
require.NoError(t, err)
defer configFile.Close()
filename = configFile.Name()
configflag = &filename
for _, line := range tc.lines {
_, err := configFile.WriteString(fmt.Sprintf("%s\n", line))
require.NoError(t, err)
}
}
var config HookConfig
getHookConfig := func() {
config = getHookConfig()
}
if tc.expectedPanic {
require.Panics(t, getHookConfig)
return
}
getHookConfig()
require.EqualValues(t, tc.expectedDriverCapabilities, config.SupportedDriverCapabilities)
})
}
}
func TestGetSwarmResourceEnvvars(t *testing.T) {
testCases := []struct {
value string
expected []string
}{
{
value: "nil",
expected: nil,
},
{
value: "",
expected: nil,
},
{
value: " ",
expected: nil,
},
{
value: "single",
expected: []string{"single"},
},
{
value: "single ",
expected: []string{"single"},
},
{
value: "one,two",
expected: []string{"one", "two"},
},
{
value: "one ,two",
expected: []string{"one", "two"},
},
{
value: "one, two",
expected: []string{"one", "two"},
},
}
for i, tc := range testCases {
t.Run(fmt.Sprintf("%d", i), func(t *testing.T) {
c := &HookConfig{
SwarmResource: func() *string {
if tc.value == "nil" {
return nil
}
return &tc.value
}(),
}
envvars := c.getSwarmResourceEnvvars()
require.EqualValues(t, tc.expected, envvars)
})
}
}

View File

@@ -0,0 +1,89 @@
package main
import (
"encoding/json"
"testing"
"github.com/stretchr/testify/require"
)
func TestIsPrivileged(t *testing.T) {
var tests = []struct {
spec string
expected bool
}{
{
`
{
"ociVersion": "1.0.0",
"process": {
"capabilities": {
"bounding": [ "CAP_SYS_ADMIN" ]
}
}
}
`,
true,
},
{
`
{
"ociVersion": "1.0.0",
"process": {
"capabilities": {
"bounding": [ "CAP_SYS_OTHER" ]
}
}
}
`,
false,
},
{
`
{
"ociVersion": "1.0.0",
"process": {}
}
`,
false,
},
{
`
{
"ociVersion": "1.0.0-rc2-dev",
"process": {
"capabilities": [ "CAP_SYS_ADMIN" ]
}
}
`,
true,
},
{
`
{
"ociVersion": "1.0.0-rc2-dev",
"process": {
"capabilities": [ "CAP_SYS_OTHER" ]
}
}
`,
false,
},
{
`
{
"ociVersion": "1.0.0-rc2-dev",
"process": {}
}
`,
false,
},
}
for i, tc := range tests {
var spec Spec
_ = json.Unmarshal([]byte(tc.spec), &spec)
privileged := isPrivileged(&spec)
require.Equal(t, tc.expected, privileged, "%d: %v", i, tc)
}
}

View File

@@ -6,20 +6,21 @@ import (
"log"
"os"
"os/exec"
"path"
"path/filepath"
"runtime"
"runtime/debug"
"strconv"
"strings"
"syscall"
"github.com/NVIDIA/nvidia-container-toolkit/internal/info"
"github.com/NVIDIA/nvidia-container-toolkit/internal/lookup"
)
var (
debugflag = flag.Bool("debug", false, "enable debug output")
configflag = flag.String("config", "", "configuration file")
defaultPATH = []string{"/usr/local/sbin", "/usr/local/bin", "/usr/sbin", "/usr/bin", "/sbin", "/bin"}
debugflag = flag.Bool("debug", false, "enable debug output")
versionflag = flag.Bool("version", false, "enable version output")
configflag = flag.String("config", "", "configuration file")
)
func exit() {
@@ -35,28 +36,16 @@ func exit() {
os.Exit(0)
}
func getPATH(config CLIConfig) string {
dirs := filepath.SplitList(os.Getenv("PATH"))
// directories from the hook environment have higher precedence
dirs = append(dirs, defaultPATH...)
if config.Root != nil {
rootDirs := []string{}
for _, dir := range dirs {
rootDirs = append(rootDirs, path.Join(*config.Root, dir))
}
// directories with the root prefix have higher precedence
dirs = append(rootDirs, dirs...)
}
return strings.Join(dirs, ":")
}
func getCLIPath(config CLIConfig) string {
if config.Path != nil {
return *config.Path
}
if err := os.Setenv("PATH", getPATH(config)); err != nil {
var root string
if config.Root != nil {
root = *config.Root
}
if err := os.Setenv("PATH", lookup.GetPath(root)); err != nil {
log.Panicln("couldn't set PATH variable:", err)
}
@@ -85,6 +74,10 @@ func doPrestart() {
hook := getHookConfig()
cli := hook.NvidiaContainerCLI
if !hook.NVIDIAContainerRuntimeHook.SkipModeDetection && info.ResolveAutoMode(&logInterceptor{}, hook.NVIDIAContainerRuntime.Mode) != "legacy" {
log.Panicln("invoking the NVIDIA Container Runtime Hook directly (e.g. specifying the docker --gpus flag) is not supported. Please use the NVIDIA Container Runtime (e.g. specify the --runtime=nvidia flag) instead.")
}
container := getContainerConfig(hook)
nvidia := container.Nvidia
if nvidia == nil {
@@ -167,6 +160,11 @@ func main() {
flag.Usage = usage
flag.Parse()
if *versionflag {
fmt.Printf("%v version %v\n", "NVIDIA Container Runtime Hook", info.GetVersionString())
return
}
args := flag.Args()
if len(args) == 0 {
flag.Usage()
@@ -186,3 +184,12 @@ func main() {
os.Exit(2)
}
}
// logInterceptor implements the info.Logger interface to allow for logging from this function.
type logInterceptor struct{}
func (l *logInterceptor) Infof(format string, args ...interface{}) {
log.Printf(format, args...)
}
func (l *logInterceptor) Debugf(format string, args ...interface{}) {}

View File

@@ -0,0 +1,34 @@
/**
# Copyright (c) NVIDIA CORPORATION. All rights reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
**/
package main
import (
"os"
"github.com/NVIDIA/nvidia-container-toolkit/internal/runtime"
)
func main() {
rt := runtime.New(
runtime.WithModeOverride("cdi"),
)
err := rt.Run(os.Args)
if err != nil {
os.Exit(1)
}
}

View File

@@ -0,0 +1,34 @@
/**
# Copyright (c) NVIDIA CORPORATION. All rights reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
**/
package main
import (
"os"
"github.com/NVIDIA/nvidia-container-toolkit/internal/runtime"
)
func main() {
rt := runtime.New(
runtime.WithModeOverride("legacy"),
)
err := rt.Run(os.Args)
if err != nil {
os.Exit(1)
}
}

View File

@@ -0,0 +1,87 @@
# The NVIDIA Container Runtime
The NVIDIA Container Runtime is a shim for OCI-compliant low-level runtimes such as [runc](https://github.com/opencontainers/runc). When a `create` command is detected, the incoming [OCI runtime specification](https://github.com/opencontainers/runtime-spec) is modified in place and the command is forwarded to the low-level runtime.
## Configuration
The NVIDIA Container Runtime uses file-based configuration, with the config stored in `/etc/nvidia-container-runtime/config.toml`. The `/etc` path can be overridden using the `XDG_CONFIG_HOME` environment variable with the `${XDG_CONFIG_HOME}/nvidia-container-runtime/config.toml` file used instead if this environment variable is set.
This config file may contain options for other components of the NVIDIA container stack and for the NVIDIA Container Runtime, the relevant config section is `nvidia-container-runtime`
### Logging
The `log-level` config option (default: `"info"`) specifies the log level to use and the `debug` option, if set, specifies a log file to which logs for the NVIDIA Container Runtime must be written.
In addition to this, the NVIDIA Container Runtime considers the value of `--log` and `--log-format` flags that may be passed to it by a container runtime such as docker or containerd. If the `--debug` flag is present the log-level specified in the config file is overridden as `"debug"`.
### Low-level Runtime Path
The `runtimes` config option allows for the low-level runtime to be specified. The first entry in this list that is an existing executable file is used as the low-level runtime. If the entry is not a path, the `PATH` is searched for a matching executable. If the entry is a path this is checked instead.
The default value for this setting is:
```toml
runtimes = [
"docker-runc",
"runc",
]
```
and if, for example, `crun` is to be used instead this can be changed to:
```toml
runtimes = [
"crun",
]
```
### Runtime Mode
The `mode` config option (default `"auto"`) controls the high-level behaviour of the runtime.
#### Auto Mode
When `mode` is set to `"auto"`, the runtime employs heuristics to determine which mode to use based on, for example, the platform where the runtime is being run.
#### Legacy Mode
When `mode` is set to `"legacy"`, the NVIDIA Container Runtime adds a [`prestart` hook](https://github.com/opencontainers/runtime-spec/blob/master/config.md#prestart) to the incomming OCI specification that invokes the NVIDIA Container Runtime Hook for all containers created. This hook checks whether NVIDIA devices are requested and ensures GPU access is configured using the `nvidia-container-cli` from the [libnvidia-container](https://github.com/NVIDIA/libnvidia-container) project.
#### CSV Mode
When `mode` is set to `"csv"`, CSV files at `/etc/nvidia-container-runtime/host-files-for-container.d` define the devices and mounts that are to be injected into a container when it is created. The search path for the files can be overridden by modifying the `nvidia-container-runtime.modes.csv.mount-spec-path` in the config as below:
```toml
[nvidia-container-runtime]
[nvidia-container-runtime.modes.csv]
mount-spec-path = "/etc/nvidia-container-runtime/host-files-for-container.d"
```
This mode is primarily targeted at Tegra-based systems without NVML available.
### Notes on using the docker CLI
Note that only the `"legacy"` NVIDIA Container Runtime mode is directly compatible with the `--gpus` flag implemented by the `docker` CLI (assuming the NVIDIA Container Runtime is not used). The reason for this is that `docker` inserts the same NVIDIA Container Runtime Hook into the OCI runtime specification.
If a different mode is explicitly set or detected, the NVIDIA Container Runtime Hook will raise the following error when `--gpus` is set:
```
$ docker run --rm --gpus all ubuntu:18.04
docker: Error response from daemon: failed to create shim: OCI runtime create failed: container_linux.go:380: starting container process caused: process_linux.go:545: container init caused: Running hook #0:: error running hook: exit status 1, stdout: , stderr: Auto-detected mode as 'csv'
invoking the NVIDIA Container Runtime Hook directly (e.g. specifying the docker --gpus flag) is not supported. Please use the NVIDIA Container Runtime instead.: unknown.
```
Here NVIDIA Container Runtime must be used explicitly. The recommended way to do this is to specify the `--runtime=nvidia` command line argument as part of the `docker run` commmand as follows:
```
$ docker run --rm --gpus all --runtime=nvidia ubuntu:18.04
```
Alternatively the NVIDIA Container Runtime can be set as the default runtime for docker. This can be done by modifying the `/etc/docker/daemon.json` file as follows:
```json
{
"default-runtime": "nvidia",
"runtimes": {
"nvidia": {
"path": "nvidia-container-runtime",
"runtimeArgs": []
}
}
}
```

View File

@@ -0,0 +1,15 @@
package main
import (
"os"
"github.com/NVIDIA/nvidia-container-toolkit/internal/runtime"
)
func main() {
r := runtime.New()
err := r.Run(os.Args)
if err != nil {
os.Exit(1)
}
}

View File

@@ -0,0 +1,248 @@
package main
import (
"bytes"
"encoding/json"
"io/ioutil"
"os"
"os/exec"
"path/filepath"
"strings"
"testing"
"github.com/NVIDIA/nvidia-container-toolkit/internal/modifier"
"github.com/NVIDIA/nvidia-container-toolkit/internal/test"
"github.com/opencontainers/runtime-spec/specs-go"
"github.com/sirupsen/logrus"
"github.com/stretchr/testify/require"
)
const (
nvidiaRuntime = "nvidia-container-runtime"
nvidiaHook = "nvidia-container-runtime-hook"
bundlePathSuffix = "test/output/bundle/"
specFile = "config.json"
unmodifiedSpecFileSuffix = "test/input/test_spec.json"
)
const (
runcExecutableName = "runc"
)
type testConfig struct {
root string
binPath string
}
var cfg *testConfig
func TestMain(m *testing.M) {
// TEST SETUP
// Determine the module root and the test binary path
var err error
moduleRoot, err := test.GetModuleRoot()
if err != nil {
logrus.Fatalf("error in test setup: could not get module root: %v", err)
}
testBinPath := filepath.Join(moduleRoot, "test", "bin")
testInputPath := filepath.Join(moduleRoot, "test", "input")
// Set the environment variables for the test
os.Setenv("PATH", test.PrependToPath(testBinPath, moduleRoot))
os.Setenv("XDG_CONFIG_HOME", testInputPath)
// Confirm that the environment is configured correctly
runcPath, err := exec.LookPath(runcExecutableName)
if err != nil || filepath.Join(testBinPath, runcExecutableName) != runcPath {
logrus.Fatalf("error in test setup: mock runc path set incorrectly in TestMain(): %v", err)
}
hookPath, err := exec.LookPath(nvidiaHook)
if err != nil || filepath.Join(testBinPath, nvidiaHook) != hookPath {
logrus.Fatalf("error in test setup: mock hook path set incorrectly in TestMain(): %v", err)
}
// Store the root and binary paths in the test Config
cfg = &testConfig{
root: moduleRoot,
binPath: testBinPath,
}
// RUN TESTS
exitCode := m.Run()
// TEST CLEANUP
os.Remove(specFile)
os.Exit(exitCode)
}
// case 1) nvidia-container-runtime run --bundle
// case 2) nvidia-container-runtime create --bundle
// - Confirm the runtime handles bad input correctly
func TestBadInput(t *testing.T) {
err := cfg.generateNewRuntimeSpec()
if err != nil {
t.Fatal(err)
}
cmdCreate := exec.Command(nvidiaRuntime, "create", "--bundle")
t.Logf("executing: %s\n", strings.Join(cmdCreate.Args, " "))
err = cmdCreate.Run()
require.Error(t, err, "runtime should return an error")
}
// case 1) nvidia-container-runtime run --bundle <bundle-name> <ctr-name>
// - Confirm the runtime runs with no errors
//
// case 2) nvidia-container-runtime create --bundle <bundle-name> <ctr-name>
// - Confirm the runtime inserts the NVIDIA prestart hook correctly
func TestGoodInput(t *testing.T) {
err := cfg.generateNewRuntimeSpec()
if err != nil {
t.Fatalf("error generating runtime spec: %v", err)
}
cmdRun := exec.Command(nvidiaRuntime, "run", "--bundle", cfg.bundlePath(), "testcontainer")
t.Logf("executing: %s\n", strings.Join(cmdRun.Args, " "))
output, err := cmdRun.CombinedOutput()
require.NoErrorf(t, err, "runtime should not return an error", "output=%v", string(output))
// Check config.json and confirm there are no hooks
spec, err := cfg.getRuntimeSpec()
require.NoError(t, err, "should be no errors when reading and parsing spec from config.json")
require.Empty(t, spec.Hooks, "there should be no hooks in config.json")
cmdCreate := exec.Command(nvidiaRuntime, "create", "--bundle", cfg.bundlePath(), "testcontainer")
t.Logf("executing: %s\n", strings.Join(cmdCreate.Args, " "))
err = cmdCreate.Run()
require.NoError(t, err, "runtime should not return an error")
// Check config.json for NVIDIA prestart hook
spec, err = cfg.getRuntimeSpec()
require.NoError(t, err, "should be no errors when reading and parsing spec from config.json")
require.NotEmpty(t, spec.Hooks, "there should be hooks in config.json")
require.Equal(t, 1, nvidiaHookCount(spec.Hooks), "exactly one nvidia prestart hook should be inserted correctly into config.json")
}
// NVIDIA prestart hook already present in config file
func TestDuplicateHook(t *testing.T) {
err := cfg.generateNewRuntimeSpec()
if err != nil {
t.Fatal(err)
}
var spec specs.Spec
spec, err = cfg.getRuntimeSpec()
if err != nil {
t.Fatal(err)
}
t.Logf("inserting nvidia prestart hook to config.json")
if err = addNVIDIAHook(&spec); err != nil {
t.Fatal(err)
}
jsonOutput, err := json.MarshalIndent(spec, "", "\t")
if err != nil {
t.Fatal(err)
}
jsonFile, err := os.OpenFile(cfg.specFilePath(), os.O_RDWR, 0644)
if err != nil {
t.Fatal(err)
}
_, err = jsonFile.WriteAt(jsonOutput, 0)
if err != nil {
t.Fatal(err)
}
// Test how runtime handles already existing prestart hook in config.json
cmdCreate := exec.Command(nvidiaRuntime, "create", "--bundle", cfg.bundlePath(), "testcontainer")
t.Logf("executing: %s\n", strings.Join(cmdCreate.Args, " "))
output, err := cmdCreate.CombinedOutput()
require.NoErrorf(t, err, "runtime should not return an error", "output=%v", string(output))
// Check config.json for NVIDIA prestart hook
spec, err = cfg.getRuntimeSpec()
require.NoError(t, err, "should be no errors when reading and parsing spec from config.json")
require.NotEmpty(t, spec.Hooks, "there should be hooks in config.json")
require.Equal(t, 1, nvidiaHookCount(spec.Hooks), "exactly one nvidia prestart hook should be inserted correctly into config.json")
}
// addNVIDIAHook is a basic wrapper for an addHookModifier that is used for
// testing.
func addNVIDIAHook(spec *specs.Spec) error {
m := modifier.NewStableRuntimeModifier(logrus.StandardLogger())
return m.Modify(spec)
}
func (c testConfig) getRuntimeSpec() (specs.Spec, error) {
filePath := c.specFilePath()
var spec specs.Spec
jsonFile, err := os.OpenFile(filePath, os.O_RDWR, 0644)
if err != nil {
return spec, err
}
defer jsonFile.Close()
jsonContent, err := ioutil.ReadAll(jsonFile)
if err != nil {
return spec, err
} else if json.Valid(jsonContent) {
err = json.Unmarshal(jsonContent, &spec)
if err != nil {
return spec, err
}
} else {
err = json.NewDecoder(bytes.NewReader(jsonContent)).Decode(&spec)
if err != nil {
return spec, err
}
}
return spec, err
}
func (c testConfig) bundlePath() string {
return filepath.Join(c.root, bundlePathSuffix)
}
func (c testConfig) specFilePath() string {
return filepath.Join(c.bundlePath(), specFile)
}
func (c testConfig) unmodifiedSpecFile() string {
return filepath.Join(c.root, unmodifiedSpecFileSuffix)
}
func (c testConfig) generateNewRuntimeSpec() error {
var err error
err = os.MkdirAll(c.bundlePath(), 0755)
if err != nil {
return err
}
cmd := exec.Command("cp", c.unmodifiedSpecFile(), c.specFilePath())
err = cmd.Run()
if err != nil {
return err
}
return nil
}
// Return number of valid NVIDIA prestart hooks in runtime spec
func nvidiaHookCount(hooks *specs.Hooks) int {
if hooks == nil {
return 0
}
count := 0
for _, hook := range hooks.Prestart {
if strings.Contains(hook.Path, nvidiaHook) {
count++
}
}
return count
}

48
cmd/nvidia-ctk/README.md Normal file
View File

@@ -0,0 +1,48 @@
# NVIDIA Container Toolkit CLI
The NVIDIA Container Toolkit CLI `nvidia-ctk` provides a number of utilities that are useful for working with the NVIDIA Container Toolkit.
## Functionality
### Configure runtimes
The `runtime` command of the `nvidia-ctk` CLI provides a set of utilities to related to the configuration
and management of supported container engines.
For example, running the following command:
```bash
nvidia-ctk runtime configure --set-as-default
```
will ensure that the NVIDIA Container Runtime is added as the default runtime to the default container
engine.
### Generate CDI specifications
The [Container Device Interface (CDI)](https://github.com/container-orchestrated-devices/container-device-interface) provides
a vendor-agnostic mechanism to make arbitrary devices accessible in containerized environments. To allow NVIDIA devices to be
used in these environments, the NVIDIA Container Toolkit CLI includes functionality to generate a CDI specification for the
available NVIDIA GPUs in a system.
In order to generate the CDI specification for the available devices, run the following command:\
```bash
nvidia-ctk cdi generate
```
The default is to print the specification to STDOUT and a filename can be specified using the `--output` flag.
The specification will contain a device entries as follows (where applicable):
* An `nvidia.com/gpu=gpu{INDEX}` device for each non-MIG-enabled full GPU in the system
* An `nvidia.com/gpu=mig{GPU_INDEX}:{MIG_INDEX}` device for each MIG-device in the system
* A special device called `nvidia.com/gpu=all` which represents all available devices.
For example, to generate the CDI specification in the default location where CDI-enabled tools such as `podman`, `containerd`, `cri-o`, or the NVIDIA Container Runtime can be configured to load it, the following command can be run:
```bash
sudo nvidia-ctk cdi generate --output=/etc/cdi/nvidia.yaml
```
(Note that `sudo` is used to ensure the correct permissions to write to the `/etc/cdi` folder)
With the specification generated, a GPU can be requested by specifying the fully-qualified CDI device name. With `podman` as an exmaple:
```bash
podman run --rm -ti --device=nvidia.com/gpu=gpu0 ubuntu nvidia-smi -L
```

50
cmd/nvidia-ctk/cdi/cdi.go Normal file
View File

@@ -0,0 +1,50 @@
/**
# Copyright (c) 2022, NVIDIA CORPORATION. All rights reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
**/
package cdi
import (
"github.com/NVIDIA/nvidia-container-toolkit/cmd/nvidia-ctk/cdi/generate"
"github.com/sirupsen/logrus"
"github.com/urfave/cli/v2"
)
type command struct {
logger *logrus.Logger
}
// NewCommand constructs an info command with the specified logger
func NewCommand(logger *logrus.Logger) *cli.Command {
c := command{
logger: logger,
}
return c.build()
}
// build
func (m command) build() *cli.Command {
// Create the 'hook' command
hook := cli.Command{
Name: "cdi",
Usage: "Provide tools for interacting with Container Device Interface specifications",
}
hook.Subcommands = []*cli.Command{
generate.NewCommand(m.logger),
}
return &hook
}

View File

@@ -0,0 +1,270 @@
/**
# Copyright (c) 2022, NVIDIA CORPORATION. All rights reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
**/
package generate
import (
"fmt"
"os"
"path/filepath"
"strings"
"github.com/NVIDIA/nvidia-container-toolkit/internal/discover"
"github.com/NVIDIA/nvidia-container-toolkit/internal/edits"
"github.com/NVIDIA/nvidia-container-toolkit/pkg/nvcdi"
"github.com/NVIDIA/nvidia-container-toolkit/pkg/nvcdi/spec"
"github.com/container-orchestrated-devices/container-device-interface/pkg/cdi"
specs "github.com/container-orchestrated-devices/container-device-interface/specs-go"
"github.com/sirupsen/logrus"
"github.com/urfave/cli/v2"
"gitlab.com/nvidia/cloud-native/go-nvlib/pkg/nvlib/device"
"gitlab.com/nvidia/cloud-native/go-nvlib/pkg/nvml"
)
const (
allDeviceName = "all"
)
type command struct {
logger *logrus.Logger
}
type config struct {
output string
format string
deviceNameStrategy string
driverRoot string
nvidiaCTKPath string
mode string
}
// NewCommand constructs a generate-cdi command with the specified logger
func NewCommand(logger *logrus.Logger) *cli.Command {
c := command{
logger: logger,
}
return c.build()
}
// build creates the CLI command
func (m command) build() *cli.Command {
cfg := config{}
// Create the 'generate-cdi' command
c := cli.Command{
Name: "generate",
Usage: "Generate CDI specifications for use with CDI-enabled runtimes",
Before: func(c *cli.Context) error {
return m.validateFlags(c, &cfg)
},
Action: func(c *cli.Context) error {
return m.run(c, &cfg)
},
}
c.Flags = []cli.Flag{
&cli.StringFlag{
Name: "output",
Usage: "Specify the file to output the generated CDI specification to. If this is '' the specification is output to STDOUT",
Destination: &cfg.output,
},
&cli.StringFlag{
Name: "format",
Usage: "The output format for the generated spec [json | yaml]. This overrides the format defined by the output file extension (if specified).",
Value: spec.FormatYAML,
Destination: &cfg.format,
},
&cli.StringFlag{
Name: "mode",
Aliases: []string{"discovery-mode"},
Usage: "The mode to use when discovering the available entities. One of [auto | nvml | wsl]. If mode is set to 'auto' the mode will be determined based on the system configuration.",
Value: nvcdi.ModeAuto,
Destination: &cfg.mode,
},
&cli.StringFlag{
Name: "device-name-strategy",
Usage: "Specify the strategy for generating device names. One of [index | uuid | type-index]",
Value: nvcdi.DeviceNameStrategyIndex,
Destination: &cfg.deviceNameStrategy,
},
&cli.StringFlag{
Name: "driver-root",
Usage: "Specify the NVIDIA GPU driver root to use when discovering the entities that should be included in the CDI specification.",
Destination: &cfg.driverRoot,
},
&cli.StringFlag{
Name: "nvidia-ctk-path",
Usage: "Specify the path to use for the nvidia-ctk in the generated CDI specification. If this is left empty, the path will be searched.",
Destination: &cfg.nvidiaCTKPath,
},
}
return &c
}
func (m command) validateFlags(c *cli.Context, cfg *config) error {
cfg.format = strings.ToLower(cfg.format)
switch cfg.format {
case spec.FormatJSON:
case spec.FormatYAML:
default:
return fmt.Errorf("invalid output format: %v", cfg.format)
}
cfg.mode = strings.ToLower(cfg.mode)
switch cfg.mode {
case nvcdi.ModeAuto:
case nvcdi.ModeNvml:
case nvcdi.ModeWsl:
case nvcdi.ModeManagement:
default:
return fmt.Errorf("invalid discovery mode: %v", cfg.mode)
}
_, err := nvcdi.NewDeviceNamer(cfg.deviceNameStrategy)
if err != nil {
return err
}
cfg.nvidiaCTKPath = discover.FindNvidiaCTK(m.logger, cfg.nvidiaCTKPath)
if outputFileFormat := formatFromFilename(cfg.output); outputFileFormat != "" {
m.logger.Debugf("Inferred output format as %q from output file name", outputFileFormat)
if !c.IsSet("format") {
cfg.format = outputFileFormat
} else if outputFileFormat != cfg.format {
m.logger.Warningf("Requested output format %q does not match format implied by output file name: %q", cfg.format, outputFileFormat)
}
}
return nil
}
func (m command) run(c *cli.Context, cfg *config) error {
spec, err := m.generateSpec(cfg)
if err != nil {
return fmt.Errorf("failed to generate CDI spec: %v", err)
}
m.logger.Infof("Generated CDI spec with version %v", spec.Raw().Version)
if cfg.output == "" {
_, err := spec.WriteTo(os.Stdout)
if err != nil {
return fmt.Errorf("failed to write CDI spec to STDOUT: %v", err)
}
return nil
}
return spec.Save(cfg.output)
}
func formatFromFilename(filename string) string {
ext := filepath.Ext(filename)
switch strings.ToLower(ext) {
case ".json":
return spec.FormatJSON
case ".yaml", ".yml":
return spec.FormatYAML
}
return ""
}
func (m command) generateSpec(cfg *config) (spec.Interface, error) {
deviceNamer, err := nvcdi.NewDeviceNamer(cfg.deviceNameStrategy)
if err != nil {
return nil, fmt.Errorf("failed to create device namer: %v", err)
}
nvmllib := nvml.New()
if r := nvmllib.Init(); r != nvml.SUCCESS {
return nil, r
}
defer nvmllib.Shutdown()
devicelib := device.New(device.WithNvml(nvmllib))
cdilib := nvcdi.New(
nvcdi.WithLogger(m.logger),
nvcdi.WithDriverRoot(cfg.driverRoot),
nvcdi.WithNVIDIACTKPath(cfg.nvidiaCTKPath),
nvcdi.WithDeviceNamer(deviceNamer),
nvcdi.WithDeviceLib(devicelib),
nvcdi.WithNvmlLib(nvmllib),
nvcdi.WithMode(string(cfg.mode)),
)
deviceSpecs, err := cdilib.GetAllDeviceSpecs()
if err != nil {
return nil, fmt.Errorf("failed to create device CDI specs: %v", err)
}
var hasAll bool
for _, deviceSpec := range deviceSpecs {
if deviceSpec.Name == allDeviceName {
hasAll = true
break
}
}
if !hasAll {
allDevice, err := MergeDeviceSpecs(deviceSpecs, allDeviceName)
if err != nil {
return nil, fmt.Errorf("failed to create CDI specification for %q device: %v", allDeviceName, err)
}
deviceSpecs = append(deviceSpecs, allDevice)
}
commonEdits, err := cdilib.GetCommonEdits()
if err != nil {
return nil, fmt.Errorf("failed to create edits common for entities: %v", err)
}
return spec.New(
spec.WithVendor("nvidia.com"),
spec.WithClass("gpu"),
spec.WithDeviceSpecs(deviceSpecs),
spec.WithEdits(*commonEdits.ContainerEdits),
spec.WithFormat(cfg.format),
)
}
// MergeDeviceSpecs creates a device with the specified name which combines the edits from the previous devices.
// If a device of the specified name already exists, an error is returned.
func MergeDeviceSpecs(deviceSpecs []specs.Device, mergedDeviceName string) (specs.Device, error) {
if err := cdi.ValidateDeviceName(mergedDeviceName); err != nil {
return specs.Device{}, fmt.Errorf("invalid device name %q: %v", mergedDeviceName, err)
}
for _, d := range deviceSpecs {
if d.Name == mergedDeviceName {
return specs.Device{}, fmt.Errorf("device %q already exists", mergedDeviceName)
}
}
mergedEdits := edits.NewContainerEdits()
for _, d := range deviceSpecs {
edit := cdi.ContainerEdits{
ContainerEdits: &d.ContainerEdits,
}
mergedEdits.Append(&edit)
}
merged := specs.Device{
Name: mergedDeviceName,
ContainerEdits: *mergedEdits.ContainerEdits,
}
return merged, nil
}

View File

@@ -0,0 +1,117 @@
/**
# Copyright (c) NVIDIA CORPORATION. All rights reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
**/
package generate
import (
"fmt"
"testing"
"github.com/container-orchestrated-devices/container-device-interface/specs-go"
"github.com/stretchr/testify/require"
)
func TestMergeDeviceSpecs(t *testing.T) {
testCases := []struct {
description string
deviceSpecs []specs.Device
mergedDeviceName string
expectedError error
expected specs.Device
}{
{
description: "no devices",
mergedDeviceName: "all",
expected: specs.Device{
Name: "all",
},
},
{
description: "one device",
mergedDeviceName: "all",
deviceSpecs: []specs.Device{
{
Name: "gpu0",
ContainerEdits: specs.ContainerEdits{
Env: []string{"GPU=0"},
},
},
},
expected: specs.Device{
Name: "all",
ContainerEdits: specs.ContainerEdits{
Env: []string{"GPU=0"},
},
},
},
{
description: "two devices",
mergedDeviceName: "all",
deviceSpecs: []specs.Device{
{
Name: "gpu0",
ContainerEdits: specs.ContainerEdits{
Env: []string{"GPU=0"},
},
},
{
Name: "gpu1",
ContainerEdits: specs.ContainerEdits{
Env: []string{"GPU=1"},
},
},
},
expected: specs.Device{
Name: "all",
ContainerEdits: specs.ContainerEdits{
Env: []string{"GPU=0", "GPU=1"},
},
},
},
{
description: "has merged device",
mergedDeviceName: "gpu0",
deviceSpecs: []specs.Device{
{
Name: "gpu0",
ContainerEdits: specs.ContainerEdits{
Env: []string{"GPU=0"},
},
},
},
expectedError: fmt.Errorf("device %q already exists", "gpu0"),
},
{
description: "invalid merged device name",
mergedDeviceName: ".-not-valid",
expectedError: fmt.Errorf("invalid device name %q", ".-not-valid"),
},
}
for _, tc := range testCases {
t.Run(tc.description, func(t *testing.T) {
mergedDevice, err := MergeDeviceSpecs(tc.deviceSpecs, tc.mergedDeviceName)
if tc.expectedError != nil {
require.Error(t, err)
return
}
require.NoError(t, err)
require.EqualValues(t, tc.expected, mergedDevice)
})
}
}

View File

@@ -0,0 +1,146 @@
/**
# Copyright (c) 2022, NVIDIA CORPORATION. All rights reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
**/
package chmod
import (
"fmt"
"os"
"path/filepath"
"strings"
"syscall"
"github.com/NVIDIA/nvidia-container-toolkit/internal/lookup"
"github.com/NVIDIA/nvidia-container-toolkit/internal/oci"
"github.com/sirupsen/logrus"
"github.com/urfave/cli/v2"
)
type command struct {
logger *logrus.Logger
}
type config struct {
paths cli.StringSlice
mode string
containerSpec string
}
// NewCommand constructs a chmod command with the specified logger
func NewCommand(logger *logrus.Logger) *cli.Command {
c := command{
logger: logger,
}
return c.build()
}
// build the chmod command
func (m command) build() *cli.Command {
cfg := config{}
// Create the 'chmod' command
c := cli.Command{
Name: "chmod",
Usage: "Set the permissions of folders in the container by running chmod. The container root is prefixed to the specified paths.",
Before: func(c *cli.Context) error {
return validateFlags(c, &cfg)
},
Action: func(c *cli.Context) error {
return m.run(c, &cfg)
},
}
c.Flags = []cli.Flag{
&cli.StringSliceFlag{
Name: "path",
Usage: "Specifiy a path to apply the specified mode to",
Destination: &cfg.paths,
},
&cli.StringFlag{
Name: "mode",
Usage: "Specify the file mode",
Destination: &cfg.mode,
},
&cli.StringFlag{
Name: "container-spec",
Usage: "Specify the path to the OCI container spec. If empty or '-' the spec will be read from STDIN",
Destination: &cfg.containerSpec,
},
}
return &c
}
func validateFlags(c *cli.Context, cfg *config) error {
if strings.TrimSpace(cfg.mode) == "" {
return fmt.Errorf("a non-empty mode must be specified")
}
for _, p := range cfg.paths.Value() {
if strings.TrimSpace(p) == "" {
return fmt.Errorf("paths must not be empty")
}
}
return nil
}
func (m command) run(c *cli.Context, cfg *config) error {
s, err := oci.LoadContainerState(cfg.containerSpec)
if err != nil {
return fmt.Errorf("failed to load container state: %v", err)
}
containerRoot, err := s.GetContainerRoot()
if err != nil {
return fmt.Errorf("failed to determined container root: %v", err)
}
if containerRoot == "" {
return fmt.Errorf("empty container root detected")
}
paths := m.getPaths(containerRoot, cfg.paths.Value())
if len(paths) == 0 {
m.logger.Debugf("No paths specified; exiting")
return nil
}
locator := lookup.NewExecutableLocator(m.logger, "")
targets, err := locator.Locate("chmod")
if err != nil {
return fmt.Errorf("failed to locate chmod: %v", err)
}
chmodPath := targets[0]
args := append([]string{filepath.Base(chmodPath), cfg.mode}, paths...)
return syscall.Exec(chmodPath, args, nil)
}
// getPaths updates the specified paths relative to the root.
func (m command) getPaths(root string, paths []string) []string {
var pathsInRoot []string
for _, f := range paths {
path := filepath.Join(root, f)
if _, err := os.Stat(path); err != nil {
m.logger.Debugf("Skipping path %q: %v", path, err)
continue
}
pathsInRoot = append(pathsInRoot, path)
}
return pathsInRoot
}

View File

@@ -0,0 +1,229 @@
/**
# Copyright (c) 2022, NVIDIA CORPORATION. All rights reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
**/
package symlinks
import (
"fmt"
"os"
"path/filepath"
"strings"
"github.com/NVIDIA/nvidia-container-toolkit/internal/discover/csv"
"github.com/NVIDIA/nvidia-container-toolkit/internal/lookup"
"github.com/NVIDIA/nvidia-container-toolkit/internal/oci"
"github.com/sirupsen/logrus"
"github.com/urfave/cli/v2"
)
type command struct {
logger *logrus.Logger
}
type config struct {
hostRoot string
filenames cli.StringSlice
links cli.StringSlice
containerSpec string
}
// NewCommand constructs a hook command with the specified logger
func NewCommand(logger *logrus.Logger) *cli.Command {
c := command{
logger: logger,
}
return c.build()
}
// build
func (m command) build() *cli.Command {
cfg := config{}
// Create the '' command
c := cli.Command{
Name: "create-symlinks",
Usage: "A hook to create symlinks in the container. This can be used to proces CSV mount specs",
Action: func(c *cli.Context) error {
return m.run(c, &cfg)
},
}
c.Flags = []cli.Flag{
&cli.StringFlag{
Name: "host-root",
Usage: "The root on the host filesystem to use to resolve symlinks",
Destination: &cfg.hostRoot,
},
&cli.StringSliceFlag{
Name: "csv-filename",
Usage: "Specify a (CSV) filename to process",
Destination: &cfg.filenames,
},
&cli.StringSliceFlag{
Name: "link",
Usage: "Specify a specific link to create. The link is specified as target::link",
Destination: &cfg.links,
},
&cli.StringFlag{
Name: "container-spec",
Usage: "Specify the path to the OCI container spec. If empty or '-' the spec will be read from STDIN",
Destination: &cfg.containerSpec,
},
}
return &c
}
func (m command) run(c *cli.Context, cfg *config) error {
s, err := oci.LoadContainerState(cfg.containerSpec)
if err != nil {
return fmt.Errorf("failed to load container state: %v", err)
}
containerRoot, err := s.GetContainerRoot()
if err != nil {
return fmt.Errorf("failed to determined container root: %v", err)
}
csvFiles := cfg.filenames.Value()
chainLocator := lookup.NewSymlinkChainLocator(m.logger, cfg.hostRoot)
var candidates []string
for _, file := range csvFiles {
mountSpecs, err := csv.NewCSVFileParser(m.logger, file).Parse()
if err != nil {
m.logger.Debugf("Skipping CSV file %v: %v", file, err)
continue
}
for _, ms := range mountSpecs {
if ms.Type != csv.MountSpecSym {
continue
}
targets, err := chainLocator.Locate(ms.Path)
if err != nil {
m.logger.Warnf("Failed to locate symlink %v", ms.Path)
}
candidates = append(candidates, targets...)
}
}
created := make(map[string]bool)
// candidates is a list of absolute paths to symlinks in a chain, or the final target of the chain.
for _, candidate := range candidates {
targets, err := m.Locate(candidate)
if err != nil {
m.logger.Debugf("Skipping invalid link: %v", err)
continue
} else if len(targets) != 1 {
m.logger.Debugf("Unexepected number of targets: %v", targets)
continue
} else if targets[0] == candidate {
m.logger.Debugf("%v is not a symlink", candidate)
continue
}
err = m.createLink(created, cfg.hostRoot, containerRoot, targets[0], candidate)
if err != nil {
m.logger.Warnf("Failed to create link %v: %v", []string{targets[0], candidate}, err)
}
}
links := cfg.links.Value()
for _, l := range links {
parts := strings.Split(l, "::")
if len(parts) != 2 {
m.logger.Warnf("Invalid link specification %v", l)
continue
}
err := m.createLink(created, cfg.hostRoot, containerRoot, parts[0], parts[1])
if err != nil {
m.logger.Warnf("Failed to create link %v: %v", parts, err)
}
}
return nil
}
func (m command) createLink(created map[string]bool, hostRoot string, containerRoot string, target string, link string) error {
linkPath, err := changeRoot(hostRoot, containerRoot, link)
if err != nil {
m.logger.Warnf("Failed to resolve path for link %v relative to %v: %v", link, containerRoot, err)
}
if created[linkPath] {
m.logger.Debugf("Link %v already created", linkPath)
return nil
}
targetPath, err := changeRoot(hostRoot, "/", target)
if err != nil {
m.logger.Warnf("Failed to resolve path for target %v relative to %v: %v", target, "/", err)
}
m.logger.Infof("Symlinking %v to %v", linkPath, targetPath)
err = os.MkdirAll(filepath.Dir(linkPath), 0755)
if err != nil {
return fmt.Errorf("failed to create directory: %v", err)
}
err = os.Symlink(target, linkPath)
if err != nil {
return fmt.Errorf("failed to create symlink: %v", err)
}
return nil
}
func changeRoot(current string, new string, path string) (string, error) {
if !filepath.IsAbs(path) {
return path, nil
}
relative := path
if current != "" {
r, err := filepath.Rel(current, path)
if err != nil {
return "", err
}
relative = r
}
return filepath.Join(new, relative), nil
}
// Locate returns the link target of the specified filename or an empty slice if the
// specified filename is not a symlink.
func (m command) Locate(filename string) ([]string, error) {
info, err := os.Lstat(filename)
if err != nil {
return nil, fmt.Errorf("failed to get file info: %v", info)
}
if info.Mode()&os.ModeSymlink == 0 {
m.logger.Debugf("%v is not a symlink", filename)
return nil, nil
}
target, err := os.Readlink(filename)
if err != nil {
return nil, fmt.Errorf("error checking symlink: %v", err)
}
m.logger.Debugf("Resolved link: '%v' => '%v'", filename, target)
return []string{target}, nil
}

View File

@@ -0,0 +1,55 @@
/**
# Copyright (c) 2022, NVIDIA CORPORATION. All rights reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
**/
package hook
import (
chmod "github.com/NVIDIA/nvidia-container-toolkit/cmd/nvidia-ctk/hook/chmod"
symlinks "github.com/NVIDIA/nvidia-container-toolkit/cmd/nvidia-ctk/hook/create-symlinks"
ldcache "github.com/NVIDIA/nvidia-container-toolkit/cmd/nvidia-ctk/hook/update-ldcache"
"github.com/sirupsen/logrus"
"github.com/urfave/cli/v2"
)
type hookCommand struct {
logger *logrus.Logger
}
// NewCommand constructs a hook command with the specified logger
func NewCommand(logger *logrus.Logger) *cli.Command {
c := hookCommand{
logger: logger,
}
return c.build()
}
// build
func (m hookCommand) build() *cli.Command {
// Create the 'hook' command
hook := cli.Command{
Name: "hook",
Usage: "A collection of hooks that may be injected into an OCI spec",
}
hook.Subcommands = []*cli.Command{
ldcache.NewCommand(m.logger),
symlinks.NewCommand(m.logger),
chmod.NewCommand(m.logger),
}
return &hook
}

View File

@@ -0,0 +1,129 @@
/**
# Copyright (c) 2022, NVIDIA CORPORATION. All rights reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
**/
package ldcache
import (
"fmt"
"os"
"path/filepath"
"syscall"
"github.com/NVIDIA/nvidia-container-toolkit/internal/oci"
"github.com/sirupsen/logrus"
"github.com/urfave/cli/v2"
)
type command struct {
logger *logrus.Logger
}
type config struct {
folders cli.StringSlice
containerSpec string
}
// NewCommand constructs an update-ldcache command with the specified logger
func NewCommand(logger *logrus.Logger) *cli.Command {
c := command{
logger: logger,
}
return c.build()
}
// build the update-ldcache command
func (m command) build() *cli.Command {
cfg := config{}
// Create the 'update-ldcache' command
c := cli.Command{
Name: "update-ldcache",
Usage: "Update ldcache in a container by running ldconfig",
Action: func(c *cli.Context) error {
return m.run(c, &cfg)
},
}
c.Flags = []cli.Flag{
&cli.StringSliceFlag{
Name: "folder",
Usage: "Specifiy a folder to add to /etc/ld.so.conf before updating the ld cache",
Destination: &cfg.folders,
},
&cli.StringFlag{
Name: "container-spec",
Usage: "Specify the path to the OCI container spec. If empty or '-' the spec will be read from STDIN",
Destination: &cfg.containerSpec,
},
}
return &c
}
func (m command) run(c *cli.Context, cfg *config) error {
s, err := oci.LoadContainerState(cfg.containerSpec)
if err != nil {
return fmt.Errorf("failed to load container state: %v", err)
}
containerRoot, err := s.GetContainerRoot()
if err != nil {
return fmt.Errorf("failed to determined container root: %v", err)
}
err = m.createConfig(containerRoot, cfg.folders.Value())
if err != nil {
return fmt.Errorf("failed to update ld.so.conf: %v", err)
}
args := []string{"/sbin/ldconfig"}
if containerRoot != "" {
args = append(args, "-r", containerRoot)
}
return syscall.Exec(args[0], args, nil)
}
// createConfig creates (or updates) /etc/ld.so.conf.d/nvcr-<RANDOM_STRING>.conf in the container
// to include the required paths.
func (m command) createConfig(root string, folders []string) error {
if len(folders) == 0 {
m.logger.Debugf("No folders to add to /etc/ld.so.conf")
return nil
}
configFile, err := os.CreateTemp(filepath.Join(root, "/etc/ld.so.conf.d"), "nvcr-*.conf")
if err != nil {
return fmt.Errorf("failed to create config file: %v", err)
}
defer configFile.Close()
m.logger.Debugf("Adding folders %v to %v", folders, configFile.Name())
configured := make(map[string]bool)
for _, folder := range folders {
if configured[folder] {
continue
}
_, err = configFile.WriteString(fmt.Sprintf("%s\n", folder))
if err != nil {
return fmt.Errorf("failed to update ld.so.conf.d: %v", err)
}
configured[folder] = true
}
return nil
}

View File

@@ -0,0 +1,47 @@
/**
# Copyright (c) 2022, NVIDIA CORPORATION. All rights reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
**/
package info
import (
"github.com/sirupsen/logrus"
"github.com/urfave/cli/v2"
)
type command struct {
logger *logrus.Logger
}
// NewCommand constructs an info command with the specified logger
func NewCommand(logger *logrus.Logger) *cli.Command {
c := command{
logger: logger,
}
return c.build()
}
// build
func (m command) build() *cli.Command {
// Create the 'hook' command
hook := cli.Command{
Name: "info",
Usage: "Provide information about the system",
}
hook.Subcommands = []*cli.Command{}
return &hook
}

90
cmd/nvidia-ctk/main.go Normal file
View File

@@ -0,0 +1,90 @@
/**
# Copyright (c) 2021, NVIDIA CORPORATION. All rights reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
**/
package main
import (
"os"
"github.com/NVIDIA/nvidia-container-toolkit/cmd/nvidia-ctk/cdi"
"github.com/NVIDIA/nvidia-container-toolkit/cmd/nvidia-ctk/hook"
infoCLI "github.com/NVIDIA/nvidia-container-toolkit/cmd/nvidia-ctk/info"
"github.com/NVIDIA/nvidia-container-toolkit/cmd/nvidia-ctk/runtime"
"github.com/NVIDIA/nvidia-container-toolkit/cmd/nvidia-ctk/system"
"github.com/NVIDIA/nvidia-container-toolkit/internal/info"
log "github.com/sirupsen/logrus"
cli "github.com/urfave/cli/v2"
)
var logger = log.New()
// config defines the options that can be set for the CLI through config files,
// environment variables, or command line flags
type config struct {
// Debug indicates whether the CLI is started in "debug" mode
Debug bool
}
func main() {
// Create a config struct to hold the parsed environment variables or command line flags
config := config{}
// Create the top-level CLI
c := cli.NewApp()
c.Name = "NVIDIA Container Toolkit CLI"
c.UseShortOptionHandling = true
c.EnableBashCompletion = true
c.Usage = "Tools to configure the NVIDIA Container Toolkit"
c.Version = info.GetVersionString()
// Setup the flags for this command
c.Flags = []cli.Flag{
&cli.BoolFlag{
Name: "debug",
Aliases: []string{"d"},
Usage: "Enable debug-level logging",
Destination: &config.Debug,
EnvVars: []string{"NVIDIA_CTK_DEBUG"},
},
}
// Set log-level for all subcommands
c.Before = func(c *cli.Context) error {
logLevel := log.InfoLevel
if config.Debug {
logLevel = log.DebugLevel
}
logger.SetLevel(logLevel)
return nil
}
// Define the subcommands
c.Commands = []*cli.Command{
hook.NewCommand(logger),
runtime.NewCommand(logger),
infoCLI.NewCommand(logger),
cdi.NewCommand(logger),
system.NewCommand(logger),
}
// Run the CLI
err := c.Run(os.Args)
if err != nil {
log.Errorf("%v", err)
log.Exit(1)
}
}

View File

@@ -0,0 +1,213 @@
/**
# Copyright (c) 2022, NVIDIA CORPORATION. All rights reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
**/
package configure
import (
"encoding/json"
"fmt"
"os"
"github.com/NVIDIA/nvidia-container-toolkit/cmd/nvidia-ctk/runtime/nvidia"
"github.com/NVIDIA/nvidia-container-toolkit/internal/config/engine/crio"
"github.com/NVIDIA/nvidia-container-toolkit/internal/config/engine/docker"
"github.com/pelletier/go-toml"
"github.com/sirupsen/logrus"
"github.com/urfave/cli/v2"
)
const (
defaultRuntime = "docker"
defaultDockerConfigFilePath = "/etc/docker/daemon.json"
defaultCrioConfigFilePath = "/etc/crio/crio.conf"
)
type command struct {
logger *logrus.Logger
}
// NewCommand constructs an configure command with the specified logger
func NewCommand(logger *logrus.Logger) *cli.Command {
c := command{
logger: logger,
}
return c.build()
}
// config defines the options that can be set for the CLI through config files,
// environment variables, or command line config
type config struct {
dryRun bool
runtime string
configFilePath string
nvidiaOptions nvidia.Options
}
func (m command) build() *cli.Command {
// Create a config struct to hold the parsed environment variables or command line flags
config := config{}
// Create the 'configure' command
configure := cli.Command{
Name: "configure",
Usage: "Add a runtime to the specified container engine",
Action: func(c *cli.Context) error {
return m.configureWrapper(c, &config)
},
}
configure.Flags = []cli.Flag{
&cli.BoolFlag{
Name: "dry-run",
Usage: "update the runtime configuration as required but don't write changes to disk",
Destination: &config.dryRun,
},
&cli.StringFlag{
Name: "runtime",
Usage: "the target runtime engine. One of [crio, docker]",
Value: defaultRuntime,
Destination: &config.runtime,
},
&cli.StringFlag{
Name: "config",
Usage: "path to the config file for the target runtime",
Destination: &config.configFilePath,
},
&cli.StringFlag{
Name: "nvidia-runtime-name",
Usage: "specify the name of the NVIDIA runtime that will be added",
Value: nvidia.RuntimeName,
Destination: &config.nvidiaOptions.RuntimeName,
},
&cli.StringFlag{
Name: "runtime-path",
Usage: "specify the path to the NVIDIA runtime executable",
Value: nvidia.RuntimeExecutable,
Destination: &config.nvidiaOptions.RuntimePath,
},
&cli.BoolFlag{
Name: "set-as-default",
Usage: "set the specified runtime as the default runtime",
Destination: &config.nvidiaOptions.SetAsDefault,
},
}
return &configure
}
func (m command) configureWrapper(c *cli.Context, config *config) error {
switch config.runtime {
case "crio":
return m.configureCrio(c, config)
case "docker":
return m.configureDocker(c, config)
}
return fmt.Errorf("unrecognized runtime '%v'", config.runtime)
}
// configureDocker updates the docker config to enable the NVIDIA Container Runtime
func (m command) configureDocker(c *cli.Context, config *config) error {
configFilePath := config.configFilePath
if configFilePath == "" {
configFilePath = defaultDockerConfigFilePath
}
cfg, err := docker.New(
docker.WithPath(configFilePath),
)
if err != nil {
return fmt.Errorf("unable to load config: %v", err)
}
err = cfg.AddRuntime(
config.nvidiaOptions.RuntimeName,
config.nvidiaOptions.RuntimePath,
config.nvidiaOptions.SetAsDefault,
)
if err != nil {
return fmt.Errorf("unable to update config: %v", err)
}
if config.dryRun {
output, err := json.MarshalIndent(cfg, "", " ")
if err != nil {
return fmt.Errorf("unable to convert to JSON: %v", err)
}
os.Stdout.WriteString(fmt.Sprintf("%s\n", output))
return nil
}
n, err := cfg.Save(configFilePath)
if err != nil {
return fmt.Errorf("unable to flush config: %v", err)
}
if n == 0 {
m.logger.Infof("Removed empty config from %v", configFilePath)
} else {
m.logger.Infof("Wrote updated config to %v", configFilePath)
}
m.logger.Infof("It is recommended that the docker daemon be restarted.")
return nil
}
// configureCrio updates the crio config to enable the NVIDIA Container Runtime
func (m command) configureCrio(c *cli.Context, config *config) error {
configFilePath := config.configFilePath
if configFilePath == "" {
configFilePath = defaultCrioConfigFilePath
}
cfg, err := crio.New(
crio.WithPath(configFilePath),
)
if err != nil {
return fmt.Errorf("unable to load config: %v", err)
}
err = cfg.AddRuntime(
config.nvidiaOptions.RuntimeName,
config.nvidiaOptions.RuntimePath,
config.nvidiaOptions.SetAsDefault,
)
if err != nil {
return fmt.Errorf("unable to update config: %v", err)
}
if config.dryRun {
output, err := toml.Marshal(cfg)
if err != nil {
return fmt.Errorf("unable to convert to TOML: %v", err)
}
os.Stdout.WriteString(fmt.Sprintf("%s\n", output))
return nil
}
n, err := cfg.Save(configFilePath)
if err != nil {
return fmt.Errorf("unable to flush config: %v", err)
}
if n == 0 {
m.logger.Infof("Removed empty config from %v", configFilePath)
} else {
m.logger.Infof("Wrote updated config to %v", configFilePath)
}
m.logger.Infof("It is recommended that the cri-o daemon be restarted.")
return nil
}

View File

@@ -0,0 +1,75 @@
/*
# Copyright (c) 2022, NVIDIA CORPORATION. All rights reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
*/
package nvidia
const (
// RuntimeName is the default name to use in configs for the NVIDIA Container Runtime
RuntimeName = "nvidia"
// RuntimeExecutable is the default NVIDIA Container Runtime executable file name
RuntimeExecutable = "nvidia-container-runtime"
)
// Options specifies the options for the NVIDIA Container Runtime w.r.t a container engine such as docker.
type Options struct {
SetAsDefault bool
RuntimeName string
RuntimePath string
}
// Runtime defines an NVIDIA runtime with a name and a executable
type Runtime struct {
Name string
Path string
}
// DefaultRuntime returns the default runtime for the configured options.
// If the configuration is invalid or the default runtimes should not be set
// the empty string is returned.
func (o Options) DefaultRuntime() string {
if !o.SetAsDefault {
return ""
}
return o.RuntimeName
}
// Runtime creates a runtime struct based on the options.
func (o Options) Runtime() Runtime {
path := o.RuntimePath
if o.RuntimePath == "" {
path = RuntimeExecutable
}
r := Runtime{
Name: o.RuntimeName,
Path: path,
}
return r
}
// DockerRuntimesConfig generatest the expected docker config for the specified runtime
func (r Runtime) DockerRuntimesConfig() map[string]interface{} {
runtimes := make(map[string]interface{})
runtimes[r.Name] = map[string]interface{}{
"path": r.Path,
"args": []string{},
}
return runtimes
}

View File

@@ -0,0 +1,49 @@
/**
# Copyright (c) 2022, NVIDIA CORPORATION. All rights reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
**/
package runtime
import (
"github.com/NVIDIA/nvidia-container-toolkit/cmd/nvidia-ctk/runtime/configure"
"github.com/sirupsen/logrus"
"github.com/urfave/cli/v2"
)
type runtimeCommand struct {
logger *logrus.Logger
}
// NewCommand constructs a runtime command with the specified logger
func NewCommand(logger *logrus.Logger) *cli.Command {
c := runtimeCommand{
logger: logger,
}
return c.build()
}
func (m runtimeCommand) build() *cli.Command {
// Create the 'runtime' command
runtime := cli.Command{
Name: "runtime",
Usage: "A collection of runtime-related utilities for the NVIDIA Container Toolkit",
}
runtime.Subcommands = []*cli.Command{
configure.NewCommand(m.logger),
}
return &runtime
}

View File

@@ -0,0 +1,175 @@
/**
# Copyright (c) NVIDIA CORPORATION. All rights reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
**/
package devchar
import (
"fmt"
"path/filepath"
"github.com/NVIDIA/nvidia-container-toolkit/internal/info/proc/devices"
"github.com/NVIDIA/nvidia-container-toolkit/internal/nvcaps"
"github.com/sirupsen/logrus"
"gitlab.com/nvidia/cloud-native/go-nvlib/pkg/nvpci"
)
type allPossible struct {
logger *logrus.Logger
driverRoot string
deviceMajors devices.Devices
migCaps nvcaps.MigCaps
}
// newAllPossible returns a new allPossible device node lister.
// This lister lists all possible device nodes for NVIDIA GPUs, control devices, and capability devices.
func newAllPossible(logger *logrus.Logger, driverRoot string) (nodeLister, error) {
deviceMajors, err := devices.GetNVIDIADevices()
if err != nil {
return nil, fmt.Errorf("failed reading device majors: %v", err)
}
migCaps, err := nvcaps.NewMigCaps()
if err != nil {
return nil, fmt.Errorf("failed to read MIG caps: %v", err)
}
if migCaps == nil {
migCaps = make(nvcaps.MigCaps)
}
l := allPossible{
logger: logger,
driverRoot: driverRoot,
deviceMajors: deviceMajors,
migCaps: migCaps,
}
return l, nil
}
// DeviceNodes returns a list of all possible device nodes for NVIDIA GPUs, control devices, and capability devices.
func (m allPossible) DeviceNodes() ([]deviceNode, error) {
gpus, err := nvpci.NewFrom(
filepath.Join(m.driverRoot, nvpci.PCIDevicesRoot),
).GetGPUs()
if err != nil {
return nil, fmt.Errorf("failed to get GPU information: %v", err)
}
count := len(gpus)
if count == 0 {
m.logger.Infof("No NVIDIA devices found in %s", m.driverRoot)
return nil, nil
}
deviceNodes, err := m.getControlDeviceNodes()
if err != nil {
return nil, fmt.Errorf("failed to get control device nodes: %v", err)
}
for gpu := 0; gpu < count; gpu++ {
deviceNodes = append(deviceNodes, m.getGPUDeviceNodes(gpu)...)
deviceNodes = append(deviceNodes, m.getNVCapDeviceNodes(gpu)...)
}
return deviceNodes, nil
}
// getControlDeviceNodes generates a list of control devices
func (m allPossible) getControlDeviceNodes() ([]deviceNode, error) {
var deviceNodes []deviceNode
// Define the control devices for standard GPUs.
controlDevices := []deviceNode{
m.newDeviceNode(devices.NVIDIAGPU, "/dev/nvidia-modeset", devices.NVIDIAModesetMinor),
m.newDeviceNode(devices.NVIDIAGPU, "/dev/nvidiactl", devices.NVIDIACTLMinor),
m.newDeviceNode(devices.NVIDIAUVM, "/dev/nvidia-uvm", devices.NVIDIAUVMMinor),
m.newDeviceNode(devices.NVIDIAUVM, "/dev/nvidia-uvm-tools", devices.NVIDIAUVMToolsMinor),
}
deviceNodes = append(deviceNodes, controlDevices...)
for _, migControlDevice := range []nvcaps.MigCap{"config", "monitor"} {
migControlMinor, exist := m.migCaps[migControlDevice]
if !exist {
continue
}
d := m.newDeviceNode(
devices.NVIDIACaps,
migControlMinor.DevicePath(),
int(migControlMinor),
)
deviceNodes = append(deviceNodes, d)
}
return deviceNodes, nil
}
// getGPUDeviceNodes generates a list of device nodes for a given GPU.
func (m allPossible) getGPUDeviceNodes(gpu int) []deviceNode {
d := m.newDeviceNode(
devices.NVIDIAGPU,
fmt.Sprintf("/dev/nvidia%d", gpu),
gpu,
)
return []deviceNode{d}
}
// getNVCapDeviceNodes generates a list of cap device nodes for a given GPU.
func (m allPossible) getNVCapDeviceNodes(gpu int) []deviceNode {
var selectedCapMinors []nvcaps.MigMinor
for gi := 0; ; gi++ {
giCap := nvcaps.NewGPUInstanceCap(gpu, gi)
giMinor, exist := m.migCaps[giCap]
if !exist {
break
}
selectedCapMinors = append(selectedCapMinors, giMinor)
for ci := 0; ; ci++ {
ciCap := nvcaps.NewComputeInstanceCap(gpu, gi, ci)
ciMinor, exist := m.migCaps[ciCap]
if !exist {
break
}
selectedCapMinors = append(selectedCapMinors, ciMinor)
}
}
var deviceNodes []deviceNode
for _, capMinor := range selectedCapMinors {
d := m.newDeviceNode(
devices.NVIDIACaps,
capMinor.DevicePath(),
int(capMinor),
)
deviceNodes = append(deviceNodes, d)
}
return deviceNodes
}
// newDeviceNode creates a new device node with the specified path and major/minor numbers.
// The path is adjusted for the specified driver root.
func (m allPossible) newDeviceNode(deviceName devices.Name, path string, minor int) deviceNode {
major, _ := m.deviceMajors.Get(deviceName)
return deviceNode{
path: filepath.Join(m.driverRoot, path),
major: uint32(major),
minor: uint32(minor),
}
}

View File

@@ -0,0 +1,332 @@
/**
# Copyright (c) NVIDIA CORPORATION. All rights reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
**/
package devchar
import (
"fmt"
"os"
"os/signal"
"path/filepath"
"strings"
"syscall"
"github.com/fsnotify/fsnotify"
"github.com/sirupsen/logrus"
"github.com/urfave/cli/v2"
)
const (
defaultDevCharPath = "/dev/char"
)
type command struct {
logger *logrus.Logger
}
type config struct {
devCharPath string
driverRoot string
dryRun bool
watch bool
createAll bool
}
// NewCommand constructs a command sub-command with the specified logger
func NewCommand(logger *logrus.Logger) *cli.Command {
c := command{
logger: logger,
}
return c.build()
}
// build
func (m command) build() *cli.Command {
cfg := config{}
// Create the 'create-dev-char-symlinks' command
c := cli.Command{
Name: "create-dev-char-symlinks",
Usage: "A utility to create symlinks to possible /dev/nv* devices in /dev/char",
Before: func(c *cli.Context) error {
return m.validateFlags(c, &cfg)
},
Action: func(c *cli.Context) error {
return m.run(c, &cfg)
},
}
c.Flags = []cli.Flag{
&cli.StringFlag{
Name: "dev-char-path",
Usage: "The path at which the symlinks will be created. Symlinks will be created as `DEV_CHAR`/MAJOR:MINOR where MAJOR and MINOR are the major and minor numbers of a corresponding device node.",
Value: defaultDevCharPath,
Destination: &cfg.devCharPath,
EnvVars: []string{"DEV_CHAR_PATH"},
},
&cli.StringFlag{
Name: "driver-root",
Usage: "The path to the driver root. `DRIVER_ROOT`/dev is searched for NVIDIA device nodes.",
Value: "/",
Destination: &cfg.driverRoot,
EnvVars: []string{"DRIVER_ROOT"},
},
&cli.BoolFlag{
Name: "watch",
Usage: "If set, the command will watch for changes to the driver root and recreate the symlinks when changes are detected.",
Value: false,
Destination: &cfg.watch,
EnvVars: []string{"WATCH"},
},
&cli.BoolFlag{
Name: "create-all",
Usage: "Create all possible /dev/char symlinks instead of limiting these to existing device nodes.",
Destination: &cfg.createAll,
EnvVars: []string{"CREATE_ALL"},
},
&cli.BoolFlag{
Name: "dry-run",
Usage: "If set, the command will not create any symlinks.",
Value: false,
Destination: &cfg.dryRun,
EnvVars: []string{"DRY_RUN"},
},
}
return &c
}
func (m command) validateFlags(r *cli.Context, cfg *config) error {
if cfg.createAll && cfg.watch {
return fmt.Errorf("create-all and watch are mutually exclusive")
}
return nil
}
func (m command) run(c *cli.Context, cfg *config) error {
var watcher *fsnotify.Watcher
var sigs chan os.Signal
if cfg.watch {
watcher, err := newFSWatcher(filepath.Join(cfg.driverRoot, "dev"))
if err != nil {
return fmt.Errorf("failed to create FS watcher: %v", err)
}
defer watcher.Close()
sigs = newOSWatcher(syscall.SIGHUP, syscall.SIGINT, syscall.SIGTERM, syscall.SIGQUIT)
}
l, err := NewSymlinkCreator(
WithLogger(m.logger),
WithDevCharPath(cfg.devCharPath),
WithDriverRoot(cfg.driverRoot),
WithDryRun(cfg.dryRun),
WithCreateAll(cfg.createAll),
)
if err != nil {
return fmt.Errorf("failed to create symlink creator: %v", err)
}
create:
err = l.CreateLinks()
if err != nil {
return fmt.Errorf("failed to create links: %v", err)
}
if !cfg.watch {
return nil
}
for {
select {
case event := <-watcher.Events:
deviceNode := filepath.Base(event.Name)
if !strings.HasPrefix(deviceNode, "nvidia") {
continue
}
if event.Op&fsnotify.Create == fsnotify.Create {
m.logger.Infof("%s created, restarting.", event.Name)
goto create
}
if event.Op&fsnotify.Create == fsnotify.Remove {
m.logger.Infof("%s removed. Ignoring", event.Name)
}
// Watch for any other fs errors and log them.
case err := <-watcher.Errors:
m.logger.Errorf("inotify: %s", err)
// React to signals
case s := <-sigs:
switch s {
case syscall.SIGHUP:
m.logger.Infof("Received SIGHUP, recreating symlinks.")
goto create
default:
m.logger.Infof("Received signal %q, shutting down.", s)
return nil
}
}
}
}
type linkCreator struct {
logger *logrus.Logger
lister nodeLister
driverRoot string
devCharPath string
dryRun bool
createAll bool
}
// Creator is an interface for creating symlinks to /dev/nv* devices in /dev/char.
type Creator interface {
CreateLinks() error
}
// Option is a functional option for configuring the linkCreator.
type Option func(*linkCreator)
// NewSymlinkCreator creates a new linkCreator.
func NewSymlinkCreator(opts ...Option) (Creator, error) {
c := linkCreator{}
for _, opt := range opts {
opt(&c)
}
if c.logger == nil {
c.logger = logrus.StandardLogger()
}
if c.driverRoot == "" {
c.driverRoot = "/"
}
if c.devCharPath == "" {
c.devCharPath = defaultDevCharPath
}
if c.createAll {
lister, err := newAllPossible(c.logger, c.driverRoot)
if err != nil {
return nil, fmt.Errorf("failed to create all possible device lister: %v", err)
}
c.lister = lister
} else {
c.lister = existing{c.logger, c.driverRoot}
}
return c, nil
}
// WithDriverRoot sets the driver root path.
func WithDriverRoot(root string) Option {
return func(c *linkCreator) {
c.driverRoot = root
}
}
// WithDevCharPath sets the path at which the symlinks will be created.
func WithDevCharPath(path string) Option {
return func(c *linkCreator) {
c.devCharPath = path
}
}
// WithDryRun sets the dry run flag.
func WithDryRun(dryRun bool) Option {
return func(c *linkCreator) {
c.dryRun = dryRun
}
}
// WithLogger sets the logger.
func WithLogger(logger *logrus.Logger) Option {
return func(c *linkCreator) {
c.logger = logger
}
}
// WithCreateAll sets the createAll flag for the linkCreator.
func WithCreateAll(createAll bool) Option {
return func(lc *linkCreator) {
lc.createAll = createAll
}
}
// CreateLinks creates symlinks for all NVIDIA device nodes found in the driver root.
func (m linkCreator) CreateLinks() error {
deviceNodes, err := m.lister.DeviceNodes()
if err != nil {
return fmt.Errorf("failed to get device nodes: %v", err)
}
if len(deviceNodes) != 0 && !m.dryRun {
err := os.MkdirAll(m.devCharPath, 0755)
if err != nil {
return fmt.Errorf("failed to create directory %s: %v", m.devCharPath, err)
}
}
for _, deviceNode := range deviceNodes {
target := deviceNode.path
linkPath := filepath.Join(m.devCharPath, deviceNode.devCharName())
m.logger.Infof("Creating link %s => %s", linkPath, target)
if m.dryRun {
continue
}
err = os.Symlink(target, linkPath)
if err != nil {
m.logger.Warnf("Could not create symlink: %v", err)
}
}
return nil
}
type deviceNode struct {
path string
major uint32
minor uint32
}
func (d deviceNode) devCharName() string {
return fmt.Sprintf("%d:%d", d.major, d.minor)
}
func newFSWatcher(files ...string) (*fsnotify.Watcher, error) {
watcher, err := fsnotify.NewWatcher()
if err != nil {
return nil, err
}
for _, f := range files {
err = watcher.Add(f)
if err != nil {
watcher.Close()
return nil, err
}
}
return watcher, nil
}
func newOSWatcher(sigs ...os.Signal) chan os.Signal {
sigChan := make(chan os.Signal, 1)
signal.Notify(sigChan, sigs...)
return sigChan
}

View File

@@ -0,0 +1,95 @@
/**
# Copyright (c) NVIDIA CORPORATION. All rights reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
**/
package devchar
import (
"path/filepath"
"strings"
"github.com/NVIDIA/nvidia-container-toolkit/internal/lookup"
"github.com/sirupsen/logrus"
"golang.org/x/sys/unix"
)
type nodeLister interface {
DeviceNodes() ([]deviceNode, error)
}
type existing struct {
logger *logrus.Logger
driverRoot string
}
// DeviceNodes returns a list of NVIDIA device nodes in the specified root.
// The nvidia-nvswitch* and nvidia-nvlink devices are excluded.
func (m existing) DeviceNodes() ([]deviceNode, error) {
locator := lookup.NewCharDeviceLocator(
lookup.WithLogger(m.logger),
lookup.WithRoot(m.driverRoot),
lookup.WithOptional(true),
)
devices, err := locator.Locate("/dev/nvidia*")
if err != nil {
m.logger.Warnf("Error while locating device: %v", err)
}
capDevices, err := locator.Locate("/dev/nvidia-caps/nvidia-*")
if err != nil {
m.logger.Warnf("Error while locating caps device: %v", err)
}
if len(devices) == 0 && len(capDevices) == 0 {
m.logger.Infof("No NVIDIA devices found in %s", m.driverRoot)
return nil, nil
}
var deviceNodes []deviceNode
for _, d := range append(devices, capDevices...) {
if m.nodeIsBlocked(d) {
continue
}
var stat unix.Stat_t
err := unix.Stat(d, &stat)
if err != nil {
m.logger.Warnf("Could not stat device: %v", err)
continue
}
deviceNode := deviceNode{
path: d,
major: unix.Major(uint64(stat.Rdev)),
minor: unix.Minor(uint64(stat.Rdev)),
}
deviceNodes = append(deviceNodes, deviceNode)
}
return deviceNodes, nil
}
// nodeIsBlocked returns true if the specified device node should be ignored.
func (m existing) nodeIsBlocked(path string) bool {
blockedPrefixes := []string{"nvidia-fs", "nvidia-nvswitch", "nvidia-nvlink"}
nodeName := filepath.Base(path)
for _, prefix := range blockedPrefixes {
if strings.HasPrefix(nodeName, prefix) {
return true
}
}
return false
}

View File

@@ -0,0 +1,49 @@
/**
# Copyright (c) NVIDIA CORPORATION. All rights reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
**/
package system
import (
devchar "github.com/NVIDIA/nvidia-container-toolkit/cmd/nvidia-ctk/system/create-dev-char-symlinks"
"github.com/sirupsen/logrus"
"github.com/urfave/cli/v2"
)
type command struct {
logger *logrus.Logger
}
// NewCommand constructs a runtime command with the specified logger
func NewCommand(logger *logrus.Logger) *cli.Command {
c := command{
logger: logger,
}
return c.build()
}
func (m command) build() *cli.Command {
// Create the 'system' command
system := cli.Command{
Name: "system",
Usage: "A collection of system-related utilities for the NVIDIA Container Toolkit",
}
system.Subcommands = []*cli.Command{
devchar.NewCommand(m.logger),
}
return &system
}

View File

@@ -1,18 +0,0 @@
disable-require = false
#swarm-resource = "DOCKER_RESOURCE_GPU"
#accept-nvidia-visible-devices-envvar-when-unprivileged = true
#accept-nvidia-visible-devices-as-volume-mounts = false
[nvidia-container-cli]
#root = "/run/nvidia/driver"
#path = "/usr/bin/nvidia-container-cli"
environment = []
#debug = "/var/log/nvidia-container-toolkit.log"
#ldcache = "/etc/ld.so.cache"
load-kmods = true
#no-cgroups = false
#user = "root:video"
ldconfig = "@/sbin/ldconfig"
[nvidia-container-runtime]
#debug = "/var/log/nvidia-container-runtime.log"

View File

@@ -16,3 +16,17 @@ ldconfig = "@/sbin/ldconfig"
[nvidia-container-runtime]
#debug = "/var/log/nvidia-container-runtime.log"
log-level = "info"
# Specify the runtimes to consider. This list is processed in order and the PATH
# searched for matching executables unless the entry is an absolute path.
runtimes = [
"docker-runc",
"runc",
]
mode = "auto"
[nvidia-container-runtime.modes.csv]
mount-spec-path = "/etc/nvidia-container-runtime/host-files-for-container.d"

View File

@@ -16,3 +16,17 @@ ldconfig = "@/sbin/ldconfig"
[nvidia-container-runtime]
#debug = "/var/log/nvidia-container-runtime.log"
log-level = "info"
# Specify the runtimes to consider. This list is processed in order and the PATH
# searched for matching executables unless the entry is an absolute path.
runtimes = [
"docker-runc",
"runc",
]
mode = "auto"
[nvidia-container-runtime.modes.csv]
mount-spec-path = "/etc/nvidia-container-runtime/host-files-for-container.d"

View File

@@ -16,3 +16,17 @@ ldconfig = "@/sbin/ldconfig"
[nvidia-container-runtime]
#debug = "/var/log/nvidia-container-runtime.log"
log-level = "info"
# Specify the runtimes to consider. This list is processed in order and the PATH
# searched for matching executables unless the entry is an absolute path.
runtimes = [
"docker-runc",
"runc",
]
mode = "auto"
[nvidia-container-runtime.modes.csv]
mount-spec-path = "/etc/nvidia-container-runtime/host-files-for-container.d"

View File

@@ -16,3 +16,17 @@ ldconfig = "@/sbin/ldconfig.real"
[nvidia-container-runtime]
#debug = "/var/log/nvidia-container-runtime.log"
log-level = "info"
# Specify the runtimes to consider. This list is processed in order and the PATH
# searched for matching executables unless the entry is an absolute path.
runtimes = [
"docker-runc",
"runc",
]
mode = "auto"
[nvidia-container-runtime.modes.csv]
mount-spec-path = "/etc/nvidia-container-runtime/host-files-for-container.d"

View File

@@ -1,65 +0,0 @@
ARG BASEIMAGE
FROM ${BASEIMAGE}
RUN yum install -y \
ca-certificates \
wget \
git \
rpm-build \
make && \
rm -rf /var/cache/yum/*
ARG GOLANG_VERSION=0.0.0
RUN set -eux; \
\
arch="$(uname -m)"; \
case "${arch##*-}" in \
x86_64 | amd64) ARCH='amd64' ;; \
ppc64el | ppc64le) ARCH='ppc64le' ;; \
aarch64) ARCH='arm64' ;; \
*) echo "unsupported architecture"; exit 1 ;; \
esac; \
wget -nv -O - https://storage.googleapis.com/golang/go${GOLANG_VERSION}.linux-${ARCH}.tar.gz \
| tar -C /usr/local -xz
ENV GOPATH /go
ENV PATH $GOPATH/bin:/usr/local/go/bin:$PATH
# packaging
ARG PKG_VERS
ARG PKG_REV
ENV VERSION $PKG_VERS
ENV RELEASE $PKG_REV
# output directory
ENV DIST_DIR=/tmp/nvidia-container-toolkit-$PKG_VERS/SOURCES
RUN mkdir -p $DIST_DIR /dist
# nvidia-container-toolkit
WORKDIR $GOPATH/src/nvidia-container-toolkit
COPY . .
RUN make binary && \
mv ./nvidia-container-toolkit $DIST_DIR/nvidia-container-toolkit
COPY config/config.toml.amzn $DIST_DIR/config.toml
# Hook for Project Atomic's fork of Docker: https://github.com/projectatomic/docker/tree/docker-1.13.1-rhel#add-dockerhooks-exec-custom-hooks-for-prestartpoststop-containerspatch
# This might not be useful on Amazon Linux, but it's simpler to keep the RHEL
# and Amazon Linux packages identical.
COPY oci-nvidia-hook $DIST_DIR/oci-nvidia-hook
# Hook for libpod/CRI-O: https://github.com/containers/libpod/blob/v0.8.5/pkg/hooks/docs/oci-hooks.5.md
COPY oci-nvidia-hook.json $DIST_DIR/oci-nvidia-hook.json
WORKDIR $DIST_DIR/..
COPY packaging/rpm .
CMD arch=$(uname -m) && \
rpmbuild --clean --target=$arch -bb \
-D "_topdir $PWD" \
-D "version $VERSION" \
-D "release $RELEASE" \
SPECS/nvidia-container-toolkit.spec && \
mv RPMS/$arch/*.rpm /dist

View File

@@ -32,6 +32,7 @@ ENV GOPATH /go
ENV PATH $GOPATH/bin:/usr/local/go/bin:$PATH
# packaging
ARG PKG_NAME
ARG PKG_VERS
ARG PKG_REV
@@ -48,10 +49,13 @@ RUN mkdir -p $DIST_DIR /dist
WORKDIR $GOPATH/src/nvidia-container-toolkit
COPY . .
RUN make binary && \
mv ./nvidia-container-toolkit $DIST_DIR/nvidia-container-toolkit
ARG GIT_COMMIT
ENV GIT_COMMIT ${GIT_COMMIT}
RUN make PREFIX=${DIST_DIR} cmds
COPY config/config.toml.debian $DIST_DIR/config.toml
ARG CONFIG_TOML_SUFFIX
ENV CONFIG_TOML_SUFFIX ${CONFIG_TOML_SUFFIX}
COPY config/config.toml.${CONFIG_TOML_SUFFIX} $DIST_DIR/config.toml
# Debian Jessie still had ldconfig.real
RUN if [ "$(lsb_release -cs)" = "jessie" ]; then \
@@ -61,9 +65,17 @@ RUN if [ "$(lsb_release -cs)" = "jessie" ]; then \
WORKDIR $DIST_DIR
COPY packaging/debian ./debian
RUN sed -i "s;@VERSION@;${REVISION};" debian/changelog && \
ARG LIBNVIDIA_CONTAINER_TOOLS_VERSION
ENV LIBNVIDIA_CONTAINER_TOOLS_VERSION ${LIBNVIDIA_CONTAINER_TOOLS_VERSION}
RUN dch --create --package="${PKG_NAME}" \
--newversion "${REVISION}" \
"See https://gitlab.com/nvidia/container-toolkit/container-toolkit/-/blob/${GIT_COMMIT}/CHANGELOG.md for the changelog" && \
dch --append "Bump libnvidia-container dependency to ${LIBNVIDIA_CONTAINER1_VERSION}" && \
dch -r "" && \
if [ "$REVISION" != "$(dpkg-parsechangelog --show-field=Version)" ]; then exit 1; fi
CMD export DISTRIB="$(lsb_release -cs)" && \
debuild -eDISTRIB -eSECTION --dpkg-buildpackage-hook='sh debian/prepare' -i -us -uc -b && \
mv /tmp/nvidia-container-toolkit_*.deb /dist
debuild -eDISTRIB -eSECTION -eLIBNVIDIA_CONTAINER_TOOLS_VERSION -eVERSION="${REVISION}" \
--dpkg-buildpackage-hook='sh debian/prepare' -i -us -uc -b && \
mv /tmp/*.deb /dist

21
docker/Dockerfile.devel Normal file
View File

@@ -0,0 +1,21 @@
# Copyright (c) 2021, NVIDIA CORPORATION. All rights reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
ARG GOLANG_VERSION=x.x.x
FROM golang:${GOLANG_VERSION}
RUN go install golang.org/x/lint/golint@6edffad5e6160f5949cdefc81710b2706fbcd4f6
RUN go install github.com/matryer/moq@latest
RUN go install github.com/gordonklaus/ineffassign@d2c82e48359b033cde9cf1307f6d5550b8d61321
RUN go install github.com/client9/misspell/cmd/misspell@latest
RUN go install github.com/google/go-licenses@latest

View File

@@ -25,11 +25,12 @@ ENV GOPATH /go
ENV PATH $GOPATH/bin:/usr/local/go/bin:$PATH
# packaging
ARG PKG_NAME
ARG PKG_VERS
ARG PKG_REV
ENV VERSION $PKG_VERS
ENV RELEASE $PKG_REV
ENV PKG_NAME ${PKG_NAME}
ENV PKG_VERS ${PKG_VERS}
ENV PKG_REV ${PKG_REV}
# output directory
ENV DIST_DIR=/tmp/nvidia-container-toolkit-$PKG_VERS/SOURCES
@@ -39,8 +40,9 @@ RUN mkdir -p $DIST_DIR /dist
WORKDIR $GOPATH/src/nvidia-container-toolkit
COPY . .
RUN make binary && \
mv ./nvidia-container-toolkit $DIST_DIR/nvidia-container-toolkit
ARG GIT_COMMIT
ENV GIT_COMMIT ${GIT_COMMIT}
RUN make PREFIX=${DIST_DIR} cmds
# Hook for Project Atomic's fork of Docker: https://github.com/projectatomic/docker/tree/docker-1.13.1-rhel#add-dockerhooks-exec-custom-hooks-for-prestartpoststop-containerspatch
COPY oci-nvidia-hook $DIST_DIR/oci-nvidia-hook
@@ -48,15 +50,23 @@ COPY oci-nvidia-hook $DIST_DIR/oci-nvidia-hook
# Hook for libpod/CRI-O: https://github.com/containers/libpod/blob/v0.8.5/pkg/hooks/docs/oci-hooks.5.md
COPY oci-nvidia-hook.json $DIST_DIR/oci-nvidia-hook.json
COPY config/config.toml.opensuse-leap $DIST_DIR/config.toml
ARG CONFIG_TOML_SUFFIX
ENV CONFIG_TOML_SUFFIX ${CONFIG_TOML_SUFFIX}
COPY config/config.toml.${CONFIG_TOML_SUFFIX} $DIST_DIR/config.toml
WORKDIR $DIST_DIR/..
COPY packaging/rpm .
ARG LIBNVIDIA_CONTAINER_TOOLS_VERSION
ENV LIBNVIDIA_CONTAINER_TOOLS_VERSION ${LIBNVIDIA_CONTAINER_TOOLS_VERSION}
CMD arch=$(uname -m) && \
rpmbuild --clean --target=$arch -bb \
-D "_topdir $PWD" \
-D "version $VERSION" \
-D "release $RELEASE" \
-D "release_date $(date +'%a %b %d %Y')" \
-D "git_commit ${GIT_COMMIT}" \
-D "version ${PKG_VERS}" \
-D "libnvidia_container_tools_version ${LIBNVIDIA_CONTAINER_TOOLS_VERSION}" \
-D "release ${PKG_REV}" \
SPECS/nvidia-container-toolkit.spec && \
mv RPMS/$arch/*.rpm /dist

View File

@@ -1,8 +1,25 @@
# Copyright (c) 2022, NVIDIA CORPORATION. All rights reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
# This is the dockerfile for building packages on yum-based RPM systems.
ARG BASEIMAGE
FROM ${BASEIMAGE}
RUN yum install -y \
ca-certificates \
gcc \
wget \
git \
make \
@@ -26,11 +43,12 @@ ENV GOPATH /go
ENV PATH $GOPATH/bin:/usr/local/go/bin:$PATH
# packaging
ARG PKG_NAME
ARG PKG_VERS
ARG PKG_REV
ENV VERSION $PKG_VERS
ENV RELEASE $PKG_REV
ENV PKG_NAME ${PKG_NAME}
ENV PKG_VERS ${PKG_VERS}
ENV PKG_REV ${PKG_REV}
# output directory
ENV DIST_DIR=/tmp/nvidia-container-toolkit-$PKG_VERS/SOURCES
@@ -40,10 +58,13 @@ RUN mkdir -p $DIST_DIR /dist
WORKDIR $GOPATH/src/nvidia-container-toolkit
COPY . .
RUN make binary && \
mv ./nvidia-container-toolkit $DIST_DIR/nvidia-container-toolkit
ARG GIT_COMMIT
ENV GIT_COMMIT ${GIT_COMMIT}
RUN make PREFIX=${DIST_DIR} cmds
COPY config/config.toml.centos $DIST_DIR/config.toml
ARG CONFIG_TOML_SUFFIX
ENV CONFIG_TOML_SUFFIX ${CONFIG_TOML_SUFFIX}
COPY config/config.toml.${CONFIG_TOML_SUFFIX} $DIST_DIR/config.toml
# Hook for Project Atomic's fork of Docker: https://github.com/projectatomic/docker/tree/docker-1.13.1-rhel#add-dockerhooks-exec-custom-hooks-for-prestartpoststop-containerspatch
COPY oci-nvidia-hook $DIST_DIR/oci-nvidia-hook
@@ -54,10 +75,16 @@ COPY oci-nvidia-hook.json $DIST_DIR/oci-nvidia-hook.json
WORKDIR $DIST_DIR/..
COPY packaging/rpm .
ARG LIBNVIDIA_CONTAINER_TOOLS_VERSION
ENV LIBNVIDIA_CONTAINER_TOOLS_VERSION ${LIBNVIDIA_CONTAINER_TOOLS_VERSION}
CMD arch=$(uname -m) && \
rpmbuild --clean --target=$arch -bb \
-D "_topdir $PWD" \
-D "version $VERSION" \
-D "release $RELEASE" \
-D "release_date $(date +'%a %b %d %Y')" \
-D "git_commit ${GIT_COMMIT}" \
-D "version ${PKG_VERS}" \
-D "libnvidia_container_tools_version ${LIBNVIDIA_CONTAINER_TOOLS_VERSION}" \
-D "release ${PKG_REV}" \
SPECS/nvidia-container-toolkit.spec && \
mv RPMS/$arch/*.rpm /dist

View File

@@ -30,6 +30,7 @@ ENV GOPATH /go
ENV PATH $GOPATH/bin:/usr/local/go/bin:$PATH
# packaging
ARG PKG_NAME
ARG PKG_VERS
ARG PKG_REV
@@ -46,17 +47,28 @@ RUN mkdir -p $DIST_DIR /dist
WORKDIR $GOPATH/src/nvidia-container-toolkit
COPY . .
RUN make binary && \
mv ./nvidia-container-toolkit $DIST_DIR/nvidia-container-toolkit
ARG GIT_COMMIT
ENV GIT_COMMIT ${GIT_COMMIT}
RUN make PREFIX=${DIST_DIR} cmds
COPY config/config.toml.ubuntu $DIST_DIR/config.toml
ARG CONFIG_TOML_SUFFIX
ENV CONFIG_TOML_SUFFIX ${CONFIG_TOML_SUFFIX}
COPY config/config.toml.${CONFIG_TOML_SUFFIX} $DIST_DIR/config.toml
WORKDIR $DIST_DIR
COPY packaging/debian ./debian
RUN sed -i "s;@VERSION@;${REVISION};" debian/changelog && \
ARG LIBNVIDIA_CONTAINER_TOOLS_VERSION
ENV LIBNVIDIA_CONTAINER_TOOLS_VERSION ${LIBNVIDIA_CONTAINER_TOOLS_VERSION}
RUN dch --create --package="${PKG_NAME}" \
--newversion "${REVISION}" \
"See https://gitlab.com/nvidia/container-toolkit/container-toolkit/-/blob/${GIT_COMMIT}/CHANGELOG.md for the changelog" && \
dch --append "Bump libnvidia-container dependency to ${LIBNVIDIA_CONTAINER_TOOLS_VERSION}" && \
dch -r "" && \
if [ "$REVISION" != "$(dpkg-parsechangelog --show-field=Version)" ]; then exit 1; fi
CMD export DISTRIB="$(lsb_release -cs)" && \
debuild -eDISTRIB -eSECTION --dpkg-buildpackage-hook='sh debian/prepare' -i -us -uc -b && \
debuild -eDISTRIB -eSECTION -eLIBNVIDIA_CONTAINER_TOOLS_VERSION -eVERSION="${REVISION}" \
--dpkg-buildpackage-hook='sh debian/prepare' -i -us -uc -b && \
mv /tmp/*.deb /dist

View File

@@ -1,11 +1,23 @@
# Copyright (c) 2017-2020, NVIDIA CORPORATION. All rights reserved.
# Copyright (c) 2017-2021, NVIDIA CORPORATION. All rights reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
# Supported OSs by architecture
AMD64_TARGETS := ubuntu20.04 ubuntu18.04 ubuntu16.04 debian10 debian9
X86_64_TARGETS := centos7 centos8 rhel7 rhel8 amazonlinux1 amazonlinux2 opensuse-leap15.1
X86_64_TARGETS := fedora35 centos7 centos8 rhel7 rhel8 amazonlinux2 opensuse-leap15.1
PPC64LE_TARGETS := ubuntu18.04 ubuntu16.04 centos7 centos8 rhel7 rhel8
ARM64_TARGETS := ubuntu20.04 ubuntu18.04
AARCH64_TARGETS := centos8 rhel8
AARCH64_TARGETS := fedora35 centos8 rhel8 amazonlinux2
# Define top-level build targets
docker%: SHELL:=/bin/bash
@@ -73,48 +85,71 @@ docker-all: $(AMD64_TARGETS) $(X86_64_TARGETS) \
--%: docker-build-%
@
LIBNVIDIA_CONTAINER_VERSION ?= $(LIB_VERSION)
LIBNVIDIA_CONTAINER_TAG ?= $(LIB_TAG)
LIBNVIDIA_CONTAINER_TOOLS_VERSION := $(LIBNVIDIA_CONTAINER_VERSION)$(if $(LIBNVIDIA_CONTAINER_TAG),~$(LIBNVIDIA_CONTAINER_TAG))-1
# private ubuntu target
--ubuntu%: OS := ubuntu
--ubuntu%: LIB_VERSION := $(LIB_VERSION)$(if $(LIB_TAG),~$(LIB_TAG))
--ubuntu%: PKG_REV := 1
# private debian target
--debian%: OS := debian
--debian%: LIB_VERSION := $(LIB_VERSION)$(if $(LIB_TAG),~$(LIB_TAG))
--debian%: PKG_REV := 1
# private centos target
--centos%: OS := centos
--centos%: PKG_REV := $(if $(LIB_TAG),0.1.$(LIB_TAG),2)
--centos%: DOCKERFILE = $(CURDIR)/docker/Dockerfile.rpm-yum
--centos%: CONFIG_TOML_SUFFIX := rpm-yum
--centos8%: BASEIMAGE = quay.io/centos/centos:stream8
# private fedora target
--fedora%: OS := fedora
--fedora%: DOCKERFILE = $(CURDIR)/docker/Dockerfile.rpm-yum
--fedora%: CONFIG_TOML_SUFFIX := rpm-yum
# The fedora(35) base image has very slow performance when building aarch64 packages.
# Since our primary concern here is glibc versions, we use the older glibc version available in centos8.
--fedora35%: BASEIMAGE = quay.io/centos/centos:stream8
# private amazonlinux target
--amazonlinux%: OS := amazonlinux
--amazonlinux%: PKG_REV = $(if $(LIB_TAG),0.1.$(LIB_TAG).amzn$(VERSION),2.amzn$(VERSION))
--amazonlinux%: DOCKERFILE = $(CURDIR)/docker/Dockerfile.rpm-yum
--amazonlinux%: CONFIG_TOML_SUFFIX := rpm-yum
# private opensuse-leap target
--opensuse-leap%: OS = opensuse-leap
--opensuse-leap%: BASEIMAGE = opensuse/leap:$(VERSION)
--opensuse-leap%: PKG_REV := $(if $(LIB_TAG),0.1.$(LIB_TAG),1)
# private rhel target (actually built on centos)
--rhel%: OS := centos
--rhel%: PKG_REV := $(if $(LIB_TAG),0.1.$(LIB_TAG),2)
--rhel%: VERSION = $(patsubst rhel%-$(ARCH),%,$(TARGET_PLATFORM))
--rhel%: ARTIFACTS_DIR = $(DIST_DIR)/rhel$(VERSION)/$(ARCH)
--rhel%: DOCKERFILE = $(CURDIR)/docker/Dockerfile.rpm-yum
--rhel%: CONFIG_TOML_SUFFIX := rpm-yum
--rhel8%: BASEIMAGE = quay.io/centos/centos:stream8
# We allow the CONFIG_TOML_SUFFIX to be overridden.
CONFIG_TOML_SUFFIX ?= $(OS)
docker-build-%:
@echo "Building for $(TARGET_PLATFORM)"
docker pull --platform=linux/$(ARCH) $(BASEIMAGE)
DOCKER_BUILDKIT=1 \
$(DOCKER) build \
--platform=linux/$(ARCH) \
--progress=plain \
--build-arg BASEIMAGE=$(BASEIMAGE) \
--build-arg BASEIMAGE="$(BASEIMAGE)" \
--build-arg GOLANG_VERSION="$(GOLANG_VERSION)" \
--build-arg PKG_VERS="$(LIB_VERSION)" \
--build-arg PKG_REV="$(PKG_REV)" \
--build-arg PKG_NAME="$(LIB_NAME)" \
--build-arg PKG_VERS="$(PACKAGE_VERSION)" \
--build-arg PKG_REV="$(PACKAGE_REVISION)" \
--build-arg LIBNVIDIA_CONTAINER_TOOLS_VERSION="$(LIBNVIDIA_CONTAINER_TOOLS_VERSION)" \
--build-arg CONFIG_TOML_SUFFIX="$(CONFIG_TOML_SUFFIX)" \
--build-arg GIT_COMMIT="$(GIT_COMMIT)" \
--tag $(BUILDIMAGE) \
--file $(DOCKERFILE) .
$(DOCKER) run \
--platform=linux/$(ARCH) \
-e DISTRIB \
-e SECTION \
-v $(ARTIFACTS_DIR):/dist \

35
go.mod
View File

@@ -1,9 +1,36 @@
module github.com/NVIDIA/nvidia-container-toolkit
go 1.14
go 1.18
require (
github.com/BurntSushi/toml v0.3.1
github.com/stretchr/testify v1.6.0
golang.org/x/mod v0.3.0
github.com/BurntSushi/toml v1.0.0
github.com/NVIDIA/go-nvml v0.12.0-0
github.com/container-orchestrated-devices/container-device-interface v0.5.4-0.20230111111500-5b3b5d81179a
github.com/fsnotify/fsnotify v1.5.4
github.com/opencontainers/runtime-spec v1.0.3-0.20220825212826-86290f6a00fb
github.com/pelletier/go-toml v1.9.4
github.com/sirupsen/logrus v1.9.0
github.com/stretchr/testify v1.7.0
github.com/urfave/cli/v2 v2.3.0
gitlab.com/nvidia/cloud-native/go-nvlib v0.0.0-20230209143738-95328d8c4438
golang.org/x/mod v0.5.0
golang.org/x/sys v0.0.0-20220927170352-d9d178bc13c6
sigs.k8s.io/yaml v1.3.0
)
require (
github.com/cpuguy83/go-md2man/v2 v2.0.1 // indirect
github.com/davecgh/go-spew v1.1.1 // indirect
github.com/hashicorp/errwrap v1.1.0 // indirect
github.com/kr/text v0.2.0 // indirect
github.com/opencontainers/runc v1.1.4 // indirect
github.com/opencontainers/runtime-tools v0.9.1-0.20221107090550-2e043c6bd626 // indirect
github.com/opencontainers/selinux v1.10.1 // indirect
github.com/pmezard/go-difflib v1.0.0 // indirect
github.com/russross/blackfriday/v2 v2.1.0 // indirect
github.com/syndtr/gocapability v0.0.0-20200815063812-42c35b437635 // indirect
github.com/xeipuuv/gojsonpointer v0.0.0-20190905194746-02993c407bfb // indirect
gopkg.in/check.v1 v1.0.0-20201130134442-10cb98267c6c // indirect
gopkg.in/yaml.v2 v2.4.0 // indirect
gopkg.in/yaml.v3 v3.0.1 // indirect
)

138
go.sum
View File

@@ -1,25 +1,127 @@
github.com/BurntSushi/toml v0.3.1 h1:WXkYYl6Yr3qBf1K79EBnL4mak0OimBfB0XUf9Vl28OQ=
github.com/BurntSushi/toml v0.3.1/go.mod h1:xHWCNGjB5oqiDr8zfno3MHue2Ht5sIBksp03qcyfWMU=
github.com/davecgh/go-spew v1.1.0 h1:ZDRjVQ15GmhC3fiQ8ni8+OwkZQO4DARzQgrnXU1Liz8=
github.com/BurntSushi/toml v1.0.0 h1:dtDWrepsVPfW9H/4y7dDgFc2MBUSeJhlaDtK13CxFlU=
github.com/BurntSushi/toml v1.0.0/go.mod h1:CxXYINrC8qIiEnFrOxCa7Jy5BFHlXnUU2pbicEuybxQ=
github.com/NVIDIA/go-nvml v0.11.6-0.0.20220823120812-7e2082095e82 h1:x751Xx1tdxkiA/sdkv2J769n21UbYKzVOpe9S/h1M3k=
github.com/NVIDIA/go-nvml v0.11.6-0.0.20220823120812-7e2082095e82/go.mod h1:hy7HYeQy335x6nEss0Ne3PYqleRa6Ct+VKD9RQ4nyFs=
github.com/NVIDIA/go-nvml v0.12.0-0 h1:eHYNHbzAsMgWYshf6dEmTY66/GCXnORJFnzm3TNH4mc=
github.com/NVIDIA/go-nvml v0.12.0-0/go.mod h1:hy7HYeQy335x6nEss0Ne3PYqleRa6Ct+VKD9RQ4nyFs=
github.com/blang/semver/v4 v4.0.0 h1:1PFHFE6yCCTv8C1TeyNNarDzntLi7wMI5i/pzqYIsAM=
github.com/blang/semver/v4 v4.0.0/go.mod h1:IbckMUScFkM3pff0VJDNKRiT6TG/YpiHIM2yvyW5YoQ=
github.com/checkpoint-restore/go-criu/v5 v5.3.0/go.mod h1:E/eQpaFtUKGOOSEBZgmKAcn+zUUwWxqcaKZlF54wK8E=
github.com/cilium/ebpf v0.7.0/go.mod h1:/oI2+1shJiTGAMgl6/RgJr36Eo1jzrRcAWbcXO2usCA=
github.com/container-orchestrated-devices/container-device-interface v0.5.4-0.20230111111500-5b3b5d81179a h1:sP3PcgyIkRlHqfF3Jfpe/7G8kf/qpzG4C8r94y9hLbE=
github.com/container-orchestrated-devices/container-device-interface v0.5.4-0.20230111111500-5b3b5d81179a/go.mod h1:xMRa4fJgXzSDFUCURSimOUgoSc+odohvO3uXT9xjqH0=
github.com/containerd/console v1.0.3/go.mod h1:7LqA/THxQ86k76b8c/EMSiaJ3h1eZkMkXar0TQ1gf3U=
github.com/coreos/go-systemd/v22 v22.3.2/go.mod h1:Y58oyj3AT4RCenI/lSvhwexgC+NSVTIJ3seZv2GcEnc=
github.com/cpuguy83/go-md2man/v2 v2.0.0-20190314233015-f79a8a8ca69d/go.mod h1:maD7wRr/U5Z6m/iR4s+kqSMx2CaBsrgA7czyZG/E6dU=
github.com/cpuguy83/go-md2man/v2 v2.0.1 h1:r/myEWzV9lfsM1tFLgDyu0atFtJ1fXn261LKYj/3DxU=
github.com/cpuguy83/go-md2man/v2 v2.0.1/go.mod h1:tgQtvFlXSQOSOSIRvRPT7W67SCa46tRHOmNcaadrF8o=
github.com/creack/pty v1.1.9/go.mod h1:oKZEueFk5CKHvIhNR5MUki03XCEU+Q6VDXinZuGJ33E=
github.com/cyphar/filepath-securejoin v0.2.3/go.mod h1:aPGpWjXOXUn2NCNjFvBE6aRxGGx79pTxQpKOJNYHHl4=
github.com/davecgh/go-spew v1.1.0/go.mod h1:J7Y8YcW2NihsgmVo/mv3lAwl/skON4iLHjSsI+c5H38=
github.com/davecgh/go-spew v1.1.1 h1:vj9j/u1bqnvCEfJOwUhtlOARqs3+rkHYY13jYWTU97c=
github.com/davecgh/go-spew v1.1.1/go.mod h1:J7Y8YcW2NihsgmVo/mv3lAwl/skON4iLHjSsI+c5H38=
github.com/docker/go-units v0.4.0/go.mod h1:fgPhTUdO+D/Jk86RDLlptpiXQzgHJF7gydDDbaIK4Dk=
github.com/frankban/quicktest v1.11.3/go.mod h1:wRf/ReqHper53s+kmmSZizM8NamnL3IM0I9ntUbOk+k=
github.com/fsnotify/fsnotify v1.5.4 h1:jRbGcIw6P2Meqdwuo0H1p6JVLbL5DHKAKlYndzMwVZI=
github.com/fsnotify/fsnotify v1.5.4/go.mod h1:OVB6XrOHzAwXMpEM7uPOzcehqUV2UqJxmVXmkdnm1bU=
github.com/godbus/dbus/v5 v5.0.4/go.mod h1:xhWf0FNVPg57R7Z0UbKHbJfkEywrmjJnf7w5xrFpKfA=
github.com/godbus/dbus/v5 v5.0.6/go.mod h1:xhWf0FNVPg57R7Z0UbKHbJfkEywrmjJnf7w5xrFpKfA=
github.com/golang/protobuf v1.5.0/go.mod h1:FsONVRAS9T7sI+LIUmWTfcYkHO4aIWwzhcaSAoJOfIk=
github.com/google/go-cmp v0.5.4/go.mod h1:v8dTdLbMG2kIc/vJvl+f65V22dbkXbowE6jgT/gNBxE=
github.com/google/go-cmp v0.5.5/go.mod h1:v8dTdLbMG2kIc/vJvl+f65V22dbkXbowE6jgT/gNBxE=
github.com/google/uuid v1.3.0/go.mod h1:TIyPZe4MgqvfeYDBFedMoGGpEw/LqOeaOT+nhxU+yHo=
github.com/hashicorp/errwrap v1.0.0/go.mod h1:YH+1FKiLXxHSkmPseP+kNlulaMuP3n2brvKWEqk/Jc4=
github.com/hashicorp/errwrap v1.1.0 h1:OxrOeh75EUXMY8TBjag2fzXGZ40LB6IKw45YeGUDY2I=
github.com/hashicorp/errwrap v1.1.0/go.mod h1:YH+1FKiLXxHSkmPseP+kNlulaMuP3n2brvKWEqk/Jc4=
github.com/hashicorp/go-multierror v1.1.1 h1:H5DkEtf6CXdFp0N0Em5UCwQpXMWke8IA0+lD48awMYo=
github.com/hashicorp/go-multierror v1.1.1/go.mod h1:iw975J/qwKPdAO1clOe2L8331t/9/fmwbPZ6JB6eMoM=
github.com/kr/pretty v0.2.1 h1:Fmg33tUaq4/8ym9TJN1x7sLJnHVwhP33CNkpYV/7rwI=
github.com/kr/pretty v0.2.1/go.mod h1:ipq/a2n7PKx3OHsz4KJII5eveXtPO4qwEXGdVfWzfnI=
github.com/kr/pty v1.1.1/go.mod h1:pFQYn66WHrOpPYNljwOMqo10TkYh1fy3cYio2l3bCsQ=
github.com/kr/text v0.1.0/go.mod h1:4Jbv+DJW3UT/LiOwJeYQe1efqtUx/iVham/4vfdArNI=
github.com/kr/text v0.2.0 h1:5Nx0Ya0ZqY2ygV366QzturHI13Jq95ApcVaJBhpS+AY=
github.com/kr/text v0.2.0/go.mod h1:eLer722TekiGuMkidMxC/pM04lWEeraHUUmBw8l2grE=
github.com/mndrix/tap-go v0.0.0-20171203230836-629fa407e90b/go.mod h1:pzzDgJWZ34fGzaAZGFW22KVZDfyrYW+QABMrWnJBnSs=
github.com/moby/sys/mountinfo v0.5.0/go.mod h1:3bMD3Rg+zkqx8MRYPi7Pyb0Ie97QEBmdxbhnCLlSvSU=
github.com/mrunalp/fileutils v0.5.0/go.mod h1:M1WthSahJixYnrXQl/DFQuteStB1weuxD2QJNHXfbSQ=
github.com/opencontainers/runc v1.1.4 h1:nRCz/8sKg6K6jgYAFLDlXzPeITBZJyX28DBVhWD+5dg=
github.com/opencontainers/runc v1.1.4/go.mod h1:1J5XiS+vdZ3wCyZybsuxXZWGrgSr8fFJHLXuG2PsnNg=
github.com/opencontainers/runtime-spec v1.0.3-0.20210326190908-1c3f411f0417/go.mod h1:jwyrGlmzljRJv/Fgzds9SsS/C5hL+LL3ko9hs6T5lQ0=
github.com/opencontainers/runtime-spec v1.0.3-0.20220825212826-86290f6a00fb h1:1xSVPOd7/UA+39/hXEGnBJ13p6JFB0E1EvQFlrRDOXI=
github.com/opencontainers/runtime-spec v1.0.3-0.20220825212826-86290f6a00fb/go.mod h1:jwyrGlmzljRJv/Fgzds9SsS/C5hL+LL3ko9hs6T5lQ0=
github.com/opencontainers/runtime-tools v0.9.1-0.20221107090550-2e043c6bd626 h1:DmNGcqH3WDbV5k8OJ+esPWbqUOX5rMLR2PMvziDMJi0=
github.com/opencontainers/runtime-tools v0.9.1-0.20221107090550-2e043c6bd626/go.mod h1:BRHJJd0E+cx42OybVYSgUvZmU0B8P9gZuRXlZUP7TKI=
github.com/opencontainers/selinux v1.9.1/go.mod h1:2i0OySw99QjzBBQByd1Gr9gSjvuho1lHsJxIJ3gGbJI=
github.com/opencontainers/selinux v1.10.0/go.mod h1:2i0OySw99QjzBBQByd1Gr9gSjvuho1lHsJxIJ3gGbJI=
github.com/opencontainers/selinux v1.10.1 h1:09LIPVRP3uuZGQvgR+SgMSNBd1Eb3vlRbGqQpoHsF8w=
github.com/opencontainers/selinux v1.10.1/go.mod h1:2i0OySw99QjzBBQByd1Gr9gSjvuho1lHsJxIJ3gGbJI=
github.com/pelletier/go-toml v1.9.4 h1:tjENF6MfZAg8e4ZmZTeWaWiT2vXtsoO6+iuOjFhECwM=
github.com/pelletier/go-toml v1.9.4/go.mod h1:u1nR/EPcESfeI/szUZKdtJ0xRNbUoANCkoOuaOx1Y+c=
github.com/pmezard/go-difflib v1.0.0 h1:4DBwDE0NGyQoBHbLQYPwSUPoCMWR5BEzIk/f1lZbAQM=
github.com/pmezard/go-difflib v1.0.0/go.mod h1:iKH77koFhYxTK1pcRnkKkqfTogsbg7gZNVY4sRDYZ/4=
github.com/russross/blackfriday/v2 v2.0.1/go.mod h1:+Rmxgy9KzJVeS9/2gXHxylqXiyQDYRxCVz55jmeOWTM=
github.com/russross/blackfriday/v2 v2.1.0 h1:JIOH55/0cWyOuilr9/qlrm0BSXldqnqwMsf35Ld67mk=
github.com/russross/blackfriday/v2 v2.1.0/go.mod h1:+Rmxgy9KzJVeS9/2gXHxylqXiyQDYRxCVz55jmeOWTM=
github.com/seccomp/libseccomp-golang v0.9.2-0.20220502022130-f33da4d89646/go.mod h1:JA8cRccbGaA1s33RQf7Y1+q9gHmZX1yB/z9WDN1C6fg=
github.com/shurcooL/sanitized_anchor_name v1.0.0/go.mod h1:1NzhyTcUVG4SuEtjjoZeVRXNmyL/1OwPU0+IJeTBvfc=
github.com/sirupsen/logrus v1.8.1/go.mod h1:yWOB1SBYBC5VeMP7gHvWumXLIWorT60ONWic61uBYv0=
github.com/sirupsen/logrus v1.9.0 h1:trlNQbNUG3OdDrDil03MCb1H2o9nJ1x4/5LYw7byDE0=
github.com/sirupsen/logrus v1.9.0/go.mod h1:naHLuLoDiP4jHNo9R0sCBMtWGeIprob74mVsIT4qYEQ=
github.com/stretchr/objx v0.1.0/go.mod h1:HFkY916IF+rwdDfMAkV7OtwuqBVzrE8GR6GFx+wExME=
github.com/stretchr/testify v1.6.0 h1:jlIyCplCJFULU/01vCkhKuTyc3OorI3bJFuw6obfgho=
github.com/stretchr/testify v1.6.0/go.mod h1:6Fq8oRcR53rry900zMqJjRRixrwX3KX962/h/Wwjteg=
golang.org/x/crypto v0.0.0-20190308221718-c2843e01d9a2/go.mod h1:djNgcEr1/C05ACkg1iLfiJU5Ep61QUkGW8qpdssI0+w=
golang.org/x/crypto v0.0.0-20191011191535-87dc89f01550/go.mod h1:yigFU9vqHzYiE8UmvKecakEJjdnWj3jj499lnFckfCI=
golang.org/x/mod v0.3.0 h1:RM4zey1++hCTbCVQfnWeKs9/IEsaBLA8vTkd0WVtmH4=
golang.org/x/mod v0.3.0/go.mod h1:s0Qsj1ACt9ePp/hMypM3fl4fZqREWJwdYDEqhRiZZUA=
golang.org/x/net v0.0.0-20190404232315-eb5bcb51f2a3/go.mod h1:t9HGtf8HONx5eT2rtn7q6eTqICYqUVnKs3thJo3Qplg=
golang.org/x/net v0.0.0-20190620200207-3b0461eec859/go.mod h1:z5CRVTTTmAJ677TzLLGU+0bjPO0LkuOLi4/5GtJWs/s=
golang.org/x/sync v0.0.0-20190423024810-112230192c58/go.mod h1:RxMgew5VJxzue5/jJTE5uejpjVlOe/izrB70Jof72aM=
golang.org/x/sys v0.0.0-20190215142949-d0b11bdaac8a/go.mod h1:STP8DvDyc/dI5b8T5hshtkjS+E42TnysNCUPdjciGhY=
golang.org/x/sys v0.0.0-20190412213103-97732733099d/go.mod h1:h1NjWce9XRLGQEsW7wpKNCjG9DtNlClVuFLEZdDNbEs=
golang.org/x/text v0.3.0/go.mod h1:NqM8EUOU14njkJ3fqMW+pc6Ldnwhi/IjpwHt7yyuwOQ=
golang.org/x/tools v0.0.0-20191119224855-298f0cb1881e/go.mod h1:b+2E5dAYhXwXZwtnZ6UAqBI28+e2cm9otk0dWdXHAEo=
golang.org/x/xerrors v0.0.0-20190717185122-a985d3407aa7/go.mod h1:I/5z698sn9Ka8TeJc9MKroUUfqBBauWjQqLJ2OPfmY0=
golang.org/x/xerrors v0.0.0-20191011141410-1b5146add898/go.mod h1:I/5z698sn9Ka8TeJc9MKroUUfqBBauWjQqLJ2OPfmY0=
github.com/stretchr/testify v1.2.2/go.mod h1:a8OnRcib4nhh0OaRAV+Yts87kKdq0PP7pXfy6kDkUVs=
github.com/stretchr/testify v1.3.0/go.mod h1:M5WIy9Dh21IEIfnGCwXGc5bZfKNJtfHm1UVUgZn+9EI=
github.com/stretchr/testify v1.7.0 h1:nwc3DEeHmmLAfoZucVR881uASk0Mfjw8xYJ99tb5CcY=
github.com/stretchr/testify v1.7.0/go.mod h1:6Fq8oRcR53rry900zMqJjRRixrwX3KX962/h/Wwjteg=
github.com/syndtr/gocapability v0.0.0-20200815063812-42c35b437635 h1:kdXcSzyDtseVEc4yCz2qF8ZrQvIDBJLl4S1c3GCXmoI=
github.com/syndtr/gocapability v0.0.0-20200815063812-42c35b437635/go.mod h1:hkRG7XYTFWNJGYcbNJQlaLq0fg1yr4J4t/NcTQtrfww=
github.com/urfave/cli v1.19.1/go.mod h1:70zkFmudgCuE/ngEzBv17Jvp/497gISqfk5gWijbERA=
github.com/urfave/cli v1.22.1/go.mod h1:Gos4lmkARVdJ6EkW0WaNv/tZAAMe9V7XWyB60NtXRu0=
github.com/urfave/cli/v2 v2.3.0 h1:qph92Y649prgesehzOrQjdWyxFOp/QVM+6imKHad91M=
github.com/urfave/cli/v2 v2.3.0/go.mod h1:LJmUH05zAU44vOAcrfzZQKsZbVcdbOG8rtL3/XcUArI=
github.com/vishvananda/netlink v1.1.0/go.mod h1:cTgwzPIzzgDAYoQrMm0EdrjRUBkTqKYppBueQtXaqoE=
github.com/vishvananda/netns v0.0.0-20191106174202-0a2b9b5464df/go.mod h1:JP3t17pCcGlemwknint6hfoeCVQrEMVwxRLRjXpq+BU=
github.com/xeipuuv/gojsonpointer v0.0.0-20180127040702-4e3ac2762d5f/go.mod h1:N2zxlSyiKSe5eX1tZViRH5QA0qijqEDrYZiPEAiq3wU=
github.com/xeipuuv/gojsonpointer v0.0.0-20190905194746-02993c407bfb h1:zGWFAtiMcyryUHoUjUJX0/lt1H2+i2Ka2n+D3DImSNo=
github.com/xeipuuv/gojsonpointer v0.0.0-20190905194746-02993c407bfb/go.mod h1:N2zxlSyiKSe5eX1tZViRH5QA0qijqEDrYZiPEAiq3wU=
github.com/xeipuuv/gojsonreference v0.0.0-20180127040603-bd5ef7bd5415 h1:EzJWgHovont7NscjpAxXsDA8S8BMYve8Y5+7cuRE7R0=
github.com/xeipuuv/gojsonreference v0.0.0-20180127040603-bd5ef7bd5415/go.mod h1:GwrjFmJcFw6At/Gs6z4yjiIwzuJ1/+UwLxMQDVQXShQ=
github.com/xeipuuv/gojsonschema v1.2.0 h1:LhYJRs+L4fBtjZUfuSZIKGeVu0QRy8e5Xi7D17UxZ74=
github.com/xeipuuv/gojsonschema v1.2.0/go.mod h1:anYRn/JVcOK2ZgGU+IjEV4nwlhoK5sQluxsYJ78Id3Y=
gitlab.com/nvidia/cloud-native/go-nvlib v0.0.0-20230119114711-6fe07bb33342 h1:083n9fJt2dWOpJd/X/q9Xgl5XtQLL22uSFYbzVqJssg=
gitlab.com/nvidia/cloud-native/go-nvlib v0.0.0-20230119114711-6fe07bb33342/go.mod h1:GStidGxhaqJhYFW1YpOnLvYCbL2EsM0od7IW4u7+JgU=
gitlab.com/nvidia/cloud-native/go-nvlib v0.0.0-20230209143738-95328d8c4438 h1:+qRai7XRl8omFQVCeHcaWzL542Yw64vfmuXG+79ZCIc=
gitlab.com/nvidia/cloud-native/go-nvlib v0.0.0-20230209143738-95328d8c4438/go.mod h1:GStidGxhaqJhYFW1YpOnLvYCbL2EsM0od7IW4u7+JgU=
golang.org/x/mod v0.5.0 h1:UG21uOlmZabA4fW5i7ZX6bjw1xELEGg/ZLgZq9auk/Q=
golang.org/x/mod v0.5.0/go.mod h1:5OXOZSfqPIIbmVBIIKWRFfZjPR0E5r58TLhUjH0a2Ro=
golang.org/x/net v0.0.0-20201224014010-6772e930b67b/go.mod h1:m0MpNAwzfU5UDzcl9v0D8zg8gWTRqZa9RBIspLL5mdg=
golang.org/x/sys v0.0.0-20190606203320-7fc4e5ec1444/go.mod h1:h1NjWce9XRLGQEsW7wpKNCjG9DtNlClVuFLEZdDNbEs=
golang.org/x/sys v0.0.0-20191026070338-33540a1f6037/go.mod h1:h1NjWce9XRLGQEsW7wpKNCjG9DtNlClVuFLEZdDNbEs=
golang.org/x/sys v0.0.0-20191115151921-52ab43148777/go.mod h1:h1NjWce9XRLGQEsW7wpKNCjG9DtNlClVuFLEZdDNbEs=
golang.org/x/sys v0.0.0-20201119102817-f84b799fce68/go.mod h1:h1NjWce9XRLGQEsW7wpKNCjG9DtNlClVuFLEZdDNbEs=
golang.org/x/sys v0.0.0-20210124154548-22da62e12c0c/go.mod h1:h1NjWce9XRLGQEsW7wpKNCjG9DtNlClVuFLEZdDNbEs=
golang.org/x/sys v0.0.0-20210906170528-6f6e22806c34/go.mod h1:oPkhp1MJrh7nUepCBck5+mAzfO9JrbApNNgaTdGDITg=
golang.org/x/sys v0.0.0-20211025201205-69cdffdb9359/go.mod h1:oPkhp1MJrh7nUepCBck5+mAzfO9JrbApNNgaTdGDITg=
golang.org/x/sys v0.0.0-20211116061358-0a5406a5449c/go.mod h1:oPkhp1MJrh7nUepCBck5+mAzfO9JrbApNNgaTdGDITg=
golang.org/x/sys v0.0.0-20220412211240-33da011f77ad/go.mod h1:oPkhp1MJrh7nUepCBck5+mAzfO9JrbApNNgaTdGDITg=
golang.org/x/sys v0.0.0-20220715151400-c0bba94af5f8/go.mod h1:oPkhp1MJrh7nUepCBck5+mAzfO9JrbApNNgaTdGDITg=
golang.org/x/sys v0.0.0-20220927170352-d9d178bc13c6 h1:cy1ko5847T/lJ45eyg/7uLprIE/amW5IXxGtEnQdYMI=
golang.org/x/sys v0.0.0-20220927170352-d9d178bc13c6/go.mod h1:oPkhp1MJrh7nUepCBck5+mAzfO9JrbApNNgaTdGDITg=
golang.org/x/term v0.0.0-20201126162022-7de9c90e9dd1/go.mod h1:bj7SfCRtBDWHUb9snDiAeCFNEtKQo2Wmx5Cou7ajbmo=
golang.org/x/text v0.3.3/go.mod h1:5Zoc/QRtKVWzQhOtBMvqHzDpF6irO9z98xDceosuGiQ=
golang.org/x/tools v0.0.0-20180917221912-90fa682c2a6e/go.mod h1:n7NCudcB/nEzxVGmLbDWY5pfWTLqBcC2KZ6jyYvM4mQ=
golang.org/x/xerrors v0.0.0-20191204190536-9bdfabe68543/go.mod h1:I/5z698sn9Ka8TeJc9MKroUUfqBBauWjQqLJ2OPfmY0=
google.golang.org/protobuf v1.26.0-rc.1/go.mod h1:jlhhOSvTdKEhbULTjvd4ARK9grFBp09yW+WbY/TyQbw=
google.golang.org/protobuf v1.27.1/go.mod h1:9q0QmTI4eRPtz6boOQmLYwt+qCgq0jsYwAQnmE0givc=
gopkg.in/check.v1 v0.0.0-20161208181325-20d25e280405/go.mod h1:Co6ibVJAznAaIkqp8huTwlJQCZ016jof/cbN4VW5Yz0=
gopkg.in/yaml.v3 v3.0.0-20200313102051-9f266ea9e77c h1:dUUwHk2QECo/6vqA44rthZ8ie2QXMNeKRTHCNY2nXvo=
gopkg.in/check.v1 v1.0.0-20201130134442-10cb98267c6c h1:Hei/4ADfdWqJk1ZMxUNpqntNwaWcugrBjAiHlqqRiVk=
gopkg.in/check.v1 v1.0.0-20201130134442-10cb98267c6c/go.mod h1:JHkPIbrfpd72SG/EVd6muEfDQjcINNoR0C8j2r3qZ4Q=
gopkg.in/yaml.v2 v2.2.2/go.mod h1:hI93XBmqTisBFMUTm0b8Fm+jr3Dg1NNxqwp+5A1VGuI=
gopkg.in/yaml.v2 v2.2.3/go.mod h1:hI93XBmqTisBFMUTm0b8Fm+jr3Dg1NNxqwp+5A1VGuI=
gopkg.in/yaml.v2 v2.4.0 h1:D8xgwECY7CYvx+Y2n4sBz93Jn9JRvxdiyyo8CTfuKaY=
gopkg.in/yaml.v2 v2.4.0/go.mod h1:RDklbk79AGWmwhnvt/jBztapEOGDOx6ZbXqjP6csGnQ=
gopkg.in/yaml.v3 v3.0.0-20200313102051-9f266ea9e77c/go.mod h1:K4uyk7z7BCEPqu6E+C64Yfv1cQ7kz7rIZviUmN+EgEM=
gopkg.in/yaml.v3 v3.0.1 h1:fxVm/GzAzEWqLHuvctI91KS9hhNmmWOoWu0XTYJS7CA=
gopkg.in/yaml.v3 v3.0.1/go.mod h1:K4uyk7z7BCEPqu6E+C64Yfv1cQ7kz7rIZviUmN+EgEM=
sigs.k8s.io/yaml v1.3.0 h1:a2VclLzOGrwOHDiV8EfBGhvjHvP46CtW5j6POvhYGGo=
sigs.k8s.io/yaml v1.3.0/go.mod h1:GeOyir5tyXNByN85N/dRIT9es5UQNerPYEKK56eTBm8=

48
internal/config/cli.go Normal file
View File

@@ -0,0 +1,48 @@
/**
# Copyright (c) 2022, NVIDIA CORPORATION. All rights reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
**/
package config
import (
"github.com/pelletier/go-toml"
)
// ContainerCLIConfig stores the options for the nvidia-container-cli
type ContainerCLIConfig struct {
Root string
}
// getContainerCLIConfigFrom reads the nvidia container runtime config from the specified toml Tree.
func getContainerCLIConfigFrom(toml *toml.Tree) *ContainerCLIConfig {
cfg := getDefaultContainerCLIConfig()
if toml == nil {
return cfg
}
cfg.Root = toml.GetDefault("nvidia-container-cli.root", cfg.Root).(string)
return cfg
}
// getDefaultContainerCLIConfig defines the default values for the config
func getDefaultContainerCLIConfig() *ContainerCLIConfig {
c := ContainerCLIConfig{
Root: "",
}
return &c
}

126
internal/config/config.go Normal file
View File

@@ -0,0 +1,126 @@
/**
# Copyright (c) 2022, NVIDIA CORPORATION. All rights reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
**/
package config
import (
"fmt"
"io"
"os"
"path"
"github.com/pelletier/go-toml"
)
const (
configOverride = "XDG_CONFIG_HOME"
configFilePath = "nvidia-container-runtime/config.toml"
)
var (
// DefaultExecutableDir specifies the default path to use for executables if they cannot be located in the path.
DefaultExecutableDir = "/usr/bin"
// NVIDIAContainerRuntimeHookExecutable is the executable name for the NVIDIA Container Runtime Hook
NVIDIAContainerRuntimeHookExecutable = "nvidia-container-runtime-hook"
// NVIDIAContainerToolkitExecutable is the executable name for the NVIDIA Container Toolkit (an alias for the NVIDIA Container Runtime Hook)
NVIDIAContainerToolkitExecutable = "nvidia-container-toolkit"
configDir = "/etc/"
)
// Config represents the contents of the config.toml file for the NVIDIA Container Toolkit
// Note: This is currently duplicated by the HookConfig in cmd/nvidia-container-toolkit/hook_config.go
type Config struct {
AcceptEnvvarUnprivileged bool `toml:"accept-nvidia-visible-devices-envvar-when-unprivileged"`
NVIDIAContainerCLIConfig ContainerCLIConfig `toml:"nvidia-container-cli"`
NVIDIACTKConfig CTKConfig `toml:"nvidia-ctk"`
NVIDIAContainerRuntimeConfig RuntimeConfig `toml:"nvidia-container-runtime"`
NVIDIAContainerRuntimeHookConfig RuntimeHookConfig `toml:"nvidia-container-runtime-hook"`
}
// GetConfig sets up the config struct. Values are read from a toml file
// or set via the environment.
func GetConfig() (*Config, error) {
if XDGConfigDir := os.Getenv(configOverride); len(XDGConfigDir) != 0 {
configDir = XDGConfigDir
}
configFilePath := path.Join(configDir, configFilePath)
tomlFile, err := os.Open(configFilePath)
if err != nil {
return getDefaultConfig(), nil
}
defer tomlFile.Close()
cfg, err := loadConfigFrom(tomlFile)
if err != nil {
return nil, fmt.Errorf("failed to read config values: %v", err)
}
return cfg, nil
}
// loadRuntimeConfigFrom reads the config from the specified Reader
func loadConfigFrom(reader io.Reader) (*Config, error) {
toml, err := toml.LoadReader(reader)
if err != nil {
return nil, err
}
return getConfigFrom(toml)
}
// getConfigFrom reads the nvidia container runtime config from the specified toml Tree.
func getConfigFrom(toml *toml.Tree) (*Config, error) {
cfg := getDefaultConfig()
if toml == nil {
return cfg, nil
}
cfg.AcceptEnvvarUnprivileged = toml.GetDefault("accept-nvidia-visible-devices-envvar-when-unprivileged", cfg.AcceptEnvvarUnprivileged).(bool)
cfg.NVIDIAContainerCLIConfig = *getContainerCLIConfigFrom(toml)
cfg.NVIDIACTKConfig = *getCTKConfigFrom(toml)
runtimeConfig, err := getRuntimeConfigFrom(toml)
if err != nil {
return nil, fmt.Errorf("failed to load nvidia-container-runtime config: %v", err)
}
cfg.NVIDIAContainerRuntimeConfig = *runtimeConfig
runtimeHookConfig, err := getRuntimeHookConfigFrom(toml)
if err != nil {
return nil, fmt.Errorf("failed to load nvidia-container-runtime-hook config: %v", err)
}
cfg.NVIDIAContainerRuntimeHookConfig = *runtimeHookConfig
return cfg, nil
}
// getDefaultConfig defines the default values for the config
func getDefaultConfig() *Config {
c := Config{
AcceptEnvvarUnprivileged: true,
NVIDIAContainerCLIConfig: *getDefaultContainerCLIConfig(),
NVIDIACTKConfig: *getDefaultCTKConfig(),
NVIDIAContainerRuntimeConfig: *GetDefaultRuntimeConfig(),
}
return &c
}

View File

@@ -0,0 +1,182 @@
/**
# Copyright (c) 2022, NVIDIA CORPORATION. All rights reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
**/
package config
import (
"io/ioutil"
"os"
"path/filepath"
"strings"
"testing"
"github.com/stretchr/testify/require"
)
func TestGetConfigWithCustomConfig(t *testing.T) {
wd, err := os.Getwd()
require.NoError(t, err)
// By default debug is disabled
contents := []byte("[nvidia-container-runtime]\ndebug = \"/nvidia-container-toolkit.log\"")
testDir := filepath.Join(wd, "test")
filename := filepath.Join(testDir, configFilePath)
os.Setenv(configOverride, testDir)
require.NoError(t, os.MkdirAll(filepath.Dir(filename), 0766))
require.NoError(t, ioutil.WriteFile(filename, contents, 0766))
defer func() { require.NoError(t, os.RemoveAll(testDir)) }()
cfg, err := GetConfig()
require.NoError(t, err)
require.Equal(t, cfg.NVIDIAContainerRuntimeConfig.DebugFilePath, "/nvidia-container-toolkit.log")
}
func TestGetConfig(t *testing.T) {
testCases := []struct {
description string
contents []string
expectedError error
expectedConfig *Config
}{
{
description: "empty config is default",
expectedConfig: &Config{
AcceptEnvvarUnprivileged: true,
NVIDIAContainerCLIConfig: ContainerCLIConfig{
Root: "",
},
NVIDIAContainerRuntimeConfig: RuntimeConfig{
DebugFilePath: "/dev/null",
LogLevel: "info",
Runtimes: []string{"docker-runc", "runc"},
Mode: "auto",
Modes: modesConfig{
CSV: csvModeConfig{
MountSpecPath: "/etc/nvidia-container-runtime/host-files-for-container.d",
},
CDI: cdiModeConfig{
DefaultKind: "nvidia.com/gpu",
},
},
},
NVIDIACTKConfig: CTKConfig{
Path: "nvidia-ctk",
},
},
},
{
description: "config options set inline",
contents: []string{
"accept-nvidia-visible-devices-envvar-when-unprivileged = false",
"nvidia-container-cli.root = \"/bar/baz\"",
"nvidia-container-runtime.debug = \"/foo/bar\"",
"nvidia-container-runtime.experimental = true",
"nvidia-container-runtime.discover-mode = \"not-legacy\"",
"nvidia-container-runtime.log-level = \"debug\"",
"nvidia-container-runtime.runtimes = [\"/some/runtime\",]",
"nvidia-container-runtime.mode = \"not-auto\"",
"nvidia-container-runtime.modes.cdi.default-kind = \"example.vendor.com/device\"",
"nvidia-container-runtime.modes.csv.mount-spec-path = \"/not/etc/nvidia-container-runtime/host-files-for-container.d\"",
"nvidia-ctk.path = \"/foo/bar/nvidia-ctk\"",
},
expectedConfig: &Config{
AcceptEnvvarUnprivileged: false,
NVIDIAContainerCLIConfig: ContainerCLIConfig{
Root: "/bar/baz",
},
NVIDIAContainerRuntimeConfig: RuntimeConfig{
DebugFilePath: "/foo/bar",
LogLevel: "debug",
Runtimes: []string{"/some/runtime"},
Mode: "not-auto",
Modes: modesConfig{
CSV: csvModeConfig{
MountSpecPath: "/not/etc/nvidia-container-runtime/host-files-for-container.d",
},
CDI: cdiModeConfig{
DefaultKind: "example.vendor.com/device",
},
},
},
NVIDIACTKConfig: CTKConfig{
Path: "/foo/bar/nvidia-ctk",
},
},
},
{
description: "config options set in section",
contents: []string{
"accept-nvidia-visible-devices-envvar-when-unprivileged = false",
"[nvidia-container-cli]",
"root = \"/bar/baz\"",
"[nvidia-container-runtime]",
"debug = \"/foo/bar\"",
"experimental = true",
"discover-mode = \"not-legacy\"",
"log-level = \"debug\"",
"runtimes = [\"/some/runtime\",]",
"mode = \"not-auto\"",
"[nvidia-container-runtime.modes.cdi]",
"default-kind = \"example.vendor.com/device\"",
"[nvidia-container-runtime.modes.csv]",
"mount-spec-path = \"/not/etc/nvidia-container-runtime/host-files-for-container.d\"",
"[nvidia-ctk]",
"path = \"/foo/bar/nvidia-ctk\"",
},
expectedConfig: &Config{
AcceptEnvvarUnprivileged: false,
NVIDIAContainerCLIConfig: ContainerCLIConfig{
Root: "/bar/baz",
},
NVIDIAContainerRuntimeConfig: RuntimeConfig{
DebugFilePath: "/foo/bar",
LogLevel: "debug",
Runtimes: []string{"/some/runtime"},
Mode: "not-auto",
Modes: modesConfig{
CSV: csvModeConfig{
MountSpecPath: "/not/etc/nvidia-container-runtime/host-files-for-container.d",
},
CDI: cdiModeConfig{
DefaultKind: "example.vendor.com/device",
},
},
},
NVIDIACTKConfig: CTKConfig{
Path: "/foo/bar/nvidia-ctk",
},
},
},
}
for _, tc := range testCases {
t.Run(tc.description, func(t *testing.T) {
reader := strings.NewReader(strings.Join(tc.contents, "\n"))
cfg, err := loadConfigFrom(reader)
if tc.expectedError != nil {
require.Error(t, err)
} else {
require.NoError(t, err)
}
require.EqualValues(t, tc.expectedConfig, cfg)
})
}
}

View File

@@ -0,0 +1,25 @@
/**
# Copyright (c) NVIDIA CORPORATION. All rights reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
**/
package engine
// Interface defines the API for a runtime config updater.
type Interface interface {
DefaultRuntime() string
AddRuntime(string, string, bool) error
RemoveRuntime(string) error
Save(string) (int64, error)
}

View File

@@ -0,0 +1,137 @@
/**
# Copyright (c) NVIDIA CORPORATION. All rights reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
**/
package containerd
import (
"fmt"
"github.com/NVIDIA/nvidia-container-toolkit/internal/config/engine"
"github.com/pelletier/go-toml"
)
// ConfigV1 represents a version 1 containerd config
type ConfigV1 Config
var _ engine.Interface = (*ConfigV1)(nil)
// AddRuntime adds a runtime to the containerd config
func (c *ConfigV1) AddRuntime(name string, path string, setAsDefault bool) error {
if c == nil || c.Tree == nil {
return fmt.Errorf("config is nil")
}
config := *c.Tree
config.Set("version", int64(1))
switch runc := config.GetPath([]string{"plugins", "cri", "containerd", "runtimes", "runc"}).(type) {
case *toml.Tree:
runc, _ = toml.Load(runc.String())
config.SetPath([]string{"plugins", "cri", "containerd", "runtimes", name}, runc)
}
if config.GetPath([]string{"plugins", "cri", "containerd", "runtimes", name}) == nil {
config.SetPath([]string{"plugins", "cri", "containerd", "runtimes", name, "runtime_type"}, c.RuntimeType)
config.SetPath([]string{"plugins", "cri", "containerd", "runtimes", name, "runtime_root"}, "")
config.SetPath([]string{"plugins", "cri", "containerd", "runtimes", name, "runtime_engine"}, "")
config.SetPath([]string{"plugins", "cri", "containerd", "runtimes", name, "privileged_without_host_devices"}, false)
}
cdiAnnotations := []interface{}{"cdi.k8s.io/*"}
containerAnnotations, ok := config.GetPath([]string{"plugins", "cri", "containerd", "runtimes", name, "container_annotations"}).([]interface{})
if ok && containerAnnotations != nil {
cdiAnnotations = append(containerAnnotations, cdiAnnotations...)
}
config.SetPath([]string{"plugins", "cri", "containerd", "runtimes", name, "container_annotations"}, cdiAnnotations)
config.SetPath([]string{"plugins", "cri", "containerd", "runtimes", name, "options", "BinaryName"}, path)
config.SetPath([]string{"plugins", "cri", "containerd", "runtimes", name, "options", "Runtime"}, path)
if setAsDefault && c.UseDefaultRuntimeName {
config.SetPath([]string{"plugins", "cri", "containerd", "default_runtime_name"}, name)
} else if setAsDefault {
// Note: This is deprecated in containerd 1.4.0 and will be removed in 1.5.0
if config.GetPath([]string{"plugins", "cri", "containerd", "default_runtime"}) == nil {
config.SetPath([]string{"plugins", "cri", "containerd", "default_runtime", "runtime_type"}, c.RuntimeType)
config.SetPath([]string{"plugins", "cri", "containerd", "default_runtime", "runtime_root"}, "")
config.SetPath([]string{"plugins", "cri", "containerd", "default_runtime", "runtime_engine"}, "")
config.SetPath([]string{"plugins", "cri", "containerd", "default_runtime", "privileged_without_host_devices"}, false)
}
config.SetPath([]string{"plugins", "cri", "containerd", "default_runtime", "options", "BinaryName"}, path)
config.SetPath([]string{"plugins", "cri", "containerd", "default_runtime", "options", "Runtime"}, path)
}
*c.Tree = config
return nil
}
// DefaultRuntime returns the default runtime for the cri-o config
func (c ConfigV1) DefaultRuntime() string {
if runtime, ok := c.GetPath([]string{"plugins", "cri", "containerd", "default_runtime_name"}).(string); ok {
return runtime
}
return ""
}
// RemoveRuntime removes a runtime from the docker config
func (c *ConfigV1) RemoveRuntime(name string) error {
if c == nil || c.Tree == nil {
return nil
}
config := *c.Tree
// If the specified runtime was set as the default runtime we need to remove the default runtime too.
runtimePath, ok := config.GetPath([]string{"plugins", "cri", "containerd", "runtimes", name, "options", "BinaryName"}).(string)
if !ok || runtimePath == "" {
runtimePath, _ = config.GetPath([]string{"plugins", "cri", "containerd", "runtimes", name, "options", "Runtime"}).(string)
}
defaultRuntimePath, ok := config.GetPath([]string{"plugins", "cri", "containerd", "default_runtime", "options", "BinaryName"}).(string)
if !ok || defaultRuntimePath == "" {
defaultRuntimePath, _ = config.GetPath([]string{"plugins", "cri", "containerd", "default_runtime", "options", "Runtime"}).(string)
}
if runtimePath != "" && defaultRuntimePath != "" && runtimePath == defaultRuntimePath {
config.DeletePath([]string{"plugins", "cri", "containerd", "default_runtime"})
}
config.DeletePath([]string{"plugins", "cri", "containerd", "runtimes", name})
if runtime, ok := config.GetPath([]string{"plugins", "cri", "containerd", "default_runtime_name"}).(string); ok {
if runtime == name {
config.DeletePath([]string{"plugins", "cri", "containerd", "default_runtime_name"})
}
}
runtimeConfigPath := []string{"plugins", "cri", "containerd", "runtimes", name}
for i := 0; i < len(runtimeConfigPath); i++ {
if runtimes, ok := config.GetPath(runtimeConfigPath[:len(runtimeConfigPath)-i]).(*toml.Tree); ok {
if len(runtimes.Keys()) == 0 {
config.DeletePath(runtimeConfigPath[:len(runtimeConfigPath)-i])
}
}
}
if len(config.Keys()) == 1 && config.Keys()[0] == "version" {
config.Delete("version")
}
*c.Tree = config
return nil
}
// Save wrotes the config to a file
func (c ConfigV1) Save(path string) (int64, error) {
return (Config)(c).Save(path)
}

View File

@@ -0,0 +1,133 @@
/**
# Copyright (c) NVIDIA CORPORATION. All rights reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
**/
package containerd
import (
"fmt"
"os"
"github.com/pelletier/go-toml"
)
// AddRuntime adds a runtime to the containerd config
func (c *Config) AddRuntime(name string, path string, setAsDefault bool) error {
if c == nil || c.Tree == nil {
return fmt.Errorf("config is nil")
}
config := *c.Tree
config.Set("version", int64(2))
switch runc := config.GetPath([]string{"plugins", "io.containerd.grpc.v1.cri", "containerd", "runtimes", "runc"}).(type) {
case *toml.Tree:
runc, _ = toml.Load(runc.String())
config.SetPath([]string{"plugins", "io.containerd.grpc.v1.cri", "containerd", "runtimes", name}, runc)
}
if config.GetPath([]string{"plugins", "io.containerd.grpc.v1.cri", "containerd", "runtimes", name}) == nil {
config.SetPath([]string{"plugins", "io.containerd.grpc.v1.cri", "containerd", "runtimes", name, "runtime_type"}, c.RuntimeType)
config.SetPath([]string{"plugins", "io.containerd.grpc.v1.cri", "containerd", "runtimes", name, "runtime_root"}, "")
config.SetPath([]string{"plugins", "io.containerd.grpc.v1.cri", "containerd", "runtimes", name, "runtime_engine"}, "")
config.SetPath([]string{"plugins", "io.containerd.grpc.v1.cri", "containerd", "runtimes", name, "privileged_without_host_devices"}, false)
}
cdiAnnotations := []interface{}{"cdi.k8s.io/*"}
containerAnnotations, ok := config.GetPath([]string{"plugins", "io.containerd.grpc.v1.cri", "containerd", "runtimes", name, "container_annotations"}).([]interface{})
if ok && containerAnnotations != nil {
cdiAnnotations = append(containerAnnotations, cdiAnnotations...)
}
config.SetPath([]string{"plugins", "io.containerd.grpc.v1.cri", "containerd", "runtimes", name, "container_annotations"}, cdiAnnotations)
config.SetPath([]string{"plugins", "io.containerd.grpc.v1.cri", "containerd", "runtimes", name, "options", "BinaryName"}, path)
if setAsDefault {
config.SetPath([]string{"plugins", "io.containerd.grpc.v1.cri", "containerd", "default_runtime_name"}, name)
}
*c.Tree = config
return nil
}
// DefaultRuntime returns the default runtime for the cri-o config
func (c Config) DefaultRuntime() string {
if runtime, ok := c.GetPath([]string{"plugins", "io.containerd.grpc.v1.cri", "containerd", "default_runtime_name"}).(string); ok {
return runtime
}
return ""
}
// RemoveRuntime removes a runtime from the docker config
func (c *Config) RemoveRuntime(name string) error {
if c == nil || c.Tree == nil {
return nil
}
config := *c.Tree
config.DeletePath([]string{"plugins", "io.containerd.grpc.v1.cri", "containerd", "runtimes", name})
if runtime, ok := config.GetPath([]string{"plugins", "io.containerd.grpc.v1.cri", "containerd", "default_runtime_name"}).(string); ok {
if runtime == name {
config.DeletePath([]string{"plugins", "io.containerd.grpc.v1.cri", "containerd", "default_runtime_name"})
}
}
runtimePath := []string{"plugins", "io.containerd.grpc.v1.cri", "containerd", "runtimes", name}
for i := 0; i < len(runtimePath); i++ {
if runtimes, ok := config.GetPath(runtimePath[:len(runtimePath)-i]).(*toml.Tree); ok {
if len(runtimes.Keys()) == 0 {
config.DeletePath(runtimePath[:len(runtimePath)-i])
}
}
}
if len(config.Keys()) == 1 && config.Keys()[0] == "version" {
config.Delete("version")
}
*c.Tree = config
return nil
}
// Save writes the config to the specified path
func (c Config) Save(path string) (int64, error) {
config := c.Tree
output, err := config.ToTomlString()
if err != nil {
return 0, fmt.Errorf("unable to convert to TOML: %v", err)
}
if len(output) == 0 {
err := os.Remove(path)
if err != nil {
return 0, fmt.Errorf("unable to remove empty file: %v", err)
}
return 0, nil
}
f, err := os.Create(path)
if err != nil {
return 0, fmt.Errorf("unable to open '%v' for writing: %v", path, err)
}
defer f.Close()
n, err := f.WriteString(output)
if err != nil {
return 0, fmt.Errorf("unable to write output: %v", err)
}
return int64(n), err
}

View File

@@ -0,0 +1,39 @@
/**
# Copyright (c) 2022, NVIDIA CORPORATION. All rights reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
**/
package containerd
import (
"github.com/NVIDIA/nvidia-container-toolkit/internal/config/engine"
"github.com/pelletier/go-toml"
)
// Config represents the containerd config
type Config struct {
*toml.Tree
RuntimeType string
UseDefaultRuntimeName bool
}
// New creates a containerd config with the specified options
func New(opts ...Option) (engine.Interface, error) {
b := &builder{}
for _, opt := range opts {
opt(b)
}
return b.build()
}

View File

@@ -0,0 +1,140 @@
/**
# Copyright (c) NVIDIA CORPORATION. All rights reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
**/
package containerd
import (
"fmt"
"os"
"github.com/NVIDIA/nvidia-container-toolkit/internal/config/engine"
"github.com/pelletier/go-toml"
log "github.com/sirupsen/logrus"
)
const (
defaultRuntimeType = "io.containerd.runc.v2"
)
type builder struct {
path string
runtimeType string
useLegacyConfig bool
}
// Option defines a function that can be used to configure the config builder
type Option func(*builder)
// WithPath sets the path for the config builder
func WithPath(path string) Option {
return func(b *builder) {
b.path = path
}
}
// WithRuntimeType sets the runtime type for the config builder
func WithRuntimeType(runtimeType string) Option {
return func(b *builder) {
b.runtimeType = runtimeType
}
}
// WithUseLegacyConfig sets the useLegacyConfig flag for the config builder
func WithUseLegacyConfig(useLegacyConfig bool) Option {
return func(b *builder) {
b.useLegacyConfig = useLegacyConfig
}
}
func (b *builder) build() (engine.Interface, error) {
if b.path == "" {
return nil, fmt.Errorf("config path is empty")
}
if b.runtimeType == "" {
b.runtimeType = defaultRuntimeType
}
config, err := loadConfig(b.path)
if err != nil {
return nil, fmt.Errorf("failed to load config: %v", err)
}
config.RuntimeType = b.runtimeType
config.UseDefaultRuntimeName = !b.useLegacyConfig
version, err := config.parseVersion(b.useLegacyConfig)
if err != nil {
return nil, fmt.Errorf("failed to parse config version: %v", err)
}
switch version {
case 1:
return (*ConfigV1)(config), nil
case 2:
return config, nil
}
return nil, fmt.Errorf("unsupported config version: %v", version)
}
// loadConfig loads the containerd config from disk
func loadConfig(config string) (*Config, error) {
log.Infof("Loading config: %v", config)
info, err := os.Stat(config)
if os.IsExist(err) && info.IsDir() {
return nil, fmt.Errorf("config file is a directory")
}
configFile := config
if os.IsNotExist(err) {
configFile = "/dev/null"
log.Infof("Config file does not exist, creating new one")
}
tomlConfig, err := toml.LoadFile(configFile)
if err != nil {
return nil, err
}
log.Infof("Successfully loaded config")
cfg := Config{
Tree: tomlConfig,
}
return &cfg, nil
}
// parseVersion returns the version of the config
func (c *Config) parseVersion(useLegacyConfig bool) (int, error) {
defaultVersion := 2
if useLegacyConfig {
defaultVersion = 1
}
switch v := c.Get("version").(type) {
case nil:
switch len(c.Keys()) {
case 0: // No config exists, or the config file is empty, use version inferred from containerd
return defaultVersion, nil
default: // A config file exists, has content, and no version is set
return 1, nil
}
case int64:
return int(v), nil
default:
return -1, fmt.Errorf("unsupported type for version field: %v", v)
}
}

View File

@@ -0,0 +1,131 @@
/**
# Copyright (c) 2022, NVIDIA CORPORATION. All rights reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
**/
package crio
import (
"fmt"
"os"
"github.com/NVIDIA/nvidia-container-toolkit/internal/config/engine"
"github.com/pelletier/go-toml"
)
// Config represents the cri-o config
type Config toml.Tree
// New creates a cri-o config with the specified options
func New(opts ...Option) (engine.Interface, error) {
b := &builder{}
for _, opt := range opts {
opt(b)
}
return b.build()
}
// AddRuntime adds a new runtime to the crio config
func (c *Config) AddRuntime(name string, path string, setAsDefault bool) error {
if c == nil {
return fmt.Errorf("config is nil")
}
config := (toml.Tree)(*c)
switch runc := config.Get("crio.runtime.runtimes.runc").(type) {
case *toml.Tree:
runc, _ = toml.Load(runc.String())
config.SetPath([]string{"crio", "runtime", "runtimes", name}, runc)
}
config.SetPath([]string{"crio", "runtime", "runtimes", name, "runtime_path"}, path)
config.SetPath([]string{"crio", "runtime", "runtimes", name, "runtime_type"}, "oci")
if setAsDefault {
config.SetPath([]string{"crio", "runtime", "default_runtime"}, name)
}
*c = (Config)(config)
return nil
}
// DefaultRuntime returns the default runtime for the cri-o config
func (c Config) DefaultRuntime() string {
config := (toml.Tree)(c)
if runtime, ok := config.GetPath([]string{"crio", "runtime", "default_runtime"}).(string); ok {
return runtime
}
return ""
}
// RemoveRuntime removes a runtime from the cri-o config
func (c *Config) RemoveRuntime(name string) error {
if c == nil {
return nil
}
config := (toml.Tree)(*c)
if runtime, ok := config.GetPath([]string{"crio", "runtime", "default_runtime"}).(string); ok {
if runtime == name {
config.DeletePath([]string{"crio", "runtime", "default_runtime"})
}
}
runtimeClassPath := []string{"crio", "runtime", "runtimes", name}
config.DeletePath(runtimeClassPath)
for i := 0; i < len(runtimeClassPath); i++ {
remainingPath := runtimeClassPath[:len(runtimeClassPath)-i]
if entry, ok := config.GetPath(remainingPath).(*toml.Tree); ok {
if len(entry.Keys()) != 0 {
break
}
config.DeletePath(remainingPath)
}
}
*c = (Config)(config)
return nil
}
// Save writes the config to the specified path
func (c Config) Save(path string) (int64, error) {
config := (toml.Tree)(c)
output, err := config.ToTomlString()
if err != nil {
return 0, fmt.Errorf("unable to convert to TOML: %v", err)
}
if len(output) == 0 {
err := os.Remove(path)
if err != nil {
return 0, fmt.Errorf("unable to remove empty file: %v", err)
}
return 0, nil
}
f, err := os.Create(path)
if err != nil {
return 0, fmt.Errorf("unable to open '%v' for writing: %v", path, err)
}
defer f.Close()
n, err := f.WriteString(output)
if err != nil {
return 0, fmt.Errorf("unable to write output: %v", err)
}
return int64(n), err
}

View File

@@ -0,0 +1,73 @@
/**
# Copyright (c) NVIDIA CORPORATION. All rights reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
**/
package crio
import (
"fmt"
"os"
"github.com/pelletier/go-toml"
log "github.com/sirupsen/logrus"
)
type builder struct {
path string
}
// Option defines a function that can be used to configure the config builder
type Option func(*builder)
// WithPath sets the path for the config builder
func WithPath(path string) Option {
return func(b *builder) {
b.path = path
}
}
func (b *builder) build() (*Config, error) {
if b.path == "" {
empty := toml.Tree{}
return (*Config)(&empty), nil
}
return loadConfig(b.path)
}
// loadConfig loads the cri-o config from disk
func loadConfig(config string) (*Config, error) {
log.Infof("Loading config: %v", config)
info, err := os.Stat(config)
if os.IsExist(err) && info.IsDir() {
return nil, fmt.Errorf("config file is a directory")
}
configFile := config
if os.IsNotExist(err) {
configFile = "/dev/null"
log.Infof("Config file does not exist, creating new one")
}
cfg, err := toml.LoadFile(configFile)
if err != nil {
return nil, err
}
log.Infof("Successfully loaded config")
return (*Config)(cfg), nil
}

View File

@@ -0,0 +1,140 @@
/**
# Copyright (c) 2021-2022, NVIDIA CORPORATION. All rights reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
*/
package docker
import (
"encoding/json"
"fmt"
"os"
"github.com/NVIDIA/nvidia-container-toolkit/internal/config/engine"
)
const (
defaultDockerRuntime = "runc"
)
// Config defines a docker config file.
// TODO: This should not be public, but we need to access it from the tests in tools/container/docker
type Config map[string]interface{}
// New creates a docker config with the specified options
func New(opts ...Option) (engine.Interface, error) {
b := &builder{}
for _, opt := range opts {
opt(b)
}
return b.build()
}
// AddRuntime adds a new runtime to the docker config
func (c *Config) AddRuntime(name string, path string, setAsDefault bool) error {
if c == nil {
return fmt.Errorf("config is nil")
}
config := *c
// Read the existing runtimes
runtimes := make(map[string]interface{})
if _, exists := config["runtimes"]; exists {
runtimes = config["runtimes"].(map[string]interface{})
}
// Add / update the runtime definitions
runtimes[name] = map[string]interface{}{
"path": path,
"args": []string{},
}
config["runtimes"] = runtimes
if setAsDefault {
config["default-runtime"] = name
}
*c = config
return nil
}
// DefaultRuntime returns the default runtime for the docker config
func (c Config) DefaultRuntime() string {
r, ok := c["default-runtime"].(string)
if !ok {
return ""
}
return r
}
// RemoveRuntime removes a runtime from the docker config
func (c *Config) RemoveRuntime(name string) error {
if c == nil {
return nil
}
config := *c
if _, exists := config["default-runtime"]; exists {
defaultRuntime := config["default-runtime"].(string)
if defaultRuntime == name {
config["default-runtime"] = defaultDockerRuntime
}
}
if _, exists := config["runtimes"]; exists {
runtimes := config["runtimes"].(map[string]interface{})
delete(runtimes, name)
if len(runtimes) == 0 {
delete(config, "runtimes")
}
}
*c = config
return nil
}
// Save writes the config to the specified path
func (c Config) Save(path string) (int64, error) {
output, err := json.MarshalIndent(c, "", " ")
if err != nil {
return 0, fmt.Errorf("unable to convert to JSON: %v", err)
}
if len(output) == 0 {
err := os.Remove(path)
if err != nil {
return 0, fmt.Errorf("unable to remove empty file: %v", err)
}
return 0, nil
}
f, err := os.Create(path)
if err != nil {
return 0, fmt.Errorf("unable to open %v for writing: %v", path, err)
}
defer f.Close()
n, err := f.WriteString(string(output))
if err != nil {
return 0, fmt.Errorf("unable to write output: %v", err)
}
return int64(n), nil
}

View File

@@ -0,0 +1,215 @@
/**
# Copyright (c) 2021-2022, NVIDIA CORPORATION. All rights reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
*/
package docker
import (
"encoding/json"
"fmt"
"testing"
"github.com/stretchr/testify/require"
)
func TestUpdateConfigDefaultRuntime(t *testing.T) {
testCases := []struct {
config Config
runtimeName string
setAsDefault bool
expectedDefaultRuntimeName interface{}
}{
{
setAsDefault: false,
expectedDefaultRuntimeName: nil,
},
{
runtimeName: "NAME",
setAsDefault: true,
expectedDefaultRuntimeName: "NAME",
},
{
config: map[string]interface{}{
"default-runtime": "ALREADY_SET",
},
runtimeName: "NAME",
setAsDefault: false,
expectedDefaultRuntimeName: "ALREADY_SET",
},
{
config: map[string]interface{}{
"default-runtime": "ALREADY_SET",
},
runtimeName: "NAME",
setAsDefault: true,
expectedDefaultRuntimeName: "NAME",
},
}
for i, tc := range testCases {
t.Run(fmt.Sprintf("test case %d", i), func(t *testing.T) {
if tc.config == nil {
tc.config = make(map[string]interface{})
}
err := tc.config.AddRuntime(tc.runtimeName, "", tc.setAsDefault)
require.NoError(t, err)
defaultRuntimeName := tc.config["default-runtime"]
require.EqualValues(t, tc.expectedDefaultRuntimeName, defaultRuntimeName)
})
}
}
func TestUpdateConfigRuntimes(t *testing.T) {
testCases := []struct {
config Config
runtimes map[string]string
expectedConfig map[string]interface{}
}{
{
config: map[string]interface{}{},
runtimes: map[string]string{
"runtime1": "/test/runtime/dir/runtime1",
"runtime2": "/test/runtime/dir/runtime2",
},
expectedConfig: map[string]interface{}{
"runtimes": map[string]interface{}{
"runtime1": map[string]interface{}{
"path": "/test/runtime/dir/runtime1",
"args": []string{},
},
"runtime2": map[string]interface{}{
"path": "/test/runtime/dir/runtime2",
"args": []string{},
},
},
},
},
{
config: map[string]interface{}{
"runtimes": map[string]interface{}{
"runtime1": map[string]interface{}{
"path": "runtime1",
"args": []string{},
},
},
},
runtimes: map[string]string{
"runtime1": "/test/runtime/dir/runtime1",
"runtime2": "/test/runtime/dir/runtime2",
},
expectedConfig: map[string]interface{}{
"runtimes": map[string]interface{}{
"runtime1": map[string]interface{}{
"path": "/test/runtime/dir/runtime1",
"args": []string{},
},
"runtime2": map[string]interface{}{
"path": "/test/runtime/dir/runtime2",
"args": []string{},
},
},
},
},
{
config: map[string]interface{}{
"runtimes": map[string]interface{}{
"not-nvidia": map[string]interface{}{
"path": "some-other-path",
"args": []string{},
},
},
},
runtimes: map[string]string{
"runtime1": "/test/runtime/dir/runtime1",
},
expectedConfig: map[string]interface{}{
"runtimes": map[string]interface{}{
"not-nvidia": map[string]interface{}{
"path": "some-other-path",
"args": []string{},
},
"runtime1": map[string]interface{}{
"path": "/test/runtime/dir/runtime1",
"args": []string{},
},
},
},
},
{
config: map[string]interface{}{
"exec-opts": []string{"native.cgroupdriver=systemd"},
"log-driver": "json-file",
"log-opts": map[string]string{
"max-size": "100m",
},
"storage-driver": "overlay2",
},
runtimes: map[string]string{
"runtime1": "/test/runtime/dir/runtime1",
},
expectedConfig: map[string]interface{}{
"exec-opts": []string{"native.cgroupdriver=systemd"},
"log-driver": "json-file",
"log-opts": map[string]string{
"max-size": "100m",
},
"storage-driver": "overlay2",
"runtimes": map[string]interface{}{
"runtime1": map[string]interface{}{
"path": "/test/runtime/dir/runtime1",
"args": []string{},
},
},
},
},
{
config: map[string]interface{}{
"exec-opts": []string{"native.cgroupdriver=systemd"},
"log-driver": "json-file",
"log-opts": map[string]string{
"max-size": "100m",
},
"storage-driver": "overlay2",
},
expectedConfig: map[string]interface{}{
"exec-opts": []string{"native.cgroupdriver=systemd"},
"log-driver": "json-file",
"log-opts": map[string]string{
"max-size": "100m",
},
"storage-driver": "overlay2",
},
},
}
for i, tc := range testCases {
t.Run(fmt.Sprintf("test case %d", i), func(t *testing.T) {
for runtimeName, runtimePath := range tc.runtimes {
err := tc.config.AddRuntime(runtimeName, runtimePath, false)
require.NoError(t, err)
}
configContent, err := json.MarshalIndent(tc.config, "", " ")
require.NoError(t, err)
expectedContent, err := json.MarshalIndent(tc.expectedConfig, "", " ")
require.NoError(t, err)
require.EqualValues(t, string(expectedContent), string(configContent))
})
}
}

View File

@@ -0,0 +1,80 @@
/**
# Copyright (c) NVIDIA CORPORATION. All rights reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
**/
package docker
import (
"bytes"
"encoding/json"
"fmt"
"io/ioutil"
"os"
log "github.com/sirupsen/logrus"
)
type builder struct {
path string
}
// Option defines a function that can be used to configure the config builder
type Option func(*builder)
// WithPath sets the path for the config builder
func WithPath(path string) Option {
return func(b *builder) {
b.path = path
}
}
func (b *builder) build() (*Config, error) {
if b.path == "" {
empty := make(Config)
return &empty, nil
}
return loadConfig(b.path)
}
// loadConfig loads the docker config from disk
func loadConfig(configFilePath string) (*Config, error) {
log.Infof("Loading docker config from %v", configFilePath)
info, err := os.Stat(configFilePath)
if os.IsExist(err) && info.IsDir() {
return nil, fmt.Errorf("config file is a directory")
}
cfg := make(Config)
if os.IsNotExist(err) {
log.Infof("Config file does not exist, creating new one")
return &cfg, nil
}
readBytes, err := ioutil.ReadFile(configFilePath)
if err != nil {
return nil, fmt.Errorf("unable to read config: %v", err)
}
reader := bytes.NewReader(readBytes)
if err := json.NewDecoder(reader).Decode(&cfg); err != nil {
return nil, err
}
log.Infof("Successfully loaded config")
return &cfg, nil
}

62
internal/config/hook.go Normal file
View File

@@ -0,0 +1,62 @@
/**
# Copyright (c) 2022, NVIDIA CORPORATION. All rights reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
**/
package config
import (
"fmt"
"github.com/pelletier/go-toml"
)
// RuntimeHookConfig stores the config options for the NVIDIA Container Runtime
type RuntimeHookConfig struct {
// SkipModeDetection disables the mode check for the runtime hook.
SkipModeDetection bool `toml:"skip-mode-detection"`
}
// dummyHookConfig allows us to unmarshal only a RuntimeHookConfig from a *toml.Tree
type dummyHookConfig struct {
RuntimeHook RuntimeHookConfig `toml:"nvidia-container-runtime-hook"`
}
// getRuntimeHookConfigFrom reads the nvidia container runtime config from the specified toml Tree.
func getRuntimeHookConfigFrom(toml *toml.Tree) (*RuntimeHookConfig, error) {
cfg := GetDefaultRuntimeHookConfig()
if toml == nil {
return cfg, nil
}
d := dummyHookConfig{
RuntimeHook: *cfg,
}
if err := toml.Unmarshal(&d); err != nil {
return nil, fmt.Errorf("failed to unmarshal runtime config: %v", err)
}
return &d.RuntimeHook, nil
}
// GetDefaultRuntimeHookConfig defines the default values for the config
func GetDefaultRuntimeHookConfig() *RuntimeHookConfig {
c := RuntimeHookConfig{
SkipModeDetection: false,
}
return &c
}

View File

@@ -0,0 +1,54 @@
/**
# Copyright (c) 2022, NVIDIA CORPORATION. All rights reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
**/
package image
// DriverCapability represents the possible values of NVIDIA_DRIVER_CAPABILITIES
type DriverCapability string
// Constants for the supported driver capabilities
const (
DriverCapabilityAll DriverCapability = "all"
DriverCapabilityCompat32 DriverCapability = "compat32"
DriverCapabilityCompute DriverCapability = "compute"
DriverCapabilityDisplay DriverCapability = "display"
DriverCapabilityGraphics DriverCapability = "graphics"
DriverCapabilityNgx DriverCapability = "ngx"
DriverCapabilityUtility DriverCapability = "utility"
DriverCapabilityVideo DriverCapability = "video"
)
// DriverCapabilities represents the NVIDIA_DRIVER_CAPABILITIES set for the specified image.
type DriverCapabilities map[DriverCapability]bool
// Has check whether the specified capability is selected.
func (c DriverCapabilities) Has(capability DriverCapability) bool {
if c[DriverCapabilityAll] {
return true
}
return c[capability]
}
// Any checks whether any of the specified capabilites are set
func (c DriverCapabilities) Any(capabilities ...DriverCapability) bool {
for _, cap := range capabilities {
if c.Has(cap) {
return true
}
}
return false
}

View File

@@ -0,0 +1,190 @@
/**
# Copyright (c) 2022, NVIDIA CORPORATION. All rights reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
**/
package image
import (
"fmt"
"strconv"
"strings"
"github.com/opencontainers/runtime-spec/specs-go"
"golang.org/x/mod/semver"
)
const (
envCUDAVersion = "CUDA_VERSION"
envNVRequirePrefix = "NVIDIA_REQUIRE_"
envNVRequireCUDA = envNVRequirePrefix + "CUDA"
envNVRequireJetpack = envNVRequirePrefix + "JETPACK"
envNVDisableRequire = "NVIDIA_DISABLE_REQUIRE"
envNVDriverCapabilities = "NVIDIA_DRIVER_CAPABILITIES"
)
// CUDA represents a CUDA image that can be used for GPU computing. This wraps
// a map of environment variable to values that can be used to perform lookups
// such as requirements.
type CUDA map[string]string
// NewCUDAImageFromSpec creates a CUDA image from the input OCI runtime spec.
// The process environment is read (if present) to construc the CUDA Image.
func NewCUDAImageFromSpec(spec *specs.Spec) (CUDA, error) {
if spec == nil || spec.Process == nil {
return NewCUDAImageFromEnv(nil)
}
return NewCUDAImageFromEnv(spec.Process.Env)
}
// NewCUDAImageFromEnv creates a CUDA image from the input environment. The environment
// is a list of strings of the form ENVAR=VALUE.
func NewCUDAImageFromEnv(env []string) (CUDA, error) {
c := make(CUDA)
for _, e := range env {
parts := strings.SplitN(e, "=", 2)
if len(parts) != 2 {
return nil, fmt.Errorf("invalid environment variable: %v", e)
}
c[parts[0]] = parts[1]
}
return c, nil
}
// IsLegacy returns whether the associated CUDA image is a "legacy" image. An
// image is considered legacy if it has a CUDA_VERSION environment variable defined
// and no NVIDIA_REQUIRE_CUDA environment variable defined.
func (i CUDA) IsLegacy() bool {
legacyCudaVersion := i[envCUDAVersion]
cudaRequire := i[envNVRequireCUDA]
return len(legacyCudaVersion) > 0 && len(cudaRequire) == 0
}
// GetRequirements returns the requirements from all NVIDIA_REQUIRE_ environment
// variables.
func (i CUDA) GetRequirements() ([]string, error) {
// TODO: We need not process this if disable require is set, but this will be done
// in a single follow-up to ensure that the behavioural change is accurately captured.
// if i.HasDisableRequire() {
// return nil, nil
// }
// All variables with the "NVIDIA_REQUIRE_" prefix are passed to nvidia-container-cli
var requirements []string
for name, value := range i {
if strings.HasPrefix(name, envNVRequirePrefix) && !strings.HasPrefix(name, envNVRequireJetpack) {
requirements = append(requirements, value)
}
}
if i.IsLegacy() {
v, err := i.legacyVersion()
if err != nil {
return nil, fmt.Errorf("failed to get version: %v", err)
}
cudaRequire := fmt.Sprintf("cuda>=%s", v)
requirements = append(requirements, cudaRequire)
}
return requirements, nil
}
// HasDisableRequire checks for the value of the NVIDIA_DISABLE_REQUIRE. If set
// to a valid (true) boolean value this can be used to disable the requirement checks
func (i CUDA) HasDisableRequire() bool {
if disable, exists := i[envNVDisableRequire]; exists {
// i.logger.Debugf("NVIDIA_DISABLE_REQUIRE=%v; skipping requirement checks", disable)
d, _ := strconv.ParseBool(disable)
return d
}
return false
}
// DevicesFromEnvvars returns the devices requested by the image through environment variables
func (i CUDA) DevicesFromEnvvars(envVars ...string) VisibleDevices {
// We concantenate all the devices from the specified envvars.
var isSet bool
var devices []string
requested := make(map[string]bool)
for _, envVar := range envVars {
if devs, ok := i[envVar]; ok {
isSet = true
for _, d := range strings.Split(devs, ",") {
trimmed := strings.TrimSpace(d)
if len(trimmed) == 0 {
continue
}
devices = append(devices, trimmed)
requested[trimmed] = true
}
}
}
// Environment variable unset with legacy image: default to "all".
if !isSet && len(devices) == 0 && i.IsLegacy() {
return NewVisibleDevices("all")
}
// Environment variable unset or empty or "void": return nil
if len(devices) == 0 || requested["void"] {
return NewVisibleDevices("void")
}
return NewVisibleDevices(devices...)
}
// GetDriverCapabilities returns the requested driver capabilities.
func (i CUDA) GetDriverCapabilities() DriverCapabilities {
env := i[envNVDriverCapabilities]
capabilites := make(DriverCapabilities)
for _, c := range strings.Split(env, ",") {
capabilites[DriverCapability(c)] = true
}
return capabilites
}
func (i CUDA) legacyVersion() (string, error) {
majorMinor, err := parseMajorMinorVersion(i[envCUDAVersion])
if err != nil {
return "", fmt.Errorf("invalid CUDA version: %v", err)
}
return majorMinor, nil
}
func parseMajorMinorVersion(version string) (string, error) {
vVersion := "v" + strings.TrimPrefix(version, "v")
if !semver.IsValid(vVersion) {
return "", fmt.Errorf("invalid version string")
}
majorMinor := strings.TrimPrefix(semver.MajorMinor(vVersion), "v")
parts := strings.Split(majorMinor, ".")
var err error
_, err = strconv.ParseUint(parts[0], 10, 32)
if err != nil {
return "", fmt.Errorf("invalid major version")
}
_, err = strconv.ParseUint(parts[1], 10, 32)
if err != nil {
return "", fmt.Errorf("invalid minor version")
}
return majorMinor, nil
}

View File

@@ -0,0 +1,123 @@
/**
# Copyright (c) 2022, NVIDIA CORPORATION. All rights reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
**/
package image
import (
"testing"
"github.com/stretchr/testify/require"
)
func TestParseMajorMinorVersionValid(t *testing.T) {
var tests = []struct {
version string
expected string
}{
{"0", "0.0"},
{"8", "8.0"},
{"7.5", "7.5"},
{"9.0.116", "9.0"},
{"4294967295.4294967295.4294967295", "4294967295.4294967295"},
{"v11.6", "11.6"},
}
for _, c := range tests {
t.Run(c.version, func(t *testing.T) {
version, err := parseMajorMinorVersion(c.version)
require.NoError(t, err)
require.Equal(t, c.expected, version)
})
}
}
func TestParseMajorMinorVersionInvalid(t *testing.T) {
var tests = []string{
"foo",
"foo.5.10",
"9.0.116.50",
"9.0.116foo",
"7.foo",
"9.0.bar",
"9.4294967296",
"9.0.116.",
"9..0",
"9.",
".5.10",
"-9",
"+9",
"-9.1.116",
"-9.-1.-116",
}
for _, c := range tests {
t.Run(c, func(t *testing.T) {
_, err := parseMajorMinorVersion(c)
require.Error(t, err)
})
}
}
func TestGetRequirements(t *testing.T) {
testCases := []struct {
description string
env []string
requirements []string
}{
{
description: "NVIDIA_REQUIRE_JETPACK is ignored",
env: []string{"NVIDIA_REQUIRE_JETPACK=csv-mounts=all"},
requirements: nil,
},
{
description: "NVIDIA_REQUIRE_JETPACK_HOST_MOUNTS is ignored",
env: []string{"NVIDIA_REQUIRE_JETPACK_HOST_MOUNTS=base-only"},
requirements: nil,
},
{
description: "single requirement set",
env: []string{"NVIDIA_REQUIRE_CUDA=cuda>=11.6"},
requirements: []string{"cuda>=11.6"},
},
{
description: "requirements are concatenated requirement set",
env: []string{"NVIDIA_REQUIRE_CUDA=cuda>=11.6", "NVIDIA_REQUIRE_BRAND=brand=tesla"},
requirements: []string{"cuda>=11.6", "brand=tesla"},
},
{
description: "legacy image",
env: []string{"CUDA_VERSION=11.6"},
requirements: []string{"cuda>=11.6"},
},
{
description: "legacy image with additional requirement",
env: []string{"CUDA_VERSION=11.6", "NVIDIA_REQUIRE_BRAND=brand=tesla"},
requirements: []string{"cuda>=11.6", "brand=tesla"},
},
}
for _, tc := range testCases {
t.Run(tc.description, func(t *testing.T) {
image, err := NewCUDAImageFromEnv(tc.env)
require.NoError(t, err)
requirements, err := image.GetRequirements()
require.NoError(t, err)
require.ElementsMatch(t, tc.requirements, requirements)
})
}
}

View File

@@ -0,0 +1,127 @@
/**
# Copyright (c) 2022, NVIDIA CORPORATION. All rights reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
**/
package image
import (
"strings"
)
// VisibleDevices represents the devices selected in a container image
// through the NVIDIA_VISIBLE_DEVICES or other environment variables
type VisibleDevices interface {
List() []string
Has(string) bool
}
var _ VisibleDevices = (*all)(nil)
var _ VisibleDevices = (*none)(nil)
var _ VisibleDevices = (*void)(nil)
var _ VisibleDevices = (*devices)(nil)
// NewVisibleDevices creates a VisibleDevices based on the value of the specified envvar.
func NewVisibleDevices(envvars ...string) VisibleDevices {
for _, envvar := range envvars {
if envvar == "all" {
return all{}
}
if envvar == "none" {
return none{}
}
if envvar == "" || envvar == "void" {
return void{}
}
}
return newDevices(envvars...)
}
type all struct{}
// List returns ["all"] for all devices
func (a all) List() []string {
return []string{"all"}
}
// Has for all devices is true for any id except the empty ID
func (a all) Has(id string) bool {
return id != ""
}
type none struct{}
// List returns [""] for the none devices
func (n none) List() []string {
return []string{""}
}
// Has for none devices is false for any id
func (n none) Has(id string) bool {
return false
}
type void struct {
none
}
// List returns nil for the void devices
func (v void) List() []string {
return nil
}
type devices struct {
len int
lookup map[string]int
}
func newDevices(idOrCommaSeparated ...string) devices {
lookup := make(map[string]int)
i := 0
for _, commaSeparated := range idOrCommaSeparated {
for _, id := range strings.Split(commaSeparated, ",") {
lookup[id] = i
i++
}
}
d := devices{
len: i,
lookup: lookup,
}
return d
}
// List returns the list of requested devices
func (d devices) List() []string {
list := make([]string, d.len)
for id, i := range d.lookup {
list[i] = id
}
return list
}
// Has checks whether the specified ID is in the set of requested devices
func (d devices) Has(id string) bool {
if id == "" {
return false
}
_, exist := d.lookup[id]
return exist
}

View File

@@ -0,0 +1,43 @@
/**
# Copyright (c) NVIDIA CORPORATION. All rights reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
**/
package image
import (
"github.com/opencontainers/runtime-spec/specs-go"
)
const (
capSysAdmin = "CAP_SYS_ADMIN"
)
// IsPrivileged returns true if the container is a privileged container.
func IsPrivileged(s *specs.Spec) bool {
if s.Process.Capabilities == nil {
return false
}
// We only make sure that the bounding capabibility set has
// CAP_SYS_ADMIN. This allows us to make sure that the container was
// actually started as '--privileged', but also allow non-root users to
// access the privileged NVIDIA capabilities.
for _, c := range s.Process.Capabilities.Bounding {
if c == capSysAdmin {
return true
}
}
return false
}

106
internal/config/runtime.go Normal file
View File

@@ -0,0 +1,106 @@
/**
# Copyright (c) 2022, NVIDIA CORPORATION. All rights reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
**/
package config
import (
"fmt"
"github.com/pelletier/go-toml"
"github.com/sirupsen/logrus"
)
const (
dockerRuncExecutableName = "docker-runc"
runcExecutableName = "runc"
auto = "auto"
)
// RuntimeConfig stores the config options for the NVIDIA Container Runtime
type RuntimeConfig struct {
DebugFilePath string `toml:"debug"`
// LogLevel defines the logging level for the application
LogLevel string `toml:"log-level"`
// Runtimes defines the candidates for the low-level runtime
Runtimes []string `toml:"runtimes"`
Mode string `toml:"mode"`
Modes modesConfig `toml:"modes"`
}
// modesConfig defines (optional) per-mode configs
type modesConfig struct {
CSV csvModeConfig `toml:"csv"`
CDI cdiModeConfig `toml:"cdi"`
}
type cdiModeConfig struct {
// SpecDirs allows for the default spec dirs for CDI to be overridden
SpecDirs []string `toml:"spec-dirs"`
// DefaultKind sets the default kind to be used when constructing fully-qualified CDI device names
DefaultKind string `toml:"default-kind"`
}
type csvModeConfig struct {
MountSpecPath string `toml:"mount-spec-path"`
}
// dummy allows us to unmarshal only a RuntimeConfig from a *toml.Tree
type dummy struct {
Runtime RuntimeConfig `toml:"nvidia-container-runtime"`
}
// getRuntimeConfigFrom reads the nvidia container runtime config from the specified toml Tree.
func getRuntimeConfigFrom(toml *toml.Tree) (*RuntimeConfig, error) {
cfg := GetDefaultRuntimeConfig()
if toml == nil {
return cfg, nil
}
d := dummy{
Runtime: *cfg,
}
if err := toml.Unmarshal(&d); err != nil {
return nil, fmt.Errorf("failed to unmarshal runtime config: %v", err)
}
return &d.Runtime, nil
}
// GetDefaultRuntimeConfig defines the default values for the config
func GetDefaultRuntimeConfig() *RuntimeConfig {
c := RuntimeConfig{
DebugFilePath: "/dev/null",
LogLevel: logrus.InfoLevel.String(),
Runtimes: []string{
dockerRuncExecutableName,
runcExecutableName,
},
Mode: auto,
Modes: modesConfig{
CSV: csvModeConfig{
MountSpecPath: "/etc/nvidia-container-runtime/host-files-for-container.d",
},
CDI: cdiModeConfig{
DefaultKind: "nvidia.com/gpu",
},
},
}
return &c
}

View File

@@ -0,0 +1,46 @@
/**
# Copyright (c) 2022, NVIDIA CORPORATION. All rights reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
**/
package config
import "github.com/pelletier/go-toml"
// CTKConfig stores the config options for the NVIDIA Container Toolkit CLI (nvidia-ctk)
type CTKConfig struct {
Path string `toml:"path"`
}
// getCTKConfigFrom reads the nvidia container runtime config from the specified toml Tree.
func getCTKConfigFrom(toml *toml.Tree) *CTKConfig {
cfg := getDefaultCTKConfig()
if toml == nil {
return cfg
}
cfg.Path = toml.GetDefault("nvidia-ctk.path", cfg.Path).(string)
return cfg
}
// getDefaultCTKConfig defines the default values for the config
func getDefaultCTKConfig() *CTKConfig {
c := CTKConfig{
Path: "nvidia-ctk",
}
return &c
}

137
internal/cuda/cuda.go Normal file
View File

@@ -0,0 +1,137 @@
/**
# Copyright (c) 2022, NVIDIA CORPORATION. All rights reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
**/
package cuda
import (
"fmt"
"github.com/NVIDIA/go-nvml/pkg/dl"
)
/*
#cgo LDFLAGS: -Wl,--unresolved-symbols=ignore-in-object-files
#ifdef _WIN32
#define CUDAAPI __stdcall
#else
#define CUDAAPI
#endif
typedef int CUdevice;
typedef enum CUdevice_attribute_enum {
CU_DEVICE_ATTRIBUTE_COMPUTE_CAPABILITY_MAJOR = 75,
CU_DEVICE_ATTRIBUTE_COMPUTE_CAPABILITY_MINOR = 76
} CUdevice_attribute;
typedef enum cudaError_enum {
CUDA_SUCCESS = 0
} CUresult;
CUresult CUDAAPI cuInit(unsigned int Flags);
CUresult CUDAAPI cuDriverGetVersion(int *driverVersion);
CUresult CUDAAPI cuDeviceGet(CUdevice *device, int ordinal);
CUresult CUDAAPI cuDeviceGetAttribute(int *pi, CUdevice_attribute attrib, CUdevice dev);
*/
import "C"
const (
libraryName = "libcuda.so.1"
libraryLoadFlags = dl.RTLD_LAZY | dl.RTLD_GLOBAL
)
// cuda stores a reference the cuda dynamic library
var lib *dl.DynamicLibrary
// Version returns the CUDA version of the driver as a string or an error if this
// cannot be determined.
func Version() (string, error) {
lib, err := load()
if err != nil {
return "", err
}
defer lib.Close()
if err := lib.Lookup("cuDriverGetVersion"); err != nil {
return "", fmt.Errorf("failed to lookup symbol: %v", err)
}
var version C.int
if result := C.cuDriverGetVersion(&version); result != C.CUDA_SUCCESS {
return "", fmt.Errorf("failed to get CUDA version: result=%v", result)
}
major := version / 1000
minor := version % 100 / 10
return fmt.Sprintf("%d.%d", major, minor), nil
}
// ComputeCapability returns the CUDA compute capability of a device with the specified index as a string
// or an error if this cannot be determined.
func ComputeCapability(index int) (string, error) {
lib, err := load()
if err != nil {
return "", err
}
defer lib.Close()
if err := lib.Lookup("cuInit"); err != nil {
return "", fmt.Errorf("failed to lookup symbol: %v", err)
}
if err := lib.Lookup("cuDeviceGet"); err != nil {
return "", fmt.Errorf("failed to lookup symbol: %v", err)
}
if err := lib.Lookup("cuDeviceGetAttribute"); err != nil {
return "", fmt.Errorf("failed to lookup symbol: %v", err)
}
if result := C.cuInit(C.uint(0)); result != C.CUDA_SUCCESS {
return "", fmt.Errorf("failed to initialize CUDA: result=%v", result)
}
var device C.CUdevice
// NOTE: We only query the first device
if result := C.cuDeviceGet(&device, C.int(index)); result != C.CUDA_SUCCESS {
return "", fmt.Errorf("failed to get CUDA device %v: result=%v", 0, result)
}
var major C.int
if result := C.cuDeviceGetAttribute(&major, C.CU_DEVICE_ATTRIBUTE_COMPUTE_CAPABILITY_MAJOR, device); result != C.CUDA_SUCCESS {
return "", fmt.Errorf("failed to get CUDA compute capability major for device %v : result=%v", 0, result)
}
var minor C.int
if result := C.cuDeviceGetAttribute(&minor, C.CU_DEVICE_ATTRIBUTE_COMPUTE_CAPABILITY_MINOR, device); result != C.CUDA_SUCCESS {
return "", fmt.Errorf("failed to get CUDA compute capability minor for device %v: result=%v", 0, result)
}
return fmt.Sprintf("%d.%d", major, minor), nil
}
func load() (*dl.DynamicLibrary, error) {
lib := dl.New(libraryName, libraryLoadFlags)
if lib == nil {
return nil, fmt.Errorf("error instantiating DynamicLibrary for CUDA")
}
err := lib.Open()
if err != nil {
return nil, fmt.Errorf("error opening DynamicLibrary for CUDA: %v", err)
}
return lib, nil
}

View File

@@ -0,0 +1,69 @@
/**
# Copyright (c) 2022, NVIDIA CORPORATION. All rights reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
**/
package discover
import (
"github.com/NVIDIA/nvidia-container-toolkit/internal/lookup"
"github.com/sirupsen/logrus"
)
// charDevices is a discover for a list of character devices
type charDevices mounts
var _ Discover = (*charDevices)(nil)
// NewCharDeviceDiscoverer creates a discoverer which locates the specified set of device nodes.
func NewCharDeviceDiscoverer(logger *logrus.Logger, devices []string, root string) Discover {
locator := lookup.NewCharDeviceLocator(
lookup.WithLogger(logger),
lookup.WithRoot(root),
)
return NewDeviceDiscoverer(logger, locator, root, devices)
}
// NewDeviceDiscoverer creates a discoverer which locates the specified set of device nodes using the specified locator.
func NewDeviceDiscoverer(logger *logrus.Logger, locator lookup.Locator, root string, devices []string) Discover {
m := NewMounts(logger, locator, root, devices).(*mounts)
return (*charDevices)(m)
}
// Mounts returns the discovered mounts for the charDevices.
// Since this explicitly specifies a device list, the mounts are nil.
func (d *charDevices) Mounts() ([]Mount, error) {
return nil, nil
}
// Devices returns the discovered devices for the charDevices.
// Here the device nodes are first discovered as mounts and these are converted to devices.
func (d *charDevices) Devices() ([]Device, error) {
devicesAsMounts, err := (*mounts)(d).Mounts()
if err != nil {
return nil, err
}
var devices []Device
for _, mount := range devicesAsMounts {
device := Device{
HostPath: mount.HostPath,
Path: mount.Path,
}
devices = append(devices, device)
}
return devices, nil
}

View File

@@ -0,0 +1,83 @@
/**
# Copyright (c) 2021, NVIDIA CORPORATION. All rights reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
**/
package discover
import (
"fmt"
"testing"
"github.com/NVIDIA/nvidia-container-toolkit/internal/lookup"
testlog "github.com/sirupsen/logrus/hooks/test"
"github.com/stretchr/testify/require"
)
func TestCharDevices(t *testing.T) {
logger, logHook := testlog.NewNullLogger()
testCases := []struct {
description string
input *charDevices
expectedMounts []Mount
expectedMountsError error
expectedDevicesError error
expectedDevices []Device
}{
{
description: "dev mounts are empty",
input: (*charDevices)(
&mounts{
lookup: &lookup.LocatorMock{
LocateFunc: func(string) ([]string, error) {
return []string{"located"}, nil
},
},
required: []string{"required"},
},
),
expectedDevices: []Device{{Path: "located", HostPath: "located"}},
},
{
description: "dev devices returns error for nil lookup",
input: &charDevices{},
expectedDevicesError: fmt.Errorf("no lookup defined"),
},
}
for _, tc := range testCases {
logHook.Reset()
t.Run(tc.description, func(t *testing.T) {
tc.input.logger = logger
mounts, err := tc.input.Mounts()
if tc.expectedMountsError != nil {
require.Error(t, err)
} else {
require.NoError(t, err)
}
require.ElementsMatch(t, tc.expectedMounts, mounts)
devices, err := tc.input.Devices()
if tc.expectedDevicesError != nil {
require.Error(t, err)
} else {
require.NoError(t, err)
}
require.ElementsMatch(t, tc.expectedDevices, devices)
})
}
}

107
internal/discover/csv.go Normal file
View File

@@ -0,0 +1,107 @@
/**
# Copyright (c) 2021, NVIDIA CORPORATION. All rights reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
**/
package discover
import (
"fmt"
"github.com/NVIDIA/nvidia-container-toolkit/internal/discover/csv"
"github.com/NVIDIA/nvidia-container-toolkit/internal/lookup"
"github.com/sirupsen/logrus"
)
// NewFromCSVFiles creates a discoverer for the specified CSV files. A logger is also supplied.
// The constructed discoverer is comprised of a list, with each element in the list being associated with a
// single CSV files.
func NewFromCSVFiles(logger *logrus.Logger, files []string, driverRoot string) (Discover, error) {
if len(files) == 0 {
logger.Warnf("No CSV files specified")
return None{}, nil
}
symlinkLocator := lookup.NewSymlinkLocator(logger, driverRoot)
locators := map[csv.MountSpecType]lookup.Locator{
csv.MountSpecDev: lookup.NewCharDeviceLocator(lookup.WithLogger(logger), lookup.WithRoot(driverRoot)),
csv.MountSpecDir: lookup.NewDirectoryLocator(logger, driverRoot),
// Libraries and symlinks are handled in the same way
csv.MountSpecLib: symlinkLocator,
csv.MountSpecSym: symlinkLocator,
}
var mountSpecs []*csv.MountSpec
for _, filename := range files {
targets, err := loadCSVFile(logger, filename)
if err != nil {
logger.Warnf("Skipping CSV file %v: %v", filename, err)
continue
}
mountSpecs = append(mountSpecs, targets...)
}
return newFromMountSpecs(logger, locators, driverRoot, mountSpecs)
}
// loadCSVFile loads the specified CSV file and returns the list of mount specs
func loadCSVFile(logger *logrus.Logger, filename string) ([]*csv.MountSpec, error) {
// Create a discoverer for each file-kind combination
targets, err := csv.NewCSVFileParser(logger, filename).Parse()
if err != nil {
return nil, fmt.Errorf("failed to parse CSV file: %v", err)
}
if len(targets) == 0 {
return nil, fmt.Errorf("CSV file is empty")
}
return targets, nil
}
// newFromMountSpecs creates a discoverer for the CSV file. A logger is also supplied.
// A list of csvDiscoverers is returned, with each being associated with a single MountSpecType.
func newFromMountSpecs(logger *logrus.Logger, locators map[csv.MountSpecType]lookup.Locator, driverRoot string, targets []*csv.MountSpec) (Discover, error) {
if len(targets) == 0 {
return &None{}, nil
}
var discoverers []Discover
var mountSpecTypes []csv.MountSpecType
candidatesByType := make(map[csv.MountSpecType][]string)
for _, t := range targets {
if _, exists := candidatesByType[t.Type]; !exists {
mountSpecTypes = append(mountSpecTypes, t.Type)
}
candidatesByType[t.Type] = append(candidatesByType[t.Type], t.Path)
}
for _, t := range mountSpecTypes {
locator, exists := locators[t]
if !exists {
return nil, fmt.Errorf("no locator defined for '%v'", t)
}
var m Discover
switch t {
case csv.MountSpecDev:
m = NewDeviceDiscoverer(logger, locator, driverRoot, candidatesByType[t])
default:
m = NewMounts(logger, locator, driverRoot, candidatesByType[t])
}
discoverers = append(discoverers, m)
}
return &list{discoverers: discoverers}, nil
}

View File

@@ -0,0 +1,131 @@
/**
# Copyright (c) 2021, NVIDIA CORPORATION. All rights reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
**/
package csv
import (
"bufio"
"errors"
"fmt"
"io"
"os"
"path/filepath"
"strings"
"github.com/sirupsen/logrus"
)
const (
// DefaultMountSpecPath is default location of CSV files that define the modifications required to the OCI spec
DefaultMountSpecPath = "/etc/nvidia-container-runtime/host-files-for-container.d"
)
// GetFileList returns the (non-recursive) list of CSV files in the specified
// folder
func GetFileList(root string) ([]string, error) {
contents, err := os.ReadDir(root)
if err != nil && errors.Is(err, os.ErrNotExist) {
return nil, nil
} else if err != nil {
return nil, fmt.Errorf("failed to read the contents of %v: %v", root, err)
}
var csvFilePaths []string
for _, c := range contents {
if c.IsDir() {
continue
}
if c.Name() == ".csv" {
continue
}
ext := strings.ToLower(filepath.Ext(c.Name()))
if ext != ".csv" {
continue
}
csvFilePaths = append(csvFilePaths, filepath.Join(root, c.Name()))
}
return csvFilePaths, nil
}
// BaseFilesOnly filters out non-base CSV files from the list of CSV files.
func BaseFilesOnly(filenames []string) []string {
filter := map[string]bool{
"l4t.csv": true,
"drivers.csv": true,
"devices.csv": true,
}
var selected []string
for _, file := range filenames {
base := filepath.Base(file)
if filter[base] {
selected = append(selected, file)
}
}
return selected
}
// Parser specifies an interface for parsing MountSpecs
type Parser interface {
Parse() ([]*MountSpec, error)
}
type csv struct {
logger *logrus.Logger
filename string
}
// NewCSVFileParser creates a new parser for reading MountSpecs from the specified CSV file
func NewCSVFileParser(logger *logrus.Logger, filename string) Parser {
p := csv{
logger: logger,
filename: filename,
}
return &p
}
// Parse parses the csv file and returns a list of MountSpecs in the file
func (p csv) Parse() ([]*MountSpec, error) {
reader, err := os.Open(p.filename)
if err != nil {
return nil, fmt.Errorf("failed to open %v for reading: %v", p.filename, err)
}
defer reader.Close()
return p.parseFromReader(reader), nil
}
// parseFromReader parses the specified file and returns a list of required jetson mounts
func (p csv) parseFromReader(reader io.Reader) []*MountSpec {
var targets []*MountSpec
scanner := bufio.NewScanner(reader)
for scanner.Scan() {
line := scanner.Text()
target, err := NewMountSpecFromLine(line)
if err != nil {
p.logger.Debugf("Skipping invalid mount spec '%v': %v", line, err)
continue
}
targets = append(targets, target)
}
return targets
}

View File

@@ -0,0 +1,83 @@
/**
# Copyright (c) 2021, NVIDIA CORPORATION. All rights reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
**/
package csv
import (
"path/filepath"
"testing"
"github.com/NVIDIA/nvidia-container-toolkit/internal/test"
"github.com/stretchr/testify/require"
)
func TestGetFileList(t *testing.T) {
moduleRoot, _ := test.GetModuleRoot()
testCases := []struct {
description string
root string
files []string
expectedError error
}{
{
description: "returns list of CSV files",
root: "test/input/csv_samples/",
files: []string{
"jetson.csv",
"simple_wrong.csv",
"simple.csv",
"spaced.csv",
},
},
{
description: "handles empty folder",
root: "test/input/csv_samples/empty",
},
{
description: "handles non-existent folder",
root: "test/input/csv_samples/NONEXISTENT",
},
{
description: "handles non-existent folder root",
root: "/NONEXISTENT/test/input/csv_samples/",
},
}
for _, tc := range testCases {
t.Run(tc.description, func(t *testing.T) {
root := filepath.Join(moduleRoot, tc.root)
files, err := GetFileList(root)
if tc.expectedError != nil {
require.Error(t, err)
require.Empty(t, files)
return
}
require.NoError(t, err)
var foundFiles []string
for _, f := range files {
require.Equal(t, root, filepath.Dir(f))
require.Equal(t, ".csv", filepath.Ext(f))
foundFiles = append(foundFiles, filepath.Base(f))
}
require.ElementsMatch(t, tc.files, foundFiles)
})
}
}

View File

@@ -0,0 +1,74 @@
/**
# Copyright (c) 2021-2022, NVIDIA CORPORATION. All rights reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
**/
package csv
import (
"fmt"
"strings"
)
// MountSpecType defines the mount types allowed in a CSV file
type MountSpecType string
const (
// MountSpecDev is used for character devices
MountSpecDev = MountSpecType("dev")
// MountSpecDir is used for directories
MountSpecDir = MountSpecType("dir")
// MountSpecLib is used for libraries or regular files
MountSpecLib = MountSpecType("lib")
// MountSpecSym is used for symlinks.
MountSpecSym = MountSpecType("sym")
)
// MountSpec represents a Jetson mount consisting of a type and a path.
type MountSpec struct {
Type MountSpecType
Path string
}
// NewMountSpecFromLine parses the specified line and returns the MountSpec or an error if the line is malformed
func NewMountSpecFromLine(line string) (*MountSpec, error) {
parts := strings.SplitN(strings.TrimSpace(line), ",", 2)
if len(parts) < 2 {
return nil, fmt.Errorf("failed to parse line: %v", line)
}
mountType := strings.TrimSpace(parts[0])
path := strings.TrimSpace(parts[1])
return NewMountSpec(mountType, path)
}
// NewMountSpec creates a MountSpec with the specified type and path. An error is returned if the type is invalid.
func NewMountSpec(mountType string, path string) (*MountSpec, error) {
mt := MountSpecType(mountType)
switch mt {
case MountSpecDev, MountSpecLib, MountSpecSym, MountSpecDir:
default:
return nil, fmt.Errorf("unexpected mount type: %v", mt)
}
if path == "" {
return nil, fmt.Errorf("invalid path: %v", path)
}
mount := MountSpec{
Type: mt,
Path: path,
}
return &mount, nil
}

View File

@@ -0,0 +1,82 @@
/**
# Copyright (c) 2022, NVIDIA CORPORATION. All rights reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
**/
package csv
import (
"fmt"
"testing"
"github.com/stretchr/testify/require"
)
func TestNewMountSpecFromLine(t *testing.T) {
parseError := fmt.Errorf("failed to parse line")
unexpectedError := fmt.Errorf("unexpected mount type")
testCases := []struct {
line string
expectedError error
expectedValue MountSpec
}{
{
line: "",
expectedError: parseError,
},
{
line: "\t",
expectedError: parseError,
},
{
line: ",",
expectedError: parseError,
},
{
line: "dev,",
expectedError: parseError,
},
{
line: "dev ,/a/path",
expectedValue: MountSpec{
Path: "/a/path",
Type: "dev",
},
},
{
line: "dev ,/a/path,with,commas",
expectedValue: MountSpec{
Path: "/a/path,with,commas",
Type: "dev",
},
},
{
line: "not-dev ,/a/path",
expectedError: unexpectedError,
},
}
for i, tc := range testCases {
t.Run(fmt.Sprintf("test case %d", i), func(t *testing.T) {
target, err := NewMountSpecFromLine(tc.line)
if tc.expectedError != nil {
require.Error(t, err)
return
}
require.NoError(t, err)
require.EqualValues(t, &tc.expectedValue, target)
})
}
}

View File

@@ -0,0 +1,142 @@
/**
# Copyright (c) 2021, NVIDIA CORPORATION. All rights reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
**/
package discover
import (
"fmt"
"testing"
"github.com/NVIDIA/nvidia-container-toolkit/internal/discover/csv"
"github.com/NVIDIA/nvidia-container-toolkit/internal/lookup"
testlog "github.com/sirupsen/logrus/hooks/test"
"github.com/stretchr/testify/require"
)
func TestNewFromMountSpec(t *testing.T) {
logger, _ := testlog.NewNullLogger()
locators := map[csv.MountSpecType]lookup.Locator{
"dev": &lookup.LocatorMock{},
"lib": &lookup.LocatorMock{},
}
testCases := []struct {
description string
root string
targets []*csv.MountSpec
expectedError error
expectedDiscoverer Discover
}{
{
description: "empty targets returns None discoverer list",
expectedDiscoverer: &None{},
},
{
description: "unexpected locator returns error",
targets: []*csv.MountSpec{
{
Type: "foo",
Path: "bar",
},
},
expectedError: fmt.Errorf("no locator defined for foo"),
},
{
description: "creates discoverers based on type",
targets: []*csv.MountSpec{
{
Type: "dev",
Path: "dev0",
},
{
Type: "lib",
Path: "lib0",
},
{
Type: "dev",
Path: "dev1",
},
},
expectedDiscoverer: &list{
discoverers: []Discover{
(*charDevices)(
&mounts{
logger: logger,
lookup: locators["dev"],
root: "/",
required: []string{"dev0", "dev1"},
},
),
&mounts{
logger: logger,
lookup: locators["lib"],
root: "/",
required: []string{"lib0"},
},
},
},
},
{
description: "sets root",
targets: []*csv.MountSpec{
{
Type: "dev",
Path: "dev0",
},
{
Type: "lib",
Path: "lib0",
},
{
Type: "dev",
Path: "dev1",
},
},
root: "/some/root",
expectedDiscoverer: &list{
discoverers: []Discover{
(*charDevices)(
&mounts{
logger: logger,
lookup: locators["dev"],
root: "/some/root",
required: []string{"dev0", "dev1"},
},
),
&mounts{
logger: logger,
lookup: locators["lib"],
root: "/some/root",
required: []string{"lib0"},
},
},
},
},
}
for _, tc := range testCases {
t.Run(tc.description, func(t *testing.T) {
discoverer, err := newFromMountSpecs(logger, locators, tc.root, tc.targets)
if tc.expectedError != nil {
require.Error(t, err)
return
}
require.NoError(t, err)
require.EqualValues(t, tc.expectedDiscoverer, discoverer)
})
}
}

View File

@@ -0,0 +1,52 @@
/*
# Copyright (c) 2021-2022, NVIDIA CORPORATION. All rights reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
*/
package discover
// Config represents the configuration options for discovery
type Config struct {
DriverRoot string
NvidiaCTKPath string
}
// Device represents a discovered character device.
type Device struct {
HostPath string
Path string
}
// Mount represents a discovered mount.
type Mount struct {
HostPath string
Path string
Options []string
}
// Hook represents a discovered hook.
type Hook struct {
Lifecycle string
Path string
Args []string
}
// Discover defines an interface for discovering the devices, mounts, and hooks available on a system
//
//go:generate moq -stub -out discover_mock.go . Discover
type Discover interface {
Devices() ([]Device, error)
Mounts() ([]Mount, error)
Hooks() ([]Hook, error)
}

View File

@@ -0,0 +1,150 @@
// Code generated by moq; DO NOT EDIT.
// github.com/matryer/moq
package discover
import (
"sync"
)
// Ensure, that DiscoverMock does implement Discover.
// If this is not the case, regenerate this file with moq.
var _ Discover = &DiscoverMock{}
// DiscoverMock is a mock implementation of Discover.
//
// func TestSomethingThatUsesDiscover(t *testing.T) {
//
// // make and configure a mocked Discover
// mockedDiscover := &DiscoverMock{
// DevicesFunc: func() ([]Device, error) {
// panic("mock out the Devices method")
// },
// HooksFunc: func() ([]Hook, error) {
// panic("mock out the Hooks method")
// },
// MountsFunc: func() ([]Mount, error) {
// panic("mock out the Mounts method")
// },
// }
//
// // use mockedDiscover in code that requires Discover
// // and then make assertions.
//
// }
type DiscoverMock struct {
// DevicesFunc mocks the Devices method.
DevicesFunc func() ([]Device, error)
// HooksFunc mocks the Hooks method.
HooksFunc func() ([]Hook, error)
// MountsFunc mocks the Mounts method.
MountsFunc func() ([]Mount, error)
// calls tracks calls to the methods.
calls struct {
// Devices holds details about calls to the Devices method.
Devices []struct {
}
// Hooks holds details about calls to the Hooks method.
Hooks []struct {
}
// Mounts holds details about calls to the Mounts method.
Mounts []struct {
}
}
lockDevices sync.RWMutex
lockHooks sync.RWMutex
lockMounts sync.RWMutex
}
// Devices calls DevicesFunc.
func (mock *DiscoverMock) Devices() ([]Device, error) {
callInfo := struct {
}{}
mock.lockDevices.Lock()
mock.calls.Devices = append(mock.calls.Devices, callInfo)
mock.lockDevices.Unlock()
if mock.DevicesFunc == nil {
var (
devicesOut []Device
errOut error
)
return devicesOut, errOut
}
return mock.DevicesFunc()
}
// DevicesCalls gets all the calls that were made to Devices.
// Check the length with:
// len(mockedDiscover.DevicesCalls())
func (mock *DiscoverMock) DevicesCalls() []struct {
} {
var calls []struct {
}
mock.lockDevices.RLock()
calls = mock.calls.Devices
mock.lockDevices.RUnlock()
return calls
}
// Hooks calls HooksFunc.
func (mock *DiscoverMock) Hooks() ([]Hook, error) {
callInfo := struct {
}{}
mock.lockHooks.Lock()
mock.calls.Hooks = append(mock.calls.Hooks, callInfo)
mock.lockHooks.Unlock()
if mock.HooksFunc == nil {
var (
hooksOut []Hook
errOut error
)
return hooksOut, errOut
}
return mock.HooksFunc()
}
// HooksCalls gets all the calls that were made to Hooks.
// Check the length with:
// len(mockedDiscover.HooksCalls())
func (mock *DiscoverMock) HooksCalls() []struct {
} {
var calls []struct {
}
mock.lockHooks.RLock()
calls = mock.calls.Hooks
mock.lockHooks.RUnlock()
return calls
}
// Mounts calls MountsFunc.
func (mock *DiscoverMock) Mounts() ([]Mount, error) {
callInfo := struct {
}{}
mock.lockMounts.Lock()
mock.calls.Mounts = append(mock.calls.Mounts, callInfo)
mock.lockMounts.Unlock()
if mock.MountsFunc == nil {
var (
mountsOut []Mount
errOut error
)
return mountsOut, errOut
}
return mock.MountsFunc()
}
// MountsCalls gets all the calls that were made to Mounts.
// Check the length with:
// len(mockedDiscover.MountsCalls())
func (mock *DiscoverMock) MountsCalls() []struct {
} {
var calls []struct {
}
mock.lockMounts.RLock()
calls = mock.calls.Mounts
mock.lockMounts.RUnlock()
return calls
}

View File

@@ -0,0 +1,62 @@
/**
# Copyright (c) 2022, NVIDIA CORPORATION. All rights reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
**/
package discover
import "github.com/sirupsen/logrus"
// Filter defines an interface for filtering discovered entities
type Filter interface {
DeviceIsSelected(device Device) bool
}
// filtered represents a filtered discoverer
type filtered struct {
Discover
logger *logrus.Logger
filter Filter
}
// newFilteredDisoverer creates a discoverer that applies the specified filter to the returned entities of the discoverer
func newFilteredDisoverer(logger *logrus.Logger, applyTo Discover, filter Filter) Discover {
return filtered{
Discover: applyTo,
logger: logger,
filter: filter,
}
}
// Devices returns a filtered list of devices based on the specified filter.
func (d filtered) Devices() ([]Device, error) {
devices, err := d.Discover.Devices()
if err != nil {
return nil, err
}
if d.filter == nil {
return devices, nil
}
var selected []Device
for _, device := range devices {
if d.filter.DeviceIsSelected(device) {
selected = append(selected, device)
}
d.logger.Debugf("skipping device %v", device)
}
return selected, nil
}

80
internal/discover/gds.go Normal file
View File

@@ -0,0 +1,80 @@
/**
# Copyright (c) 2022, NVIDIA CORPORATION. All rights reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
**/
package discover
import (
"github.com/NVIDIA/nvidia-container-toolkit/internal/lookup"
"github.com/sirupsen/logrus"
)
type gdsDeviceDiscoverer struct {
None
logger *logrus.Logger
devices Discover
mounts Discover
}
// NewGDSDiscoverer creates a discoverer for GPUDirect Storage devices and mounts.
func NewGDSDiscoverer(logger *logrus.Logger, root string) (Discover, error) {
devices := NewCharDeviceDiscoverer(
logger,
[]string{"/dev/nvidia-fs*"},
root,
)
udev := NewMounts(
logger,
lookup.NewDirectoryLocator(logger, root),
root,
[]string{"/run/udev"},
)
cufile := NewMounts(
logger,
lookup.NewFileLocator(
lookup.WithLogger(logger),
lookup.WithRoot(root),
),
root,
[]string{"/etc/cufile.json"},
)
d := gdsDeviceDiscoverer{
logger: logger,
devices: devices,
mounts: Merge(udev, cufile),
}
return &d, nil
}
// Devices discovers the nvidia-fs device nodes for use with GPUDirect Storage
func (d *gdsDeviceDiscoverer) Devices() ([]Device, error) {
return d.devices.Devices()
}
// Mounts discovers the required mounts for GPUDirect Storage.
// If no devices are discovered the discovered mounts are empty
func (d *gdsDeviceDiscoverer) Mounts() ([]Mount, error) {
devices, err := d.Devices()
if err != nil || len(devices) == 0 {
d.logger.Debugf("No nvidia-fs devices detected; skipping detection of mounts")
return nil, nil
}
return d.mounts.Mounts()
}

View File

@@ -0,0 +1,264 @@
/**
# Copyright (c) 2022, NVIDIA CORPORATION. All rights reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
**/
package discover
import (
"fmt"
"os"
"path/filepath"
"github.com/NVIDIA/nvidia-container-toolkit/internal/config/image"
"github.com/NVIDIA/nvidia-container-toolkit/internal/info/drm"
"github.com/NVIDIA/nvidia-container-toolkit/internal/info/proc"
"github.com/NVIDIA/nvidia-container-toolkit/internal/lookup"
"github.com/sirupsen/logrus"
)
// NewGraphicsDiscoverer returns the discoverer for graphics tools such as Vulkan.
func NewGraphicsDiscoverer(logger *logrus.Logger, devices image.VisibleDevices, cfg *Config) (Discover, error) {
driverRoot := cfg.DriverRoot
mounts, err := NewGraphicsMountsDiscoverer(logger, driverRoot)
if err != nil {
return nil, fmt.Errorf("failed to create mounts discoverer: %v", err)
}
drmDeviceNodes, err := newDRMDeviceDiscoverer(logger, devices, driverRoot)
if err != nil {
return nil, fmt.Errorf("failed to create DRM device discoverer: %v", err)
}
drmByPathSymlinks := newCreateDRMByPathSymlinks(logger, drmDeviceNodes, cfg)
discover := Merge(
Merge(drmDeviceNodes, drmByPathSymlinks),
mounts,
)
return discover, nil
}
// NewGraphicsMountsDiscoverer creates a discoverer for the mounts required by graphics tools such as vulkan.
func NewGraphicsMountsDiscoverer(logger *logrus.Logger, driverRoot string) (Discover, error) {
locator, err := lookup.NewLibraryLocator(logger, driverRoot)
if err != nil {
return nil, fmt.Errorf("failed to construct library locator: %v", err)
}
libraries := NewMounts(
logger,
locator,
driverRoot,
[]string{
"libnvidia-egl-gbm.so",
},
)
jsonMounts := NewMounts(
logger,
lookup.NewFileLocator(
lookup.WithLogger(logger),
lookup.WithRoot(driverRoot),
lookup.WithSearchPaths("/etc", "/usr/share"),
),
driverRoot,
[]string{
"glvnd/egl_vendor.d/10_nvidia.json",
"vulkan/icd.d/nvidia_icd.json",
"vulkan/implicit_layer.d/nvidia_layers.json",
"egl/egl_external_platform.d/15_nvidia_gbm.json",
"egl/egl_external_platform.d/10_nvidia_wayland.json",
},
)
discover := Merge(
libraries,
jsonMounts,
)
return discover, nil
}
type drmDevicesByPath struct {
None
logger *logrus.Logger
nvidiaCTKPath string
driverRoot string
devicesFrom Discover
}
// newCreateDRMByPathSymlinks creates a discoverer for a hook to create the by-path symlinks for DRM devices discovered by the specified devices discoverer
func newCreateDRMByPathSymlinks(logger *logrus.Logger, devices Discover, cfg *Config) Discover {
d := drmDevicesByPath{
logger: logger,
nvidiaCTKPath: FindNvidiaCTK(logger, cfg.NvidiaCTKPath),
driverRoot: cfg.DriverRoot,
devicesFrom: devices,
}
return &d
}
// Hooks returns a hook to create the symlinks from the required CSV files
func (d drmDevicesByPath) Hooks() ([]Hook, error) {
devices, err := d.devicesFrom.Devices()
if err != nil {
return nil, fmt.Errorf("failed to discover devices for by-path symlinks: %v", err)
}
if len(devices) == 0 {
return nil, nil
}
links, err := d.getSpecificLinkArgs(devices)
if err != nil {
return nil, fmt.Errorf("failed to determine specific links: %v", err)
}
if len(links) == 0 {
return nil, nil
}
var args []string
for _, l := range links {
args = append(args, "--link", l)
}
hook := CreateNvidiaCTKHook(
d.nvidiaCTKPath,
"create-symlinks",
args...,
)
return []Hook{hook}, nil
}
// getSpecificLinkArgs returns the required specic links that need to be created
func (d drmDevicesByPath) getSpecificLinkArgs(devices []Device) ([]string, error) {
selectedDevices := make(map[string]bool)
for _, d := range devices {
selectedDevices[filepath.Base(d.HostPath)] = true
}
linkLocator := lookup.NewFileLocator(
lookup.WithLogger(d.logger),
lookup.WithRoot(d.driverRoot),
)
candidates, err := linkLocator.Locate("/dev/dri/by-path/pci-*-*")
if err != nil {
d.logger.Warningf("Failed to locate by-path links: %v; ignoring", err)
return nil, nil
}
var links []string
for _, c := range candidates {
device, err := os.Readlink(c)
if err != nil {
d.logger.Warningf("Failed to evaluate symlink %v; ignoring", c)
continue
}
if selectedDevices[filepath.Base(device)] {
d.logger.Debugf("adding device symlink %v -> %v", c, device)
links = append(links, fmt.Sprintf("%v::%v", device, c))
}
}
return links, nil
}
// newDRMDeviceDiscoverer creates a discoverer for the DRM devices associated with the requested devices.
func newDRMDeviceDiscoverer(logger *logrus.Logger, devices image.VisibleDevices, driverRoot string) (Discover, error) {
allDevices := NewDeviceDiscoverer(
logger,
lookup.NewCharDeviceLocator(
lookup.WithLogger(logger),
lookup.WithRoot(driverRoot),
),
driverRoot,
[]string{
"/dev/dri/card*",
"/dev/dri/renderD*",
},
)
filter, err := newDRMDeviceFilter(logger, devices, driverRoot)
if err != nil {
return nil, fmt.Errorf("failed to construct DRM device filter: %v", err)
}
// We return a discoverer that applies the DRM device filter created above to all discovered DRM device nodes.
d := newFilteredDisoverer(
logger,
allDevices,
filter,
)
return d, err
}
// newDRMDeviceFilter creates a filter that matches DRM devices nodes for the visible devices.
func newDRMDeviceFilter(logger *logrus.Logger, devices image.VisibleDevices, driverRoot string) (Filter, error) {
gpuInformationPaths, err := proc.GetInformationFilePaths(driverRoot)
if err != nil {
return nil, fmt.Errorf("failed to read GPU information: %v", err)
}
var selectedBusIds []string
for _, f := range gpuInformationPaths {
info, err := proc.ParseGPUInformationFile(f)
if err != nil {
return nil, fmt.Errorf("failed to parse %v: %v", f, err)
}
uuid := info[proc.GPUInfoGPUUUID]
busID := info[proc.GPUInfoBusLocation]
minor := info[proc.GPUInfoDeviceMinor]
if devices.Has(minor) || devices.Has(uuid) || devices.Has(busID) {
selectedBusIds = append(selectedBusIds, busID)
}
}
filter := make(selectDeviceByPath)
for _, busID := range selectedBusIds {
drmDeviceNodes, err := drm.GetDeviceNodesByBusID(busID)
if err != nil {
return nil, fmt.Errorf("failed to determine DRM devices for %v: %v", busID, err)
}
for _, drmDeviceNode := range drmDeviceNodes {
filter[filepath.Join(drmDeviceNode)] = true
}
}
return filter, nil
}
// selectDeviceByPath is a filter that allows devices to be selected by the path
type selectDeviceByPath map[string]bool
var _ Filter = (*selectDeviceByPath)(nil)
// DeviceIsSelected determines whether the device's path has been selected
func (s selectDeviceByPath) DeviceIsSelected(device Device) bool {
return s[device.Path]
}
// MountIsSelected is always true
func (s selectDeviceByPath) MountIsSelected(Mount) bool {
return true
}
// HookIsSelected is always true
func (s selectDeviceByPath) HookIsSelected(Hook) bool {
return true
}

103
internal/discover/hooks.go Normal file
View File

@@ -0,0 +1,103 @@
/**
# Copyright (c) NVIDIA CORPORATION. All rights reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
**/
package discover
import (
"path/filepath"
"github.com/NVIDIA/nvidia-container-toolkit/internal/lookup"
"github.com/container-orchestrated-devices/container-device-interface/pkg/cdi"
"github.com/sirupsen/logrus"
)
const (
nvidiaCTKExecutable = "nvidia-ctk"
nvidiaCTKDefaultFilePath = "/usr/bin/nvidia-ctk"
)
var _ Discover = (*Hook)(nil)
// Devices returns an empty list of devices for a Hook discoverer.
func (h Hook) Devices() ([]Device, error) {
return nil, nil
}
// Mounts returns an empty list of mounts for a Hook discoverer.
func (h Hook) Mounts() ([]Mount, error) {
return nil, nil
}
// Hooks allows the Hook type to also implement the Discoverer interface.
// It returns a single hook
func (h Hook) Hooks() ([]Hook, error) {
return []Hook{h}, nil
}
// CreateCreateSymlinkHook creates a hook which creates a symlink from link -> target.
func CreateCreateSymlinkHook(nvidiaCTKPath string, links []string) Discover {
if len(links) == 0 {
return None{}
}
var args []string
for _, link := range links {
args = append(args, "--link", link)
}
return CreateNvidiaCTKHook(
nvidiaCTKPath,
"create-symlinks",
args...,
)
}
// CreateNvidiaCTKHook creates a hook which invokes the NVIDIA Container CLI hook subcommand.
func CreateNvidiaCTKHook(nvidiaCTKPath string, hookName string, additionalArgs ...string) Hook {
return Hook{
Lifecycle: cdi.CreateContainerHook,
Path: nvidiaCTKPath,
Args: append([]string{filepath.Base(nvidiaCTKPath), "hook", hookName}, additionalArgs...),
}
}
// FindNvidiaCTK locates the nvidia-ctk executable to be used in hooks.
// If an nvidia-ctk path is specified as an absolute path, it is used directly
// without checking for existence of an executable at that path.
func FindNvidiaCTK(logger *logrus.Logger, nvidiaCTKPath string) string {
if filepath.IsAbs(nvidiaCTKPath) {
logger.Debugf("Using specified NVIDIA Container Toolkit CLI path %v", nvidiaCTKPath)
return nvidiaCTKPath
}
if nvidiaCTKPath == "" {
nvidiaCTKPath = nvidiaCTKExecutable
}
logger.Debugf("Locating NVIDIA Container Toolkit CLI as %v", nvidiaCTKPath)
lookup := lookup.NewExecutableLocator(logger, "")
hookPath := nvidiaCTKDefaultFilePath
targets, err := lookup.Locate(nvidiaCTKPath)
if err != nil {
logger.Warnf("Failed to locate %v: %v", nvidiaCTKPath, err)
} else if len(targets) == 0 {
logger.Warnf("%v not found", nvidiaCTKPath)
} else {
logger.Debugf("Found %v candidates: %v", nvidiaCTKPath, targets)
hookPath = targets[0]
}
logger.Debugf("Using NVIDIA Container Toolkit CLI path %v", hookPath)
return hookPath
}

View File

@@ -0,0 +1,60 @@
/**
# Copyright (c) NVIDIA CORPORATION. All rights reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
**/
package discover
import (
"testing"
"github.com/NVIDIA/nvidia-container-toolkit/internal/lookup"
"github.com/sirupsen/logrus"
"github.com/stretchr/testify/require"
)
func TestIPCMounts(t *testing.T) {
l := ipcMounts(
mounts{
logger: logrus.New(),
lookup: &lookup.LocatorMock{
LocateFunc: func(path string) ([]string, error) {
return []string{"/host/path"}, nil
},
},
required: []string{"target"},
},
)
mounts, err := l.Mounts()
require.NoError(t, err)
require.EqualValues(
t,
[]Mount{
{
HostPath: "/host/path",
Path: "/host/path",
Options: []string{
"ro",
"nosuid",
"nodev",
"bind",
"noexec",
},
},
},
mounts,
)
}

60
internal/discover/ipc.go Normal file
View File

@@ -0,0 +1,60 @@
/**
# Copyright (c) NVIDIA CORPORATION. All rights reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
**/
package discover
import (
"github.com/NVIDIA/nvidia-container-toolkit/internal/lookup"
"github.com/sirupsen/logrus"
)
type ipcMounts mounts
// NewIPCDiscoverer creats a discoverer for NVIDIA IPC sockets.
func NewIPCDiscoverer(logger *logrus.Logger, driverRoot string) (Discover, error) {
d := newMounts(
logger,
lookup.NewFileLocator(
lookup.WithLogger(logger),
lookup.WithRoot(driverRoot),
),
driverRoot,
[]string{
"/var/run/nvidia-persistenced/socket",
"/var/run/nvidia-fabricmanager/socket",
"/tmp/nvidia-mps",
},
)
return (*ipcMounts)(d), nil
}
// Mounts returns the discovered mounts with "noexec" added to the mount options.
func (d *ipcMounts) Mounts() ([]Mount, error) {
mounts, err := (*mounts)(d).Mounts()
if err != nil {
return nil, err
}
var modifiedMounts []Mount
for _, m := range mounts {
mount := m
mount.Options = append(m.Options, "noexec")
modifiedMounts = append(modifiedMounts, mount)
}
return modifiedMounts, nil
}

Some files were not shown because too many files have changed in this diff Show More