Compare commits

...

158 Commits

Author SHA1 Message Date
Evan Lezar
f6983969ad Merge branch 'nvidia-ctk-cdi-transform' into 'main'
Add 'target-driver-root' option to 'nvidia-ctk cdi generate' to transform root...

See merge request nvidia/container-toolkit/container-toolkit!363
2023-03-28 20:05:12 +00:00
Evan Lezar
7f7fc35843 Move input and output to transform root subcommand
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2023-03-28 21:12:48 +02:00
Evan Lezar
8eef7e5406 Merge branch 'add-runtimes' into 'main'
Add nvidia-container-runtime.runtimes config option

See merge request nvidia/container-toolkit/container-toolkit!364
2023-03-28 18:58:46 +00:00
Evan Lezar
f27c33b45f Remove target-driver-root from generate
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2023-03-28 11:49:45 -07:00
Evan Lezar
6a83e2ebe5 Add nvidia-ctk cdi transform root command
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2023-03-28 11:45:58 -07:00
Christopher Desiniotis
ee5be5e3f2 Merge branch 'CNT-4056/add-cdi-annotations' into 'main'
Add nvidia-container-runtime.modes.cdi.annotation-prefixes config option.

See merge request nvidia/container-toolkit/container-toolkit!356
2023-03-28 16:47:51 +00:00
Evan Lezar
be0cc9dc6e Add nvidia-container-runtime.runtimes config option
This change adds an nvidia-container-runtime.runtimes config option.

If this is unset no changes are made to the config and the default values are used. This
allows this setting to be overridden in cases where this is required. One such example is
crio where crun is set as the default runtime.

Signed-off-by: Evan Lezar <elezar@nvidia.com>
2023-03-28 17:39:17 +02:00
Evan Lezar
7c5283bb97 Merge branch 'create-device-nodes' into 'main'
Add nvidia-ctk system create-device-nodes command

See merge request nvidia/container-toolkit/container-toolkit!362
2023-03-28 15:07:04 +00:00
Evan Lezar
4d5ba09d88 Add --ignore-errors option for testing
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2023-03-28 16:24:17 +02:00
Evan Lezar
149236b002 Configure containerd config based on specified annotation prefixes
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2023-03-28 16:22:48 +02:00
Evan Lezar
ee141f97dc Reorganise setting toolkit config options
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2023-03-28 16:22:48 +02:00
Evan Lezar
646503ff31 Set nvidia-container-runtime.modes.cdi.annotation-prefixes in toolkit-contianer
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2023-03-28 16:22:48 +02:00
Evan Lezar
cdaaf5e46f Generate device nodes when creating management spec
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2023-03-28 11:29:45 +02:00
Evan Lezar
e774c51c97 Add nvidia-ctk system create-device-nodes command
This change adds an nvidia-ctk system create-device-nodes command for
creating NVIDIA device nodes. Currently this is limited to control devices
(nvidia-uvm, nvidia-uvm-tools, nvidia-modeset, nvidiactl).

A --dry-run mode is included for outputing commands that would be executed and
the driver root can be specified.

Signed-off-by: Evan Lezar <elezar@nvidia.com>
2023-03-28 11:29:45 +02:00
Christopher Desiniotis
7f5c9abc1e Add ability to configure CDI kind with 'nvidia-ctk cdi generate'
Signed-off-by: Christopher Desiniotis <cdesiniotis@nvidia.com>
2023-03-27 23:12:00 -07:00
Christopher Desiniotis
92d82ceaee Add 'target-driver-root' option to 'nvidia-ctk cdi generate' to transform root paths in generated spec
Signed-off-by: Christopher Desiniotis <cdesiniotis@nvidia.com>
2023-03-27 22:22:36 -07:00
Evan Lezar
c46b118f37 Add nvidia-container-runtime.modes.cdi.annotation-prefixes config option.
This change adds an nvidia-container-runtime.modes.cdi.annotation-prefixes config
option that defaults to cdi.k8s.io/. This allows the annotation prefixes parsed
for CDI devices to be overridden in cases where CDI support in container engines such
as containerd or crio need to be overridden.

Signed-off-by: Evan Lezar <elezar@nvidia.com>
2023-03-27 16:36:54 +02:00
Evan Lezar
1722b07615 Merge branch 'CNT-2264/xorg-libs' into 'main'
Inject xorg libs and config in container

See merge request nvidia/container-toolkit/container-toolkit!328
2023-03-27 14:19:52 +00:00
Evan Lezar
c13c6ebadb Inject xorg libs and config in container
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2023-03-26 17:04:06 +02:00
Evan Lezar
2abe679dd1 Move libcuda locator to internal/lookup package
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2023-03-26 17:04:06 +02:00
Evan Lezar
9571513601 Merge branch 'update-changelog' into 'main'
Update changelog

See merge request nvidia/container-toolkit/container-toolkit!361
2023-03-26 15:03:28 +00:00
Evan Lezar
ff2767ee7b Reorder changelog
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2023-03-26 17:03:05 +02:00
Evan Lezar
56319475a6 Merge branch 'fix-changelog' into 'main'
Reorder changelog

See merge request nvidia/container-toolkit/container-toolkit!360
2023-03-26 14:52:27 +00:00
Evan Lezar
a3ee58a294 Reorder changelog
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2023-03-26 16:51:59 +02:00
Evan Lezar
7a533aeff3 Merge branch 'update-nvcdi-new-with-error' into 'main'
Allow nvcdi.Option to return an error

See merge request nvidia/container-toolkit/container-toolkit!352
2023-03-26 14:13:41 +00:00
Evan Lezar
226c54613e Also return an error from nvcdi.New
This change allows nvcdi.New to return an error in addition to the
constructed library instead of panicing.

Signed-off-by: Evan Lezar <elezar@nvidia.com>
2023-03-26 16:13:12 +02:00
Evan Lezar
1ebbebf5de Merge branch 'CNT-3932/deduplicate-entries-in-cdi-spec' into 'main'
Add transform to deduplicate entities in CDI spec

See merge request nvidia/container-toolkit/container-toolkit!345
2023-03-24 19:04:43 +00:00
Evan Lezar
33f6fe0217 Generate a simplified CDI spec by default
As simplified CDI spec has no duplicate entities in any single set of container edits.
Furthermore, contianer edits defined at a spec-level are not included in the container
edits for a device.

Signed-off-by: Evan Lezar <elezar@nvidia.com>
2023-03-24 11:01:46 +02:00
Evan Lezar
5ff206e1a9 Add transform to deduplicate entities in CDI spec
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2023-03-24 11:01:23 +02:00
Evan Lezar
df618d3cba Merge branch 'CNT-4052/fix-arm-management-containers' into 'main'
Fix generation of management CDI spec in containers

See merge request nvidia/container-toolkit/container-toolkit!354
2023-03-23 16:39:10 +00:00
Evan Lezar
9506bd9da0 Fix generation of management CDI spec in containers
Since we relied on finding libcuda.so in the LDCache to determine both the CUDA
version and the expected directory for the driver libraries, the generation of the
management CDI specifications fails in containers where the LDCache has not been updated.

This change falls back to searching a set of predefined paths instead when the lookup of
libcuda.so in the cache fails.

Signed-off-by: Evan Lezar <elezar@nvidia.com>
2023-03-23 15:59:01 +02:00
Evan Lezar
5e0684e99d Merge branch 'update-libnvidia-container' into 'main'
Update libnvidia-container

See merge request nvidia/container-toolkit/container-toolkit!353
2023-03-23 08:50:18 +00:00
Evan Lezar
09a0cb24cc Remove fedora make targets
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2023-03-23 10:35:57 +02:00
Evan Lezar
ff92f1d799 Update libnvidia-container
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2023-03-23 10:33:26 +02:00
Christopher Desiniotis
b87703c503 Merge branch 'fix-nil-logger-in-library-locator' into 'main'
Instantiate a logger when constructing a library Locator

See merge request nvidia/container-toolkit/container-toolkit!351
2023-03-21 21:54:14 +00:00
Christopher Desiniotis
b2aaa21b0a Instantiate a logger when constructing a library Locator
Signed-off-by: Christopher Desiniotis <cdesiniotis@nvidia.com>
2023-03-21 13:38:36 -07:00
Evan Lezar
310c15b046 Merge branch 'CNT-4026/only-init-nvml-when-required' into 'main'
Only init nvml as required when generating CDI specs

See merge request nvidia/container-toolkit/container-toolkit!344
2023-03-20 13:26:07 +00:00
Evan Lezar
685802b1ce Only init nvml as required when generating CDI specs
CDI generation modes such as management and wsl don't require
NVML. This change removes the top-level instantiation of nvmllib
and replaces it with an instanitation in the nvml CDI spec generation
code.

Signed-off-by: Evan Lezar <elezar@nvidia.com>
2023-03-20 14:24:08 +02:00
Evan Lezar
380eb8340a Merge branch 'blossom-ci' into 'main'
Add blossom-ci github action

See merge request nvidia/container-toolkit/container-toolkit!349
2023-03-20 09:56:23 +00:00
Evan Lezar
f98e1160f5 Update components with blossim-ci
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2023-03-20 11:06:44 +02:00
Evan Lezar
1962fd68df Merge branch 'locate-ipc-sockets-at-run' into 'main'
Locate persistenced and fabricmanager sockets at /run instead of /var/run

See merge request nvidia/container-toolkit/container-toolkit!347
2023-03-20 08:08:59 +00:00
Carlos Eduardo Arango Gutierrez
29813c1e14 Add blossom-ci github action
Signed-off-by: Carlos Eduardo Arango Gutierrez <eduardoa@nvidia.com>
2023-03-17 16:16:27 +01:00
Evan Lezar
df40fbe03e Locate persistenced and fabricmanager sockets at /run instead of /var/run
This chagne prefers (non-symlink) sockets at /run over /var/run for
nvidia-persistenced and nvidia-fabricmanager sockets.

Signed-off-by: Evan Lezar <elezar@nvidia.com>
2023-03-17 09:23:48 +02:00
Carlos Eduardo Arango Gutierrez
7000c6074e Merge branch 'ci_rules' into 'main'
Rework pipeline triggers for MRs

See merge request nvidia/container-toolkit/container-toolkit!346
2023-03-15 13:15:23 +00:00
Evan Lezar
ef1fe3ab41 Rework pipeline triggers for MRs
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2023-03-15 14:15:20 +02:00
Evan Lezar
fdd198b0e8 Merge branch 'bump-v1.13.0-rc.3' into 'main'
Bump version to v1.13.0-rc.3

See merge request nvidia/container-toolkit/container-toolkit!343
2023-03-15 07:50:50 +00:00
Evan Lezar
e37f77e02d Update libnvidia-container
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2023-03-15 09:49:49 +02:00
Evan Lezar
3fcfee88be Bump version to v1.13.0-rc.3
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2023-03-15 09:26:19 +02:00
Evan Lezar
a082413d09 Merge branch 'trigger-ci-on-mrs-only' into 'main'
Add workflow rule to only trigger on MRs

See merge request nvidia/container-toolkit/container-toolkit!342
2023-03-15 07:10:30 +00:00
Evan Lezar
280f40508e Make pipeline manual on MRs
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2023-03-15 08:51:18 +02:00
Evan Lezar
e2be0e2ff0 Add workflow rule to only trigger on MRs
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2023-03-15 08:45:26 +02:00
Evan Lezar
dcff3118d9 Merge branch 'update-libnvidia-container' into 'main'
Update libnvidia-container

See merge request nvidia/container-toolkit/container-toolkit!340
2023-03-14 13:54:11 +00:00
Evan Lezar
731168ec8d Update changelog
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2023-03-14 15:05:36 +02:00
Evan Lezar
7b4435a0f8 Update libnvidia-container
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2023-03-14 15:05:29 +02:00
Evan Lezar
738af29724 Merge branch 'explicit-cdi-enabled-flag' into 'main'
Add --cdi-enabled option to control generating CDI spec

See merge request nvidia/container-toolkit/container-toolkit!339
2023-03-14 07:00:30 +00:00
Evan Lezar
08ef242afb Add --cdi-enabled option to control generating CDI spec
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2023-03-13 18:19:00 +02:00
Evan Lezar
92ea8be309 Merge branch 'fix-privileged-check-cdi-mode' into 'main'
Return empty list of devices for unprivileged containers when...

See merge request nvidia/container-toolkit/container-toolkit!337
2023-03-13 07:36:25 +00:00
Christopher Desiniotis
48414e97bb Return empty list of devices for unprivileged containers when 'accept-nvidia-visible-devices-envvar-unprivileged=false'
Signed-off-by: Christopher Desiniotis <cdesiniotis@nvidia.com>
2023-03-10 13:11:29 -08:00
Evan Lezar
77a2975524 Merge branch 'fix-kitmaker' into 'main'
Use component name as folder name

See merge request nvidia/container-toolkit/container-toolkit!336
2023-03-10 13:57:24 +00:00
Evan Lezar
ce9477966d Use component name as folder name
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2023-03-10 15:51:36 +02:00
Evan Lezar
fe02351c3a Merge branch 'bump-cuda-version' into 'main'
Bump CUDA base image version to 12.1.0

See merge request nvidia/container-toolkit/container-toolkit!335
2023-03-10 10:23:30 +00:00
Evan Lezar
9c2018a0dc Bump CUDA base image version to 12.1.0
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2023-03-10 11:31:23 +02:00
Evan Lezar
33e5b34fa1 Merge branch 'CNT-3999/legacy-cli-doesnt-work-in-cdi-mode' into 'main'
Add nvidia-container-runtime-hook.skip-mode-detection option to config

See merge request nvidia/container-toolkit/container-toolkit!330
2023-03-09 19:18:16 +00:00
Evan Lezar
ccf73f2505 Set skip-mode-detection in the toolkit-container by default
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2023-03-09 20:16:10 +02:00
Evan Lezar
3a11f6ee0a Add nvidia-container-runtime-hook.skip-mode-detection option to config
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2023-03-09 20:15:40 +02:00
Evan Lezar
8f694bbfb7 Merge branch 'set-nvidia-ctk-path' into 'main'
Set nvidia-ctk.path config option based on installed path

See merge request nvidia/container-toolkit/container-toolkit!334
2023-03-09 16:44:13 +00:00
Evan Lezar
4c2eff4865 Merge branch 'CNT-3998/cdi-accept-visible-devices-when-privileged' into 'main'
Honor accept-nvidia-visible-devices-envvar-when-unprivileged setting in CDI mode

See merge request nvidia/container-toolkit/container-toolkit!331
2023-03-09 15:59:08 +00:00
Evan Lezar
1fbdc17c40 Set nvidia-ctk.path config option based on installed path
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2023-03-09 17:53:08 +02:00
Evan Lezar
965d62f326 Merge branch 'fix-containerd-integration-tests' into 'main'
Fix integration tests failing due to CDI spec generation

See merge request nvidia/container-toolkit/container-toolkit!333
2023-03-09 14:41:52 +00:00
Evan Lezar
25ea7fa98e Remove whitespace in Makefile
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2023-03-09 15:32:07 +02:00
Evan Lezar
5ee040ba95 Disable CDI spec generation for integration tests 2023-03-09 15:32:07 +02:00
Evan Lezar
eb2aec9da8 Allow CDI options to be set by envvars
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2023-03-09 12:25:05 +02:00
Evan Lezar
973e7bda5e Check accept-nvidia-visible-devices-envvar-when-unprivileged option for CDI
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2023-03-09 11:15:53 +02:00
Evan Lezar
154cd4ecf3 Add to config struct
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2023-03-09 11:15:53 +02:00
Evan Lezar
936fad1d04 Move check for privileged images to config/image/ package
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2023-03-09 11:15:53 +02:00
Evan Lezar
86dd046c7c Merge branch 'CNT-3928/allow-cdi-container-annotations' into 'main'
Add cdi.k8s.io annotations to runtimes configured in containerd

See merge request nvidia/container-toolkit/container-toolkit!315
2023-03-09 07:52:37 +00:00
Evan Lezar
510fb248fe Add cdi.k8s.io annotations to containerd config
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2023-03-08 07:23:27 +02:00
Evan Lezar
c7384c6aee Merge branch 'fix-comment' into 'main'
Fix comment

See merge request nvidia/container-toolkit/container-toolkit!329
2023-03-08 05:15:38 +00:00
Evan Lezar
1c3c9143f8 Fix comment
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2023-03-08 07:15:05 +02:00
Evan Lezar
1c696b1e39 Merge branch 'CNT-3894/configure-mode-specific-runtimes' into 'main'
Configure .cdi and .legacy executables in Toolkit Container

See merge request nvidia/container-toolkit/container-toolkit!308
2023-03-08 05:12:50 +00:00
Evan Lezar
a2adbc1133 Merge branch 'CNT-3898/improve-cdi-annotations' into 'main'
Improve handling of environment variable devices in CDI mode

See merge request nvidia/container-toolkit/container-toolkit!321
2023-03-08 04:37:41 +00:00
Evan Lezar
36576708f0 Merge branch 'CNT-3896/gds-mofed-devices' into 'main'
Add GDS and MOFED support to the NVCDI API

See merge request nvidia/container-toolkit/container-toolkit!323
2023-03-08 04:36:55 +00:00
Evan Lezar
cc7a6f166b Handle case were runtime name is set to predefined name
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2023-03-07 20:59:56 +02:00
Evan Lezar
62d88e7c95 Add cdi and legacy mode runtimes
This change adds .cdi and .legacy mode-specific runtimes the list of
runtimes supported by the operator. These are also installed as
part of the NVIDIA Container Toolkit.

Signed-off-by: Evan Lezar <elezar@nvidia.com>
2023-03-07 20:59:55 +02:00
Evan Lezar
dca8e3123f Migrate containerd config to engine.Interface
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2023-03-07 20:59:55 +02:00
Evan Lezar
3bac4fad09 Migrate cri-o config update to engine.Interface
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2023-03-07 20:59:54 +02:00
Evan Lezar
9fff19da23 Migrate docker config to engine.Interface
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2023-03-07 20:59:54 +02:00
Evan Lezar
e5bb4d2718 Move runtime config code from config to config/engine
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2023-03-07 20:59:54 +02:00
Evan Lezar
5bfb51f801 Add API for interacting with runtime engine configs
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2023-03-07 20:59:53 +02:00
Evan Lezar
ece5b29d97 Add tools/container/operator package to handle runtime naming
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2023-03-07 20:59:53 +02:00
Evan Lezar
ec8a92c17f Use nvidia-container-runtime.experimental as wrapper
This change switches to using nvidia-container-runtime.experimental as the
wrapper name over nvidia-container-runtime-experimental. This is consistent
with upcoming mode-specific binaries.

The wrapper is created at nvidia-container-runtime.experimental.real.

Signed-off-by: Evan Lezar <elezar@nvidia.com>
2023-03-07 20:59:53 +02:00
Evan Lezar
868393b7ed Add mofed mode to nvcdi API
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2023-03-07 18:47:52 +02:00
Evan Lezar
ebe18fbb7f Add gds mode to nvcdi API
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2023-03-07 18:47:52 +02:00
Evan Lezar
9435343541 Merge branch 'fix-kitmaker' into 'main'
Include = when extracting manifest information

See merge request nvidia/container-toolkit/container-toolkit!327
2023-03-07 14:44:27 +00:00
Evan Lezar
1cd20afe4f Include = when extracting manifest information
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2023-03-07 16:43:49 +02:00
Evan Lezar
1e6fe40c76 Allow nvidia-container-runtime.modes.cdi.default-kind to be set
This change allows the nvidia-container-runtime.modes.cdi.default-kind
to be set in the toolkit-container.

The NVIDIA_CONTAINER_RUNTIME_MODES_CDI_DEFAULT_KIND envvar is used.

Signed-off-by: Evan Lezar <elezar@nvidia.com>
2023-03-07 16:19:38 +02:00
Evan Lezar
6d220ed9a2 Rework selection of devices in CDI mode
The following changes are made:
* The default-cdi-kind config option is used to convert an envvar entry to a fully-qualified device name
* If annotation devices exist, these are used instead of the envvar devices.
* The `all` device is no longer treated as a special case and MUST exist in the CDI spec.

Signed-off-by: Evan Lezar <elezar@nvidia.com>
2023-03-07 16:18:53 +02:00
Evan Lezar
f00439c93e Add nvidia-container-runtime.modes.csv.default-kind config option
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2023-03-07 16:18:53 +02:00
Evan Lezar
c59696e30e Merge branch 'fix-kitmaker' into 'main'
Log source file

See merge request nvidia/container-toolkit/container-toolkit!326
2023-03-06 16:26:43 +00:00
Evan Lezar
89c18c73cd Add source and log curl command
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2023-03-06 18:26:05 +02:00
Evan Lezar
cb5006c73f Merge branch 'CNT-3897/generate-management-container-spec' into 'main'
Generate CDI specs for management containers

See merge request nvidia/container-toolkit/container-toolkit!314
2023-03-06 16:23:13 +00:00
Evan Lezar
547b71f222 Merge branch 'change-discovery-mode' into 'main'
Rename --discovery-mode to --mode

See merge request nvidia/container-toolkit/container-toolkit!318
2023-03-06 16:21:22 +00:00
Evan Lezar
ae84bfb055 Log source file
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2023-03-06 18:11:12 +02:00
Evan Lezar
9b303d5b89 Merge branch 'fix-changelist' into 'main'
Strip on tilde for kitmaker version

See merge request nvidia/container-toolkit/container-toolkit!325
2023-03-06 14:10:54 +00:00
Evan Lezar
d944f934d7 Strip on tilde for kitmaker version
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2023-03-06 16:10:25 +02:00
Evan Lezar
c37209cd09 Merge branch 'fix-changelist' into 'main'
Fix blank changelist in kitmaker properties

See merge request nvidia/container-toolkit/container-toolkit!324
2023-03-06 13:51:19 +00:00
Evan Lezar
863b569a61 Fix blank changelist in kitmaker properties
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2023-03-06 15:50:38 +02:00
Evan Lezar
f36c514f1f Merge branch 'update-kitmaker-folders' into 'main'
Update kitmaker target folder

See merge request nvidia/container-toolkit/container-toolkit!313
2023-03-06 11:16:49 +00:00
Evan Lezar
3ab28c7fa4 Merge branch 'fix-rule-for-release' into 'main'
Run full build on release- branches

See merge request nvidia/container-toolkit/container-toolkit!320
2023-03-06 10:56:58 +00:00
Evan Lezar
c03258325b Run full build on release- branches
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2023-03-06 12:54:27 +02:00
Evan Lezar
20d3bb189b Rename --discovery-mode to --mode
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2023-03-06 11:00:22 +02:00
Evan Lezar
90acec60bb Skip CDI spec generation in integration tests
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2023-03-06 10:57:40 +02:00
Evan Lezar
0565888c03 Generate CDI spec in toolkit container
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2023-03-06 10:57:40 +02:00
Evan Lezar
f7e817cff6 Support management mode in nvidia-ctk cdi generate
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2023-03-06 10:53:43 +02:00
Evan Lezar
29cbbe83f9 Add management mode to CDI spec generation API
These changes add support for generating a management spec to the nvcdi API.
A management spec consists of a single CDI device (`all`) which includes all expected
NVIDIA device nodes, driver libraries, binaries, and IPC sockets.

Signed-off-by: Evan Lezar <elezar@nvidia.com>
2023-03-06 10:53:43 +02:00
Evan Lezar
64b16acb1f Also install nvidia-ctk in toolkit-container
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2023-03-06 10:53:43 +02:00
Evan Lezar
19c20bb422 Merge branch 'CNT-3931/add-spec-validation' into 'main'
Add nvcdi.spec for writing and validating CDI specifications

See merge request nvidia/container-toolkit/container-toolkit!306
2023-03-06 08:52:56 +00:00
Evan Lezar
28b10d2ee0 Merge branch 'fix-toolkit-ctr-envvars' into 'main'
Fix handling of envvars in toolkit container which modify the NVIDIA Container Runtime config

See merge request nvidia/container-toolkit/container-toolkit!317
2023-03-06 07:36:03 +00:00
Christopher Desiniotis
1f5123f72a Fix handling of envvars in toolkit container which modify the NVIDIA Container Runtime config
Signed-off-by: Christopher Desiniotis <cdesiniotis@nvidia.com>
2023-03-05 20:14:04 -08:00
Evan Lezar
ac5b6d097b Use kitmaker folder for releases
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2023-03-01 17:27:07 +02:00
Evan Lezar
a7bf9ddf28 Update kitmaker folder structure
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2023-03-01 17:27:07 +02:00
Evan Lezar
e27479e170 Add GIT_COMMIT_SHORT to packaging image manifest
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2023-03-01 17:27:07 +02:00
Evan Lezar
fa28e738c6 Merge branch 'fix-internal-scans' into 'main'
Fix internal scans

See merge request nvidia/container-toolkit/container-toolkit!316
2023-03-01 15:26:27 +00:00
Evan Lezar
898c5555f6 Fix internal scans
This fixes the internal scans due to the removed ubuntu18.04 images.

Signed-off-by: Evan Lezar <elezar@nvidia.com>
2023-03-01 17:25:28 +02:00
Evan Lezar
314059fcf0 Move path manipulation to spec.Save
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2023-03-01 13:49:04 +02:00
Evan Lezar
221781bd0b Use full path for output spec
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2023-03-01 13:48:28 +02:00
Evan Lezar
9f5e141437 Expose vendor and class as options
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2023-03-01 13:48:28 +02:00
Evan Lezar
8be6de177f Move formatJSON and formatYAML to nvcdi/spec package
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2023-03-01 13:48:28 +02:00
Evan Lezar
890a519121 Use nvcdi.spec package to write and validate spec
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2023-03-01 13:48:28 +02:00
Evan Lezar
89321edae6 Add top-level GetSpec function to nvcdi API
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2023-03-01 13:48:28 +02:00
Evan Lezar
6d6cd56196 Return nvcdi.spec.Interface from GetSpec
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2023-03-01 12:45:30 +02:00
Evan Lezar
2e95e04359 Add nvcdi.spec package
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2023-03-01 12:45:30 +02:00
Evan Lezar
accba4ead5 Merge branch 'CNT-3965/clean-up-by-path-symlinks' into 'main'
Improve handling of /dev/dri devices and nested device paths

See merge request nvidia/container-toolkit/container-toolkit!307
2023-03-01 10:25:48 +00:00
Christopher Desiniotis
1e9b7883cf Merge branch 'CNT-3937/add-target-driver-root' into 'main'
Add a driver root transformer to nvcdi

See merge request nvidia/container-toolkit/container-toolkit!300
2023-02-28 18:04:29 +00:00
Christopher Desiniotis
87e406eee6 Update root transformer tests to ensure container path is not modified
Signed-off-by: Christopher Desiniotis <cdesiniotis@nvidia.com>
2023-02-28 09:00:05 -08:00
Christopher Desiniotis
45ed3b0412 Handle hook arguments for creation of symlinks
Signed-off-by: Christopher Desiniotis <cdesiniotis@nvidia.com>
2023-02-28 09:00:02 -08:00
Christopher Desiniotis
0516fc96ca Add Transform interface and initial implemention for a root transform
Signed-off-by: Christopher Desiniotis <cdesiniotis@nvidia.com>
2023-02-28 08:56:13 -08:00
Evan Lezar
e7a435fd5b Merge branch 'update-libnvidia-container' into 'main'
Update libnvidia-container

See merge request nvidia/container-toolkit/container-toolkit!312
2023-02-27 13:41:26 +00:00
Evan Lezar
7a249d7771 Update libnvidia-container
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2023-02-27 15:41:02 +02:00
Evan Lezar
7986ff9cee Merge branch 'CNT-3963/deduplicate-wsl-driverstore-paths' into 'main'
Deduplicate WSL driverstore paths

See merge request nvidia/container-toolkit/container-toolkit!304
2023-02-27 13:27:31 +00:00
Evan Lezar
b74c13d75f Merge branch 'fix-rpm-postun-scriptlet' into 'main'
nvidia-container-toolkit.spec: fix syntax error in postun scriptlet

See merge request nvidia/container-toolkit/container-toolkit!309
2023-02-27 12:36:49 +00:00
Evan Lezar
de8eeb87f4 Merge branch 'remove-outdated-platforms' into 'main'
Remove outdated platforms from CI

See merge request nvidia/container-toolkit/container-toolkit!310
2023-02-27 11:48:33 +00:00
Evan Lezar
36c4174de3 Update libnvidia-container
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2023-02-27 13:45:44 +02:00
Evan Lezar
3497936cdf Remove ubuntu18.04 toolkit-container image
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2023-02-27 12:55:17 +02:00
Evan Lezar
81abc92743 Remove fedora35 from 'all' targets
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2023-02-27 12:31:38 +02:00
Evan Lezar
1ef8dc3137 Remove centos7-ppc64le from CI
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2023-02-27 12:30:29 +02:00
Evan Lezar
9a5c1bbe48 Remove ubuntu16.04 packages from CI
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2023-02-27 12:29:35 +02:00
Evan Lezar
30dff61376 Remove debian9 packages from CI
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2023-02-27 12:28:46 +02:00
Claudius Volz
de1bb68d19 nvidia-container-toolkit.spec: fix syntax error in postun scriptlet
Signed-off-by: Claudius Volz <c.volz@gmx.de>
2023-02-27 00:45:21 +01:00
Evan Lezar
06d8bb5019 Merge branch 'CNT-3965/dont-fail-chmod-hook' into 'main'
Skip paths with errors in chmod hook

See merge request nvidia/container-toolkit/container-toolkit!303
2023-02-22 15:20:26 +00:00
Evan Lezar
b4dc1f338d Generate nested device folder permission hooks per device
This change generates device folder permission hooks per device instead of
at a spec level. This ensures that the hook is not injected for a device that
does not have any nested device nodes.

Signed-off-by: Evan Lezar <elezar@nvidia.com>
2023-02-22 17:16:23 +02:00
Evan Lezar
181128fe73 Only include by-path-symlinks for injected device nodes
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2023-02-22 16:53:04 +02:00
Evan Lezar
252838e696 Merge branch 'bump-version-v1.13.0-rc.2' into 'main'
Bump version to v1.13.0-rc.2

See merge request nvidia/container-toolkit/container-toolkit!305
2023-02-21 13:11:00 +00:00
Evan Lezar
49f171a8b1 Update libnvidia-container
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2023-02-21 14:27:02 +02:00
Evan Lezar
3d12803ab3 Bump version to v1.13.0-rc.2
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2023-02-21 14:24:37 +02:00
Evan Lezar
a168091bfb Add v1.13.0-rc.1 Changelog
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2023-02-21 14:23:52 +02:00
Evan Lezar
35fc57291f Deduplicate WSL driverstore paths
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2023-02-21 11:48:56 +02:00
Evan Lezar
2542224d7b Skip paths with errors in chmod hook
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2023-02-21 11:47:11 +02:00
96 changed files with 6322 additions and 1753 deletions

View File

@@ -23,6 +23,7 @@ variables:
BUILD_MULTI_ARCH_IMAGES: "true"
stages:
- trigger
- image
- lint
- go-checks
@@ -34,13 +35,44 @@ stages:
- scan
- release
.pipeline-trigger-rules:
rules:
# We trigger the pipeline if started manually
- if: $CI_PIPELINE_SOURCE == "web"
# We trigger the pipeline on the main branch
- if: $CI_COMMIT_BRANCH == "main"
# We trigger the pipeline on the release- branches
- if: $CI_COMMIT_BRANCH =~ /^release-.*$/
# We trigger the pipeline on tags
- if: $CI_COMMIT_TAG && $CI_COMMIT_TAG != ""
workflow:
rules:
# We trigger the pipeline on a merge request
- if: $CI_PIPELINE_SOURCE == 'merge_request_event'
# We then add all the regular triggers
- !reference [.pipeline-trigger-rules, rules]
# The main or manual job is used to filter out distributions or architectures that are not required on
# every build.
.main-or-manual:
rules:
- if: $CI_COMMIT_BRANCH == "main"
- if: $CI_COMMIT_TAG && $CI_COMMIT_TAG != ""
- !reference [.pipeline-trigger-rules, rules]
- if: $CI_PIPELINE_SOURCE == "schedule"
when: manual
# The trigger-pipeline job adds a manualy triggered job to the pipeline on merge requests.
trigger-pipeline:
stage: trigger
script:
- echo "starting pipeline"
rules:
- !reference [.main-or-manual, rules]
- if: $CI_PIPELINE_SOURCE == "merge_request_event"
when: manual
allow_failure: false
- when: always
# Define the distribution targets
.dist-amazonlinux2:
rules:
@@ -70,13 +102,6 @@ stages:
DIST: debian10
PACKAGE_REPO_TYPE: debian
.dist-debian9:
rules:
- !reference [.main-or-manual, rules]
variables:
DIST: debian9
PACKAGE_REPO_TYPE: debian
.dist-opensuse-leap15.1:
rules:
- !reference [.main-or-manual, rules]
@@ -92,13 +117,6 @@ stages:
CVE_UPDATES: "cyrus-sasl-lib"
PACKAGE_REPO_TYPE: rpm
.dist-ubuntu16.04:
rules:
- !reference [.main-or-manual, rules]
variables:
DIST: ubuntu16.04
PACKAGE_REPO_TYPE: debian
.dist-ubuntu18.04:
variables:
DIST: ubuntu18.04
@@ -106,8 +124,6 @@ stages:
PACKAGE_REPO_TYPE: debian
.dist-ubuntu20.04:
rules:
- !reference [.main-or-manual, rules]
variables:
DIST: ubuntu20.04
CVE_UPDATES: "libsasl2-2 libsasl2-modules-db"
@@ -252,22 +268,15 @@ release:staging-ubi8:
needs:
- image-ubi8
release:staging-ubuntu18.04:
extends:
- .release:staging
- .dist-ubuntu18.04
needs:
- test-toolkit-ubuntu18.04
- test-containerd-ubuntu18.04
- test-crio-ubuntu18.04
- test-docker-ubuntu18.04
release:staging-ubuntu20.04:
extends:
- .release:staging
- .dist-ubuntu20.04
needs:
- image-ubuntu20.04
- test-toolkit-ubuntu20.04
- test-containerd-ubuntu20.04
- test-crio-ubuntu20.04
- test-docker-ubuntu20.04
release:staging-packaging:
extends:

113
.github/workflows/blossom-ci.yaml vendored Normal file
View File

@@ -0,0 +1,113 @@
# Copyright (c) 2020-2023, NVIDIA CORPORATION.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
# A workflow to trigger ci on hybrid infra (github + self hosted runner)
name: Blossom-CI
on:
issue_comment:
types: [created]
workflow_dispatch:
inputs:
platform:
description: 'runs-on argument'
required: false
args:
description: 'argument'
required: false
jobs:
Authorization:
name: Authorization
runs-on: blossom
outputs:
args: ${{ env.args }}
# This job only runs for pull request comments
if: |
contains( '\
anstockatnv,\
rohitrajani2018,\
cdesiniotis,\
shivamerla,\
ArangoGutierrez,\
elezar,\
klueska,\
zvonkok,\
', format('{0},', github.actor)) &&
github.event.comment.body == '/blossom-ci'
steps:
- name: Check if comment is issued by authorized person
run: blossom-ci
env:
OPERATION: 'AUTH'
REPO_TOKEN: ${{ secrets.GITHUB_TOKEN }}
REPO_KEY_DATA: ${{ secrets.BLOSSOM_KEY }}
Vulnerability-scan:
name: Vulnerability scan
needs: [Authorization]
runs-on: ubuntu-latest
steps:
- name: Checkout code
uses: actions/checkout@v2
with:
repository: ${{ fromJson(needs.Authorization.outputs.args).repo }}
ref: ${{ fromJson(needs.Authorization.outputs.args).ref }}
lfs: 'true'
# repo specific steps
#- name: Setup java
# uses: actions/setup-java@v1
# with:
# java-version: 1.8
# add blackduck properties https://synopsys.atlassian.net/wiki/spaces/INTDOCS/pages/631308372/Methods+for+Configuring+Analysis#Using-a-configuration-file
#- name: Setup blackduck properties
# run: |
# PROJECTS=$(mvn -am dependency:tree | grep maven-dependency-plugin | awk '{ out="com.nvidia:"$(NF-1);print out }' | grep rapids | xargs | sed -e 's/ /,/g')
# echo detect.maven.build.command="-pl=$PROJECTS -am" >> application.properties
# echo detect.maven.included.scopes=compile >> application.properties
- name: Run blossom action
uses: NVIDIA/blossom-action@main
env:
REPO_TOKEN: ${{ secrets.GITHUB_TOKEN }}
REPO_KEY_DATA: ${{ secrets.BLOSSOM_KEY }}
with:
args1: ${{ fromJson(needs.Authorization.outputs.args).args1 }}
args2: ${{ fromJson(needs.Authorization.outputs.args).args2 }}
args3: ${{ fromJson(needs.Authorization.outputs.args).args3 }}
Job-trigger:
name: Start ci job
needs: [Vulnerability-scan]
runs-on: blossom
steps:
- name: Start ci job
run: blossom-ci
env:
OPERATION: 'START-CI-JOB'
CI_SERVER: ${{ secrets.CI_SERVER }}
REPO_TOKEN: ${{ secrets.GITHUB_TOKEN }}
Upload-Log:
name: Upload log
runs-on: blossom
if : github.event_name == 'workflow_dispatch'
steps:
- name: Jenkins log for pull request ${{ fromJson(github.event.inputs.args).pr }} (click here)
run: blossom-ci
env:
OPERATION: 'POST-PROCESSING'
CI_SERVER: ${{ secrets.CI_SERVER }}
REPO_TOKEN: ${{ secrets.GITHUB_TOKEN }}

View File

@@ -116,12 +116,6 @@ package-amazonlinux2-x86_64:
- .dist-amazonlinux2
- .arch-x86_64
package-centos7-ppc64le:
extends:
- .package-build
- .dist-centos7
- .arch-ppc64le
package-centos7-x86_64:
extends:
- .package-build
@@ -152,30 +146,12 @@ package-debian10-amd64:
- .dist-debian10
- .arch-amd64
package-debian9-amd64:
extends:
- .package-build
- .dist-debian9
- .arch-amd64
package-opensuse-leap15.1-x86_64:
extends:
- .package-build
- .dist-opensuse-leap15.1
- .arch-x86_64
package-ubuntu16.04-amd64:
extends:
- .package-build
- .dist-ubuntu16.04
- .arch-amd64
package-ubuntu16.04-ppc64le:
extends:
- .package-build
- .dist-ubuntu16.04
- .arch-ppc64le
package-ubuntu18.04-amd64:
extends:
- .package-build
@@ -228,7 +204,6 @@ image-centos7:
- .package-artifacts
- .dist-centos7
needs:
- package-centos7-ppc64le
- package-centos7-x86_64
image-ubi8:
@@ -242,17 +217,6 @@ image-ubi8:
- package-centos8-x86_64
- package-centos8-ppc64le
image-ubuntu18.04:
extends:
- .image-build
- .package-artifacts
- .dist-ubuntu18.04
needs:
- package-ubuntu18.04-amd64
- package-ubuntu18.04-arm64
- job: package-ubuntu18.04-ppc64le
optional: true
image-ubuntu20.04:
extends:
- .image-build
@@ -261,7 +225,8 @@ image-ubuntu20.04:
needs:
- package-ubuntu18.04-amd64
- package-ubuntu18.04-arm64
- package-ubuntu18.04-ppc64le
- job: package-ubuntu18.04-ppc64le
optional: true
# The DIST=packaging target creates an image containing all built packages
image-packaging:
@@ -278,22 +243,14 @@ image-packaging:
optional: true
- job: package-amazonlinux2-x86_64
optional: true
- job: package-centos7-ppc64le
optional: true
- job: package-centos7-x86_64
optional: true
- job: package-centos8-ppc64le
optional: true
- job: package-debian10-amd64
optional: true
- job: package-debian9-amd64
optional: true
- job: package-opensuse-leap15.1-x86_64
optional: true
- job: package-ubuntu16.04-amd64
optional: true
- job: package-ubuntu16.04-ppc64le
optional: true
- job: package-ubuntu18.04-ppc64le
optional: true
@@ -327,31 +284,31 @@ image-packaging:
TEST_CASES: "crio"
# Define the test targets
test-toolkit-ubuntu18.04:
test-toolkit-ubuntu20.04:
extends:
- .test:toolkit
- .dist-ubuntu18.04
- .dist-ubuntu20.04
needs:
- image-ubuntu18.04
- image-ubuntu20.04
test-containerd-ubuntu18.04:
test-containerd-ubuntu20.04:
extends:
- .test:containerd
- .dist-ubuntu18.04
- .dist-ubuntu20.04
needs:
- image-ubuntu18.04
- image-ubuntu20.04
test-crio-ubuntu18.04:
test-crio-ubuntu20.04:
extends:
- .test:crio
- .dist-ubuntu18.04
- .dist-ubuntu20.04
needs:
- image-ubuntu18.04
- image-ubuntu20.04
test-docker-ubuntu18.04:
test-docker-ubuntu20.04:
extends:
- .test:docker
- .dist-ubuntu18.04
- .dist-ubuntu20.04
needs:
- image-ubuntu18.04
- image-ubuntu20.04

View File

@@ -36,8 +36,7 @@ variables:
STAGING_REGISTRY: registry.gitlab.com/nvidia/container-toolkit/container-toolkit/staging
STAGING_VERSION: ${CI_COMMIT_SHORT_SHA}
ARTIFACTORY_REPO_BASE: "https://urm.nvidia.com/artifactory/sw-gpu-cloudnative"
# TODO: We should set the kitmaker release folder here once we have the end-to-end workflow set up
KITMAKER_RELEASE_FOLDER: "testing"
KITMAKER_RELEASE_FOLDER: "kitmaker"
.image-pull:
stage: image-build
@@ -79,11 +78,6 @@ image-ubi8:
- .dist-ubi8
- .image-pull
image-ubuntu18.04:
extends:
- .dist-ubuntu18.04
- .image-pull
image-ubuntu20.04:
extends:
- .dist-ubuntu20.04
@@ -153,14 +147,6 @@ scan-centos7-arm64:
- image-centos7
- scan-centos7-amd64
scan-ubuntu18.04-amd64:
extends:
- .dist-ubuntu18.04
- .platform-amd64
- .scan
needs:
- image-ubuntu18.04
scan-ubuntu20.04-amd64:
extends:
- .dist-ubuntu20.04
@@ -230,12 +216,12 @@ release:packages:kitmaker:
extends:
- .release:packages
release:staging-ubuntu18.04:
release:staging-ubuntu20.04:
extends:
- .release:staging
- .dist-ubuntu18.04
- .dist-ubuntu20.04
needs:
- image-ubuntu18.04
- image-ubuntu20.04
# Define the external release targets
# Release to NGC
@@ -244,11 +230,6 @@ release:ngc-centos7:
- .dist-centos7
- .release:ngc
release:ngc-ubuntu18.04:
extends:
- .dist-ubuntu18.04
- .release:ngc
release:ngc-ubuntu20.04:
extends:
- .dist-ubuntu20.04

View File

@@ -1,9 +1,59 @@
# NVIDIA Container Toolkit Changelog
## v1.13.0-rc.3
* Only initialize NVML for modes that require it when runing `nvidia-ctk cdi generate`.
* Prefer /run over /var/run when locating nvidia-persistenced and nvidia-fabricmanager sockets.
* Fix the generation of CDI specifications for management containers when the driver libraries are not in the LDCache.
* Add transformers to deduplicate and simplify CDI specifications.
* Generate a simplified CDI specification by default. This means that entities in the common edits in a spec are not included in device definitions.
* Also return an error from the nvcdi.New constructor instead of panicing.
* Detect XOrg libraries for injection and CDI spec generation.
* Add `nvidia-container-runtime.modes.cdi.annotation-prefixes` config option that allows the CDI annotation prefixes that are read to be overridden.
* [libnvidia-container] Fix segmentation fault when RPC initialization fails.
* [libnvidia-container] Build centos variants of the NVIDIA Container Library with static libtirpc v1.3.2.
* [libnvidia-container] Remove make targets for fedora35 as the centos8 packages are compatible.
## v1.13.0-rc.2
* Don't fail chmod hook if paths are not injected
* Only create `by-path` symlinks if CDI devices are actually requested.
* Fix possible blank `nvidia-ctk` path in generated CDI specifications
* Fix error in postun scriplet on RPM-based systems
* Only check `NVIDIA_VISIBLE_DEVICES` for environment variables if no annotations are specified.
* Add `cdi.default-kind` config option for constructing fully-qualified CDI device names in CDI mode
* Add support for `accept-nvidia-visible-devices-envvar-unprivileged` config setting in CDI mode
* Add `nvidia-container-runtime-hook.skip-mode-detection` config option to bypass mode detection. This allows `legacy` and `cdi` mode, for example, to be used at the same time.
* Add support for generating CDI specifications for GDS and MOFED devices
* Ensure CDI specification is validated on save when generating a spec
* Rename `--discovery-mode` argument to `--mode` for `nvidia-ctk cdi generate`
* [libnvidia-container] Fix segfault on WSL2 systems
* [toolkit-container] Add `--cdi-enabled` flag to toolkit config
* [toolkit-container] Install `nvidia-ctk` from toolkit container
* [toolkit-container] Use installed `nvidia-ctk` path in NVIDIA Container Toolkit config
* [toolkit-container] Bump CUDA base images to 12.1.0
* [toolkit-container] Set `nvidia-ctk` path in the
* [toolkit-container] Add `cdi.k8s.io/*` to set of allowed annotations in containerd config
* [toolkit-container] Generate CDI specification for use in management containers
* [toolkit-container] Install experimental runtime as `nvidia-container-runtime.experimental` instead of `nvidia-container-runtime-experimental`
* [toolkit-container] Install and configure mode-specific runtimes for `cdi` and `legacy` modes
## v1.13.0-rc.1
* Discover gsb*.bin files for GSP firmware when generating CDI specification
* [libnvidia-container] Inject gsp*.bin files for GSP firmware
* Include MIG-enabled devices as GPUs when generating CDI specification
* Fix missing NVML symbols when running `nvidia-ctk` on some platforms [#49]
* Add CDI spec generation for WSL2-based systems to `nvidia-ctk cdi generate` command
* Add `auto` mode to `nvidia-ctk cdi generate` command to automatically detect a WSL2-based system over a standard NVML-based system.
* Add mode-specific (`.cdi` and `.legacy`) NVIDIA Container Runtime binaries for use in the GPU Operator
* Discover all `gsb*.bin` GSP firmware files when generating CDI specification.
* Align `.deb` and `.rpm` release candidate package versions
* Remove `fedora35` packaging targets
* [libnvidia-container] Include all `gsp*.bin` firmware files if present
* [libnvidia-container] Align `.deb` and `.rpm` release candidate package versions
* [libnvidia-container] Remove `fedora35` packaging targets
* [toolkit-container] Install `nvidia-container-toolkit-operator-extensions` package for mode-specific executables.
* [toolkit-container] Allow `nvidia-container-runtime.mode` to be set when configuring the NVIDIA Container Toolkit
## v1.12.0

View File

@@ -29,6 +29,7 @@ ARG PACKAGE_DIST
ARG PACKAGE_VERSION
ARG GIT_BRANCH
ARG GIT_COMMIT
ARG GIT_COMMIT_SHORT
ARG SOURCE_DATE_EPOCH
ARG VERSION

View File

@@ -44,13 +44,13 @@ OUT_IMAGE = $(OUT_IMAGE_NAME):$(OUT_IMAGE_TAG)
##### Public rules #####
DEFAULT_PUSH_TARGET := ubuntu20.04
DISTRIBUTIONS := ubuntu20.04 ubuntu18.04 ubi8 centos7
DISTRIBUTIONS := ubuntu20.04 ubi8 centos7
META_TARGETS := packaging
BUILD_TARGETS := $(patsubst %,build-%,$(DISTRIBUTIONS) $(META_TARGETS))
PUSH_TARGETS := $(patsubst %,push-%,$(DISTRIBUTIONS) $(META_TARGETS))
TEST_TARGETS := $(patsubst %,test-%, $(DISTRIBUTIONS))
TEST_TARGETS := $(patsubst %,test-%,$(DISTRIBUTIONS))
.PHONY: $(DISTRIBUTIONS) $(PUSH_TARGETS) $(BUILD_TARGETS) $(TEST_TARGETS)
@@ -95,6 +95,7 @@ $(BUILD_TARGETS): build-%: $(ARTIFACTS_ROOT)
--build-arg PACKAGE_VERSION="$(PACKAGE_VERSION)" \
--build-arg VERSION="$(VERSION)" \
--build-arg GIT_COMMIT="$(GIT_COMMIT)" \
--build-arg GIT_COMMIT_SHORT="$(GIT_COMMIT_SHORT)" \
--build-arg GIT_BRANCH="$(GIT_BRANCH)" \
--build-arg SOURCE_DATE_EPOCH="$(SOURCE_DATE_EPOCH)" \
--build-arg CVE_UPDATES="$(CVE_UPDATES)" \
@@ -142,10 +143,7 @@ test-packaging:
@$(DOCKER) run --rm $(IMAGE) test -d "/artifacts/packages/centos8/ppc64le" || echo "Missing centos8/ppc64le"
@$(DOCKER) run --rm $(IMAGE) test -d "/artifacts/packages/centos8/x86_64" || echo "Missing centos8/x86_64"
@$(DOCKER) run --rm $(IMAGE) test -d "/artifacts/packages/debian10/amd64" || echo "Missing debian10/amd64"
@$(DOCKER) run --rm $(IMAGE) test -d "/artifacts/packages/debian9/amd64" || echo "Missing debian9/amd64"
@$(DOCKER) run --rm $(IMAGE) test -d "/artifacts/packages/opensuse-leap15.1/x86_64" || echo "Missing opensuse-leap15.1/x86_64"
@$(DOCKER) run --rm $(IMAGE) test -d "/artifacts/packages/ubuntu16.04/amd64" || echo "Missing ubuntu16.04/amd64"
@$(DOCKER) run --rm $(IMAGE) test -d "/artifacts/packages/ubuntu16.04/ppc64le" || echo "Missing ubuntu16.04/ppc64le"
@$(DOCKER) run --rm $(IMAGE) test -d "/artifacts/packages/ubuntu18.04/amd64" || echo "Missing ubuntu18.04/amd64"
@$(DOCKER) run --rm $(IMAGE) test -d "/artifacts/packages/ubuntu18.04/arm64" || echo "Missing ubuntu18.04/arm64"
@$(DOCKER) run --rm $(IMAGE) test -d "/artifacts/packages/ubuntu18.04/ppc64le" || echo "Missing ubuntu18.04/ppc64le"

View File

@@ -10,6 +10,7 @@ import (
"strings"
"github.com/NVIDIA/nvidia-container-toolkit/internal/config/image"
"github.com/opencontainers/runtime-spec/specs-go"
"golang.org/x/mod/semver"
)
@@ -130,7 +131,7 @@ func isPrivileged(s *Spec) bool {
}
var caps []string
// If v1.1.0-rc1 <= OCI version < v1.0.0-rc5 parse s.Process.Capabilities as:
// If v1.0.0-rc1 <= OCI version < v1.0.0-rc5 parse s.Process.Capabilities as:
// github.com/opencontainers/runtime-spec/blob/v1.0.0-rc1/specs-go/config.go#L30-L54
rc1cmp := semver.Compare("v"+*s.Version, "v1.0.0-rc1")
rc5cmp := semver.Compare("v"+*s.Version, "v1.0.0-rc5")
@@ -139,28 +140,31 @@ func isPrivileged(s *Spec) bool {
if err != nil {
log.Panicln("could not decode Process.Capabilities in OCI spec:", err)
}
// Otherwise, parse s.Process.Capabilities as:
// github.com/opencontainers/runtime-spec/blob/v1.0.0/specs-go/config.go#L30-L54
} else {
var lc LinuxCapabilities
err := json.Unmarshal(*s.Process.Capabilities, &lc)
if err != nil {
log.Panicln("could not decode Process.Capabilities in OCI spec:", err)
for _, c := range caps {
if c == capSysAdmin {
return true
}
}
// We only make sure that the bounding capabibility set has
// CAP_SYS_ADMIN. This allows us to make sure that the container was
// actually started as '--privileged', but also allow non-root users to
// access the privileged NVIDIA capabilities.
caps = lc.Bounding
return false
}
for _, c := range caps {
if c == capSysAdmin {
return true
}
// Otherwise, parse s.Process.Capabilities as:
// github.com/opencontainers/runtime-spec/blob/v1.0.0/specs-go/config.go#L30-L54
process := specs.Process{
Env: s.Process.Env,
}
return false
err := json.Unmarshal(*s.Process.Capabilities, &process.Capabilities)
if err != nil {
log.Panicln("could not decode Process.Capabilities in OCI spec:", err)
}
fullSpec := specs.Spec{
Version: *s.Version,
Process: &process,
}
return image.IsPrivileged(&fullSpec)
}
func getDevicesFromEnvvar(image image.CUDA, swarmResourceEnvvars []string) *string {

View File

@@ -43,8 +43,9 @@ type HookConfig struct {
AcceptDeviceListAsVolumeMounts bool `toml:"accept-nvidia-visible-devices-as-volume-mounts"`
SupportedDriverCapabilities DriverCapabilities `toml:"supported-driver-capabilities"`
NvidiaContainerCLI CLIConfig `toml:"nvidia-container-cli"`
NVIDIAContainerRuntime config.RuntimeConfig `toml:"nvidia-container-runtime"`
NvidiaContainerCLI CLIConfig `toml:"nvidia-container-cli"`
NVIDIAContainerRuntime config.RuntimeConfig `toml:"nvidia-container-runtime"`
NVIDIAContainerRuntimeHook config.RuntimeHookConfig `toml:"nvidia-container-runtime-hook"`
}
func getDefaultHookConfig() HookConfig {
@@ -66,7 +67,8 @@ func getDefaultHookConfig() HookConfig {
User: nil,
Ldconfig: nil,
},
NVIDIAContainerRuntime: *config.GetDefaultRuntimeConfig(),
NVIDIAContainerRuntime: *config.GetDefaultRuntimeConfig(),
NVIDIAContainerRuntimeHook: *config.GetDefaultRuntimeHookConfig(),
}
}

View File

@@ -74,7 +74,7 @@ func doPrestart() {
hook := getHookConfig()
cli := hook.NvidiaContainerCLI
if info.ResolveAutoMode(&logInterceptor{}, hook.NVIDIAContainerRuntime.Mode) != "legacy" {
if !hook.NVIDIAContainerRuntimeHook.SkipModeDetection && info.ResolveAutoMode(&logInterceptor{}, hook.NVIDIAContainerRuntime.Mode) != "legacy" {
log.Panicln("invoking the NVIDIA Container Runtime Hook directly (e.g. specifying the docker --gpus flag) is not supported. Please use the NVIDIA Container Runtime (e.g. specify the --runtime=nvidia flag) instead.")
}

View File

@@ -18,6 +18,7 @@ package cdi
import (
"github.com/NVIDIA/nvidia-container-toolkit/cmd/nvidia-ctk/cdi/generate"
"github.com/NVIDIA/nvidia-container-toolkit/cmd/nvidia-ctk/cdi/transform"
"github.com/sirupsen/logrus"
"github.com/urfave/cli/v2"
)
@@ -44,6 +45,7 @@ func (m command) build() *cli.Command {
hook.Subcommands = []*cli.Command{
generate.NewCommand(m.logger),
transform.NewCommand(m.logger),
}
return &hook

View File

@@ -18,7 +18,6 @@ package generate
import (
"fmt"
"io"
"os"
"path/filepath"
"strings"
@@ -26,19 +25,14 @@ import (
"github.com/NVIDIA/nvidia-container-toolkit/internal/discover"
"github.com/NVIDIA/nvidia-container-toolkit/internal/edits"
"github.com/NVIDIA/nvidia-container-toolkit/pkg/nvcdi"
"github.com/NVIDIA/nvidia-container-toolkit/pkg/nvcdi/spec"
"github.com/container-orchestrated-devices/container-device-interface/pkg/cdi"
specs "github.com/container-orchestrated-devices/container-device-interface/specs-go"
"github.com/sirupsen/logrus"
"github.com/urfave/cli/v2"
"gitlab.com/nvidia/cloud-native/go-nvlib/pkg/nvlib/device"
"gitlab.com/nvidia/cloud-native/go-nvlib/pkg/nvml"
"sigs.k8s.io/yaml"
)
const (
formatJSON = "json"
formatYAML = "yaml"
allDeviceName = "all"
)
@@ -52,7 +46,9 @@ type config struct {
deviceNameStrategy string
driverRoot string
nvidiaCTKPath string
discoveryMode string
mode string
vendor string
class string
}
// NewCommand constructs a generate-cdi command with the specified logger
@@ -88,14 +84,15 @@ func (m command) build() *cli.Command {
&cli.StringFlag{
Name: "format",
Usage: "The output format for the generated spec [json | yaml]. This overrides the format defined by the output file extension (if specified).",
Value: formatYAML,
Value: spec.FormatYAML,
Destination: &cfg.format,
},
&cli.StringFlag{
Name: "discovery-mode",
Name: "mode",
Aliases: []string{"discovery-mode"},
Usage: "The mode to use when discovering the available entities. One of [auto | nvml | wsl]. If mode is set to 'auto' the mode will be determined based on the system configuration.",
Value: nvcdi.ModeAuto,
Destination: &cfg.discoveryMode,
Destination: &cfg.mode,
},
&cli.StringFlag{
Name: "device-name-strategy",
@@ -113,27 +110,43 @@ func (m command) build() *cli.Command {
Usage: "Specify the path to use for the nvidia-ctk in the generated CDI specification. If this is left empty, the path will be searched.",
Destination: &cfg.nvidiaCTKPath,
},
&cli.StringFlag{
Name: "vendor",
Aliases: []string{"cdi-vendor"},
Usage: "the vendor string to use for the generated CDI specification.",
Value: "nvidia.com",
Destination: &cfg.vendor,
},
&cli.StringFlag{
Name: "class",
Aliases: []string{"cdi-class"},
Usage: "the class string to use for the generated CDI specification.",
Value: "gpu",
Destination: &cfg.class,
},
}
return &c
}
func (m command) validateFlags(r *cli.Context, cfg *config) error {
func (m command) validateFlags(c *cli.Context, cfg *config) error {
cfg.format = strings.ToLower(cfg.format)
switch cfg.format {
case formatJSON:
case formatYAML:
case spec.FormatJSON:
case spec.FormatYAML:
default:
return fmt.Errorf("invalid output format: %v", cfg.format)
}
cfg.discoveryMode = strings.ToLower(cfg.discoveryMode)
switch cfg.discoveryMode {
cfg.mode = strings.ToLower(cfg.mode)
switch cfg.mode {
case nvcdi.ModeAuto:
case nvcdi.ModeNvml:
case nvcdi.ModeWsl:
case nvcdi.ModeManagement:
default:
return fmt.Errorf("invalid discovery mode: %v", cfg.discoveryMode)
return fmt.Errorf("invalid discovery mode: %v", cfg.mode)
}
_, err := nvcdi.NewDeviceNamer(cfg.deviceNameStrategy)
@@ -143,31 +156,6 @@ func (m command) validateFlags(r *cli.Context, cfg *config) error {
cfg.nvidiaCTKPath = discover.FindNvidiaCTK(m.logger, cfg.nvidiaCTKPath)
return nil
}
func (m command) run(c *cli.Context, cfg *config) error {
spec, err := m.generateSpec(cfg)
if err != nil {
return fmt.Errorf("failed to generate CDI spec: %v", err)
}
var outputTo io.Writer
if cfg.output == "" {
outputTo = os.Stdout
} else {
err := createParentDirsIfRequired(cfg.output)
if err != nil {
return fmt.Errorf("failed to create parent folders for output file: %v", err)
}
outputFile, err := os.Create(cfg.output)
if err != nil {
return fmt.Errorf("failed to create output file: %v", err)
}
defer outputFile.Close()
outputTo = outputFile
}
if outputFileFormat := formatFromFilename(cfg.output); outputFileFormat != "" {
m.logger.Debugf("Inferred output format as %q from output file name", outputFileFormat)
if !c.IsSet("format") {
@@ -177,78 +165,61 @@ func (m command) run(c *cli.Context, cfg *config) error {
}
}
data, err := yaml.Marshal(spec)
if err != nil {
return fmt.Errorf("failed to marshal CDI spec: %v", err)
if err := cdi.ValidateVendorName(cfg.vendor); err != nil {
return fmt.Errorf("invalid CDI vendor name: %v", err)
}
if strings.ToLower(cfg.format) == formatJSON {
data, err = yaml.YAMLToJSONStrict(data)
if err != nil {
return fmt.Errorf("failed to convert CDI spec from YAML to JSON: %v", err)
}
if err := cdi.ValidateClassName(cfg.class); err != nil {
return fmt.Errorf("invalid CDI class name: %v", err)
}
err = writeToOutput(cfg.format, data, outputTo)
if err != nil {
return fmt.Errorf("failed to write output: %v", err)
}
return nil
}
func (m command) run(c *cli.Context, cfg *config) error {
spec, err := m.generateSpec(cfg)
if err != nil {
return fmt.Errorf("failed to generate CDI spec: %v", err)
}
m.logger.Infof("Generated CDI spec with version %v", spec.Raw().Version)
if cfg.output == "" {
_, err := spec.WriteTo(os.Stdout)
if err != nil {
return fmt.Errorf("failed to write CDI spec to STDOUT: %v", err)
}
return nil
}
return spec.Save(cfg.output)
}
func formatFromFilename(filename string) string {
ext := filepath.Ext(filename)
switch strings.ToLower(ext) {
case ".json":
return formatJSON
case ".yaml":
return formatYAML
case ".yml":
return formatYAML
return spec.FormatJSON
case ".yaml", ".yml":
return spec.FormatYAML
}
return ""
}
func writeToOutput(format string, data []byte, output io.Writer) error {
if format == formatYAML {
_, err := output.Write([]byte("---\n"))
if err != nil {
return fmt.Errorf("failed to write YAML separator: %v", err)
}
}
_, err := output.Write(data)
if err != nil {
return fmt.Errorf("failed to write data: %v", err)
}
return nil
}
func (m command) generateSpec(cfg *config) (*specs.Spec, error) {
func (m command) generateSpec(cfg *config) (spec.Interface, error) {
deviceNamer, err := nvcdi.NewDeviceNamer(cfg.deviceNameStrategy)
if err != nil {
return nil, fmt.Errorf("failed to create device namer: %v", err)
}
nvmllib := nvml.New()
if r := nvmllib.Init(); r != nvml.SUCCESS {
return nil, r
}
defer nvmllib.Shutdown()
devicelib := device.New(device.WithNvml(nvmllib))
cdilib := nvcdi.New(
cdilib, err := nvcdi.New(
nvcdi.WithLogger(m.logger),
nvcdi.WithDriverRoot(cfg.driverRoot),
nvcdi.WithNVIDIACTKPath(cfg.nvidiaCTKPath),
nvcdi.WithDeviceNamer(deviceNamer),
nvcdi.WithDeviceLib(devicelib),
nvcdi.WithNvmlLib(nvmllib),
nvcdi.WithMode(string(cfg.discoveryMode)),
nvcdi.WithMode(string(cfg.mode)),
)
if err != nil {
return nil, fmt.Errorf("failed to create CDI library: %v", err)
}
deviceSpecs, err := cdilib.GetAllDeviceSpecs()
if err != nil {
@@ -273,30 +244,14 @@ func (m command) generateSpec(cfg *config) (*specs.Spec, error) {
if err != nil {
return nil, fmt.Errorf("failed to create edits common for entities: %v", err)
}
deviceFolderPermissionEdits, err := GetDeviceFolderPermissionHookEdits(m.logger, cfg.driverRoot, cfg.nvidiaCTKPath, deviceSpecs)
if err != nil {
return nil, fmt.Errorf("failed to generated edits for device folder permissions: %v", err)
}
commonEdits.Append(deviceFolderPermissionEdits)
// We construct the spec and determine the minimum required version based on the specification.
spec := specs.Spec{
Version: "NOT_SET",
Kind: "nvidia.com/gpu",
Devices: deviceSpecs,
ContainerEdits: *commonEdits.ContainerEdits,
}
minVersion, err := cdi.MinimumRequiredVersion(&spec)
if err != nil {
return nil, fmt.Errorf("failed to get minumum required CDI spec version: %v", err)
}
m.logger.Infof("Using minimum required CDI spec version: %s", minVersion)
spec.Version = minVersion
return &spec, nil
return spec.New(
spec.WithVendor(cfg.vendor),
spec.WithClass(cfg.class),
spec.WithDeviceSpecs(deviceSpecs),
spec.WithEdits(*commonEdits.ContainerEdits),
spec.WithFormat(cfg.format),
)
}
// MergeDeviceSpecs creates a device with the specified name which combines the edits from the previous devices.
@@ -326,14 +281,3 @@ func MergeDeviceSpecs(deviceSpecs []specs.Device, mergedDeviceName string) (spec
}
return merged, nil
}
// createParentDirsIfRequired creates the parent folders of the specified path if requried.
// Note that MkdirAll does not specifically check whether the specified path is non-empty and raises an error if it is.
// The path will be empty if filename in the current folder is specified, for example
func createParentDirsIfRequired(filename string) error {
dir := filepath.Dir(filename)
if dir == "" {
return nil
}
return os.MkdirAll(dir, 0755)
}

View File

@@ -0,0 +1,159 @@
/**
# Copyright (c) 2022, NVIDIA CORPORATION. All rights reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
**/
package root
import (
"fmt"
"io"
"os"
"github.com/NVIDIA/nvidia-container-toolkit/pkg/nvcdi/spec"
"github.com/NVIDIA/nvidia-container-toolkit/pkg/nvcdi/transform"
"github.com/container-orchestrated-devices/container-device-interface/pkg/cdi"
"github.com/sirupsen/logrus"
"github.com/urfave/cli/v2"
)
type loadSaver interface {
Load() (spec.Interface, error)
Save(spec.Interface) error
}
type command struct {
logger *logrus.Logger
}
type transformOptions struct {
input string
output string
}
type options struct {
transformOptions
from string
to string
}
// NewCommand constructs a generate-cdi command with the specified logger
func NewCommand(logger *logrus.Logger) *cli.Command {
c := command{
logger: logger,
}
return c.build()
}
// build creates the CLI command
func (m command) build() *cli.Command {
opts := options{}
c := cli.Command{
Name: "root",
Usage: "Apply a root transform to a CDI specification",
Before: func(c *cli.Context) error {
return m.validateFlags(c, &opts)
},
Action: func(c *cli.Context) error {
return m.run(c, &opts)
},
}
c.Flags = []cli.Flag{
&cli.StringFlag{
Name: "input",
Usage: "Specify the file to read the CDI specification from. If this is '-' the specification is read from STDIN",
Value: "-",
Destination: &opts.input,
},
&cli.StringFlag{
Name: "output",
Usage: "Specify the file to output the generated CDI specification to. If this is '' the specification is output to STDOUT",
Destination: &opts.output,
},
&cli.StringFlag{
Name: "from",
Usage: "specify the root to be transformed",
Destination: &opts.from,
},
&cli.StringFlag{
Name: "to",
Usage: "specify the replacement root. If this is the same as the from root, the transform is a no-op.",
Value: "",
Destination: &opts.to,
},
}
return &c
}
func (m command) validateFlags(c *cli.Context, opts *options) error {
return nil
}
func (m command) run(c *cli.Context, opts *options) error {
spec, err := opts.Load()
if err != nil {
return fmt.Errorf("failed to load CDI specification: %w", err)
}
err = transform.NewRootTransformer(
opts.from,
opts.to,
).Transform(spec.Raw())
if err != nil {
return fmt.Errorf("failed to transform CDI specification: %w", err)
}
return opts.Save(spec)
}
// Load lodas the input CDI specification
func (o transformOptions) Load() (spec.Interface, error) {
contents, err := o.getContents()
if err != nil {
return nil, fmt.Errorf("failed to read spec contents: %v", err)
}
raw, err := cdi.ParseSpec(contents)
if err != nil {
return nil, fmt.Errorf("failed to parse CDI spec: %v", err)
}
return spec.New(
spec.WithRawSpec(raw),
)
}
func (o transformOptions) getContents() ([]byte, error) {
if o.input == "-" {
return io.ReadAll(os.Stdin)
}
return os.ReadFile(o.input)
}
// Save saves the CDI specification to the output file
func (o transformOptions) Save(s spec.Interface) error {
if o.output == "" {
_, err := s.WriteTo(os.Stdout)
if err != nil {
return fmt.Errorf("failed to write CDI spec to STDOUT: %v", err)
}
return nil
}
return s.Save(o.output)
}

View File

@@ -0,0 +1,51 @@
/**
# Copyright (c) 2022, NVIDIA CORPORATION. All rights reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
**/
package transform
import (
"github.com/NVIDIA/nvidia-container-toolkit/cmd/nvidia-ctk/cdi/transform/root"
"github.com/sirupsen/logrus"
"github.com/urfave/cli/v2"
)
type command struct {
logger *logrus.Logger
}
// NewCommand constructs a command with the specified logger
func NewCommand(logger *logrus.Logger) *cli.Command {
c := command{
logger: logger,
}
return c.build()
}
// build creates the CLI command
func (m command) build() *cli.Command {
c := cli.Command{
Name: "transform",
Usage: "Apply a transform to a CDI specification",
}
c.Flags = []cli.Flag{}
c.Subcommands = []*cli.Command{
root.NewCommand(m.logger),
}
return &c
}

View File

@@ -18,6 +18,7 @@ package chmod
import (
"fmt"
"os"
"path/filepath"
"strings"
"syscall"
@@ -133,7 +134,12 @@ func (m command) run(c *cli.Context, cfg *config) error {
func (m command) getPaths(root string, paths []string) []string {
var pathsInRoot []string
for _, f := range paths {
pathsInRoot = append(pathsInRoot, filepath.Join(root, f))
path := filepath.Join(root, f)
if _, err := os.Stat(path); err != nil {
m.logger.Debugf("Skipping path %q: %v", path, err)
continue
}
pathsInRoot = append(pathsInRoot, path)
}
return pathsInRoot

View File

@@ -22,8 +22,8 @@ import (
"os"
"github.com/NVIDIA/nvidia-container-toolkit/cmd/nvidia-ctk/runtime/nvidia"
"github.com/NVIDIA/nvidia-container-toolkit/internal/config/crio"
"github.com/NVIDIA/nvidia-container-toolkit/internal/config/docker"
"github.com/NVIDIA/nvidia-container-toolkit/internal/config/engine/crio"
"github.com/NVIDIA/nvidia-container-toolkit/internal/config/engine/docker"
"github.com/pelletier/go-toml"
"github.com/sirupsen/logrus"
"github.com/urfave/cli/v2"
@@ -127,13 +127,14 @@ func (m command) configureDocker(c *cli.Context, config *config) error {
configFilePath = defaultDockerConfigFilePath
}
cfg, err := docker.LoadConfig(configFilePath)
cfg, err := docker.New(
docker.WithPath(configFilePath),
)
if err != nil {
return fmt.Errorf("unable to load config: %v", err)
}
err = docker.UpdateConfig(
cfg,
err = cfg.AddRuntime(
config.nvidiaOptions.RuntimeName,
config.nvidiaOptions.RuntimePath,
config.nvidiaOptions.SetAsDefault,
@@ -150,12 +151,16 @@ func (m command) configureDocker(c *cli.Context, config *config) error {
os.Stdout.WriteString(fmt.Sprintf("%s\n", output))
return nil
}
err = docker.FlushConfig(cfg, configFilePath)
n, err := cfg.Save(configFilePath)
if err != nil {
return fmt.Errorf("unable to flush config: %v", err)
}
m.logger.Infof("Wrote updated config to %v", configFilePath)
if n == 0 {
m.logger.Infof("Removed empty config from %v", configFilePath)
} else {
m.logger.Infof("Wrote updated config to %v", configFilePath)
}
m.logger.Infof("It is recommended that the docker daemon be restarted.")
return nil
@@ -168,13 +173,14 @@ func (m command) configureCrio(c *cli.Context, config *config) error {
configFilePath = defaultCrioConfigFilePath
}
cfg, err := crio.LoadConfig(configFilePath)
cfg, err := crio.New(
crio.WithPath(configFilePath),
)
if err != nil {
return fmt.Errorf("unable to load config: %v", err)
}
err = crio.UpdateConfig(
cfg,
err = cfg.AddRuntime(
config.nvidiaOptions.RuntimeName,
config.nvidiaOptions.RuntimePath,
config.nvidiaOptions.SetAsDefault,
@@ -191,12 +197,16 @@ func (m command) configureCrio(c *cli.Context, config *config) error {
os.Stdout.WriteString(fmt.Sprintf("%s\n", output))
return nil
}
err = crio.FlushConfig(configFilePath, cfg)
n, err := cfg.Save(configFilePath)
if err != nil {
return fmt.Errorf("unable to flush config: %v", err)
}
m.logger.Infof("Wrote updated config to %v", configFilePath)
if n == 0 {
m.logger.Infof("Removed empty config from %v", configFilePath)
} else {
m.logger.Infof("Wrote updated config to %v", configFilePath)
}
m.logger.Infof("It is recommended that the cri-o daemon be restarted.")
return nil

View File

@@ -0,0 +1,107 @@
/**
# Copyright (c) NVIDIA CORPORATION. All rights reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
**/
package createdevicenodes
import (
"fmt"
"github.com/NVIDIA/nvidia-container-toolkit/internal/system"
"github.com/sirupsen/logrus"
"github.com/urfave/cli/v2"
)
type command struct {
logger *logrus.Logger
}
type options struct {
driverRoot string
dryRun bool
control bool
}
// NewCommand constructs a command sub-command with the specified logger
func NewCommand(logger *logrus.Logger) *cli.Command {
c := command{
logger: logger,
}
return c.build()
}
// build
func (m command) build() *cli.Command {
opts := options{}
c := cli.Command{
Name: "create-device-nodes",
Usage: "A utility to create NVIDIA device ndoes",
Before: func(c *cli.Context) error {
return m.validateFlags(c, &opts)
},
Action: func(c *cli.Context) error {
return m.run(c, &opts)
},
}
c.Flags = []cli.Flag{
&cli.StringFlag{
Name: "driver-root",
Usage: "the path to the driver root. Device nodes will be created at `DRIVER_ROOT`/dev",
Value: "/",
Destination: &opts.driverRoot,
EnvVars: []string{"DRIVER_ROOT"},
},
&cli.BoolFlag{
Name: "control-devices",
Usage: "create all control device nodes: nvidiactl, nvidia-modeset, nvidia-uvm, nvidia-uvm-tools",
Destination: &opts.control,
},
&cli.BoolFlag{
Name: "dry-run",
Usage: "if set, the command will not create any symlinks.",
Value: false,
Destination: &opts.dryRun,
EnvVars: []string{"DRY_RUN"},
},
}
return &c
}
func (m command) validateFlags(r *cli.Context, opts *options) error {
return nil
}
func (m command) run(c *cli.Context, opts *options) error {
s, err := system.New(
system.WithLogger(m.logger),
system.WithDryRun(opts.dryRun),
)
if err != nil {
return fmt.Errorf("failed to create library: %v", err)
}
if opts.control {
m.logger.Infof("Creating control device nodes at %s", opts.driverRoot)
if err := s.CreateNVIDIAControlDeviceNodesAt(opts.driverRoot); err != nil {
return fmt.Errorf("failed to create control device nodes: %v", err)
}
}
return nil
}

View File

@@ -18,6 +18,7 @@ package system
import (
devchar "github.com/NVIDIA/nvidia-container-toolkit/cmd/nvidia-ctk/system/create-dev-char-symlinks"
devicenodes "github.com/NVIDIA/nvidia-container-toolkit/cmd/nvidia-ctk/system/create-device-nodes"
"github.com/sirupsen/logrus"
"github.com/urfave/cli/v2"
)
@@ -43,6 +44,7 @@ func (m command) build() *cli.Command {
system.Subcommands = []*cli.Command{
devchar.NewCommand(m.logger),
devicenodes.NewCommand(m.logger),
}
return &system

View File

@@ -14,10 +14,10 @@
# Supported OSs by architecture
AMD64_TARGETS := ubuntu20.04 ubuntu18.04 ubuntu16.04 debian10 debian9
X86_64_TARGETS := fedora35 centos7 centos8 rhel7 rhel8 amazonlinux2 opensuse-leap15.1
X86_64_TARGETS := centos7 centos8 rhel7 rhel8 amazonlinux2 opensuse-leap15.1
PPC64LE_TARGETS := ubuntu18.04 ubuntu16.04 centos7 centos8 rhel7 rhel8
ARM64_TARGETS := ubuntu20.04 ubuntu18.04
AARCH64_TARGETS := fedora35 centos8 rhel8 amazonlinux2
AARCH64_TARGETS := centos8 rhel8 amazonlinux2
# Define top-level build targets
docker%: SHELL:=/bin/bash
@@ -102,14 +102,6 @@ LIBNVIDIA_CONTAINER_TOOLS_VERSION := $(LIBNVIDIA_CONTAINER_VERSION)$(if $(LIBNVI
--centos%: CONFIG_TOML_SUFFIX := rpm-yum
--centos8%: BASEIMAGE = quay.io/centos/centos:stream8
# private fedora target
--fedora%: OS := fedora
--fedora%: DOCKERFILE = $(CURDIR)/docker/Dockerfile.rpm-yum
--fedora%: CONFIG_TOML_SUFFIX := rpm-yum
# The fedora(35) base image has very slow performance when building aarch64 packages.
# Since our primary concern here is glibc versions, we use the older glibc version available in centos8.
--fedora35%: BASEIMAGE = quay.io/centos/centos:stream8
# private amazonlinux target
--amazonlinux%: OS := amazonlinux
--amazonlinux%: DOCKERFILE = $(CURDIR)/docker/Dockerfile.rpm-yum

View File

@@ -45,9 +45,12 @@ var (
// Config represents the contents of the config.toml file for the NVIDIA Container Toolkit
// Note: This is currently duplicated by the HookConfig in cmd/nvidia-container-toolkit/hook_config.go
type Config struct {
NVIDIAContainerCLIConfig ContainerCLIConfig `toml:"nvidia-container-cli"`
NVIDIACTKConfig CTKConfig `toml:"nvidia-ctk"`
NVIDIAContainerRuntimeConfig RuntimeConfig `toml:"nvidia-container-runtime"`
AcceptEnvvarUnprivileged bool `toml:"accept-nvidia-visible-devices-envvar-when-unprivileged"`
NVIDIAContainerCLIConfig ContainerCLIConfig `toml:"nvidia-container-cli"`
NVIDIACTKConfig CTKConfig `toml:"nvidia-ctk"`
NVIDIAContainerRuntimeConfig RuntimeConfig `toml:"nvidia-container-runtime"`
NVIDIAContainerRuntimeHookConfig RuntimeHookConfig `toml:"nvidia-container-runtime-hook"`
}
// GetConfig sets up the config struct. Values are read from a toml file
@@ -91,6 +94,8 @@ func getConfigFrom(toml *toml.Tree) (*Config, error) {
return cfg, nil
}
cfg.AcceptEnvvarUnprivileged = toml.GetDefault("accept-nvidia-visible-devices-envvar-when-unprivileged", cfg.AcceptEnvvarUnprivileged).(bool)
cfg.NVIDIAContainerCLIConfig = *getContainerCLIConfigFrom(toml)
cfg.NVIDIACTKConfig = *getCTKConfigFrom(toml)
runtimeConfig, err := getRuntimeConfigFrom(toml)
@@ -99,12 +104,19 @@ func getConfigFrom(toml *toml.Tree) (*Config, error) {
}
cfg.NVIDIAContainerRuntimeConfig = *runtimeConfig
runtimeHookConfig, err := getRuntimeHookConfigFrom(toml)
if err != nil {
return nil, fmt.Errorf("failed to load nvidia-container-runtime-hook config: %v", err)
}
cfg.NVIDIAContainerRuntimeHookConfig = *runtimeHookConfig
return cfg, nil
}
// getDefaultConfig defines the default values for the config
func getDefaultConfig() *Config {
c := Config{
AcceptEnvvarUnprivileged: true,
NVIDIAContainerCLIConfig: *getDefaultContainerCLIConfig(),
NVIDIACTKConfig: *getDefaultCTKConfig(),
NVIDIAContainerRuntimeConfig: *GetDefaultRuntimeConfig(),

View File

@@ -57,6 +57,7 @@ func TestGetConfig(t *testing.T) {
{
description: "empty config is default",
expectedConfig: &Config{
AcceptEnvvarUnprivileged: true,
NVIDIAContainerCLIConfig: ContainerCLIConfig{
Root: "",
},
@@ -69,6 +70,10 @@ func TestGetConfig(t *testing.T) {
CSV: csvModeConfig{
MountSpecPath: "/etc/nvidia-container-runtime/host-files-for-container.d",
},
CDI: cdiModeConfig{
DefaultKind: "nvidia.com/gpu",
AnnotationPrefixes: []string{"cdi.k8s.io/"},
},
},
},
NVIDIACTKConfig: CTKConfig{
@@ -79,6 +84,7 @@ func TestGetConfig(t *testing.T) {
{
description: "config options set inline",
contents: []string{
"accept-nvidia-visible-devices-envvar-when-unprivileged = false",
"nvidia-container-cli.root = \"/bar/baz\"",
"nvidia-container-runtime.debug = \"/foo/bar\"",
"nvidia-container-runtime.experimental = true",
@@ -86,10 +92,13 @@ func TestGetConfig(t *testing.T) {
"nvidia-container-runtime.log-level = \"debug\"",
"nvidia-container-runtime.runtimes = [\"/some/runtime\",]",
"nvidia-container-runtime.mode = \"not-auto\"",
"nvidia-container-runtime.modes.cdi.default-kind = \"example.vendor.com/device\"",
"nvidia-container-runtime.modes.cdi.annotation-prefixes = [\"cdi.k8s.io/\", \"example.vendor.com/\",]",
"nvidia-container-runtime.modes.csv.mount-spec-path = \"/not/etc/nvidia-container-runtime/host-files-for-container.d\"",
"nvidia-ctk.path = \"/foo/bar/nvidia-ctk\"",
},
expectedConfig: &Config{
AcceptEnvvarUnprivileged: false,
NVIDIAContainerCLIConfig: ContainerCLIConfig{
Root: "/bar/baz",
},
@@ -102,6 +111,13 @@ func TestGetConfig(t *testing.T) {
CSV: csvModeConfig{
MountSpecPath: "/not/etc/nvidia-container-runtime/host-files-for-container.d",
},
CDI: cdiModeConfig{
DefaultKind: "example.vendor.com/device",
AnnotationPrefixes: []string{
"cdi.k8s.io/",
"example.vendor.com/",
},
},
},
},
NVIDIACTKConfig: CTKConfig{
@@ -112,6 +128,7 @@ func TestGetConfig(t *testing.T) {
{
description: "config options set in section",
contents: []string{
"accept-nvidia-visible-devices-envvar-when-unprivileged = false",
"[nvidia-container-cli]",
"root = \"/bar/baz\"",
"[nvidia-container-runtime]",
@@ -121,12 +138,16 @@ func TestGetConfig(t *testing.T) {
"log-level = \"debug\"",
"runtimes = [\"/some/runtime\",]",
"mode = \"not-auto\"",
"[nvidia-container-runtime.modes.cdi]",
"default-kind = \"example.vendor.com/device\"",
"annotation-prefixes = [\"cdi.k8s.io/\", \"example.vendor.com/\",]",
"[nvidia-container-runtime.modes.csv]",
"mount-spec-path = \"/not/etc/nvidia-container-runtime/host-files-for-container.d\"",
"[nvidia-ctk]",
"path = \"/foo/bar/nvidia-ctk\"",
},
expectedConfig: &Config{
AcceptEnvvarUnprivileged: false,
NVIDIAContainerCLIConfig: ContainerCLIConfig{
Root: "/bar/baz",
},
@@ -139,6 +160,13 @@ func TestGetConfig(t *testing.T) {
CSV: csvModeConfig{
MountSpecPath: "/not/etc/nvidia-container-runtime/host-files-for-container.d",
},
CDI: cdiModeConfig{
DefaultKind: "example.vendor.com/device",
AnnotationPrefixes: []string{
"cdi.k8s.io/",
"example.vendor.com/",
},
},
},
},
NVIDIACTKConfig: CTKConfig{

View File

@@ -1,125 +0,0 @@
/**
# Copyright (c) 2022, NVIDIA CORPORATION. All rights reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
**/
package crio
import (
"fmt"
"os"
"github.com/pelletier/go-toml"
log "github.com/sirupsen/logrus"
)
// LoadConfig loads the cri-o config from disk
func LoadConfig(config string) (*toml.Tree, error) {
log.Infof("Loading config: %v", config)
info, err := os.Stat(config)
if os.IsExist(err) && info.IsDir() {
return nil, fmt.Errorf("config file is a directory")
}
configFile := config
if os.IsNotExist(err) {
configFile = "/dev/null"
log.Infof("Config file does not exist, creating new one")
}
cfg, err := toml.LoadFile(configFile)
if err != nil {
return nil, err
}
log.Infof("Successfully loaded config")
return cfg, nil
}
// UpdateConfig updates the cri-o config to include the NVIDIA Container Runtime
func UpdateConfig(config *toml.Tree, runtimeClass string, runtimePath string, setAsDefault bool) error {
switch runc := config.Get("crio.runtime.runtimes.runc").(type) {
case *toml.Tree:
runc, _ = toml.Load(runc.String())
config.SetPath([]string{"crio", "runtime", "runtimes", runtimeClass}, runc)
}
config.SetPath([]string{"crio", "runtime", "runtimes", runtimeClass, "runtime_path"}, runtimePath)
config.SetPath([]string{"crio", "runtime", "runtimes", runtimeClass, "runtime_type"}, "oci")
if setAsDefault {
config.SetPath([]string{"crio", "runtime", "default_runtime"}, runtimeClass)
}
return nil
}
// RevertConfig reverts the cri-o config to remove the NVIDIA Container Runtime
func RevertConfig(config *toml.Tree, runtimeClass string) error {
if runtime, ok := config.GetPath([]string{"crio", "runtime", "default_runtime"}).(string); ok {
if runtimeClass == runtime {
config.DeletePath([]string{"crio", "runtime", "default_runtime"})
}
}
runtimeClassPath := []string{"crio", "runtime", "runtimes", runtimeClass}
config.DeletePath(runtimeClassPath)
for i := 0; i < len(runtimeClassPath); i++ {
remainingPath := runtimeClassPath[:len(runtimeClassPath)-i]
if entry, ok := config.GetPath(remainingPath).(*toml.Tree); ok {
if len(entry.Keys()) != 0 {
break
}
config.DeletePath(remainingPath)
}
}
return nil
}
// FlushConfig flushes the updated/reverted config out to disk
func FlushConfig(config string, cfg *toml.Tree) error {
log.Infof("Flushing config")
output, err := cfg.ToTomlString()
if err != nil {
return fmt.Errorf("unable to convert to TOML: %v", err)
}
switch len(output) {
case 0:
err := os.Remove(config)
if err != nil {
return fmt.Errorf("unable to remove empty file: %v", err)
}
log.Infof("Config empty, removing file")
default:
f, err := os.Create(config)
if err != nil {
return fmt.Errorf("unable to open '%v' for writing: %v", config, err)
}
defer f.Close()
_, err = f.WriteString(output)
if err != nil {
return fmt.Errorf("unable to write output: %v", err)
}
}
log.Infof("Successfully flushed config")
return nil
}

View File

@@ -1,117 +0,0 @@
/**
# Copyright (c) 2021-2022, NVIDIA CORPORATION. All rights reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
*/
package docker
import (
"bytes"
"encoding/json"
"fmt"
"io/ioutil"
"os"
log "github.com/sirupsen/logrus"
)
// LoadConfig loads the docker config from disk
func LoadConfig(configFilePath string) (map[string]interface{}, error) {
log.Infof("Loading docker config from %v", configFilePath)
info, err := os.Stat(configFilePath)
if os.IsExist(err) && info.IsDir() {
return nil, fmt.Errorf("config file is a directory")
}
cfg := make(map[string]interface{})
if os.IsNotExist(err) {
log.Infof("Config file does not exist, creating new one")
return cfg, nil
}
readBytes, err := ioutil.ReadFile(configFilePath)
if err != nil {
return nil, fmt.Errorf("unable to read config: %v", err)
}
reader := bytes.NewReader(readBytes)
if err := json.NewDecoder(reader).Decode(&cfg); err != nil {
return nil, err
}
log.Infof("Successfully loaded config")
return cfg, nil
}
// UpdateConfig updates the docker config to include the nvidia runtimes
func UpdateConfig(config map[string]interface{}, runtimeName string, runtimePath string, setAsDefault bool) error {
// Read the existing runtimes
runtimes := make(map[string]interface{})
if _, exists := config["runtimes"]; exists {
runtimes = config["runtimes"].(map[string]interface{})
}
// Add / update the runtime definitions
runtimes[runtimeName] = map[string]interface{}{
"path": runtimePath,
"args": []string{},
}
// Update the runtimes definition
if len(runtimes) > 0 {
config["runtimes"] = runtimes
}
if setAsDefault {
config["default-runtime"] = runtimeName
}
return nil
}
// FlushConfig flushes the updated/reverted config out to disk
func FlushConfig(cfg map[string]interface{}, configFilePath string) error {
log.Infof("Flushing docker config to %v", configFilePath)
output, err := json.MarshalIndent(cfg, "", " ")
if err != nil {
return fmt.Errorf("unable to convert to JSON: %v", err)
}
switch len(output) {
case 0:
err := os.Remove(configFilePath)
if err != nil {
return fmt.Errorf("unable to remove empty file: %v", err)
}
log.Infof("Config empty, removing file")
default:
f, err := os.Create(configFilePath)
if err != nil {
return fmt.Errorf("unable to open %v for writing: %v", configFilePath, err)
}
defer f.Close()
_, err = f.WriteString(string(output))
if err != nil {
return fmt.Errorf("unable to write output: %v", err)
}
}
log.Infof("Successfully flushed config")
return nil
}

View File

@@ -0,0 +1,25 @@
/**
# Copyright (c) NVIDIA CORPORATION. All rights reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
**/
package engine
// Interface defines the API for a runtime config updater.
type Interface interface {
DefaultRuntime() string
AddRuntime(string, string, bool) error
RemoveRuntime(string) error
Save(string) (int64, error)
}

View File

@@ -0,0 +1,140 @@
/**
# Copyright (c) NVIDIA CORPORATION. All rights reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
**/
package containerd
import (
"fmt"
"github.com/NVIDIA/nvidia-container-toolkit/internal/config/engine"
"github.com/pelletier/go-toml"
)
// ConfigV1 represents a version 1 containerd config
type ConfigV1 Config
var _ engine.Interface = (*ConfigV1)(nil)
// AddRuntime adds a runtime to the containerd config
func (c *ConfigV1) AddRuntime(name string, path string, setAsDefault bool) error {
if c == nil || c.Tree == nil {
return fmt.Errorf("config is nil")
}
config := *c.Tree
config.Set("version", int64(1))
switch runc := config.GetPath([]string{"plugins", "cri", "containerd", "runtimes", "runc"}).(type) {
case *toml.Tree:
runc, _ = toml.Load(runc.String())
config.SetPath([]string{"plugins", "cri", "containerd", "runtimes", name}, runc)
}
if config.GetPath([]string{"plugins", "cri", "containerd", "runtimes", name}) == nil {
config.SetPath([]string{"plugins", "cri", "containerd", "runtimes", name, "runtime_type"}, c.RuntimeType)
config.SetPath([]string{"plugins", "cri", "containerd", "runtimes", name, "runtime_root"}, "")
config.SetPath([]string{"plugins", "cri", "containerd", "runtimes", name, "runtime_engine"}, "")
config.SetPath([]string{"plugins", "cri", "containerd", "runtimes", name, "privileged_without_host_devices"}, false)
}
if len(c.ContainerAnnotations) > 0 {
annotations, err := (*Config)(c).getRuntimeAnnotations([]string{"plugins", "cri", "containerd", "runtimes", name, "container_annotations"})
if err != nil {
return err
}
annotations = append(c.ContainerAnnotations, annotations...)
config.SetPath([]string{"plugins", "cri", "containerd", "runtimes", name, "container_annotations"}, annotations)
}
config.SetPath([]string{"plugins", "cri", "containerd", "runtimes", name, "options", "BinaryName"}, path)
config.SetPath([]string{"plugins", "cri", "containerd", "runtimes", name, "options", "Runtime"}, path)
if setAsDefault && c.UseDefaultRuntimeName {
config.SetPath([]string{"plugins", "cri", "containerd", "default_runtime_name"}, name)
} else if setAsDefault {
// Note: This is deprecated in containerd 1.4.0 and will be removed in 1.5.0
if config.GetPath([]string{"plugins", "cri", "containerd", "default_runtime"}) == nil {
config.SetPath([]string{"plugins", "cri", "containerd", "default_runtime", "runtime_type"}, c.RuntimeType)
config.SetPath([]string{"plugins", "cri", "containerd", "default_runtime", "runtime_root"}, "")
config.SetPath([]string{"plugins", "cri", "containerd", "default_runtime", "runtime_engine"}, "")
config.SetPath([]string{"plugins", "cri", "containerd", "default_runtime", "privileged_without_host_devices"}, false)
}
config.SetPath([]string{"plugins", "cri", "containerd", "default_runtime", "options", "BinaryName"}, path)
config.SetPath([]string{"plugins", "cri", "containerd", "default_runtime", "options", "Runtime"}, path)
}
*c.Tree = config
return nil
}
// DefaultRuntime returns the default runtime for the cri-o config
func (c ConfigV1) DefaultRuntime() string {
if runtime, ok := c.GetPath([]string{"plugins", "cri", "containerd", "default_runtime_name"}).(string); ok {
return runtime
}
return ""
}
// RemoveRuntime removes a runtime from the docker config
func (c *ConfigV1) RemoveRuntime(name string) error {
if c == nil || c.Tree == nil {
return nil
}
config := *c.Tree
// If the specified runtime was set as the default runtime we need to remove the default runtime too.
runtimePath, ok := config.GetPath([]string{"plugins", "cri", "containerd", "runtimes", name, "options", "BinaryName"}).(string)
if !ok || runtimePath == "" {
runtimePath, _ = config.GetPath([]string{"plugins", "cri", "containerd", "runtimes", name, "options", "Runtime"}).(string)
}
defaultRuntimePath, ok := config.GetPath([]string{"plugins", "cri", "containerd", "default_runtime", "options", "BinaryName"}).(string)
if !ok || defaultRuntimePath == "" {
defaultRuntimePath, _ = config.GetPath([]string{"plugins", "cri", "containerd", "default_runtime", "options", "Runtime"}).(string)
}
if runtimePath != "" && defaultRuntimePath != "" && runtimePath == defaultRuntimePath {
config.DeletePath([]string{"plugins", "cri", "containerd", "default_runtime"})
}
config.DeletePath([]string{"plugins", "cri", "containerd", "runtimes", name})
if runtime, ok := config.GetPath([]string{"plugins", "cri", "containerd", "default_runtime_name"}).(string); ok {
if runtime == name {
config.DeletePath([]string{"plugins", "cri", "containerd", "default_runtime_name"})
}
}
runtimeConfigPath := []string{"plugins", "cri", "containerd", "runtimes", name}
for i := 0; i < len(runtimeConfigPath); i++ {
if runtimes, ok := config.GetPath(runtimeConfigPath[:len(runtimeConfigPath)-i]).(*toml.Tree); ok {
if len(runtimes.Keys()) == 0 {
config.DeletePath(runtimeConfigPath[:len(runtimeConfigPath)-i])
}
}
}
if len(config.Keys()) == 1 && config.Keys()[0] == "version" {
config.Delete("version")
}
*c.Tree = config
return nil
}
// Save wrotes the config to a file
func (c ConfigV1) Save(path string) (int64, error) {
return (Config)(c).Save(path)
}

View File

@@ -0,0 +1,161 @@
/**
# Copyright (c) NVIDIA CORPORATION. All rights reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
**/
package containerd
import (
"fmt"
"os"
"github.com/pelletier/go-toml"
)
// AddRuntime adds a runtime to the containerd config
func (c *Config) AddRuntime(name string, path string, setAsDefault bool) error {
if c == nil || c.Tree == nil {
return fmt.Errorf("config is nil")
}
config := *c.Tree
config.Set("version", int64(2))
switch runc := config.GetPath([]string{"plugins", "io.containerd.grpc.v1.cri", "containerd", "runtimes", "runc"}).(type) {
case *toml.Tree:
runc, _ = toml.Load(runc.String())
config.SetPath([]string{"plugins", "io.containerd.grpc.v1.cri", "containerd", "runtimes", name}, runc)
}
if config.GetPath([]string{"plugins", "io.containerd.grpc.v1.cri", "containerd", "runtimes", name}) == nil {
config.SetPath([]string{"plugins", "io.containerd.grpc.v1.cri", "containerd", "runtimes", name, "runtime_type"}, c.RuntimeType)
config.SetPath([]string{"plugins", "io.containerd.grpc.v1.cri", "containerd", "runtimes", name, "runtime_root"}, "")
config.SetPath([]string{"plugins", "io.containerd.grpc.v1.cri", "containerd", "runtimes", name, "runtime_engine"}, "")
config.SetPath([]string{"plugins", "io.containerd.grpc.v1.cri", "containerd", "runtimes", name, "privileged_without_host_devices"}, false)
}
if len(c.ContainerAnnotations) > 0 {
annotations, err := c.getRuntimeAnnotations([]string{"plugins", "io.containerd.grpc.v1.cri", "containerd", "runtimes", name, "container_annotations"})
if err != nil {
return err
}
annotations = append(c.ContainerAnnotations, annotations...)
config.SetPath([]string{"plugins", "io.containerd.grpc.v1.cri", "containerd", "runtimes", name, "container_annotations"}, annotations)
}
config.SetPath([]string{"plugins", "io.containerd.grpc.v1.cri", "containerd", "runtimes", name, "options", "BinaryName"}, path)
if setAsDefault {
config.SetPath([]string{"plugins", "io.containerd.grpc.v1.cri", "containerd", "default_runtime_name"}, name)
}
*c.Tree = config
return nil
}
func (c *Config) getRuntimeAnnotations(path []string) ([]string, error) {
if c == nil || c.Tree == nil {
return nil, nil
}
config := *c.Tree
if !config.HasPath(path) {
return nil, nil
}
annotationsI, ok := config.GetPath(path).([]interface{})
if !ok {
return nil, fmt.Errorf("invalid annotations: %v", annotationsI)
}
var annotations []string
for _, annotation := range annotationsI {
a, ok := annotation.(string)
if !ok {
return nil, fmt.Errorf("invalid annotation: %v", annotation)
}
annotations = append(annotations, a)
}
return annotations, nil
}
// DefaultRuntime returns the default runtime for the cri-o config
func (c Config) DefaultRuntime() string {
if runtime, ok := c.GetPath([]string{"plugins", "io.containerd.grpc.v1.cri", "containerd", "default_runtime_name"}).(string); ok {
return runtime
}
return ""
}
// RemoveRuntime removes a runtime from the docker config
func (c *Config) RemoveRuntime(name string) error {
if c == nil || c.Tree == nil {
return nil
}
config := *c.Tree
config.DeletePath([]string{"plugins", "io.containerd.grpc.v1.cri", "containerd", "runtimes", name})
if runtime, ok := config.GetPath([]string{"plugins", "io.containerd.grpc.v1.cri", "containerd", "default_runtime_name"}).(string); ok {
if runtime == name {
config.DeletePath([]string{"plugins", "io.containerd.grpc.v1.cri", "containerd", "default_runtime_name"})
}
}
runtimePath := []string{"plugins", "io.containerd.grpc.v1.cri", "containerd", "runtimes", name}
for i := 0; i < len(runtimePath); i++ {
if runtimes, ok := config.GetPath(runtimePath[:len(runtimePath)-i]).(*toml.Tree); ok {
if len(runtimes.Keys()) == 0 {
config.DeletePath(runtimePath[:len(runtimePath)-i])
}
}
}
if len(config.Keys()) == 1 && config.Keys()[0] == "version" {
config.Delete("version")
}
*c.Tree = config
return nil
}
// Save writes the config to the specified path
func (c Config) Save(path string) (int64, error) {
config := c.Tree
output, err := config.ToTomlString()
if err != nil {
return 0, fmt.Errorf("unable to convert to TOML: %v", err)
}
if len(output) == 0 {
err := os.Remove(path)
if err != nil {
return 0, fmt.Errorf("unable to remove empty file: %v", err)
}
return 0, nil
}
f, err := os.Create(path)
if err != nil {
return 0, fmt.Errorf("unable to open '%v' for writing: %v", path, err)
}
defer f.Close()
n, err := f.WriteString(output)
if err != nil {
return 0, fmt.Errorf("unable to write output: %v", err)
}
return int64(n), err
}

View File

@@ -0,0 +1,40 @@
/**
# Copyright (c) 2022, NVIDIA CORPORATION. All rights reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
**/
package containerd
import (
"github.com/NVIDIA/nvidia-container-toolkit/internal/config/engine"
"github.com/pelletier/go-toml"
)
// Config represents the containerd config
type Config struct {
*toml.Tree
RuntimeType string
UseDefaultRuntimeName bool
ContainerAnnotations []string
}
// New creates a containerd config with the specified options
func New(opts ...Option) (engine.Interface, error) {
b := &builder{}
for _, opt := range opts {
opt(b)
}
return b.build()
}

View File

@@ -0,0 +1,149 @@
/**
# Copyright (c) NVIDIA CORPORATION. All rights reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
**/
package containerd
import (
"fmt"
"os"
"github.com/NVIDIA/nvidia-container-toolkit/internal/config/engine"
"github.com/pelletier/go-toml"
log "github.com/sirupsen/logrus"
)
const (
defaultRuntimeType = "io.containerd.runc.v2"
)
type builder struct {
path string
runtimeType string
useLegacyConfig bool
containerAnnotations []string
}
// Option defines a function that can be used to configure the config builder
type Option func(*builder)
// WithPath sets the path for the config builder
func WithPath(path string) Option {
return func(b *builder) {
b.path = path
}
}
// WithRuntimeType sets the runtime type for the config builder
func WithRuntimeType(runtimeType string) Option {
return func(b *builder) {
b.runtimeType = runtimeType
}
}
// WithUseLegacyConfig sets the useLegacyConfig flag for the config builder
func WithUseLegacyConfig(useLegacyConfig bool) Option {
return func(b *builder) {
b.useLegacyConfig = useLegacyConfig
}
}
// WithContainerAnnotations sets the container annotations for the config builder
func WithContainerAnnotations(containerAnnotations ...string) Option {
return func(b *builder) {
b.containerAnnotations = containerAnnotations
}
}
func (b *builder) build() (engine.Interface, error) {
if b.path == "" {
return nil, fmt.Errorf("config path is empty")
}
if b.runtimeType == "" {
b.runtimeType = defaultRuntimeType
}
config, err := loadConfig(b.path)
if err != nil {
return nil, fmt.Errorf("failed to load config: %v", err)
}
config.RuntimeType = b.runtimeType
config.UseDefaultRuntimeName = !b.useLegacyConfig
config.ContainerAnnotations = b.containerAnnotations
version, err := config.parseVersion(b.useLegacyConfig)
if err != nil {
return nil, fmt.Errorf("failed to parse config version: %v", err)
}
switch version {
case 1:
return (*ConfigV1)(config), nil
case 2:
return config, nil
}
return nil, fmt.Errorf("unsupported config version: %v", version)
}
// loadConfig loads the containerd config from disk
func loadConfig(config string) (*Config, error) {
log.Infof("Loading config: %v", config)
info, err := os.Stat(config)
if os.IsExist(err) && info.IsDir() {
return nil, fmt.Errorf("config file is a directory")
}
configFile := config
if os.IsNotExist(err) {
configFile = "/dev/null"
log.Infof("Config file does not exist, creating new one")
}
tomlConfig, err := toml.LoadFile(configFile)
if err != nil {
return nil, err
}
log.Infof("Successfully loaded config")
cfg := Config{
Tree: tomlConfig,
}
return &cfg, nil
}
// parseVersion returns the version of the config
func (c *Config) parseVersion(useLegacyConfig bool) (int, error) {
defaultVersion := 2
if useLegacyConfig {
defaultVersion = 1
}
switch v := c.Get("version").(type) {
case nil:
switch len(c.Keys()) {
case 0: // No config exists, or the config file is empty, use version inferred from containerd
return defaultVersion, nil
default: // A config file exists, has content, and no version is set
return 1, nil
}
case int64:
return int(v), nil
default:
return -1, fmt.Errorf("unsupported type for version field: %v", v)
}
}

View File

@@ -0,0 +1,131 @@
/**
# Copyright (c) 2022, NVIDIA CORPORATION. All rights reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
**/
package crio
import (
"fmt"
"os"
"github.com/NVIDIA/nvidia-container-toolkit/internal/config/engine"
"github.com/pelletier/go-toml"
)
// Config represents the cri-o config
type Config toml.Tree
// New creates a cri-o config with the specified options
func New(opts ...Option) (engine.Interface, error) {
b := &builder{}
for _, opt := range opts {
opt(b)
}
return b.build()
}
// AddRuntime adds a new runtime to the crio config
func (c *Config) AddRuntime(name string, path string, setAsDefault bool) error {
if c == nil {
return fmt.Errorf("config is nil")
}
config := (toml.Tree)(*c)
switch runc := config.Get("crio.runtime.runtimes.runc").(type) {
case *toml.Tree:
runc, _ = toml.Load(runc.String())
config.SetPath([]string{"crio", "runtime", "runtimes", name}, runc)
}
config.SetPath([]string{"crio", "runtime", "runtimes", name, "runtime_path"}, path)
config.SetPath([]string{"crio", "runtime", "runtimes", name, "runtime_type"}, "oci")
if setAsDefault {
config.SetPath([]string{"crio", "runtime", "default_runtime"}, name)
}
*c = (Config)(config)
return nil
}
// DefaultRuntime returns the default runtime for the cri-o config
func (c Config) DefaultRuntime() string {
config := (toml.Tree)(c)
if runtime, ok := config.GetPath([]string{"crio", "runtime", "default_runtime"}).(string); ok {
return runtime
}
return ""
}
// RemoveRuntime removes a runtime from the cri-o config
func (c *Config) RemoveRuntime(name string) error {
if c == nil {
return nil
}
config := (toml.Tree)(*c)
if runtime, ok := config.GetPath([]string{"crio", "runtime", "default_runtime"}).(string); ok {
if runtime == name {
config.DeletePath([]string{"crio", "runtime", "default_runtime"})
}
}
runtimeClassPath := []string{"crio", "runtime", "runtimes", name}
config.DeletePath(runtimeClassPath)
for i := 0; i < len(runtimeClassPath); i++ {
remainingPath := runtimeClassPath[:len(runtimeClassPath)-i]
if entry, ok := config.GetPath(remainingPath).(*toml.Tree); ok {
if len(entry.Keys()) != 0 {
break
}
config.DeletePath(remainingPath)
}
}
*c = (Config)(config)
return nil
}
// Save writes the config to the specified path
func (c Config) Save(path string) (int64, error) {
config := (toml.Tree)(c)
output, err := config.ToTomlString()
if err != nil {
return 0, fmt.Errorf("unable to convert to TOML: %v", err)
}
if len(output) == 0 {
err := os.Remove(path)
if err != nil {
return 0, fmt.Errorf("unable to remove empty file: %v", err)
}
return 0, nil
}
f, err := os.Create(path)
if err != nil {
return 0, fmt.Errorf("unable to open '%v' for writing: %v", path, err)
}
defer f.Close()
n, err := f.WriteString(output)
if err != nil {
return 0, fmt.Errorf("unable to write output: %v", err)
}
return int64(n), err
}

View File

@@ -0,0 +1,73 @@
/**
# Copyright (c) NVIDIA CORPORATION. All rights reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
**/
package crio
import (
"fmt"
"os"
"github.com/pelletier/go-toml"
log "github.com/sirupsen/logrus"
)
type builder struct {
path string
}
// Option defines a function that can be used to configure the config builder
type Option func(*builder)
// WithPath sets the path for the config builder
func WithPath(path string) Option {
return func(b *builder) {
b.path = path
}
}
func (b *builder) build() (*Config, error) {
if b.path == "" {
empty := toml.Tree{}
return (*Config)(&empty), nil
}
return loadConfig(b.path)
}
// loadConfig loads the cri-o config from disk
func loadConfig(config string) (*Config, error) {
log.Infof("Loading config: %v", config)
info, err := os.Stat(config)
if os.IsExist(err) && info.IsDir() {
return nil, fmt.Errorf("config file is a directory")
}
configFile := config
if os.IsNotExist(err) {
configFile = "/dev/null"
log.Infof("Config file does not exist, creating new one")
}
cfg, err := toml.LoadFile(configFile)
if err != nil {
return nil, err
}
log.Infof("Successfully loaded config")
return (*Config)(cfg), nil
}

View File

@@ -0,0 +1,140 @@
/**
# Copyright (c) 2021-2022, NVIDIA CORPORATION. All rights reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
*/
package docker
import (
"encoding/json"
"fmt"
"os"
"github.com/NVIDIA/nvidia-container-toolkit/internal/config/engine"
)
const (
defaultDockerRuntime = "runc"
)
// Config defines a docker config file.
// TODO: This should not be public, but we need to access it from the tests in tools/container/docker
type Config map[string]interface{}
// New creates a docker config with the specified options
func New(opts ...Option) (engine.Interface, error) {
b := &builder{}
for _, opt := range opts {
opt(b)
}
return b.build()
}
// AddRuntime adds a new runtime to the docker config
func (c *Config) AddRuntime(name string, path string, setAsDefault bool) error {
if c == nil {
return fmt.Errorf("config is nil")
}
config := *c
// Read the existing runtimes
runtimes := make(map[string]interface{})
if _, exists := config["runtimes"]; exists {
runtimes = config["runtimes"].(map[string]interface{})
}
// Add / update the runtime definitions
runtimes[name] = map[string]interface{}{
"path": path,
"args": []string{},
}
config["runtimes"] = runtimes
if setAsDefault {
config["default-runtime"] = name
}
*c = config
return nil
}
// DefaultRuntime returns the default runtime for the docker config
func (c Config) DefaultRuntime() string {
r, ok := c["default-runtime"].(string)
if !ok {
return ""
}
return r
}
// RemoveRuntime removes a runtime from the docker config
func (c *Config) RemoveRuntime(name string) error {
if c == nil {
return nil
}
config := *c
if _, exists := config["default-runtime"]; exists {
defaultRuntime := config["default-runtime"].(string)
if defaultRuntime == name {
config["default-runtime"] = defaultDockerRuntime
}
}
if _, exists := config["runtimes"]; exists {
runtimes := config["runtimes"].(map[string]interface{})
delete(runtimes, name)
if len(runtimes) == 0 {
delete(config, "runtimes")
}
}
*c = config
return nil
}
// Save writes the config to the specified path
func (c Config) Save(path string) (int64, error) {
output, err := json.MarshalIndent(c, "", " ")
if err != nil {
return 0, fmt.Errorf("unable to convert to JSON: %v", err)
}
if len(output) == 0 {
err := os.Remove(path)
if err != nil {
return 0, fmt.Errorf("unable to remove empty file: %v", err)
}
return 0, nil
}
f, err := os.Create(path)
if err != nil {
return 0, fmt.Errorf("unable to open %v for writing: %v", path, err)
}
defer f.Close()
n, err := f.WriteString(string(output))
if err != nil {
return 0, fmt.Errorf("unable to write output: %v", err)
}
return int64(n), nil
}

View File

@@ -26,7 +26,7 @@ import (
func TestUpdateConfigDefaultRuntime(t *testing.T) {
testCases := []struct {
config map[string]interface{}
config Config
runtimeName string
setAsDefault bool
expectedDefaultRuntimeName interface{}
@@ -63,7 +63,7 @@ func TestUpdateConfigDefaultRuntime(t *testing.T) {
if tc.config == nil {
tc.config = make(map[string]interface{})
}
err := UpdateConfig(tc.config, tc.runtimeName, "", tc.setAsDefault)
err := tc.config.AddRuntime(tc.runtimeName, "", tc.setAsDefault)
require.NoError(t, err)
defaultRuntimeName := tc.config["default-runtime"]
@@ -74,7 +74,7 @@ func TestUpdateConfigDefaultRuntime(t *testing.T) {
func TestUpdateConfigRuntimes(t *testing.T) {
testCases := []struct {
config map[string]interface{}
config Config
runtimes map[string]string
expectedConfig map[string]interface{}
}{
@@ -198,7 +198,7 @@ func TestUpdateConfigRuntimes(t *testing.T) {
for i, tc := range testCases {
t.Run(fmt.Sprintf("test case %d", i), func(t *testing.T) {
for runtimeName, runtimePath := range tc.runtimes {
err := UpdateConfig(tc.config, runtimeName, runtimePath, false)
err := tc.config.AddRuntime(runtimeName, runtimePath, false)
require.NoError(t, err)
}

View File

@@ -0,0 +1,80 @@
/**
# Copyright (c) NVIDIA CORPORATION. All rights reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
**/
package docker
import (
"bytes"
"encoding/json"
"fmt"
"io/ioutil"
"os"
log "github.com/sirupsen/logrus"
)
type builder struct {
path string
}
// Option defines a function that can be used to configure the config builder
type Option func(*builder)
// WithPath sets the path for the config builder
func WithPath(path string) Option {
return func(b *builder) {
b.path = path
}
}
func (b *builder) build() (*Config, error) {
if b.path == "" {
empty := make(Config)
return &empty, nil
}
return loadConfig(b.path)
}
// loadConfig loads the docker config from disk
func loadConfig(configFilePath string) (*Config, error) {
log.Infof("Loading docker config from %v", configFilePath)
info, err := os.Stat(configFilePath)
if os.IsExist(err) && info.IsDir() {
return nil, fmt.Errorf("config file is a directory")
}
cfg := make(Config)
if os.IsNotExist(err) {
log.Infof("Config file does not exist, creating new one")
return &cfg, nil
}
readBytes, err := ioutil.ReadFile(configFilePath)
if err != nil {
return nil, fmt.Errorf("unable to read config: %v", err)
}
reader := bytes.NewReader(readBytes)
if err := json.NewDecoder(reader).Decode(&cfg); err != nil {
return nil, err
}
log.Infof("Successfully loaded config")
return &cfg, nil
}

62
internal/config/hook.go Normal file
View File

@@ -0,0 +1,62 @@
/**
# Copyright (c) 2022, NVIDIA CORPORATION. All rights reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
**/
package config
import (
"fmt"
"github.com/pelletier/go-toml"
)
// RuntimeHookConfig stores the config options for the NVIDIA Container Runtime
type RuntimeHookConfig struct {
// SkipModeDetection disables the mode check for the runtime hook.
SkipModeDetection bool `toml:"skip-mode-detection"`
}
// dummyHookConfig allows us to unmarshal only a RuntimeHookConfig from a *toml.Tree
type dummyHookConfig struct {
RuntimeHook RuntimeHookConfig `toml:"nvidia-container-runtime-hook"`
}
// getRuntimeHookConfigFrom reads the nvidia container runtime config from the specified toml Tree.
func getRuntimeHookConfigFrom(toml *toml.Tree) (*RuntimeHookConfig, error) {
cfg := GetDefaultRuntimeHookConfig()
if toml == nil {
return cfg, nil
}
d := dummyHookConfig{
RuntimeHook: *cfg,
}
if err := toml.Unmarshal(&d); err != nil {
return nil, fmt.Errorf("failed to unmarshal runtime config: %v", err)
}
return &d.RuntimeHook, nil
}
// GetDefaultRuntimeHookConfig defines the default values for the config
func GetDefaultRuntimeHookConfig() *RuntimeHookConfig {
c := RuntimeHookConfig{
SkipModeDetection: false,
}
return &c
}

View File

@@ -0,0 +1,43 @@
/**
# Copyright (c) NVIDIA CORPORATION. All rights reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
**/
package image
import (
"github.com/opencontainers/runtime-spec/specs-go"
)
const (
capSysAdmin = "CAP_SYS_ADMIN"
)
// IsPrivileged returns true if the container is a privileged container.
func IsPrivileged(s *specs.Spec) bool {
if s.Process.Capabilities == nil {
return false
}
// We only make sure that the bounding capabibility set has
// CAP_SYS_ADMIN. This allows us to make sure that the container was
// actually started as '--privileged', but also allow non-root users to
// access the privileged NVIDIA capabilities.
for _, c := range s.Process.Capabilities.Bounding {
if c == capSysAdmin {
return true
}
}
return false
}

View File

@@ -19,6 +19,7 @@ package config
import (
"fmt"
"github.com/container-orchestrated-devices/container-device-interface/pkg/cdi"
"github.com/pelletier/go-toml"
"github.com/sirupsen/logrus"
)
@@ -50,6 +51,10 @@ type modesConfig struct {
type cdiModeConfig struct {
// SpecDirs allows for the default spec dirs for CDI to be overridden
SpecDirs []string `toml:"spec-dirs"`
// DefaultKind sets the default kind to be used when constructing fully-qualified CDI device names
DefaultKind string `toml:"default-kind"`
// AnnotationPrefixes sets the allowed prefixes for CDI annotation-based device injection
AnnotationPrefixes []string `toml:"annotation-prefixes"`
}
type csvModeConfig struct {
@@ -94,6 +99,12 @@ func GetDefaultRuntimeConfig() *RuntimeConfig {
CSV: csvModeConfig{
MountSpecPath: "/etc/nvidia-container-runtime/host-files-for-container.d",
},
CDI: cdiModeConfig{
DefaultKind: "nvidia.com/gpu",
AnnotationPrefixes: []string{
cdi.AnnotationPrefix,
},
},
},
}

View File

@@ -20,11 +20,13 @@ import (
"fmt"
"os"
"path/filepath"
"strings"
"github.com/NVIDIA/nvidia-container-toolkit/internal/config/image"
"github.com/NVIDIA/nvidia-container-toolkit/internal/info/drm"
"github.com/NVIDIA/nvidia-container-toolkit/internal/info/proc"
"github.com/NVIDIA/nvidia-container-toolkit/internal/lookup"
"github.com/NVIDIA/nvidia-container-toolkit/internal/lookup/cuda"
"github.com/sirupsen/logrus"
)
@@ -44,9 +46,15 @@ func NewGraphicsDiscoverer(logger *logrus.Logger, devices image.VisibleDevices,
drmByPathSymlinks := newCreateDRMByPathSymlinks(logger, drmDeviceNodes, cfg)
xorg, err := newXorgDiscoverer(logger, driverRoot, cfg.NvidiaCTKPath)
if err != nil {
return nil, fmt.Errorf("failed to create Xorg discoverer: %v", err)
}
discover := Merge(
Merge(drmDeviceNodes, drmByPathSymlinks),
mounts,
xorg,
)
return discover, nil
@@ -243,6 +251,112 @@ func newDRMDeviceFilter(logger *logrus.Logger, devices image.VisibleDevices, dri
return filter, nil
}
type xorgHooks struct {
libraries Discover
driverVersion string
nvidiaCTKPath string
}
var _ Discover = (*xorgHooks)(nil)
func newXorgDiscoverer(logger *logrus.Logger, driverRoot string, nvidiaCTKPath string) (Discover, error) {
libCudaPaths, err := cuda.New(
cuda.WithLogger(logger),
cuda.WithDriverRoot(driverRoot),
).Locate(".*.*.*")
if err != nil {
return nil, fmt.Errorf("failed to locate libcuda.so: %v", err)
}
libcudaPath := libCudaPaths[0]
version := strings.TrimPrefix(filepath.Base(libcudaPath), "libcuda.so.")
if version == "" {
return nil, fmt.Errorf("failed to determine libcuda.so version from path: %q", libcudaPath)
}
libRoot := filepath.Dir(libcudaPath)
xorgLibs := NewMounts(
logger,
lookup.NewFileLocator(
lookup.WithLogger(logger),
lookup.WithRoot(driverRoot),
lookup.WithSearchPaths(libRoot, "/usr/lib/x86_64-linux-gnu"),
lookup.WithCount(1),
),
driverRoot,
[]string{
"nvidia/xorg/nvidia_drv.so",
fmt.Sprintf("nvidia/xorg/libglxserver_nvidia.so.%s", version),
},
)
xorgHooks := xorgHooks{
libraries: xorgLibs,
driverVersion: version,
nvidiaCTKPath: FindNvidiaCTK(logger, nvidiaCTKPath),
}
xorgConfg := NewMounts(
logger,
lookup.NewFileLocator(
lookup.WithLogger(logger),
lookup.WithRoot(driverRoot),
lookup.WithSearchPaths("/usr/share"),
),
driverRoot,
[]string{"X11/xorg.conf.d/10-nvidia.conf"},
)
d := Merge(
xorgLibs,
xorgConfg,
xorgHooks,
)
return d, nil
}
// Devices returns no devices for Xorg
func (m xorgHooks) Devices() ([]Device, error) {
return nil, nil
}
// Hooks returns a hook to create symlinks for Xorg libraries
func (m xorgHooks) Hooks() ([]Hook, error) {
mounts, err := m.libraries.Mounts()
if err != nil {
return nil, fmt.Errorf("failed to get mounts: %v", err)
}
if len(mounts) == 0 {
return nil, nil
}
var target string
for _, mount := range mounts {
filename := filepath.Base(mount.HostPath)
if filename == "libglxserver_nvidia.so."+m.driverVersion {
target = mount.Path
}
}
if target == "" {
return nil, nil
}
link := strings.TrimSuffix(target, "."+m.driverVersion)
links := []string{fmt.Sprintf("%s::%s", filepath.Base(target), link)}
symlinkHook := CreateCreateSymlinkHook(
m.nvidiaCTKPath,
links,
)
return symlinkHook.Hooks()
}
// Mounts returns the libraries required for Xorg
func (m xorgHooks) Mounts() ([]Mount, error) {
return nil, nil
}
// selectDeviceByPath is a filter that allows devices to be selected by the path
type selectDeviceByPath map[string]bool

View File

@@ -25,21 +25,39 @@ type ipcMounts mounts
// NewIPCDiscoverer creats a discoverer for NVIDIA IPC sockets.
func NewIPCDiscoverer(logger *logrus.Logger, driverRoot string) (Discover, error) {
d := newMounts(
sockets := newMounts(
logger,
lookup.NewFileLocator(
lookup.WithLogger(logger),
lookup.WithRoot(driverRoot),
lookup.WithSearchPaths("/run", "/var/run"),
lookup.WithCount(1),
),
driverRoot,
[]string{
"/nvidia-persistenced/socket",
"/nvidia-fabricmanager/socket",
},
)
mps := newMounts(
logger,
lookup.NewFileLocator(
lookup.WithLogger(logger),
lookup.WithRoot(driverRoot),
lookup.WithCount(1),
),
driverRoot,
[]string{
"/var/run/nvidia-persistenced/socket",
"/var/run/nvidia-fabricmanager/socket",
"/tmp/nvidia-mps",
},
)
return (*ipcMounts)(d), nil
d := Merge(
(*ipcMounts)(sockets),
(*ipcMounts)(mps),
)
return d, nil
}
// Mounts returns the discovered mounts with "noexec" added to the mount options.

View File

@@ -49,9 +49,15 @@ func Shutdown() error {
// GetDriverStorePaths returns the list of driver store paths
func GetDriverStorePaths() []string {
var paths []string
selected := make(map[string]bool)
for i := 0; i < dxcore.getAdapterCount(); i++ {
adapter := dxcore.getAdapter(i)
paths = append(paths, adapter.getDriverStorePath())
path := dxcore.getAdapter(i).getDriverStorePath()
if selected[path] {
continue
}
selected[path] = true
paths = append(paths, path)
}
return paths

View File

@@ -0,0 +1,102 @@
/**
# Copyright (c) NVIDIA CORPORATION. All rights reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
**/
package cuda
import (
"path/filepath"
"github.com/NVIDIA/nvidia-container-toolkit/internal/lookup"
"github.com/sirupsen/logrus"
)
type cudaLocator struct {
logger *logrus.Logger
driverRoot string
}
// Options is a function that configures a cudaLocator.
type Options func(*cudaLocator)
// WithLogger is an option that configures the logger used by the locator.
func WithLogger(logger *logrus.Logger) Options {
return func(c *cudaLocator) {
c.logger = logger
}
}
// WithDriverRoot is an option that configures the driver root used by the locator.
func WithDriverRoot(driverRoot string) Options {
return func(c *cudaLocator) {
c.driverRoot = driverRoot
}
}
// New creates a new CUDA library locator.
func New(opts ...Options) lookup.Locator {
c := &cudaLocator{}
for _, opt := range opts {
opt(c)
}
if c.logger == nil {
c.logger = logrus.StandardLogger()
}
if c.driverRoot == "" {
c.driverRoot = "/"
}
return c
}
// Locate returns the path to the libcuda.so.RMVERSION file.
// libcuda.so is prefixed to the specified pattern.
func (l *cudaLocator) Locate(pattern string) ([]string, error) {
ldcacheLocator, err := lookup.NewLibraryLocator(
l.logger,
l.driverRoot,
)
if err != nil {
l.logger.Debugf("Failed to create LDCache locator: %v", err)
}
fullPattern := "libcuda.so" + pattern
candidates, err := ldcacheLocator.Locate("libcuda.so")
if err == nil {
for _, c := range candidates {
if match, err := filepath.Match(fullPattern, filepath.Base(c)); err != nil || !match {
l.logger.Debugf("Skipping non-matching candidate %v: %v", c, err)
continue
}
return []string{c}, nil
}
}
l.logger.Debugf("Could not locate %q in LDCache: Checking predefined library paths.", pattern)
pathLocator := lookup.NewFileLocator(
lookup.WithLogger(l.logger),
lookup.WithRoot(l.driverRoot),
lookup.WithSearchPaths(
"/usr/lib64",
"/usr/lib/x86_64-linux-gnu",
"/usr/lib/aarch64-linux-gnu",
),
lookup.WithCount(1),
)
return pathLocator.Locate(fullPattern)
}

View File

@@ -40,6 +40,7 @@ func NewLibraryLocator(logger *log.Logger, root string) (Locator, error) {
}
l := library{
logger: logger,
symlink: NewSymlinkLocator(logger, root),
cache: cache,
}

View File

@@ -38,7 +38,7 @@ type cdiModifier struct {
// CDI specifications available on the system. The NVIDIA_VISIBLE_DEVICES enviroment variable is
// used to select the devices to include.
func NewCDIModifier(logger *logrus.Logger, cfg *config.Config, ociSpec oci.Spec) (oci.SpecModifier, error) {
devices, err := getDevicesFromSpec(ociSpec)
devices, err := getDevicesFromSpec(logger, ociSpec, cfg)
if err != nil {
return nil, fmt.Errorf("failed to get required devices from OCI specification: %v", err)
}
@@ -46,6 +46,7 @@ func NewCDIModifier(logger *logrus.Logger, cfg *config.Config, ociSpec oci.Spec)
logger.Debugf("No devices requested; no modification required.")
return nil, nil
}
logger.Debugf("Creating CDI modifier for devices: %v", devices)
specDirs := cdi.DefaultSpecDirs
if len(cfg.NVIDIAContainerRuntimeConfig.Modes.CDI.SpecDirs) > 0 {
@@ -61,38 +62,82 @@ func NewCDIModifier(logger *logrus.Logger, cfg *config.Config, ociSpec oci.Spec)
return m, nil
}
func getDevicesFromSpec(ociSpec oci.Spec) ([]string, error) {
func getDevicesFromSpec(logger *logrus.Logger, ociSpec oci.Spec, cfg *config.Config) ([]string, error) {
rawSpec, err := ociSpec.Load()
if err != nil {
return nil, fmt.Errorf("failed to load OCI spec: %v", err)
}
image, err := image.NewCUDAImageFromSpec(rawSpec)
if err != nil {
return nil, err
}
envDevices := image.DevicesFromEnvvars(visibleDevicesEnvvar)
_, annotationDevices, err := cdi.ParseAnnotations(rawSpec.Annotations)
annotationDevices, err := getAnnotationDevices(cfg.NVIDIAContainerRuntimeConfig.Modes.CDI.AnnotationPrefixes, rawSpec.Annotations)
if err != nil {
return nil, fmt.Errorf("failed to parse container annotations: %v", err)
}
uniqueDevices := make(map[string]struct{})
for _, name := range append(envDevices.List(), annotationDevices...) {
if !cdi.IsQualifiedName(name) {
name = cdi.QualifiedName("nvidia.com", "gpu", name)
}
uniqueDevices[name] = struct{}{}
if len(annotationDevices) > 0 {
return annotationDevices, nil
}
container, err := image.NewCUDAImageFromSpec(rawSpec)
if err != nil {
return nil, err
}
envDevices := container.DevicesFromEnvvars(visibleDevicesEnvvar)
var devices []string
for name := range uniqueDevices {
seen := make(map[string]bool)
for _, name := range envDevices.List() {
if !cdi.IsQualifiedName(name) {
name = fmt.Sprintf("%s=%s", cfg.NVIDIAContainerRuntimeConfig.Modes.CDI.DefaultKind, name)
}
if seen[name] {
logger.Debugf("Ignoring duplicate device %q", name)
continue
}
devices = append(devices, name)
}
return devices, nil
if len(devices) == 0 {
return nil, nil
}
if cfg.AcceptEnvvarUnprivileged || image.IsPrivileged(rawSpec) {
return devices, nil
}
logger.Warningf("Ignoring devices specified in NVIDIA_VISIBLE_DEVICES: %v", devices)
return nil, nil
}
// getAnnotationDevices returns a list of devices specified in the annotations.
// Keys starting with the specified prefixes are considered and expected to contain a comma-separated list of
// fully-qualified CDI devices names. If any device name is not fully-quality an error is returned.
// The list of returned devices is deduplicated.
func getAnnotationDevices(prefixes []string, annotations map[string]string) ([]string, error) {
devicesByKey := make(map[string][]string)
for key, value := range annotations {
for _, prefix := range prefixes {
if strings.HasPrefix(key, prefix) {
devicesByKey[key] = strings.Split(value, ",")
}
}
}
seen := make(map[string]bool)
var annotationDevices []string
for key, devices := range devicesByKey {
for _, device := range devices {
if !cdi.IsQualifiedName(device) {
return nil, fmt.Errorf("invalid device name %q in annotation %q", device, key)
}
if seen[device] {
continue
}
annotationDevices = append(annotationDevices, device)
seen[device] = true
}
}
return annotationDevices, nil
}
// Modify loads the CDI registry and injects the specified CDI devices into the OCI runtime specification.
@@ -105,21 +150,8 @@ func (m cdiModifier) Modify(spec *specs.Spec) error {
m.logger.Debugf("The following error was triggered when refreshing the CDI registry: %v", err)
}
devices := m.devices
for _, d := range devices {
if d == "nvidia.com/gpu=all" {
devices = []string{}
for _, candidate := range registry.DeviceDB().ListDevices() {
if strings.HasPrefix(candidate, "nvidia.com/gpu=") {
devices = append(devices, candidate)
}
}
break
}
}
m.logger.Debugf("Injecting devices using CDI: %v", devices)
_, err := registry.InjectDevices(spec, devices...)
m.logger.Debugf("Injecting devices using CDI: %v", m.devices)
_, err := registry.InjectDevices(spec, m.devices...)
if err != nil {
return fmt.Errorf("failed to inject CDI devices: %v", err)
}

View File

@@ -0,0 +1,92 @@
/**
# Copyright (c) NVIDIA CORPORATION. All rights reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
**/
package modifier
import (
"fmt"
"testing"
"github.com/stretchr/testify/require"
)
func TestGetAnnotationDevices(t *testing.T) {
testCases := []struct {
description string
prefixes []string
annotations map[string]string
expectedDevices []string
expectedError error
}{
{
description: "no annotations",
},
{
description: "no matching annotations",
prefixes: []string{"not-prefix/"},
annotations: map[string]string{
"prefix/foo": "example.com/device=bar",
},
},
{
description: "single matching annotation",
prefixes: []string{"prefix/"},
annotations: map[string]string{
"prefix/foo": "example.com/device=bar",
},
expectedDevices: []string{"example.com/device=bar"},
},
{
description: "multiple matching annotations",
prefixes: []string{"prefix/", "another-prefix/"},
annotations: map[string]string{
"prefix/foo": "example.com/device=bar",
"another-prefix/bar": "example.com/device=baz",
},
expectedDevices: []string{"example.com/device=bar", "example.com/device=baz"},
},
{
description: "multiple matching annotations with duplicate devices",
prefixes: []string{"prefix/", "another-prefix/"},
annotations: map[string]string{
"prefix/foo": "example.com/device=bar",
"another-prefix/bar": "example.com/device=bar",
},
expectedDevices: []string{"example.com/device=bar"},
},
{
description: "invalid devices",
prefixes: []string{"prefix/"},
annotations: map[string]string{
"prefix/foo": "example.com/device",
},
expectedError: fmt.Errorf("invalid device %q", "example.com/device"),
},
}
for _, tc := range testCases {
t.Run(tc.description, func(t *testing.T) {
devices, err := getAnnotationDevices(tc.prefixes, tc.annotations)
if tc.expectedError != nil {
require.Error(t, err)
return
}
require.NoError(t, err)
require.ElementsMatch(t, tc.expectedDevices, devices)
})
}
}

View File

@@ -0,0 +1,36 @@
/**
# Copyright (c) NVIDIA CORPORATION. All rights reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
**/
package system
import "github.com/sirupsen/logrus"
// Option is a functional option for the system command
type Option func(*Interface)
// WithLogger sets the logger for the system command
func WithLogger(logger *logrus.Logger) Option {
return func(i *Interface) {
i.logger = logger
}
}
// WithDryRun sets the dry run flag
func WithDryRun(dryRun bool) Option {
return func(i *Interface) {
i.dryRun = dryRun
}
}

149
internal/system/system.go Normal file
View File

@@ -0,0 +1,149 @@
/**
# Copyright (c) NVIDIA CORPORATION. All rights reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
**/
package system
import (
"fmt"
"os"
"path/filepath"
"strings"
"github.com/NVIDIA/nvidia-container-toolkit/internal/info/proc/devices"
"github.com/sirupsen/logrus"
"golang.org/x/sys/unix"
)
// Interface is the interface for the system command
type Interface struct {
logger *logrus.Logger
dryRun bool
nvidiaDevices nvidiaDevices
}
// New constructs a system command with the specified options
func New(opts ...Option) (*Interface, error) {
i := &Interface{
logger: logrus.StandardLogger(),
}
for _, opt := range opts {
opt(i)
}
devices, err := devices.GetNVIDIADevices()
if err != nil {
return nil, fmt.Errorf("failed to create devices info: %v", err)
}
i.nvidiaDevices = nvidiaDevices{devices}
return i, nil
}
// CreateNVIDIAControlDeviceNodesAt creates the NVIDIA control device nodes associated with the NVIDIA driver at the specified root.
func (m *Interface) CreateNVIDIAControlDeviceNodesAt(root string) error {
controlNodes := []string{"/dev/nvidiactl", "/dev/nvidia-modeset", "/dev/nvidia-uvm", "/dev/nvidia-uvm-tools"}
for _, node := range controlNodes {
path := filepath.Join(root, node)
err := m.CreateNVIDIADeviceNode(path)
if err != nil {
return fmt.Errorf("failed to create device node %s: %v", path, err)
}
}
return nil
}
// CreateNVIDIADeviceNode creates a specified device node associated with the NVIDIA driver.
func (m *Interface) CreateNVIDIADeviceNode(path string) error {
node := filepath.Base(path)
if !strings.HasPrefix(node, "nvidia") {
return fmt.Errorf("invalid device node %q", node)
}
major, err := m.nvidiaDevices.Major(node)
if err != nil {
return fmt.Errorf("failed to determine major: %v", err)
}
minor, err := m.nvidiaDevices.Minor(node)
if err != nil {
return fmt.Errorf("failed to determine minor: %v", err)
}
return m.createDeviceNode(path, int(major), int(minor))
}
func (m *Interface) createDeviceNode(path string, major int, minor int) error {
if m.dryRun {
m.logger.Infof("Running: mknod --mode=0666 %s c %d %d", path, major, minor)
return nil
}
if _, err := os.Stat(path); err == nil {
m.logger.Infof("Skipping: %s already exists", path)
return nil
} else if !os.IsNotExist(err) {
return fmt.Errorf("failed to stat %s: %v", path, err)
}
err := unix.Mknod(path, unix.S_IFCHR, int(unix.Mkdev(uint32(major), uint32(minor))))
if err != nil {
return err
}
return unix.Chmod(path, 0666)
}
type nvidiaDevices struct {
devices.Devices
}
// Major returns the major number for the specified NVIDIA device node.
// If the device node is not supported, an error is returned.
func (n *nvidiaDevices) Major(node string) (int64, error) {
var valid bool
var major devices.Major
switch node {
case "nvidia-uvm", "nvidia-uvm-tools":
major, valid = n.Get(devices.NVIDIAUVM)
case "nvidia-modeset", "nvidiactl":
major, valid = n.Get(devices.NVIDIAGPU)
}
if !valid {
return 0, fmt.Errorf("invalid device node %q", node)
}
return int64(major), nil
}
// Minor returns the minor number for the specified NVIDIA device node.
// If the device node is not supported, an error is returned.
func (n *nvidiaDevices) Minor(node string) (int64, error) {
switch node {
case "nvidia-modeset":
return devices.NVIDIAModesetMinor, nil
case "nvidia-uvm-tools":
return devices.NVIDIAUVMToolsMinor, nil
case "nvidia-uvm":
return devices.NVIDIAUVMMinor, nil
case "nvidiactl":
return devices.NVIDIACTLMinor, nil
}
return 0, fmt.Errorf("invalid device node %q", node)
}

View File

@@ -69,7 +69,9 @@ rm -rf %{_localstatedir}/lib/rpm-state/nvidia-container-toolkit
ln -sf %{_bindir}/nvidia-container-runtime-hook %{_bindir}/nvidia-container-toolkit
%postun
if [ -L %{_bindir}/nvidia-container-toolkit ] then; rm -f %{_bindir}/nvidia-container-toolkit; fi
if [ "$1" = 0 ]; then # package is uninstalled, not upgraded
if [ -L %{_bindir}/nvidia-container-toolkit ]; then rm -f %{_bindir}/nvidia-container-toolkit; fi
fi
%files
%license LICENSE

View File

@@ -17,6 +17,7 @@
package nvcdi
import (
"github.com/NVIDIA/nvidia-container-toolkit/pkg/nvcdi/spec"
"github.com/container-orchestrated-devices/container-device-interface/pkg/cdi"
"github.com/container-orchestrated-devices/container-device-interface/specs-go"
"gitlab.com/nvidia/cloud-native/go-nvlib/pkg/nvlib/device"
@@ -29,10 +30,17 @@ const (
ModeNvml = "nvml"
// ModeWsl configures the CDI spec generator to generate a WSL spec.
ModeWsl = "wsl"
// ModeManagement configures the CDI spec generator to generate a management spec.
ModeManagement = "management"
// ModeGds configures the CDI spec generator to generate a GDS spec.
ModeGds = "gds"
// ModeMofed configures the CDI spec generator to generate a MOFED spec.
ModeMofed = "mofed"
)
// Interface defines the API for the nvcdi package
type Interface interface {
GetSpec() (spec.Interface, error)
GetCommonEdits() (*cdi.ContainerEdits, error)
GetAllDeviceSpecs() ([]specs.Device, error)
GetGPUDeviceEdits(device.Device) (*cdi.ContainerEdits, error)

View File

@@ -22,8 +22,8 @@ import (
"strings"
"github.com/NVIDIA/nvidia-container-toolkit/internal/discover"
"github.com/NVIDIA/nvidia-container-toolkit/internal/ldcache"
"github.com/NVIDIA/nvidia-container-toolkit/internal/lookup"
"github.com/NVIDIA/nvidia-container-toolkit/internal/lookup/cuda"
"github.com/sirupsen/logrus"
"gitlab.com/nvidia/cloud-native/go-nvlib/pkg/nvml"
)
@@ -31,11 +31,20 @@ import (
// NewDriverDiscoverer creates a discoverer for the libraries and binaries associated with a driver installation.
// The supplied NVML Library is used to query the expected driver version.
func NewDriverDiscoverer(logger *logrus.Logger, driverRoot string, nvidiaCTKPath string, nvmllib nvml.Interface) (discover.Discover, error) {
if r := nvmllib.Init(); r != nvml.SUCCESS {
return nil, fmt.Errorf("failed to initalize NVML: %v", r)
}
defer nvmllib.Shutdown()
version, r := nvmllib.SystemGetDriverVersion()
if r != nvml.SUCCESS {
return nil, fmt.Errorf("failed to determine driver version: %v", r)
}
return newDriverVersionDiscoverer(logger, driverRoot, nvidiaCTKPath, version)
}
func newDriverVersionDiscoverer(logger *logrus.Logger, driverRoot string, nvidiaCTKPath string, version string) (discover.Discover, error) {
libraries, err := NewDriverLibraryDiscoverer(logger, driverRoot, nvidiaCTKPath, version)
if err != nil {
return nil, fmt.Errorf("failed to create discoverer for driver libraries: %v", err)
@@ -127,26 +136,24 @@ func NewDriverBinariesDiscoverer(logger *logrus.Logger, driverRoot string) disco
func getVersionLibs(logger *logrus.Logger, driverRoot string, version string) ([]string, error) {
logger.Infof("Using driver version %v", version)
cache, err := ldcache.New(logger, driverRoot)
libCudaPaths, err := cuda.New(
cuda.WithLogger(logger),
cuda.WithDriverRoot(driverRoot),
).Locate("." + version)
if err != nil {
return nil, fmt.Errorf("failed to load ldcache: %v", err)
return nil, fmt.Errorf("failed to locate libcuda.so.%v: %v", version, err)
}
libRoot := filepath.Dir(libCudaPaths[0])
libs32, libs64 := cache.List()
libraries := lookup.NewFileLocator(
lookup.WithLogger(logger),
lookup.WithSearchPaths(libRoot),
lookup.WithOptional(true),
)
var libs []string
for _, l := range libs64 {
if strings.HasSuffix(l, version) {
logger.Infof("found 64-bit driver lib: %v", l)
libs = append(libs, l)
}
}
for _, l := range libs32 {
if strings.HasSuffix(l, version) {
logger.Infof("found 32-bit driver lib: %v", l)
libs = append(libs, l)
}
libs, err := libraries.Locate("*.so." + version)
if err != nil {
return nil, fmt.Errorf("failed to locate libraries for driver version %v: %v", version, err)
}
if driverRoot == "/" || driverRoot == "" {

View File

@@ -73,6 +73,7 @@ type byPathHookDiscoverer struct {
driverRoot string
nvidiaCTKPath string
pciBusID string
deviceNodes discover.Discover
}
var _ discover.Discover = (*byPathHookDiscoverer)(nil)
@@ -111,11 +112,20 @@ func newFullGPUDiscoverer(logger *logrus.Logger, driverRoot string, nvidiaCTKPat
driverRoot: driverRoot,
nvidiaCTKPath: nvidiaCTKPath,
pciBusID: pciBusID,
deviceNodes: deviceNodes,
}
deviceFolderPermissionHooks := newDeviceFolderPermissionHookDiscoverer(
logger,
driverRoot,
nvidiaCTKPath,
deviceNodes,
)
dd := discover.Merge(
deviceNodes,
byPathHooks,
deviceFolderPermissionHooks,
)
return dd, nil
@@ -158,6 +168,20 @@ func (d *byPathHookDiscoverer) Mounts() ([]discover.Mount, error) {
}
func (d *byPathHookDiscoverer) deviceNodeLinks() ([]string, error) {
devices, err := d.deviceNodes.Devices()
if err != nil {
return nil, fmt.Errorf("failed to discover device nodes: %v", err)
}
if len(devices) == 0 {
return nil, nil
}
selectedDevices := make(map[string]bool)
for _, d := range devices {
selectedDevices[d.HostPath] = true
}
candidates := []string{
fmt.Sprintf("/dev/dri/by-path/pci-%s-card", d.pciBusID),
fmt.Sprintf("/dev/dri/by-path/pci-%s-render", d.pciBusID),
@@ -172,6 +196,14 @@ func (d *byPathHookDiscoverer) deviceNodeLinks() ([]string, error) {
continue
}
deviceNode := device
if !filepath.IsAbs(device) {
deviceNode = filepath.Join(filepath.Dir(linkPath), device)
}
if !selectedDevices[deviceNode] {
d.logger.Debugf("ignoring device symlink %v -> %v since %v is not mounted", linkPath, device, deviceNode)
continue
}
d.logger.Debugf("adding device symlink %v -> %v", linkPath, device)
links = append(links, fmt.Sprintf("%v::%v", device, linkPath))
}

82
pkg/nvcdi/gds.go Normal file
View File

@@ -0,0 +1,82 @@
/**
# Copyright (c) NVIDIA CORPORATION. All rights reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
**/
package nvcdi
import (
"fmt"
"github.com/NVIDIA/nvidia-container-toolkit/internal/discover"
"github.com/NVIDIA/nvidia-container-toolkit/internal/edits"
"github.com/NVIDIA/nvidia-container-toolkit/pkg/nvcdi/spec"
"github.com/container-orchestrated-devices/container-device-interface/pkg/cdi"
"github.com/container-orchestrated-devices/container-device-interface/specs-go"
"gitlab.com/nvidia/cloud-native/go-nvlib/pkg/nvlib/device"
)
type gdslib nvcdilib
var _ Interface = (*gdslib)(nil)
// GetAllDeviceSpecs returns the device specs for all available devices.
func (l *gdslib) GetAllDeviceSpecs() ([]specs.Device, error) {
discoverer, err := discover.NewGDSDiscoverer(l.logger, l.driverRoot)
if err != nil {
return nil, fmt.Errorf("failed to create GPUDirect Storage discoverer: %v", err)
}
edits, err := edits.FromDiscoverer(discoverer)
if err != nil {
return nil, fmt.Errorf("failed to create container edits for GPUDirect Storage: %v", err)
}
deviceSpec := specs.Device{
Name: "all",
ContainerEdits: *edits.ContainerEdits,
}
return []specs.Device{deviceSpec}, nil
}
// GetCommonEdits generates a CDI specification that can be used for ANY devices
func (l *gdslib) GetCommonEdits() (*cdi.ContainerEdits, error) {
return edits.FromDiscoverer(discover.None{})
}
// GetSpec is unsppported for the gdslib specs.
// gdslib is typically wrapped by a spec that implements GetSpec.
func (l *gdslib) GetSpec() (spec.Interface, error) {
return nil, fmt.Errorf("GetSpec is not supported")
}
// GetGPUDeviceEdits is unsupported for the gdslib specs
func (l *gdslib) GetGPUDeviceEdits(device.Device) (*cdi.ContainerEdits, error) {
return nil, fmt.Errorf("GetGPUDeviceEdits is not supported")
}
// GetGPUDeviceSpecs is unsupported for the gdslib specs
func (l *gdslib) GetGPUDeviceSpecs(int, device.Device) (*specs.Device, error) {
return nil, fmt.Errorf("GetGPUDeviceSpecs is not supported")
}
// GetMIGDeviceEdits is unsupported for the gdslib specs
func (l *gdslib) GetMIGDeviceEdits(device.Device, device.MigDevice) (*cdi.ContainerEdits, error) {
return nil, fmt.Errorf("GetMIGDeviceEdits is not supported")
}
// GetMIGDeviceSpecs is unsupported for the gdslib specs
func (l *gdslib) GetMIGDeviceSpecs(int, device.Device, int, device.MigDevice) (*specs.Device, error) {
return nil, fmt.Errorf("GetMIGDeviceSpecs is not supported")
}

View File

@@ -20,19 +20,31 @@ import (
"fmt"
"github.com/NVIDIA/nvidia-container-toolkit/internal/edits"
"github.com/NVIDIA/nvidia-container-toolkit/pkg/nvcdi/spec"
"github.com/container-orchestrated-devices/container-device-interface/pkg/cdi"
"github.com/container-orchestrated-devices/container-device-interface/specs-go"
"gitlab.com/nvidia/cloud-native/go-nvlib/pkg/nvlib/device"
"gitlab.com/nvidia/cloud-native/go-nvlib/pkg/nvml"
)
type nvmllib nvcdilib
var _ Interface = (*nvmllib)(nil)
// GetSpec should not be called for nvmllib
func (l *nvmllib) GetSpec() (spec.Interface, error) {
return nil, fmt.Errorf("Unexpected call to nvmllib.GetSpec()")
}
// GetAllDeviceSpecs returns the device specs for all available devices.
func (l *nvmllib) GetAllDeviceSpecs() ([]specs.Device, error) {
var deviceSpecs []specs.Device
if r := l.nvmllib.Init(); r != nvml.SUCCESS {
return nil, fmt.Errorf("failed to initalize NVML: %v", r)
}
defer l.nvmllib.Shutdown()
gpuDeviceSpecs, err := l.getGPUDeviceSpecs()
if err != nil {
return nil, err

View File

@@ -20,6 +20,7 @@ import (
"fmt"
"github.com/NVIDIA/nvidia-container-toolkit/internal/edits"
"github.com/NVIDIA/nvidia-container-toolkit/pkg/nvcdi/spec"
"github.com/container-orchestrated-devices/container-device-interface/pkg/cdi"
"github.com/container-orchestrated-devices/container-device-interface/specs-go"
"gitlab.com/nvidia/cloud-native/go-nvlib/pkg/nvlib/device"
@@ -29,6 +30,11 @@ type wsllib nvcdilib
var _ Interface = (*wsllib)(nil)
// GetSpec should not be called for wsllib
func (l *wsllib) GetSpec() (spec.Interface, error) {
return nil, fmt.Errorf("Unexpected call to wsllib.GetSpec()")
}
// GetAllDeviceSpecs returns the device specs for all available devices.
func (l *wsllib) GetAllDeviceSpecs() ([]specs.Device, error) {
device := newDXGDeviceDiscoverer(l.logger, l.driverRoot)

View File

@@ -17,12 +17,22 @@
package nvcdi
import (
"fmt"
"github.com/NVIDIA/nvidia-container-toolkit/pkg/nvcdi/spec"
"github.com/sirupsen/logrus"
"gitlab.com/nvidia/cloud-native/go-nvlib/pkg/nvlib/device"
"gitlab.com/nvidia/cloud-native/go-nvlib/pkg/nvlib/info"
"gitlab.com/nvidia/cloud-native/go-nvlib/pkg/nvml"
)
type wrapper struct {
Interface
vendor string
class string
}
type nvcdilib struct {
logger *logrus.Logger
nvmllib nvml.Interface
@@ -32,11 +42,14 @@ type nvcdilib struct {
driverRoot string
nvidiaCTKPath string
vendor string
class string
infolib info.Interface
}
// New creates a new nvcdi library
func New(opts ...Option) Interface {
func New(opts ...Option) (Interface, error) {
l := &nvcdilib{}
for _, opt := range opts {
opt(l)
@@ -60,7 +73,13 @@ func New(opts ...Option) Interface {
l.infolib = info.New()
}
var lib Interface
switch l.resolveMode() {
case ModeManagement:
if l.vendor == "" {
l.vendor = "management.nvidia.com"
}
lib = (*managementlib)(l)
case ModeNvml:
if l.nvmllib == nil {
l.nvmllib = nvml.New()
@@ -69,13 +88,50 @@ func New(opts ...Option) Interface {
l.devicelib = device.New(device.WithNvml(l.nvmllib))
}
return (*nvmllib)(l)
lib = (*nvmllib)(l)
case ModeWsl:
return (*wsllib)(l)
lib = (*wsllib)(l)
case ModeGds:
if l.class == "" {
l.class = "gds"
}
lib = (*gdslib)(l)
case ModeMofed:
if l.class == "" {
l.class = "mofed"
}
lib = (*mofedlib)(l)
default:
return nil, fmt.Errorf("unknown mode %q", l.mode)
}
// TODO: We want an error here.
return nil
w := wrapper{
Interface: lib,
vendor: l.vendor,
class: l.class,
}
return &w, nil
}
// GetSpec combines the device specs and common edits from the wrapped Interface to a single spec.Interface.
func (l *wrapper) GetSpec() (spec.Interface, error) {
deviceSpecs, err := l.GetAllDeviceSpecs()
if err != nil {
return nil, err
}
edits, err := l.GetCommonEdits()
if err != nil {
return nil, err
}
return spec.New(
spec.WithDeviceSpecs(deviceSpecs),
spec.WithEdits(*edits.ContainerEdits),
spec.WithVendor(l.vendor),
spec.WithClass(l.class),
)
}
// resolveMode resolves the mode for CDI spec generation based on the current system.
@@ -96,3 +152,24 @@ func (l *nvcdilib) resolveMode() (rmode string) {
return ModeNvml
}
// getCudaVersion returns the CUDA version of the current system.
func (l *nvcdilib) getCudaVersion() (string, error) {
if hasNVML, reason := l.infolib.HasNvml(); !hasNVML {
return "", fmt.Errorf("nvml not detected: %v", reason)
}
if l.nvmllib == nil {
return "", fmt.Errorf("nvml library not initialized")
}
r := l.nvmllib.Init()
if r != nvml.SUCCESS {
return "", fmt.Errorf("failed to initialize nvml: %v", r)
}
defer l.nvmllib.Shutdown()
version, r := l.nvmllib.SystemGetDriverVersion()
if r != nvml.SUCCESS {
return "", fmt.Errorf("failed to get driver version: %v", r)
}
return version, nil
}

190
pkg/nvcdi/management.go Normal file
View File

@@ -0,0 +1,190 @@
/**
# Copyright (c) NVIDIA CORPORATION. All rights reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
**/
package nvcdi
import (
"fmt"
"path/filepath"
"strings"
"github.com/NVIDIA/nvidia-container-toolkit/internal/discover"
"github.com/NVIDIA/nvidia-container-toolkit/internal/edits"
"github.com/NVIDIA/nvidia-container-toolkit/internal/lookup/cuda"
"github.com/NVIDIA/nvidia-container-toolkit/pkg/nvcdi/spec"
"github.com/container-orchestrated-devices/container-device-interface/pkg/cdi"
"github.com/container-orchestrated-devices/container-device-interface/specs-go"
"gitlab.com/nvidia/cloud-native/go-nvlib/pkg/nvlib/device"
)
type managementlib nvcdilib
var _ Interface = (*managementlib)(nil)
// GetAllDeviceSpecs returns all device specs for use in managemnt containers.
// A single device with the name `all` is returned.
func (m *managementlib) GetAllDeviceSpecs() ([]specs.Device, error) {
devices, err := m.newManagementDeviceDiscoverer()
if err != nil {
return nil, fmt.Errorf("failed to create device discoverer: %v", err)
}
edits, err := edits.FromDiscoverer(devices)
if err != nil {
return nil, fmt.Errorf("failed to create edits from discoverer: %v", err)
}
if len(edits.DeviceNodes) == 0 {
return nil, fmt.Errorf("no NVIDIA device nodes found")
}
device := specs.Device{
Name: "all",
ContainerEdits: *edits.ContainerEdits,
}
return []specs.Device{device}, nil
}
// GetCommonEdits returns the common edits for use in managementlib containers.
func (m *managementlib) GetCommonEdits() (*cdi.ContainerEdits, error) {
version, err := m.getCudaVersion()
if err != nil {
return nil, fmt.Errorf("failed to get CUDA version: %v", err)
}
driver, err := newDriverVersionDiscoverer(m.logger, m.driverRoot, m.nvidiaCTKPath, version)
if err != nil {
return nil, fmt.Errorf("failed to create driver library discoverer: %v", err)
}
edits, err := edits.FromDiscoverer(driver)
if err != nil {
return nil, fmt.Errorf("failed to create edits from discoverer: %v", err)
}
return edits, nil
}
// getCudaVersion returns the CUDA version for use in managementlib containers.
func (m *managementlib) getCudaVersion() (string, error) {
version, err := (*nvcdilib)(m).getCudaVersion()
if err == nil {
return version, nil
}
libCudaPaths, err := cuda.New(
cuda.WithLogger(m.logger),
cuda.WithDriverRoot(m.driverRoot),
).Locate(".*.*.*")
if err != nil {
return "", fmt.Errorf("failed to locate libcuda.so: %v", err)
}
libCudaPath := libCudaPaths[0]
version = strings.TrimPrefix(filepath.Base(libCudaPath), "libcuda.so.")
return version, nil
}
type managementDiscoverer struct {
discover.Discover
}
// newManagementDeviceDiscoverer returns a discover.Discover that discovers device nodes for use in managementlib containers.
// NVML is not used to query devices and all device nodes are returned.
func (m *managementlib) newManagementDeviceDiscoverer() (discover.Discover, error) {
deviceNodes := discover.NewCharDeviceDiscoverer(
m.logger,
[]string{
"/dev/nvidia*",
"/dev/nvidia-caps/nvidia-cap*",
"/dev/nvidia-modeset",
"/dev/nvidia-uvm-tools",
"/dev/nvidia-uvm",
"/dev/nvidiactl",
},
m.driverRoot,
)
deviceFolderPermissionHooks := newDeviceFolderPermissionHookDiscoverer(
m.logger,
m.driverRoot,
m.nvidiaCTKPath,
deviceNodes,
)
d := discover.Merge(
&managementDiscoverer{deviceNodes},
deviceFolderPermissionHooks,
)
return d, nil
}
func (m *managementDiscoverer) Devices() ([]discover.Device, error) {
devices, err := m.Discover.Devices()
if err != nil {
return devices, err
}
var filteredDevices []discover.Device
for _, device := range devices {
if m.nodeIsBlocked(device.HostPath) {
continue
}
filteredDevices = append(filteredDevices, device)
}
return filteredDevices, nil
}
// nodeIsBlocked returns true if the specified device node should be ignored.
func (m managementDiscoverer) nodeIsBlocked(path string) bool {
blockedPrefixes := []string{"nvidia-fs", "nvidia-nvswitch", "nvidia-nvlink"}
nodeName := filepath.Base(path)
for _, prefix := range blockedPrefixes {
if strings.HasPrefix(nodeName, prefix) {
return true
}
}
return false
}
// GetSpec is unsppported for the managementlib specs.
// managementlib is typically wrapped by a spec that implements GetSpec.
func (m *managementlib) GetSpec() (spec.Interface, error) {
return nil, fmt.Errorf("GetSpec is not supported")
}
// GetGPUDeviceEdits is unsupported for the managementlib specs
func (m *managementlib) GetGPUDeviceEdits(device.Device) (*cdi.ContainerEdits, error) {
return nil, fmt.Errorf("GetGPUDeviceEdits is not supported")
}
// GetGPUDeviceSpecs is unsupported for the managementlib specs
func (m *managementlib) GetGPUDeviceSpecs(int, device.Device) (*specs.Device, error) {
return nil, fmt.Errorf("GetGPUDeviceSpecs is not supported")
}
// GetMIGDeviceEdits is unsupported for the managementlib specs
func (m *managementlib) GetMIGDeviceEdits(device.Device, device.MigDevice) (*cdi.ContainerEdits, error) {
return nil, fmt.Errorf("GetMIGDeviceEdits is not supported")
}
// GetMIGDeviceSpecs is unsupported for the managementlib specs
func (m *managementlib) GetMIGDeviceSpecs(int, device.Device, int, device.MigDevice) (*specs.Device, error) {
return nil, fmt.Errorf("GetMIGDeviceSpecs is not supported")
}

82
pkg/nvcdi/mofed.go Normal file
View File

@@ -0,0 +1,82 @@
/**
# Copyright (c) NVIDIA CORPORATION. All rights reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
**/
package nvcdi
import (
"fmt"
"github.com/NVIDIA/nvidia-container-toolkit/internal/discover"
"github.com/NVIDIA/nvidia-container-toolkit/internal/edits"
"github.com/NVIDIA/nvidia-container-toolkit/pkg/nvcdi/spec"
"github.com/container-orchestrated-devices/container-device-interface/pkg/cdi"
"github.com/container-orchestrated-devices/container-device-interface/specs-go"
"gitlab.com/nvidia/cloud-native/go-nvlib/pkg/nvlib/device"
)
type mofedlib nvcdilib
var _ Interface = (*mofedlib)(nil)
// GetAllDeviceSpecs returns the device specs for all available devices.
func (l *mofedlib) GetAllDeviceSpecs() ([]specs.Device, error) {
discoverer, err := discover.NewMOFEDDiscoverer(l.logger, l.driverRoot)
if err != nil {
return nil, fmt.Errorf("failed to create MOFED discoverer: %v", err)
}
edits, err := edits.FromDiscoverer(discoverer)
if err != nil {
return nil, fmt.Errorf("failed to create container edits for MOFED devices: %v", err)
}
deviceSpec := specs.Device{
Name: "all",
ContainerEdits: *edits.ContainerEdits,
}
return []specs.Device{deviceSpec}, nil
}
// GetCommonEdits generates a CDI specification that can be used for ANY devices
func (l *mofedlib) GetCommonEdits() (*cdi.ContainerEdits, error) {
return edits.FromDiscoverer(discover.None{})
}
// GetSpec is unsppported for the mofedlib specs.
// mofedlib is typically wrapped by a spec that implements GetSpec.
func (l *mofedlib) GetSpec() (spec.Interface, error) {
return nil, fmt.Errorf("GetSpec is not supported")
}
// GetGPUDeviceEdits is unsupported for the mofedlib specs
func (l *mofedlib) GetGPUDeviceEdits(device.Device) (*cdi.ContainerEdits, error) {
return nil, fmt.Errorf("GetGPUDeviceEdits is not supported")
}
// GetGPUDeviceSpecs is unsupported for the mofedlib specs
func (l *mofedlib) GetGPUDeviceSpecs(int, device.Device) (*specs.Device, error) {
return nil, fmt.Errorf("GetGPUDeviceSpecs is not supported")
}
// GetMIGDeviceEdits is unsupported for the mofedlib specs
func (l *mofedlib) GetMIGDeviceEdits(device.Device, device.MigDevice) (*cdi.ContainerEdits, error) {
return nil, fmt.Errorf("GetMIGDeviceEdits is not supported")
}
// GetMIGDeviceSpecs is unsupported for the mofedlib specs
func (l *mofedlib) GetMIGDeviceSpecs(int, device.Device, int, device.MigDevice) (*specs.Device, error) {
return nil, fmt.Errorf("GetMIGDeviceSpecs is not supported")
}

View File

@@ -73,3 +73,17 @@ func WithMode(mode string) Option {
l.mode = mode
}
}
// WithVendor sets the vendor for the library
func WithVendor(vendor string) Option {
return func(o *nvcdilib) {
o.vendor = vendor
}
}
// WithClass sets the class for the library
func WithClass(class string) Option {
return func(o *nvcdilib) {
o.class = class
}
}

40
pkg/nvcdi/spec/api.go Normal file
View File

@@ -0,0 +1,40 @@
/**
# Copyright (c) NVIDIA CORPORATION. All rights reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
**/
package spec
import (
"io"
"github.com/container-orchestrated-devices/container-device-interface/specs-go"
)
const (
// DetectMinimumVersion is a constant that triggers a spec to detect the minimum required version.
DetectMinimumVersion = "DETECT_MINIMUM_VERSION"
// FormatJSON indicates a JSON output format
FormatJSON = "json"
// FormatYAML indicates a YAML output format
FormatYAML = "yaml"
)
// Interface is the interface for the spec API
type Interface interface {
io.WriterTo
Save(string) error
Raw() *specs.Spec
}

159
pkg/nvcdi/spec/builder.go Normal file
View File

@@ -0,0 +1,159 @@
/**
# Copyright (c) NVIDIA CORPORATION. All rights reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
**/
package spec
import (
"fmt"
"github.com/NVIDIA/nvidia-container-toolkit/pkg/nvcdi/transform"
"github.com/container-orchestrated-devices/container-device-interface/pkg/cdi"
"github.com/container-orchestrated-devices/container-device-interface/specs-go"
)
type builder struct {
raw *specs.Spec
version string
vendor string
class string
deviceSpecs []specs.Device
edits specs.ContainerEdits
format string
noSimplify bool
}
// newBuilder creates a new spec builder with the supplied options
func newBuilder(opts ...Option) *builder {
s := &builder{}
for _, opt := range opts {
opt(s)
}
if s.raw != nil {
s.noSimplify = true
vendor, class := cdi.ParseQualifier(s.raw.Kind)
s.vendor = vendor
s.class = class
}
if s.version == "" {
s.version = DetectMinimumVersion
}
if s.vendor == "" {
s.vendor = "nvidia.com"
}
if s.class == "" {
s.class = "gpu"
}
if s.format == "" {
s.format = FormatYAML
}
return s
}
// Build builds a CDI spec form the spec builder.
func (o *builder) Build() (*spec, error) {
raw := o.raw
if raw == nil {
raw = &specs.Spec{
Version: o.version,
Kind: fmt.Sprintf("%s/%s", o.vendor, o.class),
Devices: o.deviceSpecs,
ContainerEdits: o.edits,
}
}
if raw.Version == DetectMinimumVersion {
minVersion, err := cdi.MinimumRequiredVersion(raw)
if err != nil {
return nil, fmt.Errorf("failed to get minumum required CDI spec version: %v", err)
}
raw.Version = minVersion
}
if !o.noSimplify {
err := transform.NewSimplifier().Transform(raw)
if err != nil {
return nil, fmt.Errorf("failed to simplify spec: %v", err)
}
}
s := spec{
Spec: raw,
format: o.format,
}
return &s, nil
}
// Option defines a function that can be used to configure the spec builder.
type Option func(*builder)
// WithDeviceSpecs sets the device specs for the spec builder
func WithDeviceSpecs(deviceSpecs []specs.Device) Option {
return func(o *builder) {
o.deviceSpecs = deviceSpecs
}
}
// WithEdits sets the container edits for the spec builder
func WithEdits(edits specs.ContainerEdits) Option {
return func(o *builder) {
o.edits = edits
}
}
// WithVersion sets the version for the spec builder
func WithVersion(version string) Option {
return func(o *builder) {
o.version = version
}
}
// WithVendor sets the vendor for the spec builder
func WithVendor(vendor string) Option {
return func(o *builder) {
o.vendor = vendor
}
}
// WithClass sets the class for the spec builder
func WithClass(class string) Option {
return func(o *builder) {
o.class = class
}
}
// WithFormat sets the output file format
func WithFormat(format string) Option {
return func(o *builder) {
o.format = format
}
}
// WithNoSimplify sets whether the spec must be simplified
func WithNoSimplify(noSimplify bool) Option {
return func(o *builder) {
o.noSimplify = noSimplify
}
}
// WithRawSpec sets the raw spec for the spec builder
func WithRawSpec(raw *specs.Spec) Option {
return func(o *builder) {
o.raw = raw
}
}

120
pkg/nvcdi/spec/spec.go Normal file
View File

@@ -0,0 +1,120 @@
/**
# Copyright (c) NVIDIA CORPORATION. All rights reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
**/
package spec
import (
"fmt"
"io"
"os"
"path/filepath"
"github.com/container-orchestrated-devices/container-device-interface/pkg/cdi"
"github.com/container-orchestrated-devices/container-device-interface/specs-go"
)
type spec struct {
*specs.Spec
format string
}
var _ Interface = (*spec)(nil)
// New creates a new spec with the specified options.
func New(opts ...Option) (Interface, error) {
return newBuilder(opts...).Build()
}
// Save writes the spec to the specified path and overwrites the file if it exists.
func (s *spec) Save(path string) error {
path, err := s.normalizePath(path)
if err != nil {
return fmt.Errorf("failed to normalize path: %w", err)
}
specDir := filepath.Dir(path)
registry := cdi.GetRegistry(
cdi.WithAutoRefresh(false),
cdi.WithSpecDirs(specDir),
)
return registry.SpecDB().WriteSpec(s.Raw(), filepath.Base(path))
}
// WriteTo writes the spec to the specified writer.
func (s *spec) WriteTo(w io.Writer) (int64, error) {
name, err := cdi.GenerateNameForSpec(s.Raw())
if err != nil {
return 0, err
}
path, _ := s.normalizePath(name)
tmpFile, err := os.CreateTemp("", "*"+filepath.Base(path))
if err != nil {
return 0, err
}
defer os.Remove(tmpFile.Name())
if err := s.Save(tmpFile.Name()); err != nil {
return 0, err
}
err = tmpFile.Close()
if err != nil {
return 0, fmt.Errorf("failed to close temporary file: %w", err)
}
r, err := os.Open(tmpFile.Name())
if err != nil {
return 0, fmt.Errorf("failed to open temporary file: %w", err)
}
defer r.Close()
return io.Copy(w, r)
}
// Raw returns a pointer to the raw spec.
func (s *spec) Raw() *specs.Spec {
return s.Spec
}
// normalizePath ensures that the specified path has a supported extension
func (s *spec) normalizePath(path string) (string, error) {
if ext := filepath.Ext(path); ext != ".yaml" && ext != ".json" {
path += s.extension()
}
if filepath.Clean(filepath.Dir(path)) == "." {
pwd, err := os.Getwd()
if err != nil {
return path, fmt.Errorf("failed to get current working directory: %v", err)
}
path = filepath.Join(pwd, path)
}
return path, nil
}
func (s *spec) extension() string {
switch s.format {
case FormatJSON:
return ".json"
case FormatYAML:
return ".yaml"
}
return ".yaml"
}

View File

@@ -0,0 +1,24 @@
/**
# Copyright (c) NVIDIA CORPORATION. All rights reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
**/
package transform
import "github.com/container-orchestrated-devices/container-device-interface/specs-go"
// Transformer defines the API for applying arbitrary transforms to a spec in-place
type Transformer interface {
Transform(*specs.Spec) error
}

View File

@@ -0,0 +1,151 @@
/**
# Copyright (c) NVIDIA CORPORATION. All rights reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
**/
package transform
import (
"github.com/container-orchestrated-devices/container-device-interface/specs-go"
)
type dedupe struct{}
var _ Transformer = (*dedupe)(nil)
// NewDedupe creates a transformer that deduplicates container edits.
func NewDedupe() (Transformer, error) {
return &dedupe{}, nil
}
// Transform removes duplicate entris from devices and common container edits.
func (d dedupe) Transform(spec *specs.Spec) error {
if spec == nil {
return nil
}
if err := d.transformEdits(&spec.ContainerEdits); err != nil {
return err
}
var updatedDevices []specs.Device
for _, device := range spec.Devices {
if err := d.transformEdits(&device.ContainerEdits); err != nil {
return err
}
updatedDevices = append(updatedDevices, device)
}
spec.Devices = updatedDevices
return nil
}
func (d dedupe) transformEdits(edits *specs.ContainerEdits) error {
deviceNodes, err := d.deduplicateDeviceNodes(edits.DeviceNodes)
if err != nil {
return err
}
edits.DeviceNodes = deviceNodes
envs, err := d.deduplicateEnvs(edits.Env)
if err != nil {
return err
}
edits.Env = envs
hooks, err := d.deduplicateHooks(edits.Hooks)
if err != nil {
return err
}
edits.Hooks = hooks
mounts, err := d.deduplicateMounts(edits.Mounts)
if err != nil {
return err
}
edits.Mounts = mounts
return nil
}
func (d dedupe) deduplicateDeviceNodes(entities []*specs.DeviceNode) ([]*specs.DeviceNode, error) {
seen := make(map[string]bool)
var deviceNodes []*specs.DeviceNode
for _, e := range entities {
if e == nil {
continue
}
id, err := deviceNode(*e).id()
if err != nil {
return nil, err
}
if seen[id] {
continue
}
seen[id] = true
deviceNodes = append(deviceNodes, e)
}
return deviceNodes, nil
}
func (d dedupe) deduplicateEnvs(entities []string) ([]string, error) {
seen := make(map[string]bool)
var envs []string
for _, e := range entities {
id := e
if seen[id] {
continue
}
seen[id] = true
envs = append(envs, e)
}
return envs, nil
}
func (d dedupe) deduplicateHooks(entities []*specs.Hook) ([]*specs.Hook, error) {
seen := make(map[string]bool)
var hooks []*specs.Hook
for _, e := range entities {
if e == nil {
continue
}
id, err := hook(*e).id()
if err != nil {
return nil, err
}
if seen[id] {
continue
}
seen[id] = true
hooks = append(hooks, e)
}
return hooks, nil
}
func (d dedupe) deduplicateMounts(entities []*specs.Mount) ([]*specs.Mount, error) {
seen := make(map[string]bool)
var mounts []*specs.Mount
for _, e := range entities {
if e == nil {
continue
}
id, err := mount(*e).id()
if err != nil {
return nil, err
}
if seen[id] {
continue
}
seen[id] = true
mounts = append(mounts, e)
}
return mounts, nil
}

View File

@@ -0,0 +1,250 @@
/**
# Copyright (c) NVIDIA CORPORATION. All rights reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
**/
package transform
import (
"testing"
"github.com/container-orchestrated-devices/container-device-interface/specs-go"
"github.com/stretchr/testify/require"
)
func TestDeduplicate(t *testing.T) {
testCases := []struct {
description string
spec *specs.Spec
expectedError error
expectedSpec *specs.Spec
}{
{
description: "nil spec",
},
{
description: "duplicate deviceNode is removed",
spec: &specs.Spec{
ContainerEdits: specs.ContainerEdits{
DeviceNodes: []*specs.DeviceNode{
{
Path: "/dev/gpu0",
},
{
Path: "/dev/gpu0",
},
},
},
},
expectedSpec: &specs.Spec{
ContainerEdits: specs.ContainerEdits{
DeviceNodes: []*specs.DeviceNode{
{
Path: "/dev/gpu0",
},
},
},
},
},
{
description: "duplicate deviceNode is remved from device edits",
spec: &specs.Spec{
Devices: []specs.Device{
{
Name: "device0",
ContainerEdits: specs.ContainerEdits{
DeviceNodes: []*specs.DeviceNode{
{
Path: "/dev/gpu0",
},
{
Path: "/dev/gpu0",
},
},
},
},
},
},
expectedSpec: &specs.Spec{
Devices: []specs.Device{
{
Name: "device0",
ContainerEdits: specs.ContainerEdits{
DeviceNodes: []*specs.DeviceNode{
{
Path: "/dev/gpu0",
},
},
},
},
},
},
},
{
description: "duplicate hook is removed",
spec: &specs.Spec{
ContainerEdits: specs.ContainerEdits{
Hooks: []*specs.Hook{
{
HookName: "createContainer",
Path: "/usr/bin/nvidia-ctk",
Args: []string{"nvidia-ctk", "hook", "chmod", "--mode", "755", "--path", "/dev/dri"},
},
{
HookName: "createContainer",
Path: "/usr/bin/nvidia-ctk",
Args: []string{"nvidia-ctk", "hook", "chmod", "--mode", "755", "--path", "/dev/dri"},
},
},
},
},
expectedSpec: &specs.Spec{
ContainerEdits: specs.ContainerEdits{
Hooks: []*specs.Hook{
{
HookName: "createContainer",
Path: "/usr/bin/nvidia-ctk",
Args: []string{"nvidia-ctk", "hook", "chmod", "--mode", "755", "--path", "/dev/dri"},
},
},
},
},
},
{
description: "duplicate mount is removed",
spec: &specs.Spec{
Devices: []specs.Device{
{
Name: "device0",
ContainerEdits: specs.ContainerEdits{
Mounts: []*specs.Mount{
{
HostPath: "/host/mount2",
ContainerPath: "/mount2",
},
{
HostPath: "/host/mount2",
ContainerPath: "/mount2",
},
{
HostPath: "/host/mount1",
ContainerPath: "/mount1",
},
},
},
},
},
ContainerEdits: specs.ContainerEdits{
Mounts: []*specs.Mount{
{
HostPath: "/host/mount1",
ContainerPath: "/mount1",
Options: []string{"bind", "ro"},
Type: "tmpfs",
},
{
HostPath: "/host/mount1",
ContainerPath: "/mount1",
Options: []string{"bind", "ro"},
Type: "tmpfs",
},
{
HostPath: "/host/mount1",
ContainerPath: "/mount1",
Options: []string{"bind", "ro"},
},
},
},
},
expectedSpec: &specs.Spec{
Devices: []specs.Device{
{
Name: "device0",
ContainerEdits: specs.ContainerEdits{
Mounts: []*specs.Mount{
{
HostPath: "/host/mount2",
ContainerPath: "/mount2",
},
{
HostPath: "/host/mount1",
ContainerPath: "/mount1",
},
},
},
},
},
ContainerEdits: specs.ContainerEdits{
Mounts: []*specs.Mount{
{
HostPath: "/host/mount1",
ContainerPath: "/mount1",
Options: []string{"bind", "ro"},
Type: "tmpfs",
},
{
HostPath: "/host/mount1",
ContainerPath: "/mount1",
Options: []string{"bind", "ro"},
},
},
},
},
},
{
description: "duplicate env is removed",
spec: &specs.Spec{
Devices: []specs.Device{
{
Name: "device0",
ContainerEdits: specs.ContainerEdits{
Env: []string{"ENV1=VAL1", "ENV1=VAL1", "ENV2=ONE_VALUE", "ENV2=ANOTHER_VALUE"},
},
},
},
ContainerEdits: specs.ContainerEdits{
Env: []string{"ENV1=VAL1", "ENV1=VAL1", "ENV2=ONE_VALUE", "ENV2=ANOTHER_VALUE"},
},
},
expectedSpec: &specs.Spec{
Devices: []specs.Device{
{
Name: "device0",
ContainerEdits: specs.ContainerEdits{
Env: []string{"ENV1=VAL1", "ENV2=ONE_VALUE", "ENV2=ANOTHER_VALUE"},
},
},
},
ContainerEdits: specs.ContainerEdits{
Env: []string{"ENV1=VAL1", "ENV2=ONE_VALUE", "ENV2=ANOTHER_VALUE"},
},
},
},
}
for _, tc := range testCases {
t.Run(tc.description, func(t *testing.T) {
d := dedupe{}
err := d.Transform(tc.spec)
if tc.expectedError != nil {
require.Error(t, err)
return
}
require.NoError(t, err)
require.EqualValues(t, tc.expectedSpec, tc.spec)
})
}
}

View File

@@ -0,0 +1,166 @@
/*
*
# Copyright (c) NVIDIA CORPORATION. All rights reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
**/
package transform
import (
"encoding/json"
"github.com/container-orchestrated-devices/container-device-interface/specs-go"
)
type containerEdits specs.ContainerEdits
// IsEmpty returns true if the edits are empty.
func (e containerEdits) IsEmpty() bool {
// Devices with empty edits are invalid
if len(e.DeviceNodes) > 0 {
return false
}
if len(e.Env) > 0 {
return false
}
if len(e.Hooks) > 0 {
return false
}
if len(e.Mounts) > 0 {
return false
}
return true
}
func (e *containerEdits) getEntityIds() ([]string, error) {
if e == nil {
return nil, nil
}
uniqueIDs := make(map[string]bool)
deviceNodes, err := e.getDeviceNodeIDs()
if err != nil {
return nil, err
}
for k := range deviceNodes {
uniqueIDs[k] = true
}
envs, err := e.getEnvIDs()
if err != nil {
return nil, err
}
for k := range envs {
uniqueIDs[k] = true
}
hooks, err := e.getHookIDs()
if err != nil {
return nil, err
}
for k := range hooks {
uniqueIDs[k] = true
}
mounts, err := e.getMountIDs()
if err != nil {
return nil, err
}
for k := range mounts {
uniqueIDs[k] = true
}
var ids []string
for k := range uniqueIDs {
ids = append(ids, k)
}
return ids, nil
}
func (e *containerEdits) getDeviceNodeIDs() (map[string]bool, error) {
deviceIDs := make(map[string]bool)
for _, entity := range e.DeviceNodes {
id, err := deviceNode(*entity).id()
if err != nil {
return nil, err
}
deviceIDs[id] = true
}
return deviceIDs, nil
}
func (e *containerEdits) getEnvIDs() (map[string]bool, error) {
envIDs := make(map[string]bool)
for _, entity := range e.Env {
id, err := env(entity).id()
if err != nil {
return nil, err
}
envIDs[id] = true
}
return envIDs, nil
}
func (e *containerEdits) getHookIDs() (map[string]bool, error) {
hookIDs := make(map[string]bool)
for _, entity := range e.Hooks {
id, err := hook(*entity).id()
if err != nil {
return nil, err
}
hookIDs[id] = true
}
return hookIDs, nil
}
func (e *containerEdits) getMountIDs() (map[string]bool, error) {
mountIDs := make(map[string]bool)
for _, entity := range e.Mounts {
id, err := mount(*entity).id()
if err != nil {
return nil, err
}
mountIDs[id] = true
}
return mountIDs, nil
}
type deviceNode specs.DeviceNode
func (dn deviceNode) id() (string, error) {
b, err := json.Marshal(dn)
return string(b), err
}
type env string
func (e env) id() (string, error) {
return string(e), nil
}
type mount specs.Mount
func (m mount) id() (string, error) {
b, err := json.Marshal(m)
return string(b), err
}
type hook specs.Hook
func (m hook) id() (string, error) {
b, err := json.Marshal(m)
return string(b), err
}

View File

@@ -0,0 +1,35 @@
/**
# Copyright (c) NVIDIA CORPORATION. All rights reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
**/
package transform
import (
"github.com/container-orchestrated-devices/container-device-interface/specs-go"
)
type noop struct{}
var _ Transformer = (*noop)(nil)
// NewNoopTransformer returns a no-op transformer
func NewNoopTransformer() Transformer {
return noop{}
}
// Transform is a no-op
func (n noop) Transform(spec *specs.Spec) error {
return nil
}

View File

@@ -0,0 +1,105 @@
/**
# Copyright (c) NVIDIA CORPORATION. All rights reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
**/
package transform
import (
"fmt"
"github.com/container-orchestrated-devices/container-device-interface/specs-go"
)
type remove map[string]bool
func newRemover(ids ...string) Transformer {
r := make(remove)
for _, id := range ids {
r[id] = true
}
return r
}
// Transform remove the specified entities from the spec.
func (r remove) Transform(spec *specs.Spec) error {
if spec == nil {
return nil
}
for _, device := range spec.Devices {
if err := r.transformEdits(&device.ContainerEdits); err != nil {
return fmt.Errorf("failed to remove edits from device %q: %w", device.Name, err)
}
}
return r.transformEdits(&spec.ContainerEdits)
}
func (r remove) transformEdits(edits *specs.ContainerEdits) error {
if edits == nil {
return nil
}
var deviceNodes []*specs.DeviceNode
for _, entity := range edits.DeviceNodes {
id, err := deviceNode(*entity).id()
if err != nil {
return err
}
if r[id] {
continue
}
deviceNodes = append(deviceNodes, entity)
}
edits.DeviceNodes = deviceNodes
var envs []string
for _, entity := range edits.Env {
id := entity
if r[id] {
continue
}
envs = append(envs, entity)
}
edits.Env = envs
var hooks []*specs.Hook
for _, entity := range edits.Hooks {
id, err := hook(*entity).id()
if err != nil {
return err
}
if r[id] {
continue
}
hooks = append(hooks, entity)
}
edits.Hooks = hooks
var mounts []*specs.Mount
for _, entity := range edits.Mounts {
id, err := mount(*entity).id()
if err != nil {
return err
}
if r[id] {
continue
}
mounts = append(mounts, entity)
}
edits.Mounts = mounts
return nil
}

113
pkg/nvcdi/transform/root.go Normal file
View File

@@ -0,0 +1,113 @@
/**
# Copyright (c) NVIDIA CORPORATION. All rights reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
**/
package transform
import (
"fmt"
"path/filepath"
"strings"
"github.com/container-orchestrated-devices/container-device-interface/specs-go"
)
type rootTransformer struct {
root string
targetRoot string
}
var _ Transformer = (*rootTransformer)(nil)
// NewRootTransformer creates a new transformer for modifying
// the root for paths in a CDI spec. If both roots are identical,
// this tranformer is a no-op.
func NewRootTransformer(root string, targetRoot string) Transformer {
if root == targetRoot {
return NewNoopTransformer()
}
t := rootTransformer{
root: root,
targetRoot: targetRoot,
}
return t
}
// Transform replaces the root in a spec with a new root.
// It walks the spec and replaces all host paths that start with root with the target root.
func (t rootTransformer) Transform(spec *specs.Spec) error {
if spec == nil {
return nil
}
for _, d := range spec.Devices {
if err := t.applyToEdits(&d.ContainerEdits); err != nil {
return fmt.Errorf("failed to apply root transform to device %s: %w", d.Name, err)
}
}
if err := t.applyToEdits(&spec.ContainerEdits); err != nil {
return fmt.Errorf("failed to apply root transform to spec: %w", err)
}
return nil
}
func (t rootTransformer) applyToEdits(edits *specs.ContainerEdits) error {
for i, dn := range edits.DeviceNodes {
dn.HostPath = t.transformPath(dn.HostPath)
edits.DeviceNodes[i] = dn
}
for i, hook := range edits.Hooks {
hook.Path = t.transformPath(hook.Path)
var args []string
for _, arg := range hook.Args {
if !strings.Contains(arg, "::") {
args = append(args, t.transformPath(arg))
continue
}
// For the 'create-symlinks' hook, special care is taken for the
// '--link' flag argument which takes the form <target>::<link>.
// Both paths, the target and link paths, are transformed.
split := strings.Split(arg, "::")
if len(split) != 2 {
return fmt.Errorf("unexpected number of '::' separators in hook argument")
}
split[0] = t.transformPath(split[0])
split[1] = t.transformPath(split[1])
args = append(args, strings.Join(split, "::"))
}
hook.Args = args
edits.Hooks[i] = hook
}
for i, mount := range edits.Mounts {
mount.HostPath = t.transformPath(mount.HostPath)
edits.Mounts[i] = mount
}
return nil
}
func (t rootTransformer) transformPath(path string) string {
if !strings.HasPrefix(path, t.root) {
return path
}
return filepath.Join(t.targetRoot, strings.TrimPrefix(path, t.root))
}

View File

@@ -0,0 +1,162 @@
/**
# Copyright (c) NVIDIA CORPORATION. All rights reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
**/
package transform
import (
"testing"
"github.com/container-orchestrated-devices/container-device-interface/specs-go"
"github.com/stretchr/testify/require"
)
func TestRootTransformer(t *testing.T) {
testCases := []struct {
description string
root string
targetRoot string
spec *specs.Spec
expectedSpec *specs.Spec
}{
{
description: "nil spec",
root: "/root",
targetRoot: "/target-root",
spec: nil,
expectedSpec: nil,
},
{
description: "empty spec",
root: "/root",
targetRoot: "/target-root",
spec: &specs.Spec{},
expectedSpec: &specs.Spec{},
},
{
description: "device nodes",
root: "/root",
targetRoot: "/target-root",
spec: &specs.Spec{
ContainerEdits: specs.ContainerEdits{
DeviceNodes: []*specs.DeviceNode{
{HostPath: "/root/dev/nvidia0", Path: "/root/dev/nvidia0"},
{HostPath: "/target-root/dev/nvidia1", Path: "/target-root/dev/nvidia1"},
{HostPath: "/different-root/dev/nvidia2", Path: "/different-root/dev/nvidia2"},
},
},
},
expectedSpec: &specs.Spec{
ContainerEdits: specs.ContainerEdits{
DeviceNodes: []*specs.DeviceNode{
{HostPath: "/target-root/dev/nvidia0", Path: "/root/dev/nvidia0"},
{HostPath: "/target-root/dev/nvidia1", Path: "/target-root/dev/nvidia1"},
{HostPath: "/different-root/dev/nvidia2", Path: "/different-root/dev/nvidia2"},
},
},
},
},
{
description: "mounts",
root: "/root",
targetRoot: "/target-root",
spec: &specs.Spec{
ContainerEdits: specs.ContainerEdits{
Mounts: []*specs.Mount{
{HostPath: "/root/lib/lib1.so", ContainerPath: "/root/lib/lib1.so"},
{HostPath: "/target-root/lib/lib2.so", ContainerPath: "/target-root/lib/lib2.so"},
{HostPath: "/different-root/lib/lib3.so", ContainerPath: "/different-root/lib/lib3.so"},
},
},
},
expectedSpec: &specs.Spec{
ContainerEdits: specs.ContainerEdits{
Mounts: []*specs.Mount{
{HostPath: "/target-root/lib/lib1.so", ContainerPath: "/root/lib/lib1.so"},
{HostPath: "/target-root/lib/lib2.so", ContainerPath: "/target-root/lib/lib2.so"},
{HostPath: "/different-root/lib/lib3.so", ContainerPath: "/different-root/lib/lib3.so"},
},
},
},
},
{
description: "hooks",
root: "/root",
targetRoot: "/target-root",
spec: &specs.Spec{
ContainerEdits: specs.ContainerEdits{
Hooks: []*specs.Hook{
{
Path: "/root/usr/bin/nvidia-ctk",
Args: []string{
"--link",
"/root/path/to/target::/root/path/to/link",
},
},
{
Path: "/target-root/usr/bin/nvidia-ctk",
Args: []string{
"--link",
"/target-root/path/to/target::/target-root/path/to/link",
},
},
{
Path: "/different-root/usr/bin/nvidia-ctk",
Args: []string{
"--link",
"/different-root/path/to/target::/different-root/path/to/link",
},
},
},
},
},
expectedSpec: &specs.Spec{
ContainerEdits: specs.ContainerEdits{
Hooks: []*specs.Hook{
{
Path: "/target-root/usr/bin/nvidia-ctk",
Args: []string{
"--link",
"/target-root/path/to/target::/target-root/path/to/link",
},
},
{
Path: "/target-root/usr/bin/nvidia-ctk",
Args: []string{
"--link",
"/target-root/path/to/target::/target-root/path/to/link",
},
},
{
Path: "/different-root/usr/bin/nvidia-ctk",
Args: []string{
"--link",
"/different-root/path/to/target::/different-root/path/to/link",
},
},
},
},
},
},
}
for _, tc := range testCases {
t.Run(tc.description, func(t *testing.T) {
err := NewRootTransformer(tc.root, tc.targetRoot).Transform(tc.spec)
require.NoError(t, err)
require.Equal(t, tc.spec, tc.expectedSpec)
})
}
}

View File

@@ -0,0 +1,74 @@
/**
# Copyright (c) NVIDIA CORPORATION. All rights reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
**/
package transform
import (
"fmt"
"github.com/container-orchestrated-devices/container-device-interface/specs-go"
)
type simplify struct{}
var _ Transformer = (*simplify)(nil)
// NewSimplifier creates a simplifier transformer.
// This transoformer ensures that entities in the spec are deduplicated and that common edits are removed from device-specific edits.
func NewSimplifier() Transformer {
return &simplify{}
}
// Transform simplifies the supplied spec.
// Edits that are present in the common edits are removed from device-specific edits.
func (s simplify) Transform(spec *specs.Spec) error {
if spec == nil {
return nil
}
dedupe := dedupe{}
if err := dedupe.Transform(spec); err != nil {
return err
}
commonEntityIDs, err := (*containerEdits)(&spec.ContainerEdits).getEntityIds()
if err != nil {
return err
}
toRemove := newRemover(commonEntityIDs...)
var updatedDevices []specs.Device
for _, device := range spec.Devices {
deviceAsSpec := specs.Spec{
ContainerEdits: device.ContainerEdits,
}
err := toRemove.Transform(&deviceAsSpec)
if err != nil {
return fmt.Errorf("failed to transform device edits: %w", err)
}
if !(containerEdits)(deviceAsSpec.ContainerEdits).IsEmpty() {
// Devices with empty edits are invalid.
// We only update the container edits for the device if this would
// result in a valid device.
device.ContainerEdits = deviceAsSpec.ContainerEdits
}
updatedDevices = append(updatedDevices, device)
}
spec.Devices = updatedDevices
return nil
}

View File

@@ -0,0 +1,125 @@
/**
# Copyright (c) NVIDIA CORPORATION. All rights reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
**/
package transform
import (
"testing"
"github.com/container-orchestrated-devices/container-device-interface/specs-go"
"github.com/stretchr/testify/require"
)
func TestSimplify(t *testing.T) {
testCases := []struct {
description string
spec *specs.Spec
expectedError error
expectedSpec *specs.Spec
}{
{
description: "nil spec is a no-op",
},
{
description: "empty spec is simplified",
spec: &specs.Spec{},
expectedSpec: &specs.Spec{},
},
{
description: "simplify does not allow empty device",
spec: &specs.Spec{
Devices: []specs.Device{
{
Name: "device0",
ContainerEdits: specs.ContainerEdits{
Env: []string{"FOO=BAR"},
},
},
},
ContainerEdits: specs.ContainerEdits{
Env: []string{"FOO=BAR"},
},
},
expectedSpec: &specs.Spec{
Devices: []specs.Device{
{
Name: "device0",
ContainerEdits: specs.ContainerEdits{
Env: []string{"FOO=BAR"},
},
},
},
ContainerEdits: specs.ContainerEdits{
Env: []string{"FOO=BAR"},
},
},
},
{
description: "simplify removes common entities",
spec: &specs.Spec{
Devices: []specs.Device{
{
Name: "device0",
ContainerEdits: specs.ContainerEdits{
Env: []string{"FOO=BAR"},
DeviceNodes: []*specs.DeviceNode{
{
Path: "/dev/gpu0",
},
},
},
},
},
ContainerEdits: specs.ContainerEdits{
Env: []string{"FOO=BAR"},
},
},
expectedSpec: &specs.Spec{
Devices: []specs.Device{
{
Name: "device0",
ContainerEdits: specs.ContainerEdits{
DeviceNodes: []*specs.DeviceNode{
{
Path: "/dev/gpu0",
},
},
},
},
},
ContainerEdits: specs.ContainerEdits{
Env: []string{"FOO=BAR"},
},
},
},
}
for _, tc := range testCases {
t.Run(tc.description, func(t *testing.T) {
s := simplify{}
err := s.Transform(tc.spec)
if tc.expectedError != nil {
require.Error(t, err)
return
}
require.NoError(t, err)
require.EqualValues(t, tc.expectedSpec, tc.spec)
})
}
}

View File

@@ -14,16 +14,13 @@
# limitations under the License.
**/
package generate
package nvcdi
import (
"fmt"
"path/filepath"
"github.com/NVIDIA/nvidia-container-toolkit/internal/discover"
"github.com/NVIDIA/nvidia-container-toolkit/internal/edits"
"github.com/container-orchestrated-devices/container-device-interface/pkg/cdi"
"github.com/container-orchestrated-devices/container-device-interface/specs-go"
"github.com/sirupsen/logrus"
)
@@ -31,60 +28,26 @@ type deviceFolderPermissions struct {
logger *logrus.Logger
driverRoot string
nvidiaCTKPath string
folders []string
devices discover.Discover
}
var _ discover.Discover = (*deviceFolderPermissions)(nil)
// GetDeviceFolderPermissionHookEdits gets the edits required for device folder permissions discoverer
func GetDeviceFolderPermissionHookEdits(logger *logrus.Logger, driverRoot string, nvidiaCTKPath string, deviceSpecs []specs.Device) (*cdi.ContainerEdits, error) {
deviceFolderPermissionHooks, err := NewDeviceFolderPermissionHookDiscoverer(logger, driverRoot, nvidiaCTKPath, deviceSpecs)
if err != nil {
return nil, fmt.Errorf("failed to generated permission hooks for device nodes: %v", err)
}
return edits.FromDiscoverer(deviceFolderPermissionHooks)
}
// NewDeviceFolderPermissionHookDiscoverer creates a discoverer that can be used to update the permissions for the parent folders of nested device nodes from the specified set of device specs.
// newDeviceFolderPermissionHookDiscoverer creates a discoverer that can be used to update the permissions for the parent folders of nested device nodes from the specified set of device specs.
// This works around an issue with rootless podman when using crun as a low-level runtime.
// See https://github.com/containers/crun/issues/1047
// The nested devices that are applicable to the NVIDIA GPU devices are:
// - DRM devices at /dev/dri/*
// - NVIDIA Caps devices at /dev/nvidia-caps/*
func NewDeviceFolderPermissionHookDiscoverer(logger *logrus.Logger, driverRoot string, nvidiaCTKPath string, deviceSpecs []specs.Device) (discover.Discover, error) {
var folders []string
seen := make(map[string]bool)
for _, device := range deviceSpecs {
for _, dn := range device.ContainerEdits.DeviceNodes {
df := filepath.Dir(dn.Path)
if seen[df] {
continue
}
// We only consider the special case paths
if df != "/dev/dri" && df != "/dev/nvidia-caps" {
continue
}
folders = append(folders, df)
seen[df] = true
}
if len(folders) == 2 {
break
}
}
if len(folders) == 0 {
return discover.None{}, nil
}
func newDeviceFolderPermissionHookDiscoverer(logger *logrus.Logger, driverRoot string, nvidiaCTKPath string, devices discover.Discover) discover.Discover {
d := &deviceFolderPermissions{
logger: logger,
driverRoot: driverRoot,
nvidiaCTKPath: nvidiaCTKPath,
folders: folders,
devices: devices,
}
return d, nil
return d
}
// Devices are empty for this discoverer
@@ -94,12 +57,16 @@ func (d *deviceFolderPermissions) Devices() ([]discover.Device, error) {
// Hooks returns a set of hooks that sets the file mode to 755 of parent folders for nested device nodes.
func (d *deviceFolderPermissions) Hooks() ([]discover.Hook, error) {
if len(d.folders) == 0 {
folders, err := d.getDeviceSubfolders()
if err != nil {
return nil, fmt.Errorf("failed to get device subfolders: %v", err)
}
if len(folders) == 0 {
return nil, nil
}
args := []string{"--mode", "755"}
for _, folder := range d.folders {
for _, folder := range folders {
args = append(args, "--path", folder)
}
@@ -112,6 +79,39 @@ func (d *deviceFolderPermissions) Hooks() ([]discover.Hook, error) {
return []discover.Hook{hook}, nil
}
func (d *deviceFolderPermissions) getDeviceSubfolders() ([]string, error) {
// For now we only consider the following special case paths
allowedPaths := map[string]bool{
"/dev/dri": true,
"/dev/nvidia-caps": true,
}
devices, err := d.devices.Devices()
if err != nil {
return nil, fmt.Errorf("failed to get devices: %v", err)
}
var folders []string
seen := make(map[string]bool)
for _, device := range devices {
df := filepath.Dir(device.Path)
if seen[df] {
continue
}
// We only consider the special case paths
if !allowedPaths[df] {
continue
}
folders = append(folders, df)
seen[df] = true
if len(folders) == len(allowedPaths) {
break
}
}
return folders, nil
}
// Mounts are empty for this discoverer
func (d *deviceFolderPermissions) Mounts() ([]discover.Mount, error) {
return nil, nil

View File

@@ -29,18 +29,12 @@ SCRIPTS_DIR="$( cd "$( dirname "${BASH_SOURCE[0]}" )"/../scripts && pwd )"
all=(
amazonlinux2-aarch64
amazonlinux2-x86_64
centos7-ppc64le
centos7-x86_64
centos8-aarch64
centos8-ppc64le
centos8-x86_64
debian10-amd64
debian9-amd64
fedora35-aarch64
fedora35-x86_64
opensuse-leap15.1-x86_64
ubuntu16.04-amd64
ubuntu16.04-ppc64le
ubuntu18.04-amd64
ubuntu18.04-arm64
ubuntu18.04-ppc64le

View File

@@ -62,8 +62,10 @@ echo "LIBNVIDIA_CONTAINER_PACKAGE_VERSION=${libnvidia_container_version_tag//\~/
echo "NVIDIA_CONTAINER_TOOLKIT_VERSION=${nvidia_container_toolkit_version}"
echo "NVIDIA_CONTAINER_TOOLKIT_TAG=${nvidia_container_toolkit_tag}"
echo "NVIDIA_CONTAINER_TOOLKIT_PACKAGE_VERSION=${nvidia_container_toolkit_version_tag//\~/-}"
if [[ "${libnvidia_container_version_tag}" != "${nvidia_container_toolkit_version_tag}" ]]; then
if [[ "${LIBNVIDIA_CONTAINER_PACKAGE_VERSION}" != "${NVIDIA_CONTAINER_TOOLKIT_PACKAGE_VERSION}" ]]; then
>&2 echo "WARNING: The libnvidia-container and nvidia-container-toolkit versions do not match"
>&2 echo "WARNING: lib: ${LIBNVIDIA_CONTAINER_PACKAGE_VERSION}"
>&2 echo "WARNING: toolkit: ${NVIDIA_CONTAINER_TOOLKIT_PACKAGE_VERSION}"
fi
echo "NVIDIA_CONTAINER_RUNTIME_VERSION=${nvidia_container_runtime_version}"
echo "NVIDIA_CONTAINER_RUNTIME_TAG=${nvidia_container_runtime_tag}"

View File

@@ -70,7 +70,7 @@ KITMAKER_SCRATCH="${KITMAKER_DIR}/.scratch"
# extract_info extracts the value of the specified variable from the manifest.txt file.
function extract_info() {
local variable=$1
local value=$(cat "${ARTIFACTS_DIR}/manifest.txt" | grep "#${variable}" | sed -e "s/#${variable}=//" | tr -d '\r')
local value=$(cat "${ARTIFACTS_DIR}/manifest.txt" | grep "#${variable}=" | sed -e "s/#${variable}=//" | tr -d '\r')
echo $value
}
@@ -78,6 +78,7 @@ IMAGE_EPOCH=$(extract_info "IMAGE_EPOCH")
# Note we use the main branch for the kitmaker archive.
GIT_BRANCH=main
GIT_COMMIT=$(extract_info "GIT_COMMIT")
GIT_COMMIT_SHORT=$(extract_info "GIT_COMMIT_SHORT")
VERSION=$(extract_info "PACKAGE_VERSION")
@@ -92,7 +93,7 @@ function add_distro() {
local name="${component}-${os}-${arch}"
local scratch_dir="${KITMAKER_SCRATCH}/${name}"
local scratch_dir="${KITMAKER_SCRATCH}/${name}/${component}"
local packages_dir="${scratch_dir}/.packages"
mkdir -p "${packages_dir}"
@@ -112,10 +113,10 @@ function create_archive() {
local name="${component}-${os}-${arch}"
local archive="${KITMAKER_DIR}/${name}-${version}.tar.gz"
local scratch_dir="${KITMAKER_SCRATCH}/${name}"
local scratch_dir="${KITMAKER_SCRATCH}/${name}/${component}"
local packages_dir="${scratch_dir}/.packages/"
tar zcvf "${archive}" -C "${scratch_dir}/.." "${name}"
tar zcvf "${archive}" -C "${scratch_dir}/.." "${component}"
echo "Created: ${archive}"
ls -l "${archive}"
echo "With contents:"
@@ -155,7 +156,7 @@ function upload_archive() {
fi
local sha1_checksum=$(sha1sum -b "${archive}" | awk '{ print $1 }')
local upload_url="${KITMAKER_ARTIFACTORY_REPO}/${component}-${GIT_BRANCH}/default/$(basename ${archive})"
local upload_url="${KITMAKER_ARTIFACTORY_REPO}/${GIT_BRANCH}/${component}/${os}-${arch}/${version}/$(basename ${archive})"
local props=()
# Required KITMAKER properties:
@@ -164,12 +165,13 @@ function upload_archive() {
props+=("os=${os}")
props+=("arch=${arch}")
props+=("platform=${os}-${arch}")
# TODO: extract the GIT commit from the packaging image
props+=("changelist=${GIT_COMMIT}")
props+=("changelist=${GIT_COMMIT_SHORT}")
props+=("branch=${GIT_BRANCH}")
props+=("source=https://gitlab.com/nvidia/container-toolkit/container-toolkit")
# Package properties:
props+=("package.epoch=${IMAGE_EPOCH}")
props+=("package.version=${VERSION}")
props+=("package.commit=${GIT_COMMIT}")
optionally_add_property "package.builds" "${package_builds}"
for var in "CI_PROJECT_ID" "CI_PIPELINE_ID" "CI_JOB_ID" "CI_JOB_URL" "CI_PROJECT_PATH"; do
@@ -179,7 +181,11 @@ function upload_archive() {
done
local PROPS=$(join_by ";" "${props[@]}")
echo "Uploading ${upload_url} from ${file}"
echo "Uploading ${upload_url} from ${archive}"
echo -H "X-JFrog-Art-Api: REDACTED" \
-H "X-Checksum-Sha1: ${sha1_checksum}" \
${archive:+-T ${archive}} -X PUT \
"${upload_url};${PROPS}"
if ! ${CURL} -f \
-H "X-JFrog-Art-Api: ${ARTIFACTORY_TOKEN}" \
-H "X-Checksum-Sha1: ${sha1_checksum}" \
@@ -192,9 +198,9 @@ function upload_archive() {
}
component="nvidia_container_toolkit"
version="${VERSION%-rc.*}"
version="${VERSION%~rc.*}"
version_suffix=$(date -r "${IMAGE_EPOCH}" '+%Y.%m.%d.%s' || date -d @"${IMAGE_EPOCH}" '+%Y.%m.%d.%s')
kitmaker_version="${VERSION%-rc.*}.${version_suffix}"
kitmaker_version="${VERSION%~rc.*}.${version_suffix}"
kitmaker_os="linux"
# create_and_upload creates a kitmaker archive for the specified component, os, and arch and uploads it.

View File

@@ -158,18 +158,12 @@ function sync() {
all=(
amazonlinux2-aarch64
amazonlinux2-x86_64
centos7-ppc64le
centos7-x86_64
centos8-aarch64
centos8-ppc64le
centos8-x86_64
debian10-amd64
debian9-amd64
fedora35-aarch64
fedora35-x86_64
opensuse-leap15.1-x86_64
ubuntu16.04-amd64
ubuntu16.04-ppc64le
ubuntu18.04-amd64
ubuntu18.04-arm64
ubuntu18.04-ppc64le

View File

@@ -52,8 +52,8 @@ testing::containerd::toolkit::run() {
--volumes-from "${containerd_dind_ctr}" \
-v "${shared_dir}/etc/containerd/config_${version}.toml:${containerd_dind_containerd_dir}/containerd.toml" \
--pid "container:${containerd_dind_ctr}" \
-e "RUNTIME=containerd" \
-e "RUNTIME_ARGS=--config=${containerd_dind_containerd_dir}/containerd.toml --socket=${containerd_dind_containerd_dir}/containerd.sock" \
-e RUNTIME="containerd" \
-e RUNTIME_ARGS="--config=${containerd_dind_containerd_dir}/containerd.toml --socket=${containerd_dind_containerd_dir}/containerd.sock" \
--name "${containerd_test_ctr}" \
"${toolkit_container_image}" "/usr/local/nvidia" "--no-daemon"

View File

@@ -38,7 +38,7 @@ testing::docker::toolkit::run() {
docker run -d --rm --privileged \
--volumes-from "${docker_dind_ctr}" \
--pid "container:${docker_dind_ctr}" \
-e "RUNTIME_ARGS=--socket ${docker_dind_socket}" \
-e RUNTIME_ARGS="--socket ${docker_dind_socket}" \
--name "${docker_test_ctr}" \
"${toolkit_container_image}" "/usr/local/nvidia" "--no-daemon"

View File

@@ -1 +1 @@
# This is a dummy lib file to test nvidia-runtime-experimental
# This is a dummy lib file to test nvidia-runtime.experimental

View File

@@ -47,11 +47,11 @@ testing::toolkit::install() {
test -e "${shared_dir}/usr/local/nvidia/toolkit/nvidia-container-runtime.real"
test -e "${shared_dir}/usr/local/nvidia/toolkit/nvidia-container-runtime.experimental"
test -e "${shared_dir}/usr/local/nvidia/toolkit/nvidia-container-runtime-experimental"
test -e "${shared_dir}/usr/local/nvidia/toolkit/nvidia-container-runtime.experimental.real"
grep -q -E "nvidia driver modules are not yet loaded, invoking runc directly" "${shared_dir}/usr/local/nvidia/toolkit/nvidia-container-runtime-experimental"
grep -q -E "exec runc \".@\"" "${shared_dir}/usr/local/nvidia/toolkit/nvidia-container-runtime-experimental"
grep -q -E "LD_LIBRARY_PATH=/run/nvidia/driver/usr/lib64:\\\$LD_LIBRARY_PATH " "${shared_dir}/usr/local/nvidia/toolkit/nvidia-container-runtime-experimental"
grep -q -E "nvidia driver modules are not yet loaded, invoking runc directly" "${shared_dir}/usr/local/nvidia/toolkit/nvidia-container-runtime.experimental"
grep -q -E "exec runc \".@\"" "${shared_dir}/usr/local/nvidia/toolkit/nvidia-container-runtime.experimental"
grep -q -E "LD_LIBRARY_PATH=/run/nvidia/driver/usr/lib64:\\\$LD_LIBRARY_PATH " "${shared_dir}/usr/local/nvidia/toolkit/nvidia-container-runtime.experimental"
test -e "${shared_dir}/usr/local/nvidia/toolkit/.config/nvidia-container-runtime/config.toml"
@@ -61,6 +61,7 @@ testing::toolkit::install() {
grep -q -E "^\s*ldconfig = \"@${nvidia_run_dir}/driver/sbin/ldconfig(.real)?\"" "${shared_dir}/usr/local/nvidia/toolkit/.config/nvidia-container-runtime/config.toml"
grep -q -E "^\s*root = \"${nvidia_run_dir}/driver\"" "${shared_dir}/usr/local/nvidia/toolkit/.config/nvidia-container-runtime/config.toml"
grep -q -E "^\s*path = \"/usr/local/nvidia/toolkit/nvidia-container-cli\"" "${shared_dir}/usr/local/nvidia/toolkit/.config/nvidia-container-runtime/config.toml"
grep -q -E "^\s*path = \"/usr/local/nvidia/toolkit/nvidia-ctk\"" "${shared_dir}/usr/local/nvidia/toolkit/.config/nvidia-container-runtime/config.toml"
}
testing::toolkit::delete() {

View File

@@ -15,7 +15,7 @@ docker setup \
/run/nvidia/toolkit
```
Configure the `nvidia-container-runtime` as a docker runtime named `NAME`. If the `--runtime-name` flag is not specified, this runtime would be called `nvidia`. A runtime named `nvidia-experimental` will also be configured using the `nvidia-container-runtime-experimental` OCI-compliant runtime shim.
Configure the `nvidia-container-runtime` as a docker runtime named `NAME`. If the `--runtime-name` flag is not specified, this runtime would be called `nvidia`. A runtime named `nvidia-experimental` will also be configured using the `nvidia-container-runtime.experimental` OCI-compliant runtime shim.
Since `--set-as-default` is enabled by default, the specified runtime name will also be set as the default docker runtime. This can be disabled by explicityly specifying `--set-as-default=false`.
@@ -48,7 +48,7 @@ containerd setup \
/run/nvidia/toolkit
```
Configure the `nvidia-container-runtime` as a runtime class named `NAME`. If the `--runtime-class` flag is not specified, this runtime would be called `nvidia`. A runtime class named `nvidia-experimental` will also be configured using the `nvidia-container-runtime-experimental` OCI-compliant runtime shim.
Configure the `nvidia-container-runtime` as a runtime class named `NAME`. If the `--runtime-class` flag is not specified, this runtime would be called `nvidia`. A runtime class named `nvidia-experimental` will also be configured using the `nvidia-container-runtime.experimental` OCI-compliant runtime shim.
Adding the `--set-as-default` flag as follows:
```bash

View File

@@ -1,114 +0,0 @@
/**
# Copyright (c) 2020-2021, NVIDIA CORPORATION. All rights reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
*/
package main
import (
"github.com/pelletier/go-toml"
)
// UpdateReverter defines the interface for applying and reverting configurations
type UpdateReverter interface {
Update(o *options) error
Revert(o *options) error
}
type config struct {
*toml.Tree
version int64
cri string
}
// update adds the specified runtime class to the the containerd config.
// if set-as default is specified, the runtime class is also set as the
// default runtime.
func (config *config) update(runtimeClass string, runtimeType string, runtimeBinary string, setAsDefault bool) {
config.Set("version", config.version)
runcPath := config.runcPath()
runtimeClassPath := config.runtimeClassPath(runtimeClass)
switch runc := config.GetPath(runcPath).(type) {
case *toml.Tree:
runc, _ = toml.Load(runc.String())
config.SetPath(runtimeClassPath, runc)
}
config.initRuntime(runtimeClassPath, runtimeType, "BinaryName", runtimeBinary)
if config.version == 1 {
config.initRuntime(runtimeClassPath, runtimeType, "Runtime", runtimeBinary)
}
if setAsDefault {
defaultRuntimeNamePath := config.defaultRuntimeNamePath()
config.SetPath(defaultRuntimeNamePath, runtimeClass)
}
}
// revert removes the configuration applied in an update call.
func (config *config) revert(runtimeClass string) {
runtimeClassPath := config.runtimeClassPath(runtimeClass)
defaultRuntimeNamePath := config.defaultRuntimeNamePath()
config.DeletePath(runtimeClassPath)
if runtime, ok := config.GetPath(defaultRuntimeNamePath).(string); ok {
if runtimeClass == runtime {
config.DeletePath(defaultRuntimeNamePath)
}
}
for i := 0; i < len(runtimeClassPath); i++ {
if runtimes, ok := config.GetPath(runtimeClassPath[:len(runtimeClassPath)-i]).(*toml.Tree); ok {
if len(runtimes.Keys()) == 0 {
config.DeletePath(runtimeClassPath[:len(runtimeClassPath)-i])
}
}
}
if len(config.Keys()) == 1 && config.Keys()[0] == "version" {
config.Delete("version")
}
}
// initRuntime creates a runtime config if it does not exist and ensures that the
// runtimes binary path is specified.
func (config *config) initRuntime(path []string, runtimeType string, binaryKey string, binary string) {
if config.GetPath(path) == nil {
config.SetPath(append(path, "runtime_type"), runtimeType)
config.SetPath(append(path, "runtime_root"), "")
config.SetPath(append(path, "runtime_engine"), "")
config.SetPath(append(path, "privileged_without_host_devices"), false)
}
binaryPath := append(path, "options", binaryKey)
config.SetPath(binaryPath, binary)
}
func (config config) runcPath() []string {
return config.runtimeClassPath("runc")
}
func (config config) runtimeClassPath(runtimeClass string) []string {
return append(config.containerdPath(), "runtimes", runtimeClass)
}
func (config config) defaultRuntimeNamePath() []string {
return append(config.containerdPath(), "default_runtime_name")
}
func (config config) containerdPath() []string {
return []string{"plugins", config.cri, "containerd"}
}

View File

@@ -1,134 +0,0 @@
/**
# Copyright (c) 2020-2021, NVIDIA CORPORATION. All rights reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
*/
package main
import (
"path"
"github.com/pelletier/go-toml"
log "github.com/sirupsen/logrus"
)
// configV1 represents a V1 containerd config
type configV1 struct {
config
}
func newConfigV1(cfg *toml.Tree) UpdateReverter {
c := configV1{
config: config{
Tree: cfg,
version: 1,
cri: "cri",
},
}
return &c
}
// Update performs an update specific to v1 of the containerd config
func (config *configV1) Update(o *options) error {
// For v1 config, the `default_runtime_name` setting is only supported
// for containerd version at least v1.3
supportsDefaultRuntimeName := !o.useLegacyConfig
defaultRuntime := o.getDefaultRuntime()
for runtimeClass, runtimeBinary := range o.getRuntimeBinaries() {
isDefaultRuntime := runtimeClass == defaultRuntime
config.update(runtimeClass, o.runtimeType, runtimeBinary, isDefaultRuntime && supportsDefaultRuntimeName)
if !isDefaultRuntime {
continue
}
if supportsDefaultRuntimeName {
defaultRuntimePath := append(config.containerdPath(), "default_runtime")
if config.GetPath(defaultRuntimePath) != nil {
log.Warnf("The setting of default_runtime (%v) in containerd is deprecated", defaultRuntimePath)
}
continue
}
log.Warnf("Setting default_runtime is deprecated")
defaultRuntimePath := append(config.containerdPath(), "default_runtime")
config.initRuntime(defaultRuntimePath, o.runtimeType, "Runtime", runtimeBinary)
config.initRuntime(defaultRuntimePath, o.runtimeType, "BinaryName", runtimeBinary)
}
return nil
}
// Revert performs a revert specific to v1 of the containerd config
func (config *configV1) Revert(o *options) error {
defaultRuntimePath := append(config.containerdPath(), "default_runtime")
defaultRuntimeOptionsPath := append(defaultRuntimePath, "options")
if runtime, ok := config.GetPath(append(defaultRuntimeOptionsPath, "Runtime")).(string); ok {
for _, runtimeBinary := range o.getRuntimeBinaries() {
if path.Base(runtimeBinary) == path.Base(runtime) {
config.DeletePath(append(defaultRuntimeOptionsPath, "Runtime"))
break
}
}
}
if runtime, ok := config.GetPath(append(defaultRuntimeOptionsPath, "BinaryName")).(string); ok {
for _, runtimeBinary := range o.getRuntimeBinaries() {
if path.Base(runtimeBinary) == path.Base(runtime) {
config.DeletePath(append(defaultRuntimeOptionsPath, "BinaryName"))
break
}
}
}
if options, ok := config.GetPath(defaultRuntimeOptionsPath).(*toml.Tree); ok {
if len(options.Keys()) == 0 {
config.DeletePath(defaultRuntimeOptionsPath)
}
}
if runtime, ok := config.GetPath(defaultRuntimePath).(*toml.Tree); ok {
fields := []string{"runtime_type", "runtime_root", "runtime_engine", "privileged_without_host_devices"}
if len(runtime.Keys()) <= len(fields) {
matches := []string{}
for _, f := range fields {
e := runtime.Get(f)
if e != nil {
matches = append(matches, f)
}
}
if len(matches) == len(runtime.Keys()) {
for _, m := range matches {
runtime.Delete(m)
}
}
}
}
for i := 0; i < len(defaultRuntimePath); i++ {
if runtimes, ok := config.GetPath(defaultRuntimePath[:len(defaultRuntimePath)-i]).(*toml.Tree); ok {
if len(runtimes.Keys()) == 0 {
config.DeletePath(defaultRuntimePath[:len(defaultRuntimePath)-i])
}
}
}
for runtimeClass := range nvidiaRuntimeBinaries {
config.revert(runtimeClass)
}
return nil
}

View File

@@ -17,8 +17,10 @@
package main
import (
"fmt"
"testing"
"github.com/NVIDIA/nvidia-container-toolkit/internal/config/engine/containerd"
"github.com/pelletier/go-toml"
"github.com/stretchr/testify/require"
)
@@ -58,7 +60,7 @@ func TestUpdateV1ConfigDefaultRuntime(t *testing.T) {
setAsDefault: true,
runtimeClass: "nvidia-experimental",
expectedDefaultRuntimeName: nil,
expectedDefaultRuntimeBinary: "/test/runtime/dir/nvidia-container-runtime-experimental",
expectedDefaultRuntimeBinary: "/test/runtime/dir/nvidia-container-runtime.experimental",
},
{
legacyConfig: false,
@@ -89,168 +91,512 @@ func TestUpdateV1ConfigDefaultRuntime(t *testing.T) {
}
for i, tc := range testCases {
o := &options{
useLegacyConfig: tc.legacyConfig,
setAsDefault: tc.setAsDefault,
runtimeClass: tc.runtimeClass,
runtimeType: runtimeType,
runtimeDir: runtimeDir,
}
t.Run(fmt.Sprintf("%d", i), func(t *testing.T) {
o := &options{
useLegacyConfig: tc.legacyConfig,
setAsDefault: tc.setAsDefault,
runtimeClass: tc.runtimeClass,
runtimeType: runtimeType,
runtimeDir: runtimeDir,
}
config, err := toml.TreeFromMap(map[string]interface{}{})
require.NoError(t, err, "%d: %v", i, tc)
err = UpdateV1Config(config, o)
require.NoError(t, err, "%d: %v", i, tc)
defaultRuntimeName := config.GetPath([]string{"plugins", "cri", "containerd", "default_runtime_name"})
require.EqualValues(t, tc.expectedDefaultRuntimeName, defaultRuntimeName, "%d: %v", i, tc)
defaultRuntime := config.GetPath([]string{"plugins", "cri", "containerd", "default_runtime"})
if tc.expectedDefaultRuntimeBinary == nil {
require.Nil(t, defaultRuntime, "%d: %v", i, tc)
} else {
expected, err := defaultRuntimeTomlConfigV1(tc.expectedDefaultRuntimeBinary.(string))
config, err := toml.TreeFromMap(map[string]interface{}{})
require.NoError(t, err, "%d: %v", i, tc)
configContents, _ := toml.Marshal(defaultRuntime.(*toml.Tree))
expectedContents, _ := toml.Marshal(expected)
v1 := &containerd.ConfigV1{
Tree: config,
UseDefaultRuntimeName: !tc.legacyConfig,
RuntimeType: runtimeType,
}
require.Equal(t, string(expectedContents), string(configContents), "%d: %v: %v", i, tc)
}
err = UpdateConfig(v1, o)
require.NoError(t, err, "%d: %v", i, tc)
defaultRuntimeName := v1.GetPath([]string{"plugins", "cri", "containerd", "default_runtime_name"})
require.EqualValues(t, tc.expectedDefaultRuntimeName, defaultRuntimeName, "%d: %v", i, tc)
defaultRuntime := v1.GetPath([]string{"plugins", "cri", "containerd", "default_runtime"})
if tc.expectedDefaultRuntimeBinary == nil {
require.Nil(t, defaultRuntime, "%d: %v", i, tc)
} else {
require.NotNil(t, defaultRuntime)
expected, err := defaultRuntimeTomlConfigV1(tc.expectedDefaultRuntimeBinary.(string))
require.NoError(t, err, "%d: %v", i, tc)
configContents, _ := toml.Marshal(defaultRuntime.(*toml.Tree))
expectedContents, _ := toml.Marshal(expected)
require.Equal(t, string(expectedContents), string(configContents), "%d: %v: %v", i, tc)
}
})
}
}
func TestUpdateV1Config(t *testing.T) {
const runtimeDir = "/test/runtime/dir"
const expectedVersion = int64(1)
expectedBinaries := []string{
"/test/runtime/dir/nvidia-container-runtime",
"/test/runtime/dir/nvidia-container-runtime-experimental",
}
testCases := []struct {
runtimeClass string
expectedRuntimes []string
runtimeClass string
expectedConfig map[string]interface{}
}{
{
runtimeClass: "nvidia",
expectedRuntimes: []string{"nvidia", "nvidia-experimental"},
runtimeClass: "nvidia",
expectedConfig: map[string]interface{}{
"version": int64(1),
"plugins": map[string]interface{}{
"cri": map[string]interface{}{
"containerd": map[string]interface{}{
"runtimes": map[string]interface{}{
"nvidia": map[string]interface{}{
"runtime_type": "runtime_type",
"runtime_root": "",
"runtime_engine": "",
"privileged_without_host_devices": false,
"container_annotations": []string{"cdi.k8s.io/*"},
"options": map[string]interface{}{
"BinaryName": "/test/runtime/dir/nvidia-container-runtime",
"Runtime": "/test/runtime/dir/nvidia-container-runtime",
},
},
"nvidia-experimental": map[string]interface{}{
"runtime_type": "runtime_type",
"runtime_root": "",
"runtime_engine": "",
"privileged_without_host_devices": false,
"container_annotations": []string{"cdi.k8s.io/*"},
"options": map[string]interface{}{
"BinaryName": "/test/runtime/dir/nvidia-container-runtime.experimental",
"Runtime": "/test/runtime/dir/nvidia-container-runtime.experimental",
},
},
"nvidia-cdi": map[string]interface{}{
"runtime_type": "runtime_type",
"runtime_root": "",
"runtime_engine": "",
"privileged_without_host_devices": false,
"container_annotations": []string{"cdi.k8s.io/*"},
"options": map[string]interface{}{
"BinaryName": "/test/runtime/dir/nvidia-container-runtime.cdi",
"Runtime": "/test/runtime/dir/nvidia-container-runtime.cdi",
},
},
"nvidia-legacy": map[string]interface{}{
"runtime_type": "runtime_type",
"runtime_root": "",
"runtime_engine": "",
"privileged_without_host_devices": false,
"container_annotations": []string{"cdi.k8s.io/*"},
"options": map[string]interface{}{
"BinaryName": "/test/runtime/dir/nvidia-container-runtime.legacy",
"Runtime": "/test/runtime/dir/nvidia-container-runtime.legacy",
},
},
},
},
},
},
},
},
{
runtimeClass: "NAME",
expectedRuntimes: []string{"NAME", "nvidia-experimental"},
runtimeClass: "NAME",
expectedConfig: map[string]interface{}{
"version": int64(1),
"plugins": map[string]interface{}{
"cri": map[string]interface{}{
"containerd": map[string]interface{}{
"runtimes": map[string]interface{}{
"NAME": map[string]interface{}{
"runtime_type": "runtime_type",
"runtime_root": "",
"runtime_engine": "",
"privileged_without_host_devices": false,
"container_annotations": []string{"cdi.k8s.io/*"},
"options": map[string]interface{}{
"BinaryName": "/test/runtime/dir/nvidia-container-runtime",
"Runtime": "/test/runtime/dir/nvidia-container-runtime",
},
},
"nvidia-experimental": map[string]interface{}{
"runtime_type": "runtime_type",
"runtime_root": "",
"runtime_engine": "",
"privileged_without_host_devices": false,
"container_annotations": []string{"cdi.k8s.io/*"},
"options": map[string]interface{}{
"BinaryName": "/test/runtime/dir/nvidia-container-runtime.experimental",
"Runtime": "/test/runtime/dir/nvidia-container-runtime.experimental",
},
},
"nvidia-cdi": map[string]interface{}{
"runtime_type": "runtime_type",
"runtime_root": "",
"runtime_engine": "",
"privileged_without_host_devices": false,
"container_annotations": []string{"cdi.k8s.io/*"},
"options": map[string]interface{}{
"BinaryName": "/test/runtime/dir/nvidia-container-runtime.cdi",
"Runtime": "/test/runtime/dir/nvidia-container-runtime.cdi",
},
},
"nvidia-legacy": map[string]interface{}{
"runtime_type": "runtime_type",
"runtime_root": "",
"runtime_engine": "",
"privileged_without_host_devices": false,
"container_annotations": []string{"cdi.k8s.io/*"},
"options": map[string]interface{}{
"BinaryName": "/test/runtime/dir/nvidia-container-runtime.legacy",
"Runtime": "/test/runtime/dir/nvidia-container-runtime.legacy",
},
},
},
},
},
},
},
},
{
runtimeClass: "nvidia-experimental",
expectedRuntimes: []string{"nvidia", "nvidia-experimental"},
runtimeClass: "nvidia-experimental",
expectedConfig: map[string]interface{}{
"version": int64(1),
"plugins": map[string]interface{}{
"cri": map[string]interface{}{
"containerd": map[string]interface{}{
"runtimes": map[string]interface{}{
"nvidia": map[string]interface{}{
"runtime_type": "runtime_type",
"runtime_root": "",
"runtime_engine": "",
"privileged_without_host_devices": false,
"container_annotations": []string{"cdi.k8s.io/*"},
"options": map[string]interface{}{
"BinaryName": "/test/runtime/dir/nvidia-container-runtime",
"Runtime": "/test/runtime/dir/nvidia-container-runtime",
},
},
"nvidia-experimental": map[string]interface{}{
"runtime_type": "runtime_type",
"runtime_root": "",
"runtime_engine": "",
"privileged_without_host_devices": false,
"container_annotations": []string{"cdi.k8s.io/*"},
"options": map[string]interface{}{
"BinaryName": "/test/runtime/dir/nvidia-container-runtime.experimental",
"Runtime": "/test/runtime/dir/nvidia-container-runtime.experimental",
},
},
"nvidia-cdi": map[string]interface{}{
"runtime_type": "runtime_type",
"runtime_root": "",
"runtime_engine": "",
"privileged_without_host_devices": false,
"container_annotations": []string{"cdi.k8s.io/*"},
"options": map[string]interface{}{
"BinaryName": "/test/runtime/dir/nvidia-container-runtime.cdi",
"Runtime": "/test/runtime/dir/nvidia-container-runtime.cdi",
},
},
"nvidia-legacy": map[string]interface{}{
"runtime_type": "runtime_type",
"runtime_root": "",
"runtime_engine": "",
"privileged_without_host_devices": false,
"container_annotations": []string{"cdi.k8s.io/*"},
"options": map[string]interface{}{
"BinaryName": "/test/runtime/dir/nvidia-container-runtime.legacy",
"Runtime": "/test/runtime/dir/nvidia-container-runtime.legacy",
},
},
},
},
},
},
},
},
}
for i, tc := range testCases {
o := &options{
runtimeClass: tc.runtimeClass,
runtimeType: runtimeType,
runtimeDir: runtimeDir,
}
t.Run(fmt.Sprintf("%d", i), func(t *testing.T) {
o := &options{
runtimeClass: tc.runtimeClass,
runtimeType: runtimeType,
runtimeDir: runtimeDir,
}
config, err := toml.TreeFromMap(map[string]interface{}{})
require.NoError(t, err, "%d: %v", i, tc)
config, err := toml.TreeFromMap(map[string]interface{}{})
require.NoError(t, err)
err = UpdateV1Config(config, o)
require.NoError(t, err, "%d: %v", i, tc)
v1 := &containerd.ConfigV1{
Tree: config,
UseDefaultRuntimeName: true,
RuntimeType: runtimeType,
ContainerAnnotations: []string{"cdi.k8s.io/*"},
}
version, ok := config.Get("version").(int64)
require.True(t, ok)
require.EqualValues(t, expectedVersion, version)
err = UpdateConfig(v1, o)
require.NoError(t, err)
runtimes, ok := config.GetPath([]string{"plugins", "cri", "containerd", "runtimes"}).(*toml.Tree)
require.True(t, ok)
expected, err := toml.TreeFromMap(tc.expectedConfig)
require.NoError(t, err)
runtimeClasses := runtimes.Keys()
require.ElementsMatch(t, tc.expectedRuntimes, runtimeClasses, "%d: %v", i, tc)
for i, r := range tc.expectedRuntimes {
runtimeConfig := runtimes.Get(r)
expected, err := runtimeTomlConfigV1(expectedBinaries[i])
require.NoError(t, err, "%d: %v", i, tc)
configContents, _ := toml.Marshal(runtimeConfig)
expectedContents, _ := toml.Marshal(expected)
require.Equal(t, string(expectedContents), string(configContents), "%d: %v: %v", i, r, tc)
}
require.Equal(t, expected.String(), config.String())
})
}
}
func TestUpdateV1ConfigWithRuncPresent(t *testing.T) {
const runcBinary = "/runc-binary"
const runtimeDir = "/test/runtime/dir"
const expectedVersion = int64(1)
expectedBinaries := []string{
runcBinary,
"/test/runtime/dir/nvidia-container-runtime",
"/test/runtime/dir/nvidia-container-runtime-experimental",
}
testCases := []struct {
runtimeClass string
expectedRuntimes []string
runtimeClass string
expectedConfig map[string]interface{}
}{
{
runtimeClass: "nvidia",
expectedRuntimes: []string{"runc", "nvidia", "nvidia-experimental"},
runtimeClass: "nvidia",
expectedConfig: map[string]interface{}{
"version": int64(1),
"plugins": map[string]interface{}{
"cri": map[string]interface{}{
"containerd": map[string]interface{}{
"runtimes": map[string]interface{}{
"runc": map[string]interface{}{
"runtime_type": "runc_runtime_type",
"runtime_root": "runc_runtime_root",
"runtime_engine": "runc_runtime_engine",
"privileged_without_host_devices": true,
"options": map[string]interface{}{
"runc-option": "value",
"BinaryName": "/runc-binary",
},
},
"nvidia": map[string]interface{}{
"runtime_type": "runc_runtime_type",
"runtime_root": "runc_runtime_root",
"runtime_engine": "runc_runtime_engine",
"privileged_without_host_devices": true,
"container_annotations": []string{"cdi.k8s.io/*"},
"options": map[string]interface{}{
"runc-option": "value",
"BinaryName": "/test/runtime/dir/nvidia-container-runtime",
"Runtime": "/test/runtime/dir/nvidia-container-runtime",
},
},
"nvidia-experimental": map[string]interface{}{
"runtime_type": "runc_runtime_type",
"runtime_root": "runc_runtime_root",
"runtime_engine": "runc_runtime_engine",
"privileged_without_host_devices": true,
"container_annotations": []string{"cdi.k8s.io/*"},
"options": map[string]interface{}{
"runc-option": "value",
"BinaryName": "/test/runtime/dir/nvidia-container-runtime.experimental",
"Runtime": "/test/runtime/dir/nvidia-container-runtime.experimental",
},
},
"nvidia-cdi": map[string]interface{}{
"runtime_type": "runc_runtime_type",
"runtime_root": "runc_runtime_root",
"runtime_engine": "runc_runtime_engine",
"privileged_without_host_devices": true,
"container_annotations": []string{"cdi.k8s.io/*"},
"options": map[string]interface{}{
"runc-option": "value",
"BinaryName": "/test/runtime/dir/nvidia-container-runtime.cdi",
"Runtime": "/test/runtime/dir/nvidia-container-runtime.cdi",
},
},
"nvidia-legacy": map[string]interface{}{
"runtime_type": "runc_runtime_type",
"runtime_root": "runc_runtime_root",
"runtime_engine": "runc_runtime_engine",
"privileged_without_host_devices": true,
"container_annotations": []string{"cdi.k8s.io/*"},
"options": map[string]interface{}{
"runc-option": "value",
"BinaryName": "/test/runtime/dir/nvidia-container-runtime.legacy",
"Runtime": "/test/runtime/dir/nvidia-container-runtime.legacy",
},
},
},
},
},
},
},
},
{
runtimeClass: "NAME",
expectedRuntimes: []string{"runc", "NAME", "nvidia-experimental"},
runtimeClass: "NAME",
expectedConfig: map[string]interface{}{
"version": int64(1),
"plugins": map[string]interface{}{
"cri": map[string]interface{}{
"containerd": map[string]interface{}{
"runtimes": map[string]interface{}{
"runc": map[string]interface{}{
"runtime_type": "runc_runtime_type",
"runtime_root": "runc_runtime_root",
"runtime_engine": "runc_runtime_engine",
"privileged_without_host_devices": true,
"options": map[string]interface{}{
"runc-option": "value",
"BinaryName": "/runc-binary",
},
},
"NAME": map[string]interface{}{
"runtime_type": "runc_runtime_type",
"runtime_root": "runc_runtime_root",
"runtime_engine": "runc_runtime_engine",
"privileged_without_host_devices": true,
"container_annotations": []string{"cdi.k8s.io/*"},
"options": map[string]interface{}{
"runc-option": "value",
"BinaryName": "/test/runtime/dir/nvidia-container-runtime",
"Runtime": "/test/runtime/dir/nvidia-container-runtime",
},
},
"nvidia-experimental": map[string]interface{}{
"runtime_type": "runc_runtime_type",
"runtime_root": "runc_runtime_root",
"runtime_engine": "runc_runtime_engine",
"privileged_without_host_devices": true,
"container_annotations": []string{"cdi.k8s.io/*"},
"options": map[string]interface{}{
"runc-option": "value",
"BinaryName": "/test/runtime/dir/nvidia-container-runtime.experimental",
"Runtime": "/test/runtime/dir/nvidia-container-runtime.experimental",
},
},
"nvidia-cdi": map[string]interface{}{
"runtime_type": "runc_runtime_type",
"runtime_root": "runc_runtime_root",
"runtime_engine": "runc_runtime_engine",
"privileged_without_host_devices": true,
"container_annotations": []string{"cdi.k8s.io/*"},
"options": map[string]interface{}{
"runc-option": "value",
"BinaryName": "/test/runtime/dir/nvidia-container-runtime.cdi",
"Runtime": "/test/runtime/dir/nvidia-container-runtime.cdi",
},
},
"nvidia-legacy": map[string]interface{}{
"runtime_type": "runc_runtime_type",
"runtime_root": "runc_runtime_root",
"runtime_engine": "runc_runtime_engine",
"privileged_without_host_devices": true,
"container_annotations": []string{"cdi.k8s.io/*"},
"options": map[string]interface{}{
"runc-option": "value",
"BinaryName": "/test/runtime/dir/nvidia-container-runtime.legacy",
"Runtime": "/test/runtime/dir/nvidia-container-runtime.legacy",
},
},
},
},
},
},
},
},
{
runtimeClass: "nvidia-experimental",
expectedRuntimes: []string{"runc", "nvidia", "nvidia-experimental"},
runtimeClass: "nvidia-experimental",
expectedConfig: map[string]interface{}{
"version": int64(1),
"plugins": map[string]interface{}{
"cri": map[string]interface{}{
"containerd": map[string]interface{}{
"runtimes": map[string]interface{}{
"runc": map[string]interface{}{
"runtime_type": "runc_runtime_type",
"runtime_root": "runc_runtime_root",
"runtime_engine": "runc_runtime_engine",
"privileged_without_host_devices": true,
"options": map[string]interface{}{
"runc-option": "value",
"BinaryName": "/runc-binary",
},
},
"nvidia": map[string]interface{}{
"runtime_type": "runc_runtime_type",
"runtime_root": "runc_runtime_root",
"runtime_engine": "runc_runtime_engine",
"privileged_without_host_devices": true,
"container_annotations": []string{"cdi.k8s.io/*"},
"options": map[string]interface{}{
"runc-option": "value",
"BinaryName": "/test/runtime/dir/nvidia-container-runtime",
"Runtime": "/test/runtime/dir/nvidia-container-runtime",
},
},
"nvidia-experimental": map[string]interface{}{
"runtime_type": "runc_runtime_type",
"runtime_root": "runc_runtime_root",
"runtime_engine": "runc_runtime_engine",
"privileged_without_host_devices": true,
"container_annotations": []string{"cdi.k8s.io/*"},
"options": map[string]interface{}{
"runc-option": "value",
"BinaryName": "/test/runtime/dir/nvidia-container-runtime.experimental",
"Runtime": "/test/runtime/dir/nvidia-container-runtime.experimental",
},
},
"nvidia-cdi": map[string]interface{}{
"runtime_type": "runc_runtime_type",
"runtime_root": "runc_runtime_root",
"runtime_engine": "runc_runtime_engine",
"privileged_without_host_devices": true,
"container_annotations": []string{"cdi.k8s.io/*"},
"options": map[string]interface{}{
"runc-option": "value",
"BinaryName": "/test/runtime/dir/nvidia-container-runtime.cdi",
"Runtime": "/test/runtime/dir/nvidia-container-runtime.cdi",
},
},
"nvidia-legacy": map[string]interface{}{
"runtime_type": "runc_runtime_type",
"runtime_root": "runc_runtime_root",
"runtime_engine": "runc_runtime_engine",
"privileged_without_host_devices": true,
"container_annotations": []string{"cdi.k8s.io/*"},
"options": map[string]interface{}{
"runc-option": "value",
"BinaryName": "/test/runtime/dir/nvidia-container-runtime.legacy",
"Runtime": "/test/runtime/dir/nvidia-container-runtime.legacy",
},
},
},
},
},
},
},
},
}
for i, tc := range testCases {
o := &options{
runtimeClass: tc.runtimeClass,
runtimeType: runtimeType,
runtimeDir: runtimeDir,
}
t.Run(fmt.Sprintf("%d", i), func(t *testing.T) {
o := &options{
runtimeClass: tc.runtimeClass,
runtimeType: runtimeType,
runtimeDir: runtimeDir,
}
config, err := toml.TreeFromMap(runcConfigMapV1("/runc-binary"))
require.NoError(t, err, "%d: %v", i, tc)
config, err := toml.TreeFromMap(runcConfigMapV1("/runc-binary"))
require.NoError(t, err)
err = UpdateV1Config(config, o)
require.NoError(t, err, "%d: %v", i, tc)
v1 := &containerd.ConfigV1{
Tree: config,
UseDefaultRuntimeName: true,
RuntimeType: runtimeType,
ContainerAnnotations: []string{"cdi.k8s.io/*"},
}
version, ok := config.Get("version").(int64)
require.True(t, ok)
require.EqualValues(t, expectedVersion, version)
err = UpdateConfig(v1, o)
require.NoError(t, err)
runtimes, ok := config.GetPath([]string{"plugins", "cri", "containerd", "runtimes"}).(*toml.Tree)
require.True(t, ok)
expected, err := toml.TreeFromMap(tc.expectedConfig)
require.NoError(t, err)
runtimeClasses := runtimes.Keys()
require.ElementsMatch(t, tc.expectedRuntimes, runtimeClasses, "%d: %v", i, tc)
for i, r := range tc.expectedRuntimes {
runtimeConfig := runtimes.Get(r)
expected, err := toml.TreeFromMap(runcRuntimeConfigMapV1(expectedBinaries[i]))
require.NoError(t, err, "%d: %v", i, tc)
configContents, _ := toml.Marshal(runtimeConfig)
expectedContents, _ := toml.Marshal(expected)
require.Equal(t, string(expectedContents), string(configContents), "%d: %v: %v", i, r, tc)
}
require.Equal(t, expected.String(), config.String())
})
}
}
@@ -274,7 +620,9 @@ func TestRevertV1Config(t *testing.T) {
"containerd": map[string]interface{}{
"runtimes": map[string]interface{}{
"nvidia": runtimeMapV1("/test/runtime/dir/nvidia-container-runtime"),
"nvidia-experimental": runtimeMapV1("/test/runtime/dir/nvidia-container-runtime-experimental"),
"nvidia-experimental": runtimeMapV1("/test/runtime/dir/nvidia-container-runtime.experimental"),
"nvidia-cdi": runtimeMapV1("/test/runtime/dir/nvidia-container-runtime.cdi"),
"nvidia-legacy": runtimeMapV1("/test/runtime/dir/nvidia-container-runtime.legacy"),
},
},
},
@@ -289,7 +637,9 @@ func TestRevertV1Config(t *testing.T) {
"containerd": map[string]interface{}{
"runtimes": map[string]interface{}{
"nvidia": runtimeMapV1("/test/runtime/dir/nvidia-container-runtime"),
"nvidia-experimental": runtimeMapV1("/test/runtime/dir/nvidia-container-runtime-experimental"),
"nvidia-experimental": runtimeMapV1("/test/runtime/dir/nvidia-container-runtime.experimental"),
"nvidia-cdi": runtimeMapV1("/test/runtime/dir/nvidia-container-runtime.cdi"),
"nvidia-legacy": runtimeMapV1("/test/runtime/dir/nvidia-container-runtime.legacy"),
},
"default_runtime": defaultRuntimeV1("/test/runtime/dir/nvidia-container-runtime"),
"default_runtime_name": "nvidia",
@@ -301,30 +651,34 @@ func TestRevertV1Config(t *testing.T) {
}
for i, tc := range testCases {
o := &options{
runtimeClass: "nvidia",
}
t.Run(fmt.Sprintf("%d", i), func(t *testing.T) {
o := &options{
runtimeClass: "nvidia",
}
config, err := toml.TreeFromMap(tc.config)
require.NoError(t, err, "%d: %v", i, tc)
config, err := toml.TreeFromMap(tc.config)
require.NoError(t, err, "%d: %v", i, tc)
expected, err := toml.TreeFromMap(tc.expected)
require.NoError(t, err, "%d: %v", i, tc)
expected, err := toml.TreeFromMap(tc.expected)
require.NoError(t, err, "%d: %v", i, tc)
err = RevertV1Config(config, o)
require.NoError(t, err, "%d: %v", i, tc)
v1 := &containerd.ConfigV1{
Tree: config,
UseDefaultRuntimeName: true,
RuntimeType: runtimeType,
}
configContents, _ := toml.Marshal(config)
expectedContents, _ := toml.Marshal(expected)
err = RevertConfig(v1, o)
require.NoError(t, err, "%d: %v", i, tc)
require.Equal(t, string(expectedContents), string(configContents), "%d: %v", i, tc)
configContents, _ := toml.Marshal(config)
expectedContents, _ := toml.Marshal(expected)
require.Equal(t, string(expectedContents), string(configContents), "%d: %v", i, tc)
})
}
}
func runtimeTomlConfigV1(binary string) (*toml.Tree, error) {
return toml.TreeFromMap(runtimeMapV1(binary))
}
func defaultRuntimeTomlConfigV1(binary string) (*toml.Tree, error) {
return toml.TreeFromMap(defaultRuntimeV1(binary))
}
@@ -361,24 +715,19 @@ func runcConfigMapV1(binary string) map[string]interface{} {
"cri": map[string]interface{}{
"containerd": map[string]interface{}{
"runtimes": map[string]interface{}{
"runc": runcRuntimeConfigMapV1(binary),
"runc": map[string]interface{}{
"runtime_type": "runc_runtime_type",
"runtime_root": "runc_runtime_root",
"runtime_engine": "runc_runtime_engine",
"privileged_without_host_devices": true,
"options": map[string]interface{}{
"runc-option": "value",
"BinaryName": binary,
},
},
},
},
},
},
}
}
func runcRuntimeConfigMapV1(binary string) map[string]interface{} {
return map[string]interface{}{
"runtime_type": "runc_runtime_type",
"runtime_root": "runc_runtime_root",
"runtime_engine": "runc_runtime_engine",
"privileged_without_host_devices": true,
"options": map[string]interface{}{
"runc-option": "value",
"BinaryName": binary,
"Runtime": binary,
},
}
}

View File

@@ -1,58 +0,0 @@
/**
# Copyright (c) 2020-2021, NVIDIA CORPORATION. All rights reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
*/
package main
import (
"github.com/pelletier/go-toml"
)
// configV2 represents a V2 containerd config
type configV2 struct {
config
}
func newConfigV2(cfg *toml.Tree) UpdateReverter {
c := configV2{
config: config{
Tree: cfg,
version: 2,
cri: "io.containerd.grpc.v1.cri",
},
}
return &c
}
// Update performs an update specific to v2 of the containerd config
func (config *configV2) Update(o *options) error {
defaultRuntime := o.getDefaultRuntime()
for runtimeClass, runtimeBinary := range o.getRuntimeBinaries() {
setAsDefault := defaultRuntime == runtimeClass
config.update(runtimeClass, o.runtimeType, runtimeBinary, setAsDefault)
}
return nil
}
// Revert performs a revert specific to v2 of the containerd config
func (config *configV2) Revert(o *options) error {
for runtimeClass := range o.getRuntimeBinaries() {
config.revert(runtimeClass)
}
return nil
}

View File

@@ -17,8 +17,10 @@
package main
import (
"fmt"
"testing"
"github.com/NVIDIA/nvidia-container-toolkit/internal/config/engine/containerd"
"github.com/pelletier/go-toml"
"github.com/stretchr/testify/require"
)
@@ -69,20 +71,27 @@ func TestUpdateV2ConfigDefaultRuntime(t *testing.T) {
}
for i, tc := range testCases {
o := &options{
setAsDefault: tc.setAsDefault,
runtimeClass: tc.runtimeClass,
runtimeDir: runtimeDir,
}
t.Run(fmt.Sprintf("%d", i), func(t *testing.T) {
o := &options{
setAsDefault: tc.setAsDefault,
runtimeClass: tc.runtimeClass,
runtimeDir: runtimeDir,
}
config, err := toml.TreeFromMap(map[string]interface{}{})
require.NoError(t, err, "%d: %v", i, tc)
config, err := toml.TreeFromMap(map[string]interface{}{})
require.NoError(t, err)
err = UpdateV2Config(config, o)
require.NoError(t, err, "%d: %v", i, tc)
v2 := &containerd.Config{
Tree: config,
RuntimeType: runtimeType,
}
defaultRuntimeName := config.GetPath([]string{"plugins", "io.containerd.grpc.v1.cri", "containerd", "default_runtime_name"})
require.EqualValues(t, tc.expectedDefaultRuntimeName, defaultRuntimeName, "%d: %v", i, tc)
err = UpdateConfig(v2, o)
require.NoError(t, err)
defaultRuntimeName := config.GetPath([]string{"plugins", "io.containerd.grpc.v1.cri", "containerd", "default_runtime_name"})
require.EqualValues(t, tc.expectedDefaultRuntimeName, defaultRuntimeName)
})
}
}
@@ -90,132 +99,441 @@ func TestUpdateV2Config(t *testing.T) {
const runtimeDir = "/test/runtime/dir"
const expectedVersion = int64(2)
expectedBinaries := []string{
"/test/runtime/dir/nvidia-container-runtime",
"/test/runtime/dir/nvidia-container-runtime-experimental",
}
testCases := []struct {
runtimeClass string
expectedRuntimes []string
runtimeClass string
expectedConfig map[string]interface{}
}{
{
runtimeClass: "nvidia",
expectedRuntimes: []string{"nvidia", "nvidia-experimental"},
runtimeClass: "nvidia",
expectedConfig: map[string]interface{}{
"version": int64(2),
"plugins": map[string]interface{}{
"io.containerd.grpc.v1.cri": map[string]interface{}{
"containerd": map[string]interface{}{
"runtimes": map[string]interface{}{
"nvidia": map[string]interface{}{
"runtime_type": "runtime_type",
"runtime_root": "",
"runtime_engine": "",
"privileged_without_host_devices": false,
"container_annotations": []string{"cdi.k8s.io/*"},
"options": map[string]interface{}{
"BinaryName": "/test/runtime/dir/nvidia-container-runtime",
},
},
"nvidia-experimental": map[string]interface{}{
"runtime_type": "runtime_type",
"runtime_root": "",
"runtime_engine": "",
"privileged_without_host_devices": false,
"container_annotations": []string{"cdi.k8s.io/*"},
"options": map[string]interface{}{
"BinaryName": "/test/runtime/dir/nvidia-container-runtime.experimental",
},
},
"nvidia-cdi": map[string]interface{}{
"runtime_type": "runtime_type",
"runtime_root": "",
"runtime_engine": "",
"privileged_without_host_devices": false,
"container_annotations": []string{"cdi.k8s.io/*"},
"options": map[string]interface{}{
"BinaryName": "/test/runtime/dir/nvidia-container-runtime.cdi",
},
},
"nvidia-legacy": map[string]interface{}{
"runtime_type": "runtime_type",
"runtime_root": "",
"runtime_engine": "",
"privileged_without_host_devices": false,
"container_annotations": []string{"cdi.k8s.io/*"},
"options": map[string]interface{}{
"BinaryName": "/test/runtime/dir/nvidia-container-runtime.legacy",
},
},
},
},
},
},
},
},
{
runtimeClass: "NAME",
expectedRuntimes: []string{"NAME", "nvidia-experimental"},
runtimeClass: "NAME",
expectedConfig: map[string]interface{}{
"version": int64(2),
"plugins": map[string]interface{}{
"io.containerd.grpc.v1.cri": map[string]interface{}{
"containerd": map[string]interface{}{
"runtimes": map[string]interface{}{
"NAME": map[string]interface{}{
"runtime_type": "runtime_type",
"runtime_root": "",
"runtime_engine": "",
"privileged_without_host_devices": false,
"container_annotations": []string{"cdi.k8s.io/*"},
"options": map[string]interface{}{
"BinaryName": "/test/runtime/dir/nvidia-container-runtime",
},
},
"nvidia-experimental": map[string]interface{}{
"runtime_type": "runtime_type",
"runtime_root": "",
"runtime_engine": "",
"privileged_without_host_devices": false,
"container_annotations": []string{"cdi.k8s.io/*"},
"options": map[string]interface{}{
"BinaryName": "/test/runtime/dir/nvidia-container-runtime.experimental",
},
},
"nvidia-cdi": map[string]interface{}{
"runtime_type": "runtime_type",
"runtime_root": "",
"runtime_engine": "",
"privileged_without_host_devices": false,
"container_annotations": []string{"cdi.k8s.io/*"},
"options": map[string]interface{}{
"BinaryName": "/test/runtime/dir/nvidia-container-runtime.cdi",
},
},
"nvidia-legacy": map[string]interface{}{
"runtime_type": "runtime_type",
"runtime_root": "",
"runtime_engine": "",
"privileged_without_host_devices": false,
"container_annotations": []string{"cdi.k8s.io/*"},
"options": map[string]interface{}{
"BinaryName": "/test/runtime/dir/nvidia-container-runtime.legacy",
},
},
},
},
},
},
},
},
{
runtimeClass: "nvidia-experimental",
expectedRuntimes: []string{"nvidia", "nvidia-experimental"},
runtimeClass: "nvidia-experimental",
expectedConfig: map[string]interface{}{
"version": int64(2),
"plugins": map[string]interface{}{
"io.containerd.grpc.v1.cri": map[string]interface{}{
"containerd": map[string]interface{}{
"runtimes": map[string]interface{}{
"nvidia": map[string]interface{}{
"runtime_type": "runtime_type",
"runtime_root": "",
"runtime_engine": "",
"privileged_without_host_devices": false,
"container_annotations": []string{"cdi.k8s.io/*"},
"options": map[string]interface{}{
"BinaryName": "/test/runtime/dir/nvidia-container-runtime",
},
},
"nvidia-experimental": map[string]interface{}{
"runtime_type": "runtime_type",
"runtime_root": "",
"runtime_engine": "",
"privileged_without_host_devices": false,
"container_annotations": []string{"cdi.k8s.io/*"},
"options": map[string]interface{}{
"BinaryName": "/test/runtime/dir/nvidia-container-runtime.experimental",
},
},
"nvidia-cdi": map[string]interface{}{
"runtime_type": "runtime_type",
"runtime_root": "",
"runtime_engine": "",
"privileged_without_host_devices": false,
"container_annotations": []string{"cdi.k8s.io/*"},
"options": map[string]interface{}{
"BinaryName": "/test/runtime/dir/nvidia-container-runtime.cdi",
},
},
"nvidia-legacy": map[string]interface{}{
"runtime_type": "runtime_type",
"runtime_root": "",
"runtime_engine": "",
"privileged_without_host_devices": false,
"container_annotations": []string{"cdi.k8s.io/*"},
"options": map[string]interface{}{
"BinaryName": "/test/runtime/dir/nvidia-container-runtime.legacy",
},
},
},
},
},
},
},
},
}
for i, tc := range testCases {
o := &options{
runtimeClass: tc.runtimeClass,
runtimeType: runtimeType,
runtimeDir: runtimeDir,
}
t.Run(fmt.Sprintf("%d", i), func(t *testing.T) {
o := &options{
runtimeClass: tc.runtimeClass,
runtimeType: runtimeType,
runtimeDir: runtimeDir,
}
config, err := toml.TreeFromMap(map[string]interface{}{})
require.NoError(t, err, "%d: %v", i, tc)
config, err := toml.TreeFromMap(map[string]interface{}{})
require.NoError(t, err)
err = UpdateV2Config(config, o)
require.NoError(t, err, "%d: %v", i, tc)
v2 := &containerd.Config{
Tree: config,
RuntimeType: runtimeType,
ContainerAnnotations: []string{"cdi.k8s.io/*"},
}
version, ok := config.Get("version").(int64)
require.True(t, ok)
require.EqualValues(t, expectedVersion, version, "%d: %v", i, tc)
err = UpdateConfig(v2, o)
require.NoError(t, err)
runtimes, ok := config.GetPath([]string{"plugins", "io.containerd.grpc.v1.cri", "containerd", "runtimes"}).(*toml.Tree)
require.True(t, ok)
expected, err := toml.TreeFromMap(tc.expectedConfig)
require.NoError(t, err)
runtimeClasses := runtimes.Keys()
require.ElementsMatch(t, tc.expectedRuntimes, runtimeClasses, "%d: %v", i, tc)
for i, r := range tc.expectedRuntimes {
runtimeConfig := runtimes.Get(r)
expected, err := runtimeTomlConfigV2(expectedBinaries[i])
require.NoError(t, err, "%d: %v", i, tc)
configContents, _ := toml.Marshal(runtimeConfig)
expectedContents, _ := toml.Marshal(expected)
require.Equal(t, string(expectedContents), string(configContents), "%d: %v: %v", i, r, tc)
}
require.Equal(t, expected.String(), config.String())
})
}
}
func TestUpdateV2ConfigWithRuncPresent(t *testing.T) {
const runcBinary = "/runc-binary"
const runtimeDir = "/test/runtime/dir"
const expectedVersion = int64(2)
expectedBinaries := []string{
runcBinary,
"/test/runtime/dir/nvidia-container-runtime",
"/test/runtime/dir/nvidia-container-runtime-experimental",
}
testCases := []struct {
runtimeClass string
expectedRuntimes []string
runtimeClass string
expectedConfig map[string]interface{}
}{
{
runtimeClass: "nvidia",
expectedRuntimes: []string{"runc", "nvidia", "nvidia-experimental"},
runtimeClass: "nvidia",
expectedConfig: map[string]interface{}{
"version": int64(2),
"plugins": map[string]interface{}{
"io.containerd.grpc.v1.cri": map[string]interface{}{
"containerd": map[string]interface{}{
"runtimes": map[string]interface{}{
"runc": map[string]interface{}{
"runtime_type": "runc_runtime_type",
"runtime_root": "runc_runtime_root",
"runtime_engine": "runc_runtime_engine",
"privileged_without_host_devices": true,
"options": map[string]interface{}{
"runc-option": "value",
"BinaryName": "/runc-binary",
},
},
"nvidia": map[string]interface{}{
"runtime_type": "runc_runtime_type",
"runtime_root": "runc_runtime_root",
"runtime_engine": "runc_runtime_engine",
"privileged_without_host_devices": true,
"container_annotations": []string{"cdi.k8s.io/*"},
"options": map[string]interface{}{
"runc-option": "value",
"BinaryName": "/test/runtime/dir/nvidia-container-runtime",
},
},
"nvidia-experimental": map[string]interface{}{
"runtime_type": "runc_runtime_type",
"runtime_root": "runc_runtime_root",
"runtime_engine": "runc_runtime_engine",
"privileged_without_host_devices": true,
"container_annotations": []string{"cdi.k8s.io/*"},
"options": map[string]interface{}{
"runc-option": "value",
"BinaryName": "/test/runtime/dir/nvidia-container-runtime.experimental",
},
},
"nvidia-cdi": map[string]interface{}{
"runtime_type": "runc_runtime_type",
"runtime_root": "runc_runtime_root",
"runtime_engine": "runc_runtime_engine",
"privileged_without_host_devices": true,
"container_annotations": []string{"cdi.k8s.io/*"},
"options": map[string]interface{}{
"runc-option": "value",
"BinaryName": "/test/runtime/dir/nvidia-container-runtime.cdi",
},
},
"nvidia-legacy": map[string]interface{}{
"runtime_type": "runc_runtime_type",
"runtime_root": "runc_runtime_root",
"runtime_engine": "runc_runtime_engine",
"privileged_without_host_devices": true,
"container_annotations": []string{"cdi.k8s.io/*"},
"options": map[string]interface{}{
"runc-option": "value",
"BinaryName": "/test/runtime/dir/nvidia-container-runtime.legacy",
},
},
},
},
},
},
},
},
{
runtimeClass: "NAME",
expectedRuntimes: []string{"runc", "NAME", "nvidia-experimental"},
runtimeClass: "NAME",
expectedConfig: map[string]interface{}{
"version": int64(2),
"plugins": map[string]interface{}{
"io.containerd.grpc.v1.cri": map[string]interface{}{
"containerd": map[string]interface{}{
"runtimes": map[string]interface{}{
"runc": map[string]interface{}{
"runtime_type": "runc_runtime_type",
"runtime_root": "runc_runtime_root",
"runtime_engine": "runc_runtime_engine",
"privileged_without_host_devices": true,
"options": map[string]interface{}{
"runc-option": "value",
"BinaryName": "/runc-binary",
},
},
"NAME": map[string]interface{}{
"runtime_type": "runc_runtime_type",
"runtime_root": "runc_runtime_root",
"runtime_engine": "runc_runtime_engine",
"privileged_without_host_devices": true,
"container_annotations": []string{"cdi.k8s.io/*"},
"options": map[string]interface{}{
"runc-option": "value",
"BinaryName": "/test/runtime/dir/nvidia-container-runtime",
},
},
"nvidia-experimental": map[string]interface{}{
"runtime_type": "runc_runtime_type",
"runtime_root": "runc_runtime_root",
"runtime_engine": "runc_runtime_engine",
"privileged_without_host_devices": true,
"container_annotations": []string{"cdi.k8s.io/*"},
"options": map[string]interface{}{
"runc-option": "value",
"BinaryName": "/test/runtime/dir/nvidia-container-runtime.experimental",
},
},
"nvidia-cdi": map[string]interface{}{
"runtime_type": "runc_runtime_type",
"runtime_root": "runc_runtime_root",
"runtime_engine": "runc_runtime_engine",
"privileged_without_host_devices": true,
"container_annotations": []string{"cdi.k8s.io/*"},
"options": map[string]interface{}{
"runc-option": "value",
"BinaryName": "/test/runtime/dir/nvidia-container-runtime.cdi",
},
},
"nvidia-legacy": map[string]interface{}{
"runtime_type": "runc_runtime_type",
"runtime_root": "runc_runtime_root",
"runtime_engine": "runc_runtime_engine",
"privileged_without_host_devices": true,
"container_annotations": []string{"cdi.k8s.io/*"},
"options": map[string]interface{}{
"runc-option": "value",
"BinaryName": "/test/runtime/dir/nvidia-container-runtime.legacy",
},
},
},
},
},
},
},
},
{
runtimeClass: "nvidia-experimental",
expectedRuntimes: []string{"runc", "nvidia", "nvidia-experimental"},
runtimeClass: "nvidia-experimental",
expectedConfig: map[string]interface{}{
"version": int64(2),
"plugins": map[string]interface{}{
"io.containerd.grpc.v1.cri": map[string]interface{}{
"containerd": map[string]interface{}{
"runtimes": map[string]interface{}{
"runc": map[string]interface{}{
"runtime_type": "runc_runtime_type",
"runtime_root": "runc_runtime_root",
"runtime_engine": "runc_runtime_engine",
"privileged_without_host_devices": true,
"options": map[string]interface{}{
"runc-option": "value",
"BinaryName": "/runc-binary",
},
},
"nvidia": map[string]interface{}{
"runtime_type": "runc_runtime_type",
"runtime_root": "runc_runtime_root",
"runtime_engine": "runc_runtime_engine",
"privileged_without_host_devices": true,
"container_annotations": []string{"cdi.k8s.io/*"},
"options": map[string]interface{}{
"runc-option": "value",
"BinaryName": "/test/runtime/dir/nvidia-container-runtime",
},
},
"nvidia-experimental": map[string]interface{}{
"runtime_type": "runc_runtime_type",
"runtime_root": "runc_runtime_root",
"runtime_engine": "runc_runtime_engine",
"privileged_without_host_devices": true,
"container_annotations": []string{"cdi.k8s.io/*"},
"options": map[string]interface{}{
"runc-option": "value",
"BinaryName": "/test/runtime/dir/nvidia-container-runtime.experimental",
},
},
"nvidia-cdi": map[string]interface{}{
"runtime_type": "runc_runtime_type",
"runtime_root": "runc_runtime_root",
"runtime_engine": "runc_runtime_engine",
"privileged_without_host_devices": true,
"container_annotations": []string{"cdi.k8s.io/*"},
"options": map[string]interface{}{
"runc-option": "value",
"BinaryName": "/test/runtime/dir/nvidia-container-runtime.cdi",
},
},
"nvidia-legacy": map[string]interface{}{
"runtime_type": "runc_runtime_type",
"runtime_root": "runc_runtime_root",
"runtime_engine": "runc_runtime_engine",
"privileged_without_host_devices": true,
"container_annotations": []string{"cdi.k8s.io/*"},
"options": map[string]interface{}{
"runc-option": "value",
"BinaryName": "/test/runtime/dir/nvidia-container-runtime.legacy",
},
},
},
},
},
},
},
},
}
for i, tc := range testCases {
o := &options{
runtimeClass: tc.runtimeClass,
runtimeType: runtimeType,
runtimeDir: runtimeDir,
}
t.Run(fmt.Sprintf("%d", i), func(t *testing.T) {
o := &options{
runtimeClass: tc.runtimeClass,
runtimeType: runtimeType,
runtimeDir: runtimeDir,
}
config, err := toml.TreeFromMap(runcConfigMapV2("/runc-binary"))
require.NoError(t, err, "%d: %v", i, tc)
config, err := toml.TreeFromMap(runcConfigMapV2("/runc-binary"))
require.NoError(t, err)
err = UpdateV2Config(config, o)
require.NoError(t, err, "%d: %v", i, tc)
v2 := &containerd.Config{
Tree: config,
RuntimeType: runtimeType,
ContainerAnnotations: []string{"cdi.k8s.io/*"},
}
version, ok := config.Get("version").(int64)
require.True(t, ok)
require.EqualValues(t, expectedVersion, version)
err = UpdateConfig(v2, o)
require.NoError(t, err)
runtimes, ok := config.GetPath([]string{"plugins", "io.containerd.grpc.v1.cri", "containerd", "runtimes"}).(*toml.Tree)
require.True(t, ok, "%d: %v", i, tc)
expected, err := toml.TreeFromMap(tc.expectedConfig)
require.NoError(t, err)
runtimeClasses := runtimes.Keys()
require.ElementsMatch(t, tc.expectedRuntimes, runtimeClasses, "%d: %v", i, tc)
for i, r := range tc.expectedRuntimes {
runtimeConfig := runtimes.Get(r)
expected, err := toml.TreeFromMap(runcRuntimeConfigMapV2(expectedBinaries[i]))
require.NoError(t, err, "%d: %v", i, tc)
configContents, _ := toml.Marshal(runtimeConfig)
expectedContents, _ := toml.Marshal(expected)
require.Equal(t, string(expectedContents), string(configContents), "%d: %v: %v", i, r, tc)
}
require.Equal(t, expected.String(), config.String())
})
}
}
@@ -239,7 +557,7 @@ func TestRevertV2Config(t *testing.T) {
"containerd": map[string]interface{}{
"runtimes": map[string]interface{}{
"nvidia": runtimeMapV2("/test/runtime/dir/nvidia-container-runtime"),
"nvidia-experimental": runtimeMapV2("/test/runtime/dir/nvidia-container-runtime-experimental"),
"nvidia-experimental": runtimeMapV2("/test/runtime/dir/nvidia-container-runtime.experimental"),
},
},
},
@@ -254,7 +572,7 @@ func TestRevertV2Config(t *testing.T) {
"containerd": map[string]interface{}{
"runtimes": map[string]interface{}{
"nvidia": runtimeMapV2("/test/runtime/dir/nvidia-container-runtime"),
"nvidia-experimental": runtimeMapV2("/test/runtime/dir/nvidia-container-runtime-experimental"),
"nvidia-experimental": runtimeMapV2("/test/runtime/dir/nvidia-container-runtime.experimental"),
},
"default_runtime_name": "nvidia",
},
@@ -265,30 +583,33 @@ func TestRevertV2Config(t *testing.T) {
}
for i, tc := range testCases {
o := &options{
runtimeClass: "nvidia",
}
t.Run(fmt.Sprintf("%d", i), func(t *testing.T) {
o := &options{
runtimeClass: "nvidia",
}
config, err := toml.TreeFromMap(tc.config)
require.NoError(t, err, "%d: %v", i, tc)
config, err := toml.TreeFromMap(tc.config)
require.NoError(t, err)
expected, err := toml.TreeFromMap(tc.expected)
require.NoError(t, err, "%d: %v", i, tc)
expected, err := toml.TreeFromMap(tc.expected)
require.NoError(t, err)
err = RevertV2Config(config, o)
require.NoError(t, err, "%d: %v", i, tc)
v2 := &containerd.Config{
Tree: config,
RuntimeType: runtimeType,
}
configContents, _ := toml.Marshal(config)
expectedContents, _ := toml.Marshal(expected)
err = RevertConfig(v2, o)
require.NoError(t, err)
require.Equal(t, string(expectedContents), string(configContents), "%d: %v", i, tc)
configContents, _ := toml.Marshal(config)
expectedContents, _ := toml.Marshal(expected)
require.Equal(t, string(expectedContents), string(configContents))
})
}
}
func runtimeTomlConfigV2(binary string) (*toml.Tree, error) {
return toml.TreeFromMap(runtimeMapV2(binary))
}
func runtimeMapV2(binary string) map[string]interface{} {
return map[string]interface{}{
"runtime_type": runtimeType,
@@ -307,23 +628,19 @@ func runcConfigMapV2(binary string) map[string]interface{} {
"io.containerd.grpc.v1.cri": map[string]interface{}{
"containerd": map[string]interface{}{
"runtimes": map[string]interface{}{
"runc": runcRuntimeConfigMapV2(binary),
"runc": map[string]interface{}{
"runtime_type": "runc_runtime_type",
"runtime_root": "runc_runtime_root",
"runtime_engine": "runc_runtime_engine",
"privileged_without_host_devices": true,
"options": map[string]interface{}{
"runc-option": "value",
"BinaryName": binary,
},
},
},
},
},
},
}
}
func runcRuntimeConfigMapV2(binary string) map[string]interface{} {
return map[string]interface{}{
"runtime_type": "runc_runtime_type",
"runtime_root": "runc_runtime_root",
"runtime_engine": "runc_runtime_engine",
"privileged_without_host_devices": true,
"options": map[string]interface{}{
"runc-option": "value",
"BinaryName": binary,
},
}
}

View File

@@ -21,11 +21,12 @@ import (
"net"
"os"
"os/exec"
"path/filepath"
"syscall"
"time"
toml "github.com/pelletier/go-toml"
"github.com/NVIDIA/nvidia-container-toolkit/internal/config/engine"
"github.com/NVIDIA/nvidia-container-toolkit/internal/config/engine/containerd"
"github.com/NVIDIA/nvidia-container-toolkit/tools/container/operator"
log "github.com/sirupsen/logrus"
cli "github.com/urfave/cli/v2"
)
@@ -38,7 +39,7 @@ const (
nvidiaRuntimeName = "nvidia"
nvidiaRuntimeBinary = "nvidia-container-runtime"
nvidiaExperimentalRuntimeName = "nvidia-experimental"
nvidiaExperimentalRuntimeBinary = "nvidia-container-runtime-experimental"
nvidiaExperimentalRuntimeBinary = "nvidia-container-runtime.experimental"
defaultConfig = "/etc/containerd/config.toml"
defaultSocket = "/run/containerd/containerd.sock"
@@ -71,6 +72,8 @@ type options struct {
hostRootMount string
runtimeDir string
useLegacyConfig bool
ContainerRuntimeModesCDIAnnotationPrefixes cli.StringSlice
}
func main() {
@@ -172,6 +175,11 @@ func main() {
Destination: &options.useLegacyConfig,
EnvVars: []string{"CONTAINERD_USE_LEGACY_CONFIG"},
},
&cli.StringSliceFlag{
Name: "nvidia-container-runtime-modes.cdi.annotation-prefixes",
Destination: &options.ContainerRuntimeModesCDIAnnotationPrefixes,
EnvVars: []string{"NVIDIA_CONTAINER_RUNTIME_MODES_CDI_ANNOTATION_PREFIXES"},
},
}
// Update the subcommand flags with the common subcommand flags
@@ -194,25 +202,29 @@ func Setup(c *cli.Context, o *options) error {
}
o.runtimeDir = runtimeDir
cfg, err := LoadConfig(o.config)
cfg, err := containerd.New(
containerd.WithPath(o.config),
containerd.WithRuntimeType(o.runtimeType),
containerd.WithUseLegacyConfig(o.useLegacyConfig),
containerd.WithContainerAnnotations(o.containerAnnotationsFromCDIPrefixes()...),
)
if err != nil {
return fmt.Errorf("unable to load config: %v", err)
}
version, err := ParseVersion(cfg, o.useLegacyConfig)
if err != nil {
return fmt.Errorf("unable to parse version: %v", err)
}
err = UpdateConfig(cfg, o, version)
err = UpdateConfig(cfg, o)
if err != nil {
return fmt.Errorf("unable to update config: %v", err)
}
err = FlushConfig(o.config, cfg)
log.Infof("Flushing containerd config to %v", o.config)
n, err := cfg.Save(o.config)
if err != nil {
return fmt.Errorf("unable to flush config: %v", err)
}
if n == 0 {
log.Infof("Config file is empty, removed")
}
err = RestartContainerd(o)
if err != nil {
@@ -233,25 +245,29 @@ func Cleanup(c *cli.Context, o *options) error {
return fmt.Errorf("unable to parse args: %v", err)
}
cfg, err := LoadConfig(o.config)
cfg, err := containerd.New(
containerd.WithPath(o.config),
containerd.WithRuntimeType(o.runtimeType),
containerd.WithUseLegacyConfig(o.useLegacyConfig),
containerd.WithContainerAnnotations(o.containerAnnotationsFromCDIPrefixes()...),
)
if err != nil {
return fmt.Errorf("unable to load config: %v", err)
}
version, err := ParseVersion(cfg, o.useLegacyConfig)
if err != nil {
return fmt.Errorf("unable to parse version: %v", err)
}
err = RevertConfig(cfg, o, version)
err = RevertConfig(cfg, o)
if err != nil {
return fmt.Errorf("unable to update config: %v", err)
}
err = FlushConfig(o.config, cfg)
log.Infof("Flushing containerd config to %v", o.config)
n, err := cfg.Save(o.config)
if err != nil {
return fmt.Errorf("unable to flush config: %v", err)
}
if n == 0 {
log.Infof("Config file is empty, removed")
}
err = RestartContainerd(o)
if err != nil {
@@ -277,160 +293,36 @@ func ParseArgs(c *cli.Context) (string, error) {
return runtimeDir, nil
}
// LoadConfig loads the containerd config from disk
func LoadConfig(config string) (*toml.Tree, error) {
log.Infof("Loading config: %v", config)
info, err := os.Stat(config)
if os.IsExist(err) && info.IsDir() {
return nil, fmt.Errorf("config file is a directory")
}
configFile := config
if os.IsNotExist(err) {
configFile = "/dev/null"
log.Infof("Config file does not exist, creating new one")
}
cfg, err := toml.LoadFile(configFile)
if err != nil {
return nil, err
}
log.Infof("Successfully loaded config")
return cfg, nil
}
// ParseVersion parses the version field out of the containerd config
func ParseVersion(config *toml.Tree, useLegacyConfig bool) (int, error) {
var defaultVersion int
if !useLegacyConfig {
defaultVersion = 2
} else {
defaultVersion = 1
}
var version int
switch v := config.Get("version").(type) {
case nil:
switch len(config.Keys()) {
case 0: // No config exists, or the config file is empty, use version inferred from containerd
version = defaultVersion
default: // A config file exists, has content, and no version is set
version = 1
}
case int64:
version = int(v)
default:
return -1, fmt.Errorf("unsupported type for version field: %v", v)
}
log.Infof("Config version: %v", version)
if version == 1 {
log.Warnf("Support for containerd config version 1 is deprecated")
}
return version, nil
}
// UpdateConfig updates the containerd config to include the nvidia-container-runtime
func UpdateConfig(config *toml.Tree, o *options, version int) error {
var err error
log.Infof("Updating config")
switch version {
case 1:
err = UpdateV1Config(config, o)
case 2:
err = UpdateV2Config(config, o)
default:
err = fmt.Errorf("unsupported containerd config version: %v", version)
func UpdateConfig(cfg engine.Interface, o *options) error {
runtimes := operator.GetRuntimes(
operator.WithNvidiaRuntimeName(o.runtimeClass),
operator.WithSetAsDefault(o.setAsDefault),
operator.WithRoot(o.runtimeDir),
)
for class, runtime := range runtimes {
err := cfg.AddRuntime(class, runtime.Path, runtime.SetAsDefault)
if err != nil {
return fmt.Errorf("unable to update config for runtime class '%v': %v", class, err)
}
}
if err != nil {
return err
}
log.Infof("Successfully updated config")
return nil
}
// RevertConfig reverts the containerd config to remove the nvidia-container-runtime
func RevertConfig(config *toml.Tree, o *options, version int) error {
var err error
log.Infof("Reverting config")
switch version {
case 1:
err = RevertV1Config(config, o)
case 2:
err = RevertV2Config(config, o)
default:
err = fmt.Errorf("unsupported containerd config version: %v", version)
}
if err != nil {
return err
}
log.Infof("Successfully reverted config")
return nil
}
// UpdateV1Config performs an update specific to v1 of the containerd config
func UpdateV1Config(config *toml.Tree, o *options) error {
c := newConfigV1(config)
return c.Update(o)
}
// RevertV1Config performs a revert specific to v1 of the containerd config
func RevertV1Config(config *toml.Tree, o *options) error {
c := newConfigV1(config)
return c.Revert(o)
}
// UpdateV2Config performs an update specific to v2 of the containerd config
func UpdateV2Config(config *toml.Tree, o *options) error {
c := newConfigV2(config)
return c.Update(o)
}
// RevertV2Config performs a revert specific to v2 of the containerd config
func RevertV2Config(config *toml.Tree, o *options) error {
c := newConfigV2(config)
return c.Revert(o)
}
// FlushConfig flushes the updated/reverted config out to disk
func FlushConfig(config string, cfg *toml.Tree) error {
log.Infof("Flushing config")
output, err := cfg.ToTomlString()
if err != nil {
return fmt.Errorf("unable to convert to TOML: %v", err)
}
switch len(output) {
case 0:
err := os.Remove(config)
func RevertConfig(cfg engine.Interface, o *options) error {
runtimes := operator.GetRuntimes(
operator.WithNvidiaRuntimeName(o.runtimeClass),
operator.WithSetAsDefault(o.setAsDefault),
operator.WithRoot(o.runtimeDir),
)
for class := range runtimes {
err := cfg.RemoveRuntime(class)
if err != nil {
return fmt.Errorf("unable to remove empty file: %v", err)
}
log.Infof("Config empty, removing file")
default:
f, err := os.Create(config)
if err != nil {
return fmt.Errorf("unable to open '%v' for writing: %v", config, err)
}
defer f.Close()
_, err = f.WriteString(output)
if err != nil {
return fmt.Errorf("unable to write output: %v", err)
return fmt.Errorf("unable to revert config for runtime class '%v': %v", class, err)
}
}
log.Infof("Successfully flushed config")
return nil
}
@@ -552,35 +444,12 @@ func RestartContainerdSystemd(hostRootMount string) error {
return nil
}
// getDefaultRuntime returns the default runtime for the configured options.
// If the configuration is invalid or the default runtimes should not be set
// the empty string is returned.
func (o options) getDefaultRuntime() string {
if o.setAsDefault {
if o.runtimeClass == nvidiaExperimentalRuntimeName {
return nvidiaExperimentalRuntimeName
}
if o.runtimeClass == "" {
return defaultRuntimeClass
}
return o.runtimeClass
}
return ""
}
// getRuntimeBinaries returns a map of runtime names to binary paths. This includes the
// renaming of the `nvidia` runtime as per the --runtime-class command line flag.
func (o options) getRuntimeBinaries() map[string]string {
runtimeBinaries := make(map[string]string)
for rt, bin := range nvidiaRuntimeBinaries {
runtime := rt
if o.runtimeClass != "" && o.runtimeClass != nvidiaExperimentalRuntimeName && runtime == defaultRuntimeClass {
runtime = o.runtimeClass
}
runtimeBinaries[runtime] = filepath.Join(o.runtimeDir, bin)
// containerAnnotationsFromCDIPrefixes returns the container annotations to set for the given CDI prefixes.
func (o *options) containerAnnotationsFromCDIPrefixes() []string {
var annotations []string
for _, prefix := range o.ContainerRuntimeModesCDIAnnotationPrefixes.Value() {
annotations = append(annotations, prefix+"*")
}
return runtimeBinaries
return annotations
}

View File

@@ -1,106 +0,0 @@
/**
# Copyright (c) 2021, NVIDIA CORPORATION. All rights reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
*/
package main
import (
"testing"
"github.com/stretchr/testify/require"
)
func TestOptions(t *testing.T) {
testCases := []struct {
options options
expectedDefaultRuntime string
expectedRuntimeBinaries map[string]string
}{
{
expectedRuntimeBinaries: map[string]string{
"nvidia": "nvidia-container-runtime",
"nvidia-experimental": "nvidia-container-runtime-experimental",
},
},
{
options: options{
setAsDefault: true,
},
expectedDefaultRuntime: "nvidia",
expectedRuntimeBinaries: map[string]string{
"nvidia": "nvidia-container-runtime",
"nvidia-experimental": "nvidia-container-runtime-experimental",
},
},
{
options: options{
setAsDefault: true,
runtimeClass: "nvidia",
},
expectedDefaultRuntime: "nvidia",
expectedRuntimeBinaries: map[string]string{
"nvidia": "nvidia-container-runtime",
"nvidia-experimental": "nvidia-container-runtime-experimental",
},
},
{
options: options{
setAsDefault: true,
runtimeClass: "NAME",
},
expectedDefaultRuntime: "NAME",
expectedRuntimeBinaries: map[string]string{
"NAME": "nvidia-container-runtime",
"nvidia-experimental": "nvidia-container-runtime-experimental",
},
},
{
options: options{
setAsDefault: false,
runtimeClass: "NAME",
},
expectedRuntimeBinaries: map[string]string{
"NAME": "nvidia-container-runtime",
"nvidia-experimental": "nvidia-container-runtime-experimental",
},
},
{
options: options{
setAsDefault: true,
runtimeClass: "nvidia-experimental",
},
expectedDefaultRuntime: "nvidia-experimental",
expectedRuntimeBinaries: map[string]string{
"nvidia": "nvidia-container-runtime",
"nvidia-experimental": "nvidia-container-runtime-experimental",
},
},
{
options: options{
setAsDefault: false,
runtimeClass: "nvidia-experimental",
},
expectedRuntimeBinaries: map[string]string{
"nvidia": "nvidia-container-runtime",
"nvidia-experimental": "nvidia-container-runtime-experimental",
},
},
}
for i, tc := range testCases {
require.Equal(t, tc.expectedDefaultRuntime, tc.options.getDefaultRuntime(), "%d: %v", i, tc)
require.EqualValues(t, tc.expectedRuntimeBinaries, tc.options.getRuntimeBinaries(), "%d: %v", i, tc)
}
}

View File

@@ -24,8 +24,9 @@ import (
"path/filepath"
"github.com/NVIDIA/nvidia-container-toolkit/internal/config"
"github.com/NVIDIA/nvidia-container-toolkit/internal/config/crio"
"github.com/pelletier/go-toml"
"github.com/NVIDIA/nvidia-container-toolkit/internal/config/engine"
"github.com/NVIDIA/nvidia-container-toolkit/internal/config/engine/crio"
"github.com/NVIDIA/nvidia-container-toolkit/tools/container/operator"
log "github.com/sirupsen/logrus"
cli "github.com/urfave/cli/v2"
)
@@ -213,7 +214,9 @@ func setupHook(o *options) error {
func setupConfig(o *options) error {
log.Infof("Updating config file")
cfg, err := crio.LoadConfig(o.config)
cfg, err := crio.New(
crio.WithPath(o.config),
)
if err != nil {
return fmt.Errorf("unable to load config: %v", err)
}
@@ -223,10 +226,14 @@ func setupConfig(o *options) error {
return fmt.Errorf("unable to update config: %v", err)
}
err = crio.FlushConfig(o.config, cfg)
log.Infof("Flushing cri-o config to %v", o.config)
n, err := cfg.Save(o.config)
if err != nil {
return fmt.Errorf("unable to flush config: %v", err)
}
if n == 0 {
log.Infof("Config file is empty, removed")
}
err = RestartCrio(o)
if err != nil {
@@ -267,7 +274,9 @@ func cleanupHook(o *options) error {
func cleanupConfig(o *options) error {
log.Infof("Reverting config file modifications")
cfg, err := crio.LoadConfig(o.config)
cfg, err := crio.New(
crio.WithPath(o.config),
)
if err != nil {
return fmt.Errorf("unable to load config: %v", err)
}
@@ -277,10 +286,14 @@ func cleanupConfig(o *options) error {
return fmt.Errorf("unable to update config: %v", err)
}
err = crio.FlushConfig(o.config, cfg)
log.Infof("Flushing cri-o config to %v", o.config)
n, err := cfg.Save(o.config)
if err != nil {
return fmt.Errorf("unable to flush config: %v", err)
}
if n == 0 {
log.Infof("Config file is empty, removed")
}
err = RestartCrio(o)
if err != nil {
@@ -345,14 +358,36 @@ func generateOciHook(toolkitDir string) podmanHook {
}
// UpdateConfig updates the cri-o config to include the NVIDIA Container Runtime
func UpdateConfig(config *toml.Tree, o *options) error {
runtimePath := filepath.Join(o.runtimeDir, "nvidia-container-runtime")
return crio.UpdateConfig(config, o.runtimeClass, runtimePath, o.setAsDefault)
func UpdateConfig(cfg engine.Interface, o *options) error {
runtimes := operator.GetRuntimes(
operator.WithNvidiaRuntimeName(o.runtimeClass),
operator.WithSetAsDefault(o.setAsDefault),
operator.WithRoot(o.runtimeDir),
)
for class, runtime := range runtimes {
err := cfg.AddRuntime(class, runtime.Path, runtime.SetAsDefault)
if err != nil {
return fmt.Errorf("unable to update config for runtime class '%v': %v", class, err)
}
}
return nil
}
// RevertConfig reverts the cri-o config to remove the NVIDIA Container Runtime
func RevertConfig(config *toml.Tree, o *options) error {
return crio.RevertConfig(config, o.runtimeClass)
func RevertConfig(cfg engine.Interface, o *options) error {
runtimes := operator.GetRuntimes(
operator.WithNvidiaRuntimeName(o.runtimeClass),
operator.WithSetAsDefault(o.setAsDefault),
operator.WithRoot(o.runtimeDir),
)
for class := range runtimes {
err := cfg.RemoveRuntime(class)
if err != nil {
return fmt.Errorf("unable to revert config for runtime class '%v': %v", class, err)
}
}
return nil
}
// RestartCrio restarts crio depending on the value of restartModeFlag

View File

@@ -20,11 +20,12 @@ import (
"fmt"
"net"
"os"
"path/filepath"
"syscall"
"time"
"github.com/NVIDIA/nvidia-container-toolkit/internal/config/docker"
"github.com/NVIDIA/nvidia-container-toolkit/internal/config/engine"
"github.com/NVIDIA/nvidia-container-toolkit/internal/config/engine/docker"
"github.com/NVIDIA/nvidia-container-toolkit/tools/container/operator"
log "github.com/sirupsen/logrus"
cli "github.com/urfave/cli/v2"
)
@@ -36,7 +37,7 @@ const (
nvidiaRuntimeName = "nvidia"
nvidiaRuntimeBinary = "nvidia-container-runtime"
nvidiaExperimentalRuntimeName = "nvidia-experimental"
nvidiaExperimentalRuntimeBinary = "nvidia-container-runtime-experimental"
nvidiaExperimentalRuntimeBinary = "nvidia-container-runtime.experimental"
defaultConfig = "/etc/docker/daemon.json"
defaultSocket = "/var/run/docker.sock"
@@ -170,7 +171,9 @@ func Setup(c *cli.Context, o *options) error {
}
o.runtimeDir = runtimeDir
cfg, err := LoadConfig(o.config)
cfg, err := docker.New(
docker.WithPath(o.config),
)
if err != nil {
return fmt.Errorf("unable to load config: %v", err)
}
@@ -180,7 +183,8 @@ func Setup(c *cli.Context, o *options) error {
return fmt.Errorf("unable to update config: %v", err)
}
err = FlushConfig(cfg, o.config)
log.Infof("Flushing docker config to %v", o.config)
_, err = cfg.Save(o.config)
if err != nil {
return fmt.Errorf("unable to flush config: %v", err)
}
@@ -204,20 +208,26 @@ func Cleanup(c *cli.Context, o *options) error {
return fmt.Errorf("unable to parse args: %v", err)
}
cfg, err := LoadConfig(o.config)
cfg, err := docker.New(
docker.WithPath(o.config),
)
if err != nil {
return fmt.Errorf("unable to load config: %v", err)
}
err = RevertConfig(cfg)
err = RevertConfig(cfg, o)
if err != nil {
return fmt.Errorf("unable to update config: %v", err)
}
err = FlushConfig(cfg, o.config)
log.Infof("Flushing docker config to %v", o.config)
n, err := cfg.Save(o.config)
if err != nil {
return fmt.Errorf("unable to flush config: %v", err)
}
if n == 0 {
log.Infof("Config file is empty, removed")
}
err = RestartDocker(o)
if err != nil {
@@ -243,52 +253,40 @@ func ParseArgs(c *cli.Context) (string, error) {
return runtimeDir, nil
}
// LoadConfig loads the docker config from disk
func LoadConfig(config string) (map[string]interface{}, error) {
return docker.LoadConfig(config)
}
// UpdateConfig updates the docker config to include the nvidia runtimes
func UpdateConfig(config map[string]interface{}, o *options) error {
for runtimeName, runtimePath := range o.getRuntimeBinaries() {
setAsDefault := runtimeName == o.getDefaultRuntime()
err := docker.UpdateConfig(config, runtimeName, runtimePath, setAsDefault)
func UpdateConfig(cfg engine.Interface, o *options) error {
runtimes := operator.GetRuntimes(
operator.WithNvidiaRuntimeName(o.runtimeName),
operator.WithSetAsDefault(o.setAsDefault),
operator.WithRoot(o.runtimeDir),
)
for name, runtime := range runtimes {
err := cfg.AddRuntime(name, runtime.Path, runtime.SetAsDefault)
if err != nil {
return fmt.Errorf("failed to update runtime %q: %v", runtimeName, err)
return fmt.Errorf("failed to update runtime %q: %v", name, err)
}
}
return nil
}
//RevertConfig reverts the docker config to remove the nvidia runtime
func RevertConfig(config map[string]interface{}) error {
if _, exists := config["default-runtime"]; exists {
defaultRuntime := config["default-runtime"].(string)
if _, exists := nvidiaRuntimeBinaries[defaultRuntime]; exists {
config["default-runtime"] = defaultDockerRuntime
// RevertConfig reverts the docker config to remove the nvidia runtime
func RevertConfig(cfg engine.Interface, o *options) error {
runtimes := operator.GetRuntimes(
operator.WithNvidiaRuntimeName(o.runtimeName),
operator.WithSetAsDefault(o.setAsDefault),
operator.WithRoot(o.runtimeDir),
)
for name := range runtimes {
err := cfg.RemoveRuntime(name)
if err != nil {
return fmt.Errorf("failed to remove runtime %q: %v", name, err)
}
}
if _, exists := config["runtimes"]; exists {
runtimes := config["runtimes"].(map[string]interface{})
for name := range nvidiaRuntimeBinaries {
delete(runtimes, name)
}
if len(runtimes) == 0 {
delete(config, "runtimes")
}
}
return nil
}
// FlushConfig flushes the updated/reverted config out to disk
func FlushConfig(cfg map[string]interface{}, config string) error {
return docker.FlushConfig(cfg, config)
}
// RestartDocker restarts docker depending on the value of restartModeFlag
func RestartDocker(o *options) error {
switch o.restartMode {
@@ -385,31 +383,3 @@ func SignalDocker(socket string) error {
return nil
}
// getDefaultRuntime returns the default runtime for the configured options.
// If the configuration is invalid or the default runtimes should not be set
// the empty string is returned.
func (o options) getDefaultRuntime() string {
if o.setAsDefault == false {
return ""
}
return o.runtimeName
}
// getRuntimeBinaries returns a map of runtime names to binary paths. This includes the
// renaming of the `nvidia` runtime as per the --runtime-class command line flag.
func (o options) getRuntimeBinaries() map[string]string {
runtimeBinaries := make(map[string]string)
for rt, bin := range nvidiaRuntimeBinaries {
runtime := rt
if o.runtimeName != "" && o.runtimeName != nvidiaExperimentalRuntimeName && runtime == defaultRuntimeName {
runtime = o.runtimeName
}
runtimeBinaries[runtime] = filepath.Join(o.runtimeDir, bin)
}
return runtimeBinaries
}

View File

@@ -20,6 +20,7 @@ import (
"encoding/json"
"testing"
"github.com/NVIDIA/nvidia-container-toolkit/internal/config/engine/docker"
"github.com/stretchr/testify/require"
)
@@ -60,9 +61,9 @@ func TestUpdateConfigDefaultRuntime(t *testing.T) {
runtimeDir: runtimeDir,
}
config := map[string]interface{}{}
config := docker.Config(map[string]interface{}{})
err := UpdateConfig(config, o)
err := UpdateConfig(&config, o)
require.NoError(t, err, "%d: %v", i, tc)
defaultRuntimeName := config["default-runtime"]
@@ -74,7 +75,7 @@ func TestUpdateConfig(t *testing.T) {
const runtimeDir = "/test/runtime/dir"
testCases := []struct {
config map[string]interface{}
config docker.Config
setAsDefault bool
runtimeName string
expectedConfig map[string]interface{}
@@ -89,7 +90,15 @@ func TestUpdateConfig(t *testing.T) {
"args": []string{},
},
"nvidia-experimental": map[string]interface{}{
"path": "/test/runtime/dir/nvidia-container-runtime-experimental",
"path": "/test/runtime/dir/nvidia-container-runtime.experimental",
"args": []string{},
},
"nvidia-cdi": map[string]interface{}{
"path": "/test/runtime/dir/nvidia-container-runtime.cdi",
"args": []string{},
},
"nvidia-legacy": map[string]interface{}{
"path": "/test/runtime/dir/nvidia-container-runtime.legacy",
"args": []string{},
},
},
@@ -106,7 +115,15 @@ func TestUpdateConfig(t *testing.T) {
"args": []string{},
},
"nvidia-experimental": map[string]interface{}{
"path": "/test/runtime/dir/nvidia-container-runtime-experimental",
"path": "/test/runtime/dir/nvidia-container-runtime.experimental",
"args": []string{},
},
"nvidia-cdi": map[string]interface{}{
"path": "/test/runtime/dir/nvidia-container-runtime.cdi",
"args": []string{},
},
"nvidia-legacy": map[string]interface{}{
"path": "/test/runtime/dir/nvidia-container-runtime.legacy",
"args": []string{},
},
},
@@ -123,7 +140,15 @@ func TestUpdateConfig(t *testing.T) {
"args": []string{},
},
"nvidia-experimental": map[string]interface{}{
"path": "/test/runtime/dir/nvidia-container-runtime-experimental",
"path": "/test/runtime/dir/nvidia-container-runtime.experimental",
"args": []string{},
},
"nvidia-cdi": map[string]interface{}{
"path": "/test/runtime/dir/nvidia-container-runtime.cdi",
"args": []string{},
},
"nvidia-legacy": map[string]interface{}{
"path": "/test/runtime/dir/nvidia-container-runtime.legacy",
"args": []string{},
},
},
@@ -146,7 +171,15 @@ func TestUpdateConfig(t *testing.T) {
"args": []string{},
},
"nvidia-experimental": map[string]interface{}{
"path": "/test/runtime/dir/nvidia-container-runtime-experimental",
"path": "/test/runtime/dir/nvidia-container-runtime.experimental",
"args": []string{},
},
"nvidia-cdi": map[string]interface{}{
"path": "/test/runtime/dir/nvidia-container-runtime.cdi",
"args": []string{},
},
"nvidia-legacy": map[string]interface{}{
"path": "/test/runtime/dir/nvidia-container-runtime.legacy",
"args": []string{},
},
},
@@ -172,7 +205,15 @@ func TestUpdateConfig(t *testing.T) {
"args": []string{},
},
"nvidia-experimental": map[string]interface{}{
"path": "/test/runtime/dir/nvidia-container-runtime-experimental",
"path": "/test/runtime/dir/nvidia-container-runtime.experimental",
"args": []string{},
},
"nvidia-cdi": map[string]interface{}{
"path": "/test/runtime/dir/nvidia-container-runtime.cdi",
"args": []string{},
},
"nvidia-legacy": map[string]interface{}{
"path": "/test/runtime/dir/nvidia-container-runtime.legacy",
"args": []string{},
},
},
@@ -192,7 +233,15 @@ func TestUpdateConfig(t *testing.T) {
"args": []string{},
},
"nvidia-experimental": map[string]interface{}{
"path": "/test/runtime/dir/nvidia-container-runtime-experimental",
"path": "/test/runtime/dir/nvidia-container-runtime.experimental",
"args": []string{},
},
"nvidia-cdi": map[string]interface{}{
"path": "/test/runtime/dir/nvidia-container-runtime.cdi",
"args": []string{},
},
"nvidia-legacy": map[string]interface{}{
"path": "/test/runtime/dir/nvidia-container-runtime.legacy",
"args": []string{},
},
},
@@ -212,7 +261,15 @@ func TestUpdateConfig(t *testing.T) {
"args": []string{},
},
"nvidia-experimental": map[string]interface{}{
"path": "/test/runtime/dir/nvidia-container-runtime-experimental",
"path": "/test/runtime/dir/nvidia-container-runtime.experimental",
"args": []string{},
},
"nvidia-cdi": map[string]interface{}{
"path": "/test/runtime/dir/nvidia-container-runtime.cdi",
"args": []string{},
},
"nvidia-legacy": map[string]interface{}{
"path": "/test/runtime/dir/nvidia-container-runtime.legacy",
"args": []string{},
},
},
@@ -240,7 +297,15 @@ func TestUpdateConfig(t *testing.T) {
"args": []string{},
},
"nvidia-experimental": map[string]interface{}{
"path": "/test/runtime/dir/nvidia-container-runtime-experimental",
"path": "/test/runtime/dir/nvidia-container-runtime.experimental",
"args": []string{},
},
"nvidia-cdi": map[string]interface{}{
"path": "/test/runtime/dir/nvidia-container-runtime.cdi",
"args": []string{},
},
"nvidia-legacy": map[string]interface{}{
"path": "/test/runtime/dir/nvidia-container-runtime.legacy",
"args": []string{},
},
},
@@ -254,7 +319,8 @@ func TestUpdateConfig(t *testing.T) {
runtimeName: tc.runtimeName,
runtimeDir: runtimeDir,
}
err := UpdateConfig(tc.config, options)
err := UpdateConfig(&tc.config, options)
require.NoError(t, err, "%d: %v", i, tc)
configContent, err := json.MarshalIndent(tc.config, "", " ")
@@ -269,7 +335,7 @@ func TestUpdateConfig(t *testing.T) {
func TestRevertConfig(t *testing.T) {
testCases := []struct {
config map[string]interface{}
config docker.Config
expectedConfig map[string]interface{}
}{
{
@@ -306,7 +372,30 @@ func TestRevertConfig(t *testing.T) {
"args": []string{},
},
"nvidia-experimental": map[string]interface{}{
"path": "/test/runtime/dir/nvidia-container-runtime-experimental",
"path": "/test/runtime/dir/nvidia-container-runtime.experimental",
"args": []string{},
},
},
},
expectedConfig: map[string]interface{}{},
},
{
config: map[string]interface{}{
"runtimes": map[string]interface{}{
"nvidia": map[string]interface{}{
"path": "/test/runtime/dir/nvidia-container-runtime",
"args": []string{},
},
"nvidia-experimental": map[string]interface{}{
"path": "/test/runtime/dir/nvidia-container-runtime.experimental",
"args": []string{},
},
"nvidia-cdi": map[string]interface{}{
"path": "/test/runtime/dir/nvidia-container-runtime.cdi",
"args": []string{},
},
"nvidia-legacy": map[string]interface{}{
"path": "/test/runtime/dir/nvidia-container-runtime.legacy",
"args": []string{},
},
},
@@ -368,7 +457,7 @@ func TestRevertConfig(t *testing.T) {
}
for i, tc := range testCases {
err := RevertConfig(tc.config)
err := RevertConfig(&tc.config, &options{})
require.NoError(t, err, "%d: %v", i, tc)
@@ -381,43 +470,3 @@ func TestRevertConfig(t *testing.T) {
require.EqualValues(t, string(expectedContent), string(configContent), "%d: %v", i, tc)
}
}
func TestFlagsDefaultRuntime(t *testing.T) {
testCases := []struct {
setAsDefault bool
runtimeName string
expected string
}{
{
expected: "",
},
{
runtimeName: "not-bool",
expected: "",
},
{
setAsDefault: false,
runtimeName: "nvidia",
expected: "",
},
{
setAsDefault: true,
runtimeName: "nvidia",
expected: "nvidia",
},
{
setAsDefault: true,
runtimeName: "nvidia-experimental",
expected: "nvidia-experimental",
},
}
for i, tc := range testCases {
f := options{
setAsDefault: tc.setAsDefault,
runtimeName: tc.runtimeName,
}
require.Equal(t, tc.expected, f.getDefaultRuntime(), "%d: %v", i, tc)
}
}

View File

@@ -0,0 +1,135 @@
/**
# Copyright (c) NVIDIA CORPORATION. All rights reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
**/
package operator
import "path/filepath"
const (
defaultRuntimeName = "nvidia"
experimentalRuntimeName = "nvidia-experimental"
defaultRoot = "/usr/bin"
)
// Runtime defines a runtime to be configured.
// The path and whether the runtime is the default runtime can be specfied
type Runtime struct {
name string
Path string
SetAsDefault bool
}
// Runtimes defines a set of runtimes to be configure for use in the GPU Operator
type Runtimes map[string]Runtime
type config struct {
root string
nvidiaRuntimeName string
setAsDefault bool
}
// GetRuntimes returns the set of runtimes to be configured for use with the GPU Operator.
func GetRuntimes(opts ...Option) Runtimes {
c := &config{}
for _, opt := range opts {
opt(c)
}
if c.root == "" {
c.root = defaultRoot
}
if c.nvidiaRuntimeName == "" {
c.nvidiaRuntimeName = defaultRuntimeName
}
runtimes := make(Runtimes)
runtimes.add(c.nvidiaRuntime())
modes := []string{"experimental", "cdi", "legacy"}
for _, mode := range modes {
runtimes.add(c.modeRuntime(mode))
}
return runtimes
}
// DefaultRuntimeName returns the name of the default runtime.
func (r Runtimes) DefaultRuntimeName() string {
for _, runtime := range r {
if runtime.SetAsDefault {
return runtime.name
}
}
return ""
}
// Add a runtime to the set of runtimes.
func (r *Runtimes) add(runtime Runtime) {
(*r)[runtime.name] = runtime
}
// nvidiaRuntime creates a runtime that corresponds to the nvidia runtime.
// If name is equal to one of the predefined runtimes, `nvidia` is used as the runtime name instead.
func (c config) nvidiaRuntime() Runtime {
predefinedRuntimes := map[string]struct{}{
"nvidia-experimental": {},
"nvidia-cdi": {},
"nvidia-legacy": {},
}
name := c.nvidiaRuntimeName
if _, isPredefinedRuntime := predefinedRuntimes[name]; isPredefinedRuntime {
name = defaultRuntimeName
}
return c.newRuntime(name, "nvidia-container-runtime")
}
// modeRuntime creates a runtime for the specified mode.
func (c config) modeRuntime(mode string) Runtime {
return c.newRuntime("nvidia-"+mode, "nvidia-container-runtime."+mode)
}
// newRuntime creates a runtime based on the configuration
func (c config) newRuntime(name string, binary string) Runtime {
return Runtime{
name: name,
Path: filepath.Join(c.root, binary),
SetAsDefault: c.setAsDefault && name == c.nvidiaRuntimeName,
}
}
// Option is a functional option for configuring set of runtimes.
type Option func(*config)
// WithRoot sets the root directory for the runtime binaries.
func WithRoot(root string) Option {
return func(c *config) {
c.root = root
}
}
// WithNvidiaRuntimeName sets the name of the nvidia runtime.
func WithNvidiaRuntimeName(name string) Option {
return func(c *config) {
c.nvidiaRuntimeName = name
}
}
// WithSetAsDefault sets the default runtime to the nvidia runtime.
func WithSetAsDefault(set bool) Option {
return func(c *config) {
c.setAsDefault = set
}
}

View File

@@ -0,0 +1,207 @@
/**
# Copyright (c) 2021, NVIDIA CORPORATION. All rights reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
*/
package operator
import (
"fmt"
"testing"
"github.com/stretchr/testify/require"
)
func TestOptions(t *testing.T) {
testCases := []struct {
setAsDefault bool
nvidiaRuntimeName string
root string
expectedDefaultRuntime string
expectedRuntimes Runtimes
}{
{
expectedRuntimes: Runtimes{
"nvidia": Runtime{
name: "nvidia",
Path: "/usr/bin/nvidia-container-runtime",
},
"nvidia-experimental": Runtime{
name: "nvidia-experimental",
Path: "/usr/bin/nvidia-container-runtime.experimental",
},
"nvidia-cdi": Runtime{
name: "nvidia-cdi",
Path: "/usr/bin/nvidia-container-runtime.cdi",
},
"nvidia-legacy": Runtime{
name: "nvidia-legacy",
Path: "/usr/bin/nvidia-container-runtime.legacy",
},
},
},
{
setAsDefault: true,
expectedDefaultRuntime: "nvidia",
expectedRuntimes: Runtimes{
"nvidia": Runtime{
name: "nvidia",
Path: "/usr/bin/nvidia-container-runtime",
SetAsDefault: true,
},
"nvidia-experimental": Runtime{
name: "nvidia-experimental",
Path: "/usr/bin/nvidia-container-runtime.experimental",
},
"nvidia-cdi": Runtime{
name: "nvidia-cdi",
Path: "/usr/bin/nvidia-container-runtime.cdi",
},
"nvidia-legacy": Runtime{
name: "nvidia-legacy",
Path: "/usr/bin/nvidia-container-runtime.legacy",
},
},
},
{
setAsDefault: true,
nvidiaRuntimeName: "nvidia",
expectedDefaultRuntime: "nvidia",
expectedRuntimes: Runtimes{
"nvidia": Runtime{
name: "nvidia",
Path: "/usr/bin/nvidia-container-runtime",
SetAsDefault: true,
},
"nvidia-experimental": Runtime{
name: "nvidia-experimental",
Path: "/usr/bin/nvidia-container-runtime.experimental",
},
"nvidia-cdi": Runtime{
name: "nvidia-cdi",
Path: "/usr/bin/nvidia-container-runtime.cdi",
},
"nvidia-legacy": Runtime{
name: "nvidia-legacy",
Path: "/usr/bin/nvidia-container-runtime.legacy",
},
},
},
{
setAsDefault: true,
nvidiaRuntimeName: "NAME",
expectedDefaultRuntime: "NAME",
expectedRuntimes: Runtimes{
"NAME": Runtime{
name: "NAME",
Path: "/usr/bin/nvidia-container-runtime",
SetAsDefault: true,
},
"nvidia-experimental": Runtime{
name: "nvidia-experimental",
Path: "/usr/bin/nvidia-container-runtime.experimental",
},
"nvidia-cdi": Runtime{
name: "nvidia-cdi",
Path: "/usr/bin/nvidia-container-runtime.cdi",
},
"nvidia-legacy": Runtime{
name: "nvidia-legacy",
Path: "/usr/bin/nvidia-container-runtime.legacy",
},
},
},
{
setAsDefault: false,
nvidiaRuntimeName: "NAME",
expectedRuntimes: Runtimes{
"NAME": Runtime{
name: "NAME",
Path: "/usr/bin/nvidia-container-runtime",
},
"nvidia-experimental": Runtime{
name: "nvidia-experimental",
Path: "/usr/bin/nvidia-container-runtime.experimental",
},
"nvidia-cdi": Runtime{
name: "nvidia-cdi",
Path: "/usr/bin/nvidia-container-runtime.cdi",
},
"nvidia-legacy": Runtime{
name: "nvidia-legacy",
Path: "/usr/bin/nvidia-container-runtime.legacy",
},
},
},
{
setAsDefault: true,
nvidiaRuntimeName: "nvidia-experimental",
expectedDefaultRuntime: "nvidia-experimental",
expectedRuntimes: Runtimes{
"nvidia": Runtime{
name: "nvidia",
Path: "/usr/bin/nvidia-container-runtime",
},
"nvidia-experimental": Runtime{
name: "nvidia-experimental",
Path: "/usr/bin/nvidia-container-runtime.experimental",
SetAsDefault: true,
},
"nvidia-cdi": Runtime{
name: "nvidia-cdi",
Path: "/usr/bin/nvidia-container-runtime.cdi",
},
"nvidia-legacy": Runtime{
name: "nvidia-legacy",
Path: "/usr/bin/nvidia-container-runtime.legacy",
},
},
},
{
setAsDefault: false,
nvidiaRuntimeName: "nvidia-experimental",
expectedRuntimes: Runtimes{
"nvidia": Runtime{
name: "nvidia",
Path: "/usr/bin/nvidia-container-runtime",
},
"nvidia-experimental": Runtime{
name: "nvidia-experimental",
Path: "/usr/bin/nvidia-container-runtime.experimental",
},
"nvidia-cdi": Runtime{
name: "nvidia-cdi",
Path: "/usr/bin/nvidia-container-runtime.cdi",
},
"nvidia-legacy": Runtime{
name: "nvidia-legacy",
Path: "/usr/bin/nvidia-container-runtime.legacy",
},
},
},
}
for i, tc := range testCases {
t.Run(fmt.Sprintf("%d", i), func(t *testing.T) {
runtimes := GetRuntimes(
WithNvidiaRuntimeName(tc.nvidiaRuntimeName),
WithSetAsDefault(tc.setAsDefault),
WithRoot(tc.root),
)
require.EqualValues(t, tc.expectedRuntimes, runtimes)
require.Equal(t, tc.expectedDefaultRuntime, runtimes.DefaultRuntimeName())
})
}
}

View File

@@ -21,31 +21,34 @@ import (
"path/filepath"
"strings"
"github.com/NVIDIA/nvidia-container-toolkit/tools/container/operator"
log "github.com/sirupsen/logrus"
)
const (
nvidiaContainerRuntimeSource = "/usr/bin/nvidia-container-runtime"
nvidiaContainerRuntimeTarget = "nvidia-container-runtime.real"
nvidiaContainerRuntimeWrapper = "nvidia-container-runtime"
nvidiaContainerRuntimeSource = "/usr/bin/nvidia-container-runtime"
nvidiaExperimentalContainerRuntimeSource = "nvidia-container-runtime.experimental"
nvidiaExperimentalContainerRuntimeTarget = nvidiaExperimentalContainerRuntimeSource
nvidiaExperimentalContainerRuntimeWrapper = "nvidia-container-runtime-experimental"
nvidiaExperimentalContainerRuntimeSource = "nvidia-container-runtime.experimental"
)
// installContainerRuntimes sets up the NVIDIA container runtimes, copying the executables
// and implementing the required wrapper
func installContainerRuntimes(toolkitDir string, driverRoot string) error {
r := newNvidiaContainerRuntimeInstaller()
runtimes := operator.GetRuntimes()
for _, runtime := range runtimes {
if filepath.Base(runtime.Path) == nvidiaExperimentalContainerRuntimeSource {
continue
}
r := newNvidiaContainerRuntimeInstaller(runtime.Path)
_, err := r.install(toolkitDir)
if err != nil {
return fmt.Errorf("error installing NVIDIA container runtime: %v", err)
_, err := r.install(toolkitDir)
if err != nil {
return fmt.Errorf("error installing NVIDIA container runtime: %v", err)
}
}
// Install the experimental runtime and treat failures as non-fatal.
err = installExperimentalRuntime(toolkitDir, driverRoot)
err := installExperimentalRuntime(toolkitDir, driverRoot)
if err != nil {
log.Warnf("Could not install experimental runtime: %v", err)
}
@@ -70,25 +73,34 @@ func installExperimentalRuntime(toolkitDir string, driverRoot string) error {
return nil
}
func newNvidiaContainerRuntimeInstaller() *executable {
// newNVidiaContainerRuntimeInstaller returns a new executable installer for the NVIDIA container runtime.
// This installer will copy the specified source exectuable to the toolkit directory.
// The executable is copied to a file with the same name as the source, but with a ".real" suffix and a wrapper is
// created to allow for the configuration of the runtime environment.
func newNvidiaContainerRuntimeInstaller(source string) *executable {
wrapperName := filepath.Base(source)
dotfileName := wrapperName + ".real"
target := executableTarget{
dotfileName: nvidiaContainerRuntimeTarget,
wrapperName: nvidiaContainerRuntimeWrapper,
dotfileName: dotfileName,
wrapperName: wrapperName,
}
return newRuntimeInstaller(nvidiaContainerRuntimeSource, target, nil)
return newRuntimeInstaller(source, target, nil)
}
func newNvidiaContainerRuntimeExperimentalInstaller(libraryRoot string) *executable {
source := nvidiaExperimentalContainerRuntimeSource
wrapperName := filepath.Base(source)
dotfileName := wrapperName + ".real"
target := executableTarget{
dotfileName: nvidiaExperimentalContainerRuntimeTarget,
wrapperName: nvidiaExperimentalContainerRuntimeWrapper,
dotfileName: dotfileName,
wrapperName: wrapperName,
}
env := make(map[string]string)
if libraryRoot != "" {
env["LD_LIBRARY_PATH"] = strings.Join([]string{libraryRoot, "$LD_LIBRARY_PATH"}, ":")
}
return newRuntimeInstaller(nvidiaExperimentalContainerRuntimeSource, target, env)
return newRuntimeInstaller(source, target, env)
}
func newRuntimeInstaller(source string, target executableTarget, env map[string]string) *executable {

View File

@@ -25,7 +25,7 @@ import (
)
func TestNvidiaContainerRuntimeInstallerWrapper(t *testing.T) {
r := newNvidiaContainerRuntimeInstaller()
r := newNvidiaContainerRuntimeInstaller(nvidiaContainerRuntimeSource)
const shebang = "#! /bin/sh"
const destFolder = "/dest/folder"

View File

@@ -23,6 +23,10 @@ import (
"path/filepath"
"strings"
"github.com/NVIDIA/nvidia-container-toolkit/internal/system"
"github.com/NVIDIA/nvidia-container-toolkit/pkg/nvcdi"
"github.com/NVIDIA/nvidia-container-toolkit/pkg/nvcdi/transform"
"github.com/container-orchestrated-devices/container-device-interface/pkg/cdi"
toml "github.com/pelletier/go-toml"
log "github.com/sirupsen/logrus"
"github.com/urfave/cli/v2"
@@ -40,15 +44,33 @@ const (
)
type options struct {
DriverRoot string
DriverRoot string
DriverRootCtrPath string
ContainerRuntimeMode string
ContainerRuntimeDebug string
ContainerRuntimeLogLevel string
ContainerCLIDebug string
toolkitRoot string
ContainerRuntimeModesCdiDefaultKind string
ContainerRuntimeModesCDIAnnotationPrefixes cli.StringSlice
ContainerRuntimeRuntimes cli.StringSlice
ContainerRuntimeHookSkipModeDetection bool
ContainerCLIDebug string
toolkitRoot string
cdiEnabled bool
cdiOutputDir string
cdiKind string
cdiVendor string
cdiClass string
acceptNVIDIAVisibleDevicesWhenUnprivileged bool
acceptNVIDIAVisibleDevicesAsVolumeMounts bool
ignoreErrors bool
}
func main() {
@@ -99,23 +121,54 @@ func main() {
EnvVars: []string{"NVIDIA_DRIVER_ROOT"},
},
&cli.StringFlag{
Name: "nvidia-container-runtime-debug",
Name: "driver-root-ctr-path",
Value: DefaultNvidiaDriverRoot,
Destination: &opts.DriverRootCtrPath,
EnvVars: []string{"DRIVER_ROOT_CTR_PATH"},
},
&cli.StringFlag{
Name: "nvidia-container-runtime.debug",
Aliases: []string{"nvidia-container-runtime-debug"},
Usage: "Specify the location of the debug log file for the NVIDIA Container Runtime",
Destination: &opts.ContainerRuntimeDebug,
EnvVars: []string{"NVIDIA_CONTAINER_RUNTIME_DEBUG"},
},
&cli.StringFlag{
Name: "nvidia-container-runtime-debug-log-level",
Name: "nvidia-container-runtime.log-level",
Aliases: []string{"nvidia-container-runtime-debug-log-level"},
Destination: &opts.ContainerRuntimeLogLevel,
EnvVars: []string{"NVIDIA_CONTAINER_RUNTIME_LOG_LEVEL"},
},
&cli.StringFlag{
Name: "nvidia-container-runtime-mode",
Name: "nvidia-container-runtime.mode",
Aliases: []string{"nvidia-container-runtime-mode"},
Destination: &opts.ContainerRuntimeMode,
EnvVars: []string{"NVIDIA_CONTAINER_RUNTIME_MODE"},
},
&cli.StringFlag{
Name: "nvidia-container-cli-debug",
Name: "nvidia-container-runtime.modes.cdi.default-kind",
Destination: &opts.ContainerRuntimeModesCdiDefaultKind,
EnvVars: []string{"NVIDIA_CONTAINER_RUNTIME_MODES_CDI_DEFAULT_KIND"},
},
&cli.StringSliceFlag{
Name: "nvidia-container-runtime.modes.cdi.annotation-prefixes",
Destination: &opts.ContainerRuntimeModesCDIAnnotationPrefixes,
EnvVars: []string{"NVIDIA_CONTAINER_RUNTIME_MODES_CDI_ANNOTATION_PREFIXES"},
},
&cli.StringSliceFlag{
Name: "nvidia-container-runtime.runtimes",
Destination: &opts.ContainerRuntimeRuntimes,
EnvVars: []string{"NVIDIA_CONTAINER_RUNTIME_RUNTIMES"},
},
&cli.BoolFlag{
Name: "nvidia-container-runtime-hook.skip-mode-detection",
Value: true,
Destination: &opts.ContainerRuntimeHookSkipModeDetection,
EnvVars: []string{"NVIDIA_CONTAINER_RUNTIME_HOOK_SKIP_MODE_DETECTION"},
},
&cli.StringFlag{
Name: "nvidia-container-cli.debug",
Aliases: []string{"nvidia-container-cli-debug"},
Usage: "Specify the location of the debug log file for the NVIDIA Container CLI",
Destination: &opts.ContainerCLIDebug,
EnvVars: []string{"NVIDIA_CONTAINER_CLI_DEBUG"},
@@ -140,6 +193,33 @@ func main() {
Destination: &opts.toolkitRoot,
EnvVars: []string{"TOOLKIT_ROOT"},
},
&cli.BoolFlag{
Name: "cdi-enabled",
Aliases: []string{"enable-cdi"},
Usage: "enable the generation of a CDI specification",
Destination: &opts.cdiEnabled,
EnvVars: []string{"CDI_ENABLED", "ENABLE_CDI"},
},
&cli.StringFlag{
Name: "cdi-output-dir",
Usage: "the directory where the CDI output files are to be written. If this is set to '', no CDI specification is generated.",
Value: "/var/run/cdi",
Destination: &opts.cdiOutputDir,
EnvVars: []string{"CDI_OUTPUT_DIR"},
},
&cli.StringFlag{
Name: "cdi-kind",
Usage: "the vendor string to use for the generated CDI specification",
Value: "management.nvidia.com/gpu",
Destination: &opts.cdiKind,
EnvVars: []string{"CDI_KIND"},
},
&cli.BoolFlag{
Name: "ignore-errors",
Usage: "ignore errors when installing the NVIDIA Container toolkit. This is used for testing purposes only.",
Hidden: true,
Destination: &opts.ignoreErrors,
},
}
// Update the subcommand flags with the common subcommand flags
@@ -158,6 +238,16 @@ func validateOptions(c *cli.Context, opts *options) error {
return fmt.Errorf("invalid --toolkit-root option: %v", opts.toolkitRoot)
}
vendor, class := cdi.ParseQualifier(opts.cdiKind)
if err := cdi.ValidateVendorName(vendor); err != nil {
return fmt.Errorf("invalid CDI vendor name: %v", err)
}
if err := cdi.ValidateClassName(class); err != nil {
return fmt.Errorf("invalid CDI class name: %v", err)
}
opts.cdiVendor = vendor
opts.cdiClass = class
return nil
}
@@ -178,44 +268,65 @@ func Install(cli *cli.Context, opts *options) error {
log.Infof("Removing existing NVIDIA container toolkit installation")
err := os.RemoveAll(opts.toolkitRoot)
if err != nil {
if err != nil && !opts.ignoreErrors {
return fmt.Errorf("error removing toolkit directory: %v", err)
} else if err != nil {
log.Errorf("Ignoring error: %v", fmt.Errorf("error removing toolkit directory: %v", err))
}
toolkitConfigDir := filepath.Join(opts.toolkitRoot, ".config", "nvidia-container-runtime")
toolkitConfigPath := filepath.Join(toolkitConfigDir, configFilename)
err = createDirectories(opts.toolkitRoot, toolkitConfigDir)
if err != nil {
if err != nil && !opts.ignoreErrors {
return fmt.Errorf("could not create required directories: %v", err)
} else if err != nil {
log.Errorf("Ignoring error: %v", fmt.Errorf("could not create required directories: %v", err))
}
err = installContainerLibraries(opts.toolkitRoot)
if err != nil {
if err != nil && !opts.ignoreErrors {
return fmt.Errorf("error installing NVIDIA container library: %v", err)
} else if err != nil {
log.Errorf("Ignoring error: %v", fmt.Errorf("error installing NVIDIA container library: %v", err))
}
err = installContainerRuntimes(opts.toolkitRoot, opts.DriverRoot)
if err != nil {
if err != nil && !opts.ignoreErrors {
return fmt.Errorf("error installing NVIDIA container runtime: %v", err)
} else if err != nil {
log.Errorf("Ignoring error: %v", fmt.Errorf("error installing NVIDIA container runtime: %v", err))
}
nvidiaContainerCliExecutable, err := installContainerCLI(opts.toolkitRoot)
if err != nil {
if err != nil && !opts.ignoreErrors {
return fmt.Errorf("error installing NVIDIA container CLI: %v", err)
} else if err != nil {
log.Errorf("Ignoring error: %v", fmt.Errorf("error installing NVIDIA container CLI: %v", err))
}
_, err = installRuntimeHook(opts.toolkitRoot, toolkitConfigPath)
if err != nil {
if err != nil && !opts.ignoreErrors {
return fmt.Errorf("error installing NVIDIA container runtime hook: %v", err)
} else if err != nil {
log.Errorf("Ignoring error: %v", fmt.Errorf("error installing NVIDIA container runtime hook: %v", err))
}
err = installToolkitConfig(toolkitConfigPath, nvidiaContainerCliExecutable, opts)
if err != nil {
nvidiaCTKPath, err := installContainerToolkitCLI(opts.toolkitRoot)
if err != nil && !opts.ignoreErrors {
return fmt.Errorf("error installing NVIDIA Container Toolkit CLI: %v", err)
} else if err != nil {
log.Errorf("Ignoring error: %v", fmt.Errorf("error installing NVIDIA Container Toolkit CLI: %v", err))
}
err = installToolkitConfig(cli, toolkitConfigPath, nvidiaContainerCliExecutable, nvidiaCTKPath, opts)
if err != nil && !opts.ignoreErrors {
return fmt.Errorf("error installing NVIDIA container toolkit config: %v", err)
} else if err != nil {
log.Errorf("Ignoring error: %v", fmt.Errorf("error installing NVIDIA container toolkit config: %v", err))
}
return nil
return generateCDISpec(opts, nvidiaCTKPath)
}
// installContainerLibraries locates and installs the libraries that are part of
@@ -268,10 +379,10 @@ func installLibrary(libName string, toolkitRoot string) error {
// installToolkitConfig installs the config file for the NVIDIA container toolkit ensuring
// that the settings are updated to match the desired install and nvidia driver directories.
func installToolkitConfig(toolkitConfigPath string, nvidiaContainerCliExecutablePath string, opts *options) error {
func installToolkitConfig(c *cli.Context, toolkitConfigPath string, nvidiaContainerCliExecutablePath string, nvidiaCTKPath string, opts *options) error {
log.Infof("Installing NVIDIA container toolkit config '%v'", toolkitConfigPath)
config, err := toml.LoadFile(nvidiaContainerToolkitConfigSource)
config, err := loadConfig(nvidiaContainerToolkitConfigSource)
if err != nil {
return fmt.Errorf("could not open source config file: %v", err)
}
@@ -282,39 +393,63 @@ func installToolkitConfig(toolkitConfigPath string, nvidiaContainerCliExecutable
}
defer targetConfig.Close()
// Set the options in the root toml table
config.Set("accept-nvidia-visible-devices-envvar-when-unprivileged", opts.acceptNVIDIAVisibleDevicesWhenUnprivileged)
config.Set("accept-nvidia-visible-devices-as-volume-mounts", opts.acceptNVIDIAVisibleDevicesAsVolumeMounts)
nvidiaContainerCliKey := func(p string) []string {
return []string{"nvidia-container-cli", p}
}
// Read the ldconfig path from the config as this may differ per platform
// On ubuntu-based systems this ends in `.real`
ldconfigPath := fmt.Sprintf("%s", config.GetPath(nvidiaContainerCliKey("ldconfig")))
ldconfigPath := fmt.Sprintf("%s", config.GetDefault("nvidia-container-cli.ldconfig", "/sbin/ldconfig"))
// Use the driver run root as the root:
driverLdconfigPath := "@" + filepath.Join(opts.DriverRoot, strings.TrimPrefix(ldconfigPath, "@/"))
config.SetPath(nvidiaContainerCliKey("root"), opts.DriverRoot)
config.SetPath(nvidiaContainerCliKey("path"), nvidiaContainerCliExecutablePath)
config.SetPath(nvidiaContainerCliKey("ldconfig"), driverLdconfigPath)
// Set the debug options if selected
debugOptions := map[string]string{
"nvidia-container-runtime.debug": opts.ContainerRuntimeDebug,
"nvidia-container-runtime.log-level": opts.ContainerRuntimeLogLevel,
"nvidia-container-runtime.mode": opts.ContainerRuntimeMode,
"nvidia-container-cli.debug": opts.ContainerCLIDebug,
configValues := map[string]interface{}{
// Set the options in the root toml table
"accept-nvidia-visible-devices-envvar-when-unprivileged": opts.acceptNVIDIAVisibleDevicesWhenUnprivileged,
"accept-nvidia-visible-devices-as-volume-mounts": opts.acceptNVIDIAVisibleDevicesAsVolumeMounts,
// Set the nvidia-container-cli options
"nvidia-container-cli.root": opts.DriverRoot,
"nvidia-container-cli.path": nvidiaContainerCliExecutablePath,
"nvidia-container-cli.ldconfig": driverLdconfigPath,
// Set nvidia-ctk options
"nvidia-ctk.path": nvidiaCTKPath,
// Set the nvidia-container-runtime-hook options
"nvidia-container-runtime-hook.skip-mode-detection": opts.ContainerRuntimeHookSkipModeDetection,
}
for key, value := range debugOptions {
if value == "" {
for key, value := range configValues {
config.Set(key, value)
}
// Set the optional config options
optionalConfigValues := map[string]interface{}{
"nvidia-container-runtime.debug": opts.ContainerRuntimeDebug,
"nvidia-container-runtime.log-level": opts.ContainerRuntimeLogLevel,
"nvidia-container-runtime.mode": opts.ContainerRuntimeMode,
"nvidia-container-runtime.modes.cdi.annotation-prefixes": opts.ContainerRuntimeModesCDIAnnotationPrefixes,
"nvidia-container-runtime.modes.cdi.default-kind": opts.ContainerRuntimeModesCdiDefaultKind,
"nvidia-container-runtime.runtimes": opts.ContainerRuntimeRuntimes,
"nvidia-container-cli.debug": opts.ContainerCLIDebug,
}
for key, value := range optionalConfigValues {
if !c.IsSet(key) {
log.Infof("Skipping unset option: %v", key)
continue
}
if config.Get(key) != nil {
if value == nil {
log.Infof("Skipping option with nil value: %v", key)
continue
}
switch v := value.(type) {
case string:
if v == "" {
continue
}
case cli.StringSlice:
if len(v.Value()) == 0 {
continue
}
value = v.Value()
default:
log.Warnf("Unexpected type for option %v=%v: %T", key, value, v)
}
config.Set(key, value)
}
@@ -329,6 +464,29 @@ func installToolkitConfig(toolkitConfigPath string, nvidiaContainerCliExecutable
return nil
}
func loadConfig(path string) (*toml.Tree, error) {
_, err := os.Stat(path)
if err == nil {
return toml.LoadFile(path)
} else if os.IsNotExist(err) {
return toml.TreeFromMap(nil)
}
return nil, err
}
// installContainerToolkitCLI installs the nvidia-ctk CLI executable and wrapper.
func installContainerToolkitCLI(toolkitDir string) (string, error) {
e := executable{
source: "/usr/bin/nvidia-ctk",
target: executableTarget{
dotfileName: "nvidia-ctk.real",
wrapperName: "nvidia-ctk",
},
}
return e.install(toolkitDir)
}
// installContainerCLI sets up the NVIDIA container CLI executable, copying the executable
// and implementing the required wrapper
func installContainerCLI(toolkitRoot string) (string, error) {
@@ -512,3 +670,58 @@ func createDirectories(dir ...string) error {
}
return nil
}
// generateCDISpec generates a CDI spec for use in managemnt containers
func generateCDISpec(opts *options, nvidiaCTKPath string) error {
if !opts.cdiEnabled {
return nil
}
if opts.cdiOutputDir == "" {
log.Info("Skipping CDI spec generation (no output directory specified)")
return nil
}
log.Infof("Creating control device nodes at %v", opts.DriverRootCtrPath)
s, err := system.New()
if err != nil {
return fmt.Errorf("failed to create library: %v", err)
}
if err := s.CreateNVIDIAControlDeviceNodesAt(opts.DriverRootCtrPath); err != nil {
return fmt.Errorf("failed to create control device nodes: %v", err)
}
log.Info("Generating CDI spec for management containers")
cdilib, err := nvcdi.New(
nvcdi.WithMode(nvcdi.ModeManagement),
nvcdi.WithDriverRoot(opts.DriverRootCtrPath),
nvcdi.WithNVIDIACTKPath(nvidiaCTKPath),
nvcdi.WithVendor(opts.cdiVendor),
nvcdi.WithClass(opts.cdiClass),
)
if err != nil {
return fmt.Errorf("failed to create CDI library for management containers: %v", err)
}
spec, err := cdilib.GetSpec()
if err != nil {
return fmt.Errorf("failed to genereate CDI spec for management containers: %v", err)
}
err = transform.NewRootTransformer(
opts.DriverRootCtrPath,
opts.DriverRoot,
).Transform(spec.Raw())
if err != nil {
return fmt.Errorf("failed to transform driver root in CDI spec: %v", err)
}
name, err := cdi.GenerateNameForSpec(spec.Raw())
if err != nil {
return fmt.Errorf("failed to generate CDI name for management containers: %v", err)
}
err = spec.Save(filepath.Join(opts.cdiOutputDir, name))
if err != nil {
return fmt.Errorf("failed to save CDI spec for management containers: %v", err)
}
return nil
}

View File

@@ -14,7 +14,7 @@
LIB_NAME := nvidia-container-toolkit
LIB_VERSION := 1.13.0
LIB_TAG := rc.1
LIB_TAG := rc.3
# The package version is the combination of the library version and tag.
# If the tag is specified the two components are joined with a tilde (~).
@@ -30,9 +30,10 @@ NVIDIA_CONTAINER_RUNTIME_VERSION := 3.12.0
# Specify the expected libnvidia-container0 version for arm64-based ubuntu builds.
LIBNVIDIA_CONTAINER0_VERSION := 0.10.0+jetpack
CUDA_VERSION := 12.0.1
CUDA_VERSION := 12.1.0
GOLANG_VERSION := 1.18.8
GIT_COMMIT ?= $(shell git describe --match="" --dirty --long --always --abbrev=40 2> /dev/null || echo "")
GIT_COMMIT_SHORT ?= $(shell git rev-parse --short HEAD 2> /dev/null || echo "")
GIT_BRANCH ?= $(shell git rev-parse --abbrev-ref HEAD 2> /dev/null || echo "${GIT_COMMIT}")
SOURCE_DATE_EPOCH ?= $(shell git log -1 --format=%ct 2> /dev/null || echo "")