Compare commits

...

262 Commits

Author SHA1 Message Date
Evan Lezar
ced79e51ed Merge pull request #1131 from elezar/bump-release-v1.18.0-rc.1
Bump release v1.18.0 rc.1
2025-06-25 12:04:28 +02:00
Evan Lezar
d1e25abd6c [no-relnote] Add v1.18.0-rc.1 changelog
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2025-06-25 12:02:19 +02:00
Evan Lezar
7d3defccd2 Bump version for v1.18.0-rc.1 release
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2025-06-24 19:59:15 +02:00
Evan Lezar
c95d36db52 Merge pull request #947 from elezar/ensure-libcuda.so-in-ldcache
Some checks are pending
CI Pipeline / code-scanning (push) Waiting to run
CI Pipeline / variables (push) Waiting to run
CI Pipeline / golang (push) Waiting to run
CI Pipeline / image (push) Blocked by required conditions
CI Pipeline / e2e-test (push) Blocked by required conditions
Ensure that libcuda.so is in the ldcache
2025-06-24 19:55:51 +02:00
Evan Lezar
36950ba03f Merge pull request #1100 from elezar/pin-libnvidia-container-tools-version
Require matching version of libnvidia-container-tools
2025-06-24 19:55:10 +02:00
Evan Lezar
d1286bceed Merge pull request #763 from elezar/allow-config-override-by-envvar
Add support for specifying the config file path in an environment variable
2025-06-24 19:54:36 +02:00
Evan Lezar
2f204147f9 Merge pull request #1030 from elezar/add-driver-lib-root-envvar
Add envvar for libcuda.so parent dir to CDI spec
2025-06-24 14:00:46 +02:00
Evan Lezar
39975fc77b [no-relnote] Refactor ldconfig hooks
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2025-06-24 13:53:20 +02:00
Evan Lezar
4bf7421a80 Add create-soname-symlinks hook
This change adds a create-soname-symlinks hook that can be used to ensure
that the soname symlinks for injected libraries exist in a container.

This is done by calling ldconfig -n -N for the directories containing the injected
libraries.

This also ensures that libcuda.so is present in the ldcache when the update-ldcache
hook is run.

Signed-off-by: Evan Lezar <elezar@nvidia.com>
2025-06-24 13:49:24 +02:00
Evan Lezar
f0ea60a28f Require matching version of libnvidia-container-tools
Since the nvidia-container-tools and libnvidia-container* packages
are now released at the same time with the same version, a more
restrictive version makes sense. Here we specifically require an
equal version for the nvidia-container-toolkit* packages and the
libnvidia-container* packages.

Signed-off-by: Evan Lezar <elezar@nvidia.com>
2025-06-24 13:26:31 +02:00
Evan Lezar
4bab94baa6 Add envvar for libcuda.so parent dir to CDI spec
This change adds an NVIDIA_CTK_LIBCUDA_DIR envvar to
a generated CDI specification. This reports where the `libcuda.so.*`
libraries will be injected into the container.

Signed-off-by: Evan Lezar <elezar@nvidia.com>
2025-06-24 13:24:50 +02:00
Evan Lezar
f642825ad4 Add EnvVar to Discover interface
This change adds environment variables to the Discover interface.

Signed-off-by: Evan Lezar <elezar@nvidia.com>
2025-06-24 13:24:50 +02:00
Evan Lezar
5bc2f50299 Merge pull request #1154 from elezar/switch-to-distroless
Switch to distroless Base image
2025-06-24 11:05:35 +02:00
Evan Lezar
60706815a5 Create /work/nvidia-toolkit symlink
This change ensures that a symlink from /work/nvidia-toolkit to
/work/nvidia-ctk-installer exists to allow GPU Operator versions
that override the entrypoint and assume nvidia-toolkit as the
original entrypoint.

Signed-off-by: Evan Lezar <elezar@nvidia.com>
2025-06-19 10:23:09 +02:00
Evan Lezar
69b0f0ba61 [no-relnote] Update release scripts for distroless
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2025-06-19 10:20:59 +02:00
Evan Lezar
7abf5fa6a4 Use Apache license for images
This change removes the NGC-DL-CONTAINER-LICENSE (since this
is not available in the distroless images) and includes the
repo's Apache LICENSE file in the image.

Signed-off-by: Evan Lezar <elezar@nvidia.com>
2025-06-19 10:20:59 +02:00
Evan Lezar
0dddd5cfd8 Merge pull request #910 from elezar/default-to-cdi
Some checks failed
CI Pipeline / code-scanning (push) Has been cancelled
CI Pipeline / variables (push) Has been cancelled
CI Pipeline / golang (push) Has been cancelled
CI Pipeline / image (push) Has been cancelled
CI Pipeline / e2e-test (push) Has been cancelled
Use just-in-time CDI spec generation by default in the NVIDIA Container Runtime
2025-06-18 23:40:44 +02:00
Evan Lezar
28ddc1454c Switch to golang distroless image
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2025-06-18 23:39:26 +02:00
Evan Lezar
17c5d1dc87 Resolve to legacy by default in nvidia-container-runtime-hook
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2025-06-18 23:36:12 +02:00
Evan Lezar
6149592bf6 Default to jit-cdi mode in the nvidia runtime
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2025-06-18 23:36:12 +02:00
Evan Lezar
d3ece78bc9 [no-relnote] Add RuntimeMode type
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2025-06-18 23:36:12 +02:00
Evan Lezar
980ca5d1bc Use functional options to construct runtime mode resolver
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2025-06-18 23:36:12 +02:00
Evan Lezar
76b71a5498 Merge pull request #1083 from elezar/bump-image-in-tests
[no-relnote] Use cuda 12.9.0 image in tests
2025-06-18 23:33:54 +02:00
Evan Lezar
5a1b4e7c1e [no-relnote] Fix tests for compat mode
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2025-06-18 15:22:08 +02:00
Evan Lezar
39fd15d273 [no-relnote] Use 550 driver in tests
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2025-06-18 15:18:21 +02:00
Evan Lezar
da6b849cf6 [no-relnote] Use cuda 12.9.0 image in tests
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2025-06-18 15:18:21 +02:00
Evan Lezar
849691d290 [no-relnote] Use NVIDIA_CTK_CONFIG_FILE_PATH in toolkit install
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2025-06-18 15:06:00 +02:00
Evan Lezar
f672d38aa5 [no-relnote] Rename config constants
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2025-06-18 15:02:35 +02:00
Evan Lezar
eb39b972a5 Add NVIDIA_CTK_CONFIG_FILE_PATH envvar
This change adds support for explicitly specifying the path
to the config file through an environment variable.
The NVIDIA_CTK_CONFIG_FILE_PATH variable is used for both the nvidia-container-runtime
and nvidia-container-runtime-hook.

Signed-off-by: Evan Lezar <elezar@nvidia.com>
2025-06-18 15:01:46 +02:00
Evan Lezar
d9c7ec9714 [no-relnote] Don't refer to target image distribution
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2025-06-18 14:32:54 +02:00
Evan Lezar
614e469dac Merge pull request #1153 from elezar/switch-to-ubi9
Switch to cuda ubi9 base image
2025-06-18 13:10:32 +02:00
Evan Lezar
aafe4d7ad0 Switch to cuda ubi9 base image
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2025-06-18 11:58:23 +02:00
Evan Lezar
1f7c7ffec2 Merge pull request #1152 from elezar/remove-dist-tag
Update image tags for single image
2025-06-18 11:56:14 +02:00
Evan Lezar
7e1beb7aa6 [no-relnote] Update gitlab CI for new image names
We now release all images with vX.Y.Z and vX.Y.Z-packaging tags.

This change updates the gitlab CI to allow for this.

Signed-off-by: Evan Lezar <elezar@nvidia.com>
2025-06-18 11:51:52 +02:00
Evan Lezar
d560888f1f Use single version tag for image
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2025-06-17 19:46:42 +02:00
Evan Lezar
81fb7bb9c1 Merge pull request #602 from elezar/use-ubi8-image
Use single image for deb and rpm-based systems
2025-06-17 15:09:47 +02:00
Evan Lezar
208896d87d Merge pull request #1130 from ArangoGutierrez/fix/1049
BUGFIX: modifier: respect GPU volume-mount device requests
2025-06-17 15:04:22 +02:00
Evan Lezar
82b62898bf [no-relnote] Make devicesFromEnvvars private
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2025-06-17 11:55:00 +02:00
Carlos Eduardo Arango Gutierrez
d03a06029a BUGFIX: modifier: respect GPU volume-mount device requests
The gated modifiers used to add support for GDS, Mofed, and CUDA Forward Comatibility
only check the NVIDIA_VISIBLE_DEVICES envvar to determine whether GPUs are requested
and modifications should be made. This means that use cases where volume mounts are
used to request devices are not supported.

This change ensures that device extraction is consistent for all use cases.

Signed-off-by: Carlos Eduardo Arango Gutierrez <eduardoa@nvidia.com>
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2025-06-17 11:55:00 +02:00
Evan Lezar
f4f7da65f1 Ensure consistent sorting of annotation devices
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2025-06-17 11:55:00 +02:00
Carlos Eduardo Arango Gutierrez
5fe7b06514 [no-relnote] new helper func visibleEnvVars
Signed-off-by: Carlos Eduardo Arango Gutierrez <eduardoa@nvidia.com>
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2025-06-17 11:54:53 +02:00
Evan Lezar
5606caa5af Extract deb and rpm packages to single image
This change swithces to using a single image for the NVIDIA Container Toolkit contianer.
Here the contents of the architecture-specific deb and rpm packages are extracted
to a known root. These contents can then be installed using the updated installation
mechanism which has been updated to detect the source root based on the packaging type.

Signed-off-by: Evan Lezar <elezar@nvidia.com>
2025-06-16 19:15:56 +02:00
Evan Lezar
8149be09ac Merge pull request #1149 from NVIDIA/dependabot/go_modules/main/github.com/urfave/cli/v2-2.27.7
Bump github.com/urfave/cli/v2 from 2.27.6 to 2.27.7
2025-06-16 16:28:29 +02:00
Evan Lezar
d935648722 Merge pull request #1140 from NVIDIA/dependabot/go_modules/main/github.com/NVIDIA/go-nvlib-0.7.3
Bump github.com/NVIDIA/go-nvlib from 0.7.2 to 0.7.3
2025-06-16 16:26:57 +02:00
dependabot[bot]
1f43b71dd8 Bump github.com/urfave/cli/v2 from 2.27.6 to 2.27.7
Bumps [github.com/urfave/cli/v2](https://github.com/urfave/cli) from 2.27.6 to 2.27.7.
- [Release notes](https://github.com/urfave/cli/releases)
- [Changelog](https://github.com/urfave/cli/blob/main/docs/CHANGELOG.md)
- [Commits](https://github.com/urfave/cli/compare/v2.27.6...v2.27.7)

---
updated-dependencies:
- dependency-name: github.com/urfave/cli/v2
  dependency-version: 2.27.7
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
2025-06-16 08:56:44 +00:00
Evan Lezar
b33d475ff3 Merge pull request #1145 from elezar/remove-docker-runc
Some checks failed
CI Pipeline / code-scanning (push) Has been cancelled
CI Pipeline / variables (push) Has been cancelled
CI Pipeline / golang (push) Has been cancelled
CI Pipeline / image (push) Has been cancelled
CI Pipeline / e2e-test (push) Has been cancelled
Remove docker-run as default runtime candidate
2025-06-13 19:27:08 +02:00
Evan Lezar
6359cc9919 Remove docker-run as default runtime candidate
This change removes docker-runc as the highest priority
default candidate for the low-level runtimes supported by
the nvidia-container-runtime.

Signed-off-by: Evan Lezar <elezar@nvidia.com>
2025-06-13 17:26:19 +02:00
Evan Lezar
4f9c860a37 Merge pull request #927 from elezar/disable-device-node-creation
Disable device node creation in CDI mode
2025-06-13 16:44:18 +02:00
Evan Lezar
bab9fdf607 Merge pull request #1144 from NVIDIA/dependabot/submodules/main/third_party/libnvidia-container-710a0f1
Bump third_party/libnvidia-container from `6eda4d7` to `710a0f1`
2025-06-13 16:42:53 +02:00
Evan Lezar
cc7812470f Merge pull request #1143 from elezar/add-device-ids-to-getspec
Add device IDs to nvcdi.GetSpec API
2025-06-13 16:41:43 +02:00
Evan Lezar
bdcdcb7449 Merge pull request #1132 from elezar/make-cdi-device-extraction-consistent
Make CDI device requests consistent with other methods
2025-06-13 16:05:44 +02:00
Evan Lezar
8be03cfc41 [no-relnote] Ignore annotation devices for non-CDI modes
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2025-06-13 15:26:48 +02:00
Evan Lezar
8650ca6533 [no-relnote] Move hookCreator initialisation for readability
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2025-06-13 15:00:51 +02:00
Evan Lezar
1bc2a9fee3 Return annotation devices from VisibleDevices
This change includes annotation devices in CUDA.VisibleDevices
with the highest priority. This allows for the CDI device
request extraction to be consistent across all request mechanisms.

Note that this does change behaviour in the following ways:
1. Annotations are considered when resolving the runtime mode.
2. Incorrectly formed device names in annotations are no longer treated as an error.

Signed-off-by: Evan Lezar <elezar@nvidia.com>
2025-06-13 14:56:08 +02:00
Evan Lezar
dc87dcf786 Make CDI device requests consistent with other methods
Following the refactoring of device request extraction, we can
now make CDI device requests consistent with other methods.

This change moves to using image.VisibleDevices instead of
separate calls to CDIDevicesFromMounts and VisibleDevicesFromEnvVar.

Signed-off-by: Evan Lezar <elezar@nvidia.com>
2025-06-13 14:34:02 +02:00
Evan Lezar
f17d424248 Construct container info once
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2025-06-13 14:05:51 +02:00
Evan Lezar
426186c992 Add logic to extract annotation device requests to image type
This change updates the image.CUDA type to also extract CDI
device requests. These are only relevant IF CDI prefixes are
specifically set.

Signed-off-by: Evan Lezar <elezar@nvidia.com>
2025-06-13 14:05:51 +02:00
Evan Lezar
6849ebd621 Add IsPrivileged function to CUDA container type
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2025-06-13 14:05:51 +02:00
dependabot[bot]
4a6685d3a8 Bump third_party/libnvidia-container from 6eda4d7 to 710a0f1
Bumps [third_party/libnvidia-container](https://github.com/NVIDIA/libnvidia-container) from `6eda4d7` to `710a0f1`.
- [Release notes](https://github.com/NVIDIA/libnvidia-container/releases)
- [Commits](6eda4d76c8...710a0f1304)

---
updated-dependencies:
- dependency-name: third_party/libnvidia-container
  dependency-version: 710a0f1304dacb9de06716a4e24160698952aa81
  dependency-type: direct:production
...

Signed-off-by: dependabot[bot] <support@github.com>
2025-06-13 08:14:38 +00:00
Evan Lezar
2ccf67c40f [no-relnote] Remove test output file
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2025-06-12 16:34:08 +02:00
Evan Lezar
0134ba4250 Add device IDs to nvcdi.GetSpec API
This change allows device IDs to the specified in the GetSpec API.
This simplifies cases where CDI specs are being generated for specific
devices by ID.

Signed-off-by: Evan Lezar <elezar@nvidia.com>
2025-06-12 16:23:11 +02:00
dependabot[bot]
eab9cdf1c2 Bump github.com/NVIDIA/go-nvlib from 0.7.2 to 0.7.3
Bumps [github.com/NVIDIA/go-nvlib](https://github.com/NVIDIA/go-nvlib) from 0.7.2 to 0.7.3.
- [Release notes](https://github.com/NVIDIA/go-nvlib/releases)
- [Commits](https://github.com/NVIDIA/go-nvlib/compare/v0.7.2...v0.7.3)

---
updated-dependencies:
- dependency-name: github.com/NVIDIA/go-nvlib
  dependency-version: 0.7.3
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
2025-06-11 12:58:33 +00:00
Evan Lezar
dba15acdcc Merge pull request #1135 from NVIDIA/dependabot/go_modules/tests/main/golang.org/x/crypto-0.39.0
Some checks failed
CI Pipeline / code-scanning (push) Has been cancelled
CI Pipeline / variables (push) Has been cancelled
CI Pipeline / golang (push) Has been cancelled
CI Pipeline / image (push) Has been cancelled
CI Pipeline / e2e-test (push) Has been cancelled
Bump golang.org/x/crypto from 0.38.0 to 0.39.0 in /tests
2025-06-11 14:57:38 +02:00
Evan Lezar
8339fb1ec3 Merge pull request #1139 from NVIDIA/dependabot/go_modules/main/github.com/NVIDIA/go-nvml-0.12.9-0
Bump github.com/NVIDIA/go-nvml from 0.12.4-1 to 0.12.9-0
2025-06-11 14:57:18 +02:00
Evan Lezar
b9d646c80d Merge pull request #1136 from NVIDIA/dependabot/docker/deployments/devel/main/golang-1.24.4
Bump golang from 1.24.3 to 1.24.4 in /deployments/devel
2025-06-11 14:53:57 +02:00
Evan Lezar
7380cff645 Merge pull request #1134 from NVIDIA/dependabot/go_modules/main/golang.org/x/mod-0.25.0
Bump golang.org/x/mod from 0.24.0 to 0.25.0
2025-06-11 14:53:34 +02:00
dependabot[bot]
f91736d832 Bump github.com/NVIDIA/go-nvml from 0.12.4-1 to 0.12.9-0
Bumps [github.com/NVIDIA/go-nvml](https://github.com/NVIDIA/go-nvml) from 0.12.4-1 to 0.12.9-0.
- [Release notes](https://github.com/NVIDIA/go-nvml/releases)
- [Commits](https://github.com/NVIDIA/go-nvml/compare/v0.12.4-1...v0.12.9-0)

---
updated-dependencies:
- dependency-name: github.com/NVIDIA/go-nvml
  dependency-version: 0.12.9-0
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
2025-06-11 09:03:11 +00:00
dependabot[bot]
5ccac4da5a Bump golang from 1.24.3 to 1.24.4 in /deployments/devel
Bumps golang from 1.24.3 to 1.24.4.

---
updated-dependencies:
- dependency-name: golang
  dependency-version: 1.24.4
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
2025-06-06 09:01:57 +00:00
dependabot[bot]
1aee45be2d Bump golang.org/x/crypto from 0.38.0 to 0.39.0 in /tests
Bumps [golang.org/x/crypto](https://github.com/golang/crypto) from 0.38.0 to 0.39.0.
- [Commits](https://github.com/golang/crypto/compare/v0.38.0...v0.39.0)

---
updated-dependencies:
- dependency-name: golang.org/x/crypto
  dependency-version: 0.39.0
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
2025-06-06 08:43:50 +00:00
dependabot[bot]
0d4a7f1d5a Bump golang.org/x/mod from 0.24.0 to 0.25.0
Bumps [golang.org/x/mod](https://github.com/golang/mod) from 0.24.0 to 0.25.0.
- [Commits](https://github.com/golang/mod/compare/v0.24.0...v0.25.0)

---
updated-dependencies:
- dependency-name: golang.org/x/mod
  dependency-version: 0.25.0
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
2025-06-06 08:43:41 +00:00
Evan Lezar
27f5ec83de Merge pull request #1125 from elezar/vulkan-target-cpu
Some checks failed
CI Pipeline / code-scanning (push) Has been cancelled
CI Pipeline / variables (push) Has been cancelled
CI Pipeline / golang (push) Has been cancelled
CI Pipeline / image (push) Has been cancelled
CI Pipeline / e2e-test (push) Has been cancelled
Add discovery of arch-specific vulkan ICD
2025-06-05 14:09:50 +02:00
Carlos Eduardo Arango Gutierrez
0a3146f74f Merge pull request #1076 from ArangoGutierrez/refresh_cdi
Add nvidia-cdi-refresh service
2025-06-05 13:08:57 +02:00
Carlos Eduardo Arango Gutierrez
28add0a532 Merge pull request #1123 from ArangoGutierrez/cdi_generate_env_cli
Add EnvVars option for all nvidia-ctk cdi commands
2025-06-05 13:05:46 +02:00
Evan Lezar
b55255e31f Merge pull request #1110 from ArangoGutierrez/i/1049
Refactor extracting requested devices from the container image
2025-06-05 13:00:32 +02:00
Carlos Eduardo Arango Gutierrez
dede03f322 Refactor extracting requested devices from the container image
This change consolidates the logic for determining requested devices
from the container image. The logic for this has been integrated into
the image.CUDA type so that multiple implementations are not required.

Signed-off-by: Carlos Eduardo Arango Gutierrez <eduardoa@nvidia.com>
Co-authored-by: Evan Lezar <elezar@nvidia.com>
2025-06-05 12:38:45 +02:00
Carlos Eduardo Arango Gutierrez
ce3e2c1ed5 Add EnvVars option for all nvidia-ctk cdi commands
Signed-off-by: Carlos Eduardo Arango Gutierrez <eduardoa@nvidia.com>
2025-06-05 11:22:35 +02:00
Carlos Eduardo Arango Gutierrez
a537d0323d Add nvidia-cdi-refresh service
Automatic regeneration of /var/run/cdi/nvidia.yaml
New units:
	•	nvidia-cdi-refresh.service – one-shot wrapper for
			nvidia-ctk cdi generate (adds sleep + required caps).
	•	nvidia-cdi-refresh.path   – fires on driver install/upgrade via
			modules.dep.bin changes.
Packaging
	•	RPM %post reloads systemd and enables the path unit on fresh
			installs.
	•	DEB postinst does the same (configure, skip on upgrade).

Result: CDI spec is always up to date

Signed-off-by: Carlos Eduardo Arango Gutierrez <eduardoa@nvidia.com>
2025-06-05 10:54:15 +02:00
Evan Lezar
2de997e25b Add discovery of arch-specific vulkan ICD
On some RPM-based platforms, the path of the Vulkan ICD file
include an architecture-specific infix to distinguish it from
other architectures. (Most notably x86_64 vs i686). This change
attempts to discover the arch-specific ICD file in addition to
the standard nvidia_icd.json.

Signed-off-by: Evan Lezar <elezar@nvidia.com>
2025-06-04 23:06:16 +02:00
Evan Lezar
e046d6ae79 Add disabled-device-node-modification hook to CDI spec
Some checks failed
CI Pipeline / code-scanning (push) Has been cancelled
CI Pipeline / variables (push) Has been cancelled
CI Pipeline / golang (push) Has been cancelled
CI Pipeline / image (push) Has been cancelled
CI Pipeline / e2e-test (push) Has been cancelled
This hook is not added to management specs.

Signed-off-by: Evan Lezar <elezar@nvidia.com>
2025-06-04 19:08:48 +02:00
Evan Lezar
0c8723a93a Add a hook to disable device node creation in a container
If required, this hook creates a modified params file (with ModifyDeviceFiles: 0) in a tmpfs
and mounts this over /proc/driver/nvidia/params.

This prevents device node creation when running tools such as nvidia-smi. In general the
creation of these devices is cosmetic as a container does not have the required cgroup
access for the devices.

Signed-off-by: Evan Lezar <elezar@nvidia.com>
2025-06-04 19:08:48 +02:00
Evan Lezar
fdcd250362 Merge pull request #1129 from elezar/fix-deduplicate-driver-store-wsl
Minor cleanup of WSL2 CDI spec generation
2025-06-04 10:16:36 +02:00
Evan Lezar
b66d37bedb [no-relnote] Minor code cleanup in WSL2 discoverer
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2025-06-03 23:43:17 +02:00
Evan Lezar
0c905d0de2 [no-relnote] Remove unneeded indirection
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2025-06-03 23:43:17 +02:00
Evan Lezar
0d0b56816e Remove redundant deduplication of search paths for WSL
The GetDriverStorePaths function is implemented so as to
remove duplicate driver store paths which means that the
additional deduplication (which had a bug) can be removed.

Signed-off-by: Evan Lezar <elezar@nvidia.com>
2025-06-03 23:43:10 +02:00
Evan Lezar
d59fd3da11 Merge pull request #1077 from ArangoGutierrez/1074
Some checks failed
CI Pipeline / code-scanning (push) Has been cancelled
CI Pipeline / variables (push) Has been cancelled
CI Pipeline / golang (push) Has been cancelled
CI Pipeline / image (push) Has been cancelled
CI Pipeline / e2e-test (push) Has been cancelled
Add the ability to disable specific (or all) CDI hooks when generating a CDI specification
2025-06-03 16:03:18 +02:00
Carlos Eduardo Arango Gutierrez
6cf0248321 Added ability to disable specific (or all) CDI hooks
This change adds the ability to disabled specific (or all) CDI hooks to
both the nvidia-ctk cdi generate command and the nvcdi API.

Signed-off-by: Carlos Eduardo Arango Gutierrez <eduardoa@nvidia.com>
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2025-06-03 16:01:20 +02:00
Carlos Eduardo Arango Gutierrez
b4787511d2 Consolidate HookName functionality on internal/discover pkg
Signed-off-by: Carlos Eduardo Arango Gutierrez <eduardoa@nvidia.com>
2025-06-03 15:24:43 +02:00
Evan Lezar
890db82b46 Merge pull request #1127 from ArangoGutierrez/e2e/internal_runner
[no-relnote] E2E GitHub action to run with internal runner
2025-06-03 15:18:42 +02:00
Carlos Eduardo Arango Gutierrez
5915328be5 [no-relnote] E2E GitHub action to run with internal runner
Signed-off-by: Carlos Eduardo Arango Gutierrez <eduardoa@nvidia.com>
2025-06-03 12:47:35 +02:00
Evan Lezar
bb3a54f7f4 Merge pull request #1126 from elezar/bump-holodeck-0.2.12
Some checks are pending
CI Pipeline / code-scanning (push) Waiting to run
CI Pipeline / variables (push) Waiting to run
CI Pipeline / golang (push) Waiting to run
CI Pipeline / image (push) Blocked by required conditions
CI Pipeline / e2e-test (push) Blocked by required conditions
Bump NVIDIA/holodeck from 0.2.7 to 0.2.12
2025-06-03 10:55:50 +02:00
dependabot[bot]
a909914cd6 Bump NVIDIA/holodeck from 0.2.7 to 0.2.12
Bumps [NVIDIA/holodeck](https://github.com/nvidia/holodeck) from 0.2.7 to 0.2.12.
- [Release notes](https://github.com/nvidia/holodeck/releases)
- [Commits](https://github.com/nvidia/holodeck/compare/v0.2.7...v0.2.12)

---
updated-dependencies:
- dependency-name: NVIDIA/holodeck
  dependency-version: 0.2.12
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
2025-06-03 10:11:47 +02:00
Evan Lezar
f973271da1 Merge pull request #1119 from elezar/use-public-runners
[no-relnote] Switch to public runners
2025-06-02 21:53:02 +02:00
Evan Lezar
535e023828 Merge pull request #1120 from elezar/remove-release-archive
[no-relnote] Remove release:archive CI step
2025-06-02 13:44:33 +02:00
Evan Lezar
f2cf3e8deb [no-relnote] Remove release:archive CI step
Remove unneeded release:archive internal CI step.

Signed-off-by: Evan Lezar <elezar@nvidia.com>
2025-05-30 17:10:03 +02:00
Evan Lezar
03e8b9e0f5 [no-relnote] Switch to public runners
Since we now use pr-copy-bot, we are not running image builds
from forks and as such the required secrets for pushing to the
ghcr are present.

Signed-off-by: Evan Lezar <elezar@nvidia.com>
2025-05-30 16:40:29 +02:00
Evan Lezar
450f73a046 Merge commit from fork
Some checks failed
CI Pipeline / code-scanning (push) Has been cancelled
CI Pipeline / variables (push) Has been cancelled
CI Pipeline / golang (push) Has been cancelled
CI Pipeline / image (push) Has been cancelled
CI Pipeline / e2e-test (push) Has been cancelled
Add `NVIDIA_CTK_DEBUG=false` to hook envs
2025-05-30 15:31:26 +02:00
Evan Lezar
479df7134a Add envvar to control debug logging in CDI hooks
This change allows hooks to be configured with debug logging. This
is currently only enabled for the hooks generated from the runtime.

Signed-off-by: Evan Lezar <elezar@nvidia.com>
2025-05-30 15:27:52 +02:00
Evan Lezar
19a83e3542 Merge pull request #1112 from NVIDIA/dependabot/submodules/main/third_party/libnvidia-container-6eda4d7
Some checks failed
CI Pipeline / code-scanning (push) Has been cancelled
CI Pipeline / variables (push) Has been cancelled
CI Pipeline / golang (push) Has been cancelled
CI Pipeline / image (push) Has been cancelled
CI Pipeline / e2e-test (push) Has been cancelled
Bump third_party/libnvidia-container from `caf057b` to `6eda4d7`
2025-05-28 10:33:44 +02:00
dependabot[bot]
d2344cba34 Bump third_party/libnvidia-container from caf057b to 6eda4d7
Bumps [third_party/libnvidia-container](https://github.com/NVIDIA/libnvidia-container) from `caf057b` to `6eda4d7`.
- [Release notes](https://github.com/NVIDIA/libnvidia-container/releases)
- [Commits](caf057b009...6eda4d76c8)

---
updated-dependencies:
- dependency-name: third_party/libnvidia-container
  dependency-version: 6eda4d76c8c5f8fc174e4abca83e513fb4dd63b0
  dependency-type: direct:production
...

Signed-off-by: dependabot[bot] <support@github.com>
2025-05-28 08:16:30 +00:00
Evan Lezar
c8c22162b7 Merge pull request #958 from ArangoGutierrez/codecov
Some checks are pending
CI Pipeline / code-scanning (push) Waiting to run
CI Pipeline / variables (push) Waiting to run
CI Pipeline / golang (push) Waiting to run
CI Pipeline / image (push) Blocked by required conditions
CI Pipeline / e2e-test (push) Blocked by required conditions
[no-relnote] Enable Coveralls for code coverage
2025-05-27 15:48:54 +02:00
Evan Lezar
ea9b8721c0 Merge pull request #1108 from NVIDIA/dependabot/github_actions/main/NVIDIA/holodeck-0.2.9
Some checks are pending
CI Pipeline / code-scanning (push) Waiting to run
CI Pipeline / variables (push) Waiting to run
CI Pipeline / golang (push) Waiting to run
CI Pipeline / image (push) Blocked by required conditions
CI Pipeline / e2e-test (push) Blocked by required conditions
Bump NVIDIA/holodeck from 0.2.7 to 0.2.9
2025-05-27 10:21:15 +02:00
dependabot[bot]
eaaa8536e4 Bump NVIDIA/holodeck from 0.2.7 to 0.2.9
Bumps [NVIDIA/holodeck](https://github.com/nvidia/holodeck) from 0.2.7 to 0.2.9.
- [Release notes](https://github.com/nvidia/holodeck/releases)
- [Commits](https://github.com/nvidia/holodeck/compare/v0.2.7...v0.2.9)

---
updated-dependencies:
- dependency-name: NVIDIA/holodeck
  dependency-version: 0.2.9
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
2025-05-27 08:19:31 +00:00
Carlos Eduardo Arango Gutierrez
e955f65d8f [no-relnote] Enable Coveralls
Some checks failed
CI Pipeline / code-scanning (push) Has been cancelled
CI Pipeline / variables (push) Has been cancelled
CI Pipeline / golang (push) Has been cancelled
CI Pipeline / image (push) Has been cancelled
CI Pipeline / e2e-test (push) Has been cancelled
Signed-off-by: Carlos Eduardo Arango Gutierrez <eduardoa@nvidia.com>
2025-05-23 13:53:45 +02:00
Evan Lezar
b934c68bef Merge pull request #1103 from elezar/reenable-nvsandboxutils
Some checks failed
CI Pipeline / code-scanning (push) Has been cancelled
CI Pipeline / variables (push) Has been cancelled
CI Pipeline / golang (push) Has been cancelled
CI Pipeline / image (push) Has been cancelled
CI Pipeline / e2e-test (push) Has been cancelled
Reenable nvsandboxutils for driver discovery
2025-05-23 11:38:46 +02:00
Evan Lezar
7bd65da91e Add FeatureFlags to the nvcdi API
This change adds support for feature flags to the nvcdi API.

A feature flag to disable nvsandboxutils is also added to allow
more flexibility in cases where this library causes issue.

Signed-off-by: Evan Lezar <elezar@nvidia.com>
2025-05-22 17:26:41 +02:00
Evan Lezar
872aa2fe1c Reenable nvsandboxutils for driver discovery
This change reenables nvsandboxutils for driver discovery. This
was disabled due to an error in a specific driver version (v565)
so as to not block the release of the DRA driver for ComputeDomains.

Signed-off-by: Evan Lezar <elezar@nvidia.com>
2025-05-22 17:26:40 +02:00
Carlos Eduardo Arango Gutierrez
be6a36c023 Merge pull request #1102 from ArangoGutierrez/i/1094
Some checks are pending
CI Pipeline / code-scanning (push) Waiting to run
CI Pipeline / variables (push) Waiting to run
CI Pipeline / golang (push) Waiting to run
CI Pipeline / image (push) Blocked by required conditions
CI Pipeline / e2e-test (push) Blocked by required conditions
Edit discover.mounts to have a deterministic output
2025-05-22 15:53:22 +02:00
Carlos Eduardo Arango Gutierrez
aaaa3c6275 Edit discover.mounts to have a deterministic output
Signed-off-by: Carlos Eduardo Arango Gutierrez <eduardoa@nvidia.com>
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2025-05-22 15:25:03 +02:00
Evan Lezar
f93d96a0de Merge pull request #1090 from ArangoGutierrez/hookcreator
Some checks are pending
CI Pipeline / code-scanning (push) Waiting to run
CI Pipeline / variables (push) Waiting to run
CI Pipeline / golang (push) Waiting to run
CI Pipeline / image (push) Blocked by required conditions
CI Pipeline / e2e-test (push) Blocked by required conditions
Refactor the way we handle Hook Creation
2025-05-21 14:23:51 +02:00
Carlos Eduardo Arango Gutierrez
2a4cf4c0a0 [no-relnote] Update gitignore
Some checks failed
CI Pipeline / code-scanning (push) Has been cancelled
CI Pipeline / variables (push) Has been cancelled
CI Pipeline / golang (push) Has been cancelled
CI Pipeline / image (push) Has been cancelled
CI Pipeline / e2e-test (push) Has been cancelled
Signed-off-by: Carlos Eduardo Arango Gutierrez <eduardoa@nvidia.com>
2025-05-21 10:19:52 +02:00
Carlos Eduardo Arango Gutierrez
cf3b9317ef Refactor the way we create CDI Hooks
Signed-off-by: Carlos Eduardo Arango Gutierrez <eduardoa@nvidia.com>
2025-05-21 10:19:47 +02:00
Evan Lezar
6ba25e7288 Merge pull request #1095 from NVIDIA/dependabot/submodules/main/third_party/libnvidia-container-caf057b
Some checks are pending
CI Pipeline / code-scanning (push) Waiting to run
CI Pipeline / variables (push) Waiting to run
CI Pipeline / golang (push) Waiting to run
CI Pipeline / image (push) Blocked by required conditions
CI Pipeline / e2e-test (push) Blocked by required conditions
Bump third_party/libnvidia-container from `51a7f20` to `caf057b`
2025-05-20 20:31:49 +02:00
dependabot[bot]
296633d148 Bump third_party/libnvidia-container from 51a7f20 to caf057b
Bumps [third_party/libnvidia-container](https://github.com/NVIDIA/libnvidia-container) from `51a7f20` to `caf057b`.
- [Release notes](https://github.com/NVIDIA/libnvidia-container/releases)
- [Commits](51a7f20088...caf057b009)

---
updated-dependencies:
- dependency-name: third_party/libnvidia-container
  dependency-version: caf057b00987a6d874d519cc80b742c43faa859a
  dependency-type: direct:production
...

Signed-off-by: dependabot[bot] <support@github.com>
2025-05-20 13:20:23 +00:00
Evan Lezar
ac8f190c99 Merge commit from fork
Some checks failed
CI Pipeline / code-scanning (push) Has been cancelled
CI Pipeline / variables (push) Has been cancelled
CI Pipeline / golang (push) Has been cancelled
CI Pipeline / image (push) Has been cancelled
CI Pipeline / e2e-test (push) Has been cancelled
Run update-ldcache in isolated namespaces
2025-05-16 15:15:21 +02:00
Evan Lezar
3c1f1a6519 Merge pull request #1086 from NVIDIA/dependabot/submodules/main/third_party/libnvidia-container-51a7f20
Bump third_party/libnvidia-container from `d26524a` to `51a7f20`
2025-05-16 14:18:17 +02:00
dependabot[bot]
3ee5ff0aa2 Bump third_party/libnvidia-container from d26524a to 51a7f20
Bumps [third_party/libnvidia-container](https://github.com/NVIDIA/libnvidia-container) from `d26524a` to `51a7f20`.
- [Release notes](https://github.com/NVIDIA/libnvidia-container/releases)
- [Commits](d26524ab5d...51a7f20088)

---
updated-dependencies:
- dependency-name: third_party/libnvidia-container
  dependency-version: 51a7f20088dc0c3e7ddbb67629bf8e63b9130339
  dependency-type: direct:production
...

Signed-off-by: dependabot[bot] <support@github.com>
2025-05-16 08:27:41 +00:00
Evan Lezar
6dfd63f4a8 Merge pull request #980 from elezar/add-rprivate-to-mount-options
Some checks are pending
CI Pipeline / code-scanning (push) Waiting to run
CI Pipeline / variables (push) Waiting to run
CI Pipeline / golang (push) Waiting to run
CI Pipeline / image (push) Blocked by required conditions
CI Pipeline / e2e-test (push) Blocked by required conditions
Add rprivate to CDI mount options
2025-05-16 07:53:39 +02:00
Evan Lezar
35e583b623 Merge pull request #1000 from elezar/ignore-unknown-hooks
Issue warning on unsupported CDI hook
2025-05-16 07:52:25 +02:00
Evan Lezar
7d71932d2a Merge pull request #1085 from elezar/add-security-md
[no-relnote] Add SECURITY.md to repo
2025-05-16 07:51:27 +02:00
Evan Lezar
d3ea72c440 [no-relnote] Add SECURITY.md to repo
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2025-05-15 16:38:43 +02:00
Evan Lezar
c0dda358a3 Issue warning on unsupported CDI hook
Some checks failed
CI Pipeline / code-scanning (push) Has been cancelled
CI Pipeline / variables (push) Has been cancelled
CI Pipeline / golang (push) Has been cancelled
CI Pipeline / image (push) Has been cancelled
CI Pipeline / e2e-test (push) Has been cancelled
To allow for CDI hooks to be added gradually we provide a generic no-op hook
for unrecognised subcommands. This will log a warning instead of erroring out.

An unsupported hook could be the result of a CDI specification referring to a
new hook that is not yet supported by an older NVIDIA Container Toolkit
version or a hook that has been removed in newer version.

Signed-off-by: Evan Lezar <elezar@nvidia.com>
2025-05-15 14:05:19 +02:00
Evan Lezar
ec29b602c3 Run update-ldcache in isolated namespaces
This change uses the reexec package to run the update of the
ldcache in a container in a process with isolated namespaces.
Since the hook is invoked as a createContainer hook, these
namespaces are cloned from the container's namespaces.

In the reexec handler, we further isolate the proc filesystem,
mount the host ldconfig to a tmpfs, and pivot into the containers
root.

Signed-off-by: Evan Lezar <elezar@nvidia.com>
2025-05-15 12:45:49 +02:00
Carlos Eduardo Arango Gutierrez
241881f12f Merge pull request #1048 from ArangoGutierrez/updated_e2e
Some checks failed
CI Pipeline / code-scanning (push) Has been cancelled
CI Pipeline / variables (push) Has been cancelled
CI Pipeline / golang (push) Has been cancelled
CI Pipeline / image (push) Has been cancelled
CI Pipeline / e2e-test (push) Has been cancelled
[no-relnote] Update E2E test suite
2025-05-14 12:27:01 +02:00
Carlos Eduardo Arango Gutierrez
eb40f240ac [no-relnote] Update E2E suite
Signed-off-by: Carlos Eduardo Arango Gutierrez <eduardoa@nvidia.com>
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2025-05-14 11:22:14 +02:00
Evan Lezar
72b2ee9ce0 Merge pull request #1055 from elezar/add-cuda-compat-mode
Some checks are pending
CI Pipeline / code-scanning (push) Waiting to run
CI Pipeline / variables (push) Waiting to run
CI Pipeline / golang (push) Waiting to run
CI Pipeline / image (push) Blocked by required conditions
CI Pipeline / e2e-test (push) Blocked by required conditions
Add nvidia-container-cli.compat-mode config option
2025-05-13 21:56:18 +02:00
Evan Lezar
f4981f0876 Add cuda-compat-mode config option
This change adds an nvidia-container-runtime.modes.legacy.cuda-compat-mode
config option. This can be set to one of four values:

* ldconfig (default): the --cuda-compat-mode=ldconfig flag is passed to the nvidia-container-cli
* mount: the --cuda-compat-mode=mount flag is passed to the nvidia-conainer-cli
* disabled: the --cuda-compat-mode=disabled flag is passed to the nvidia-container-cli
* hook: the --cuda-compat-mode=disabled flag is passed to the nvidia-container-cli AND the
  enable-cuda-compat hook is used to provide forward compatibility.

Note that the disable-cuda-compat-lib-hook feature flag will prevent the enable-cuda-compat
hook from being used. This change also means that the allow-cuda-compat-libs-from-container
feature flag no longer has any effect.

Signed-off-by: Evan Lezar <elezar@nvidia.com>
2025-05-13 21:49:53 +02:00
Evan Lezar
2ec67033c0 Merge pull request #1081 from NVIDIA/dependabot/submodules/main/third_party/libnvidia-container-d26524a
Bump third_party/libnvidia-container from `a198166` to `d26524a`
2025-05-13 21:49:21 +02:00
dependabot[bot]
f8eda79aaf Bump third_party/libnvidia-container from a198166 to d26524a
Bumps [third_party/libnvidia-container](https://github.com/NVIDIA/libnvidia-container) from `a198166` to `d26524a`.
- [Release notes](https://github.com/NVIDIA/libnvidia-container/releases)
- [Commits](a198166e1c...d26524ab5d)

---
updated-dependencies:
- dependency-name: third_party/libnvidia-container
  dependency-version: d26524ab5db96a55ae86033f53de50d3794fb547
  dependency-type: direct:production
...

Signed-off-by: dependabot[bot] <support@github.com>
2025-05-13 19:48:20 +00:00
Evan Lezar
51504097d8 Merge pull request #1078 from elezar/add-thor-support
Fix mode detection on Thor-based systems
2025-05-13 21:33:25 +02:00
Evan Lezar
a4dc28bb3f Fix mode detection on Thor-based systems
This change updates github.com/NVIDIA/go-nvlib from v0.7.1 to v0.7.2
to allow Thor systems to be detected as Tegra-based. This allows fixes
automatic mode detection to work on these systems.

Signed-off-by: Evan Lezar <elezar@nvidia.com>
2025-05-13 21:25:11 +02:00
Evan Lezar
d0103aa6a3 Add rprivate to CDI mount options
Some checks failed
CI Pipeline / code-scanning (push) Has been cancelled
CI Pipeline / variables (push) Has been cancelled
CI Pipeline / golang (push) Has been cancelled
CI Pipeline / image (push) Has been cancelled
CI Pipeline / e2e-test (push) Has been cancelled
This ensures that mount propagation is set to rprivate for
mounts from the host into the container. This aligns with the
default in docker.

Signed-off-by: Evan Lezar <elezar@nvidia.com>
2025-05-09 15:16:13 +02:00
Evan Lezar
adb5e6719d Merge pull request #1046 from elezar/resolve-ldcache-libs-on-arm64
Some checks failed
CI Pipeline / code-scanning (push) Has been cancelled
CI Pipeline / variables (push) Has been cancelled
CI Pipeline / golang (push) Has been cancelled
CI Pipeline / image (push) Has been cancelled
CI Pipeline / e2e-test (push) Has been cancelled
Fix resolution of libs in LDCache on ARM
2025-05-09 15:04:00 +02:00
Evan Lezar
0c254711e7 Merge pull request #1066 from NVIDIA/dependabot/docker/deployments/container/main/nvidia/cuda-12.9.0-base-ubuntu20.04
Bump nvidia/cuda from 12.8.1-base-ubuntu20.04 to 12.9.0-base-ubuntu20.04 in /deployments/container
2025-05-09 13:51:10 +02:00
Evan Lezar
27adebaa44 Merge pull request #1065 from elezar/skip-nill-discoverers
Skip nil discoverers in merge
2025-05-09 13:50:44 +02:00
dependabot[bot]
496cdb5463 Bump nvidia/cuda in /deployments/container
Bumps nvidia/cuda from 12.8.1-base-ubuntu20.04 to 12.9.0-base-ubuntu20.04.

---
updated-dependencies:
- dependency-name: nvidia/cuda
  dependency-version: 12.9.0-base-ubuntu20.04
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
2025-05-08 08:38:12 +00:00
Evan Lezar
132c9afb6c Merge pull request #1063 from NVIDIA/dependabot/github_actions/main/slackapi/slack-github-action-2.1.0
Some checks failed
CI Pipeline / code-scanning (push) Has been cancelled
CI Pipeline / variables (push) Has been cancelled
CI Pipeline / golang (push) Has been cancelled
CI Pipeline / image (push) Has been cancelled
CI Pipeline / e2e-test (push) Has been cancelled
Bump slackapi/slack-github-action from 2.0.0 to 2.1.0
2025-05-07 17:19:08 +02:00
Evan Lezar
c879fb59c1 Merge pull request #1058 from NVIDIA/dependabot/github_actions/main/golangci/golangci-lint-action-8
Bump golangci/golangci-lint-action from 7 to 8
2025-05-07 16:54:45 +02:00
Evan Lezar
fbff2c4943 Merge pull request #1064 from NVIDIA/dependabot/docker/deployments/devel/main/golang-1.24.3
Bump golang from 1.24.2 to 1.24.3 in /deployments/devel
2025-05-07 16:54:18 +02:00
Evan Lezar
0c765c6536 Skip nil discoverers in merge
When constructing a list of discoverers using discover.Merge we
explicitly skip `nil` discoverers to simplify usage as we don't
have to explicitly check validity when processing the discoverers
in the list.

Signed-off-by: Evan Lezar <elezar@nvidia.com>
2025-05-07 12:51:38 +02:00
dependabot[bot]
0863749de3 Bump golang from 1.24.2 to 1.24.3 in /deployments/devel
Bumps golang from 1.24.2 to 1.24.3.

---
updated-dependencies:
- dependency-name: golang
  dependency-version: 1.24.3
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
2025-05-07 08:53:04 +00:00
dependabot[bot]
a8ca8e91f2 Bump slackapi/slack-github-action from 2.0.0 to 2.1.0
Bumps [slackapi/slack-github-action](https://github.com/slackapi/slack-github-action) from 2.0.0 to 2.1.0.
- [Release notes](https://github.com/slackapi/slack-github-action/releases)
- [Commits](https://github.com/slackapi/slack-github-action/compare/v2.0.0...v2.1.0)

---
updated-dependencies:
- dependency-name: slackapi/slack-github-action
  dependency-version: 2.1.0
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
2025-05-07 08:29:10 +00:00
Evan Lezar
cf395e765a Merge pull request #1061 from NVIDIA/dependabot/go_modules/main/golang.org/x/sys-0.33.0
Some checks failed
CI Pipeline / code-scanning (push) Has been cancelled
CI Pipeline / variables (push) Has been cancelled
CI Pipeline / golang (push) Has been cancelled
CI Pipeline / image (push) Has been cancelled
CI Pipeline / e2e-test (push) Has been cancelled
Bump golang.org/x/sys from 0.32.0 to 0.33.0
2025-05-06 14:32:35 +02:00
dependabot[bot]
f859c9a671 Bump golang.org/x/sys from 0.32.0 to 0.33.0
Bumps [golang.org/x/sys](https://github.com/golang/sys) from 0.32.0 to 0.33.0.
- [Commits](https://github.com/golang/sys/compare/v0.32.0...v0.33.0)

---
updated-dependencies:
- dependency-name: golang.org/x/sys
  dependency-version: 0.33.0
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
2025-05-06 09:31:22 +00:00
Carlos Eduardo Arango Gutierrez
f50e815837 Merge pull request #1062 from NVIDIA/dependabot/go_modules/tests/main/golang.org/x/crypto-0.38.0
Bump golang.org/x/crypto from 0.37.0 to 0.38.0 in /tests
2025-05-06 11:30:11 +02:00
dependabot[bot]
ffcef4f9a8 Bump golang.org/x/crypto from 0.37.0 to 0.38.0 in /tests
Bumps [golang.org/x/crypto](https://github.com/golang/crypto) from 0.37.0 to 0.38.0.
- [Commits](https://github.com/golang/crypto/compare/v0.37.0...v0.38.0)

---
updated-dependencies:
- dependency-name: golang.org/x/crypto
  dependency-version: 0.38.0
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
2025-05-06 08:42:33 +00:00
dependabot[bot]
194a1663ab Bump golangci/golangci-lint-action from 7 to 8
Bumps [golangci/golangci-lint-action](https://github.com/golangci/golangci-lint-action) from 7 to 8.
- [Release notes](https://github.com/golangci/golangci-lint-action/releases)
- [Commits](https://github.com/golangci/golangci-lint-action/compare/v7...v8)

---
updated-dependencies:
- dependency-name: golangci/golangci-lint-action
  dependency-version: '8'
  dependency-type: direct:production
  update-type: version-update:semver-major
...

Signed-off-by: dependabot[bot] <support@github.com>
2025-05-05 09:11:37 +00:00
Evan Lezar
51d603aec6 Merge pull request #1024 from NVIDIA/dependabot/go_modules/tests/main/golang.org/x/crypto-0.37.0
Some checks failed
CI Pipeline / code-scanning (push) Has been cancelled
CI Pipeline / variables (push) Has been cancelled
CI Pipeline / golang (push) Has been cancelled
CI Pipeline / image (push) Has been cancelled
CI Pipeline / e2e-test (push) Has been cancelled
Bump golang.org/x/crypto from 0.36.0 to 0.37.0 in /tests
2025-05-02 13:14:34 +02:00
Evan Lezar
3f9359eba2 Merge pull request #1056 from tariq1890/bump-runc-dep
Bump github.com/opencontainers/runc from v1.2.6 to v1.3.0
2025-05-02 13:14:13 +02:00
dependabot[bot]
574d204953 Bump golang.org/x/crypto from 0.36.0 to 0.37.0 in /tests
Bumps [golang.org/x/crypto](https://github.com/golang/crypto) from 0.36.0 to 0.37.0.
- [Commits](https://github.com/golang/crypto/compare/v0.36.0...v0.37.0)

---
updated-dependencies:
- dependency-name: golang.org/x/crypto
  dependency-version: 0.37.0
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
2025-04-30 13:52:05 +00:00
Evan Lezar
ca061bb4f0 Merge pull request #1026 from NVIDIA/dependabot/go_modules/tests/main/github.com/onsi/ginkgo/v2-2.23.4
Some checks failed
CI Pipeline / code-scanning (push) Has been cancelled
CI Pipeline / variables (push) Has been cancelled
CI Pipeline / golang (push) Has been cancelled
CI Pipeline / image (push) Has been cancelled
CI Pipeline / e2e-test (push) Has been cancelled
Bump github.com/onsi/ginkgo/v2 from 2.23.3 to 2.23.4 in /tests
2025-04-30 15:50:42 +02:00
Tariq Ibrahim
f7a415f480 bump runc go dep to v1.3.0
Signed-off-by: Tariq Ibrahim <tibrahim@nvidia.com>
2025-04-29 19:01:38 -07:00
Evan Lezar
e6cd7a3b53 Fix resolution of libs in LDCache on ARM
Since we explicitly check for the architecture of the
libraries in the ldcache, we need to also check the architecture
flag against the ARM constants.

Signed-off-by: Evan Lezar <elezar@nvidia.com>
2025-04-23 14:28:28 +02:00
dependabot[bot]
9f6b45817b Bump github.com/onsi/ginkgo/v2 from 2.23.3 to 2.23.4 in /tests
Bumps [github.com/onsi/ginkgo/v2](https://github.com/onsi/ginkgo) from 2.23.3 to 2.23.4.
- [Release notes](https://github.com/onsi/ginkgo/releases)
- [Changelog](https://github.com/onsi/ginkgo/blob/master/CHANGELOG.md)
- [Commits](https://github.com/onsi/ginkgo/compare/v2.23.3...v2.23.4)

---
updated-dependencies:
- dependency-name: github.com/onsi/ginkgo/v2
  dependency-version: 2.23.4
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
2025-04-22 08:27:11 +00:00
Evan Lezar
de3d736663 Merge pull request #1035 from NVIDIA/dependabot/go_modules/tests/golang.org/x/net-0.38.0
Some checks failed
CI Pipeline / code-scanning (push) Has been cancelled
CI Pipeline / variables (push) Has been cancelled
CI Pipeline / golang (push) Has been cancelled
CI Pipeline / image (push) Has been cancelled
CI Pipeline / e2e-test (push) Has been cancelled
Bump golang.org/x/net from 0.37.0 to 0.38.0 in /tests
2025-04-22 10:25:58 +02:00
Evan Lezar
e4e7c5d857 Merge pull request #1039 from NVIDIA/dependabot/submodules/main/third_party/libnvidia-container-a198166
Bump third_party/libnvidia-container from `95d3e86` to `a198166`
2025-04-22 09:34:29 +02:00
dependabot[bot]
0620dfa6f9 Bump third_party/libnvidia-container from 95d3e86 to a198166
Bumps [third_party/libnvidia-container](https://github.com/NVIDIA/libnvidia-container) from `95d3e86` to `a198166`.
- [Release notes](https://github.com/NVIDIA/libnvidia-container/releases)
- [Commits](95d3e86522...a198166e1c)

---
updated-dependencies:
- dependency-name: third_party/libnvidia-container
  dependency-version: a198166e1c1166f4847598438115ea97dacc7a92
  dependency-type: direct:production
...

Signed-off-by: dependabot[bot] <support@github.com>
2025-04-18 08:16:29 +00:00
Evan Lezar
6394e9e9e7 Merge pull request #1033 from JunAr7112/migrate_ngc_changes
Some checks failed
CI Pipeline / code-scanning (push) Has been cancelled
CI Pipeline / variables (push) Has been cancelled
CI Pipeline / golang (push) Has been cancelled
CI Pipeline / image (push) Has been cancelled
CI Pipeline / e2e-test (push) Has been cancelled
Updated .release:staging to stage images in nvstaging
2025-04-17 15:13:43 +02:00
Evan Lezar
a2e2a44516 Merge pull request #990 from elezar/refactor-toolkit-installer
Refactor toolkit installer
2025-04-17 15:08:39 +02:00
Arjun
6605bfb5fa Updated .release:staging to stage images in nvstaging
Signed-off-by: Arjun <arjun.gadiyar@gmail.com>
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2025-04-17 14:02:33 +02:00
Evan Lezar
14806f019b Refactor toolkit installer
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2025-04-17 13:56:58 +02:00
Evan Lezar
2437630421 [no-relnote] Explicitly use blank config
Since this is running in a contianer the contents of the
/etc/nvidia-container-runtime/config.toml file is equivalent to the
default config. This change makes it explicit.

Signed-off-by: Evan Lezar <elezar@nvidia.com>
2025-04-17 13:56:58 +02:00
Evan Lezar
cdad158f0f [no-relnote] Move toolkit installer package
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2025-04-17 13:56:56 +02:00
Evan Lezar
baa4f907ab [no-relnote] Allow relative paths from operator structs
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2025-04-17 13:56:14 +02:00
Evan Lezar
afc05f6713 Merge pull request #1004 from dschervov/main
Add support for building ubuntu22.04 on amd64
2025-04-17 13:54:08 +02:00
dependabot[bot]
abea8d375c Bump golang.org/x/net from 0.37.0 to 0.38.0 in /tests
Bumps [golang.org/x/net](https://github.com/golang/net) from 0.37.0 to 0.38.0.
- [Commits](https://github.com/golang/net/compare/v0.37.0...v0.38.0)

---
updated-dependencies:
- dependency-name: golang.org/x/net
  dependency-version: 0.38.0
  dependency-type: indirect
...

Signed-off-by: dependabot[bot] <support@github.com>
2025-04-16 23:29:16 +00:00
Evan Lezar
af985c22ea Merge pull request #1025 from NVIDIA/dependabot/go_modules/main/golang.org/x/sys-0.32.0
Some checks failed
CI Pipeline / code-scanning (push) Has been cancelled
CI Pipeline / variables (push) Has been cancelled
CI Pipeline / golang (push) Has been cancelled
CI Pipeline / image (push) Has been cancelled
CI Pipeline / e2e-test (push) Has been cancelled
Bump golang.org/x/sys from 0.31.0 to 0.32.0
2025-04-09 15:47:53 +02:00
Evan Lezar
b4edc3e730 Merge pull request #1016 from elezar/allow-runtime-path
Some checks are pending
CI Pipeline / code-scanning (push) Waiting to run
CI Pipeline / variables (push) Waiting to run
CI Pipeline / golang (push) Waiting to run
CI Pipeline / image (push) Blocked by required conditions
CI Pipeline / e2e-test (push) Blocked by required conditions
Allow container runtime executable path to be specified
2025-04-08 17:35:19 +02:00
Evan Lezar
c57bdf3391 Allow container runtime executable path to be specified
Some checks failed
CI Pipeline / code-scanning (push) Has been cancelled
CI Pipeline / variables (push) Has been cancelled
CI Pipeline / golang (push) Has been cancelled
CI Pipeline / image (push) Has been cancelled
CI Pipeline / e2e-test (push) Has been cancelled
This change adds support for specifying the container runtime
executable path. This can be used if, for example, there are
two containerd or crio executables and a specific one must be used.

Signed-off-by: Evan Lezar <elezar@nvidia.com>
2025-04-07 17:28:11 +02:00
dependabot[bot]
4c241f22ef Bump golang.org/x/sys from 0.31.0 to 0.32.0
Bumps [golang.org/x/sys](https://github.com/golang/sys) from 0.31.0 to 0.32.0.
- [Commits](https://github.com/golang/sys/compare/v0.31.0...v0.32.0)

---
updated-dependencies:
- dependency-name: golang.org/x/sys
  dependency-version: 0.32.0
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
2025-04-07 09:35:18 +00:00
Dmitrii Chervov
cd3fdd9f0f Add support for building ubuntu22.04 on arm64
Signed-off-by: Dmitrii Chervov <dschervov1@yandex.ru>
2025-04-06 18:06:36 +03:00
Evan Lezar
cb7605e132 [no-relnote] Remove unused runtimeConfigOverideJSON variable
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2025-04-04 10:57:12 +02:00
Evan Lezar
a2caaec99d Merge pull request #1014 from NVIDIA/dependabot/docker/deployments/devel/main/golang-1.24.2
Some checks failed
CI Pipeline / code-scanning (push) Has been cancelled
CI Pipeline / variables (push) Has been cancelled
CI Pipeline / golang (push) Has been cancelled
CI Pipeline / image (push) Has been cancelled
CI Pipeline / e2e-test (push) Has been cancelled
Bump golang from 1.24.1 to 1.24.2 in /deployments/devel
2025-04-03 15:52:17 +02:00
Carlos Eduardo Arango Gutierrez
7dc93d6c9a Merge pull request #1017 from NVIDIA/dependabot/go_modules/tests/main/github.com/onsi/gomega-1.37.0
Bump github.com/onsi/gomega from 1.36.2 to 1.37.0 in /tests
2025-04-03 14:25:08 +02:00
dependabot[bot]
aacfaed40b Bump github.com/onsi/gomega from 1.36.2 to 1.37.0 in /tests
Bumps [github.com/onsi/gomega](https://github.com/onsi/gomega) from 1.36.2 to 1.37.0.
- [Release notes](https://github.com/onsi/gomega/releases)
- [Changelog](https://github.com/onsi/gomega/blob/master/CHANGELOG.md)
- [Commits](https://github.com/onsi/gomega/compare/v1.36.2...v1.37.0)

---
updated-dependencies:
- dependency-name: github.com/onsi/gomega
  dependency-version: 1.37.0
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
2025-04-03 08:30:59 +00:00
Evan Lezar
7833723be1 Merge pull request #737 from elezar/use-with-cache-for-mounts
Some checks are pending
CI Pipeline / code-scanning (push) Waiting to run
CI Pipeline / variables (push) Waiting to run
CI Pipeline / golang (push) Waiting to run
CI Pipeline / image (push) Blocked by required conditions
CI Pipeline / e2e-test (push) Blocked by required conditions
Fix race condition in mounts cache
2025-04-02 16:30:05 +02:00
Evan Lezar
986f3db971 Fix race condition in mounts cache
This change switches to using the WithCache decorator for
mounts instead of keeping track of a cache locally.

This addresses a race condition when using the mounts structure.

Signed-off-by: Evan Lezar <elezar@nvidia.com>
2025-04-02 16:29:49 +02:00
Evan Lezar
98deb7e4bc Merge pull request #1015 from NVIDIA/dependabot/github_actions/main/NVIDIA/holodeck-0.2.7
Bump NVIDIA/holodeck from 0.2.6 to 0.2.7
2025-04-02 16:28:42 +02:00
dependabot[bot]
3fbb6a8dc6 Bump NVIDIA/holodeck from 0.2.6 to 0.2.7
Bumps [NVIDIA/holodeck](https://github.com/nvidia/holodeck) from 0.2.6 to 0.2.7.
- [Release notes](https://github.com/nvidia/holodeck/releases)
- [Commits](https://github.com/nvidia/holodeck/compare/v0.2.6...v0.2.7)

---
updated-dependencies:
- dependency-name: NVIDIA/holodeck
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
2025-04-02 12:25:55 +00:00
Evan Lezar
50a7e2fd15 Merge pull request #1009 from NVIDIA/dependabot/github_actions/main/golangci/golangci-lint-action-7
Bump golangci/golangci-lint-action from 6 to 7
2025-04-02 14:24:43 +02:00
dependabot[bot]
8ab6ab984e Bump golang from 1.24.1 to 1.24.2 in /deployments/devel
Bumps golang from 1.24.1 to 1.24.2.

---
updated-dependencies:
- dependency-name: golang
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
2025-04-02 12:18:33 +00:00
Evan Lezar
0fb3eec1bb [no-relnote] Fix ST1005: error strings should not be capitalized lint errors
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2025-04-02 14:18:32 +02:00
Evan Lezar
3913a6392b [no-relnote] Fix QF1012: Use fmt.Fprintf(...) instead of WriteString(fmt.Sprintf(...)) lint errors
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2025-04-02 14:18:32 +02:00
Evan Lezar
80fb4dc0e9 [no-relnote] Fix ST1019: package tags.cncf.io/container-device-interface/specs-go is being imported more than once lint errors
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2025-04-02 14:18:32 +02:00
Evan Lezar
be9d7b6db1 [no-relnote] Fix QF1002: could use tagged switch on info.SubType lint errors
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2025-04-02 14:18:32 +02:00
Evan Lezar
c1c6534b1f [no-relnote] Fix QF1008: could remove embedded field lint errors
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2025-04-02 14:18:32 +02:00
Evan Lezar
bee1969bbf [no-relnote] Migrate golangci-lint config to v2
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2025-04-02 14:18:32 +02:00
dependabot[bot]
47fc33ea9b Bump golangci/golangci-lint-action from 6 to 7
Bumps [golangci/golangci-lint-action](https://github.com/golangci/golangci-lint-action) from 6 to 7.
- [Release notes](https://github.com/golangci/golangci-lint-action/releases)
- [Commits](https://github.com/golangci/golangci-lint-action/compare/v6...v7)

---
updated-dependencies:
- dependency-name: golangci/golangci-lint-action
  dependency-type: direct:production
  update-type: version-update:semver-major
...

Signed-off-by: dependabot[bot] <support@github.com>
2025-04-02 14:18:32 +02:00
Evan Lezar
c1cf52d1f7 Merge pull request #1008 from NVIDIA/dependabot/go_modules/main/tags.cncf.io/container-device-interface-1.0.1
Some checks are pending
CI Pipeline / code-scanning (push) Waiting to run
CI Pipeline / variables (push) Waiting to run
CI Pipeline / golang (push) Waiting to run
CI Pipeline / image (push) Blocked by required conditions
CI Pipeline / e2e-test (push) Blocked by required conditions
Bump tags.cncf.io/container-device-interface from 1.0.0 to 1.0.1
2025-04-02 12:11:02 +02:00
Evan Lezar
1c845ea2d5 [no-relnote] Update CDI tests for new whitespace formatting
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2025-04-02 11:38:46 +02:00
dependabot[bot]
5549824559 Bump tags.cncf.io/container-device-interface from 1.0.0 to 1.0.1
Bumps [tags.cncf.io/container-device-interface](https://github.com/cncf-tags/container-device-interface) from 1.0.0 to 1.0.1.
- [Release notes](https://github.com/cncf-tags/container-device-interface/releases)
- [Changelog](https://github.com/cncf-tags/container-device-interface/blob/main/RELEASE.md)
- [Commits](https://github.com/cncf-tags/container-device-interface/compare/v1.0.0...v1.0.1)

---
updated-dependencies:
- dependency-name: tags.cncf.io/container-device-interface
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
2025-03-24 09:08:27 +00:00
Dmitrii Chervov
35b471d833 Add support for building ubuntu22.04 on amd64
Signed-off-by: Dmitrii Chervov <dschervov1@yandex.ru>
2025-03-21 19:34:07 +03:00
Evan Lezar
e33f15a128 Merge pull request #1002 from elezar/improve-update-ldcache-args
Some checks failed
CI Pipeline / code-scanning (push) Has been cancelled
CI Pipeline / variables (push) Has been cancelled
CI Pipeline / golang (push) Has been cancelled
CI Pipeline / image (push) Has been cancelled
CI Pipeline / e2e-test (push) Has been cancelled
Fix update-ldcache arguments
2025-03-20 13:19:25 +02:00
Evan Lezar
77bf9157ab Merge pull request #989 from NVIDIA/dependabot/go_modules/tests/golang.org/x/net-0.36.0
Bump golang.org/x/net from 0.35.0 to 0.36.0 in /tests
2025-03-20 13:01:20 +02:00
Evan Lezar
995e56306d Fix update-ldcache arguments
This change updates how the lconfig arguments are constructed. This
makes the update-ldcache more robust and ensures that folders are
specified last if at al.

Checks are also included for empty container roots at the start of the
hook to simplify later checks.

Signed-off-by: Evan Lezar <elezar@nvidia.com>
2025-03-20 12:51:31 +02:00
dependabot[bot]
3f58c67fed Bump golang.org/x/net from 0.35.0 to 0.36.0 in /tests
Bumps [golang.org/x/net](https://github.com/golang/net) from 0.35.0 to 0.36.0.
- [Commits](https://github.com/golang/net/compare/v0.35.0...v0.36.0)

---
updated-dependencies:
- dependency-name: golang.org/x/net
  dependency-type: indirect
...

Signed-off-by: dependabot[bot] <support@github.com>
2025-03-18 13:26:57 +02:00
Evan Lezar
e7a0067aae Merge pull request #998 from NVIDIA/dependabot/go_modules/main/github.com/opencontainers/runc-1.2.6
Some checks failed
CI Pipeline / code-scanning (push) Has been cancelled
CI Pipeline / variables (push) Has been cancelled
CI Pipeline / golang (push) Has been cancelled
CI Pipeline / image (push) Has been cancelled
CI Pipeline / e2e-test (push) Has been cancelled
Bump github.com/opencontainers/runc from 1.2.5 to 1.2.6
2025-03-18 13:26:18 +02:00
dependabot[bot]
002197d5b1 Bump github.com/opencontainers/runc from 1.2.5 to 1.2.6
Bumps [github.com/opencontainers/runc](https://github.com/opencontainers/runc) from 1.2.5 to 1.2.6.
- [Release notes](https://github.com/opencontainers/runc/releases)
- [Changelog](https://github.com/opencontainers/runc/blob/v1.2.6/CHANGELOG.md)
- [Commits](https://github.com/opencontainers/runc/compare/v1.2.5...v1.2.6)

---
updated-dependencies:
- dependency-name: github.com/opencontainers/runc
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
2025-03-18 08:39:18 +00:00
Evan Lezar
a72050442e Merge pull request #992 from elezar/cleanup-nvidia-ctk-installer
Some checks failed
CI Pipeline / code-scanning (push) Has been cancelled
CI Pipeline / variables (push) Has been cancelled
CI Pipeline / golang (push) Has been cancelled
CI Pipeline / image (push) Has been cancelled
CI Pipeline / e2e-test (push) Has been cancelled
Cleanup nvidia ctk installer
2025-03-17 12:41:04 +02:00
Evan Lezar
75a30af36a Remove positional arguments from nvidia-ctk-installer
Parsing positional arguments require additional processing
instead of relying on named flags. This change switches to
using a named flag for specifying the toolkit installation directory.

Signed-off-by: Evan Lezar <elezar@nvidia.com>
2025-03-14 14:57:24 +02:00
Evan Lezar
89b99ff786 Remove deprecated --runtime-args from nvidia-ctk-installer
The GPU Operator no longer sets the RUNTIME_ARGS environment variable
to control the runtime and instead sets general environment variables.

This change removes support for setting RUNTIME_ARGS in the CLI as these
settings have no effect.

Signed-off-by: Evan Lezar <elezar@nvidia.com>
2025-03-14 14:57:24 +02:00
Evan Lezar
025407543f Merge pull request #993 from NVIDIA/dependabot/docker/deployments/container/main/nvidia/cuda-12.8.1-base-ubuntu20.04
Some checks failed
CI Pipeline / code-scanning (push) Has been cancelled
CI Pipeline / variables (push) Has been cancelled
CI Pipeline / golang (push) Has been cancelled
CI Pipeline / image (push) Has been cancelled
CI Pipeline / e2e-test (push) Has been cancelled
Bump nvidia/cuda from 12.8.0-base-ubuntu20.04 to 12.8.1-base-ubuntu20.04 in /deployments/container
2025-03-14 11:24:11 +02:00
dependabot[bot]
41fd83ab2b Bump nvidia/cuda in /deployments/container
Bumps nvidia/cuda from 12.8.0-base-ubuntu20.04 to 12.8.1-base-ubuntu20.04.

---
updated-dependencies:
- dependency-name: nvidia/cuda
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
2025-03-14 08:20:16 +00:00
Evan Lezar
dd55eeecc9 Merge pull request #991 from elezar/rename-app-name
Some checks are pending
CI Pipeline / code-scanning (push) Waiting to run
CI Pipeline / variables (push) Waiting to run
CI Pipeline / golang (push) Waiting to run
CI Pipeline / image (push) Blocked by required conditions
CI Pipeline / e2e-test (push) Blocked by required conditions
Minor updates to nvidia-ctk-installer
2025-03-13 16:17:54 +02:00
Evan Lezar
62497870fa Add version info to nvidia-ctk-installer
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2025-03-13 16:15:29 +02:00
Evan Lezar
eb932bef8a Update nvidia-ctk-installer app name to match binary name
This change updates the nvidia-ctk-installer app name to match the binary name.

Signed-off-by: Evan Lezar <elezar@nvidia.com>
2025-03-13 16:12:13 +02:00
Evan Lezar
e4547bdda6 Merge pull request #969 from elezar/allow-comma-separated-config-values
Some checks are pending
CI Pipeline / code-scanning (push) Waiting to run
CI Pipeline / variables (push) Waiting to run
CI Pipeline / golang (push) Waiting to run
CI Pipeline / image (push) Blocked by required conditions
CI Pipeline / e2e-test (push) Blocked by required conditions
Allow nvidia-ctk config --set to accept comma-separated lists
2025-03-12 16:21:52 +02:00
Evan Lezar
d32449b2d2 Allow nvidia-ctk config --set to accept comma-separated lists
The urfave update to v2.27.6 fixes the behaviour when disabling a separator
for repeated StringSliceFlags. This change updates the nvidia-ctk config
command to allow list options to be specified as comma-separated values.

Signed-off-by: Evan Lezar <elezar@nvidia.com>
2025-03-12 15:47:32 +02:00
Evan Lezar
63ed478ce9 Merge pull request #959 from NVIDIA/dependabot/docker/deployments/devel/main/golang-1.24.1
Bump golang from 1.24.0 to 1.24.1 in /deployments/devel
2025-03-12 14:05:19 +02:00
Evan Lezar
047225a9ae Merge pull request #987 from elezar/fix-mock-generation
[no-relnotes] Update moq to use rm and goimports
2025-03-12 13:44:02 +02:00
Evan Lezar
2a9bae8e80 [no-relnotes] Update moq to use rm and goimports
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2025-03-12 13:12:13 +02:00
Evan Lezar
57f077fce7 Merge pull request #985 from elezar/fix-signing-container
[no-relnote] Use centos:stream9 for signing container
2025-03-12 12:45:57 +02:00
Evan Lezar
26101ea023 [no-relnote] Use centos:stream9 for signing container
The signing container need not be based on a legacy centos version.
This change updates the signing contianer to be centos:stream9 based.

Signed-off-by: Evan Lezar <elezar@nvidia.com>
2025-03-12 12:44:35 +02:00
Evan Lezar
64dd017801 Merge pull request #984 from NVIDIA/dependabot/go_modules/deployments/devel/main/github.com/golangci/golangci-lint-1.64.7
Bump github.com/golangci/golangci-lint from 1.64.6 to 1.64.7 in /deployments/devel
2025-03-12 11:49:00 +02:00
dependabot[bot]
37ba7a2801 Bump github.com/golangci/golangci-lint in /deployments/devel
Bumps [github.com/golangci/golangci-lint](https://github.com/golangci/golangci-lint) from 1.64.6 to 1.64.7.
- [Release notes](https://github.com/golangci/golangci-lint/releases)
- [Changelog](https://github.com/golangci/golangci-lint/blob/master/CHANGELOG.md)
- [Commits](https://github.com/golangci/golangci-lint/compare/v1.64.6...v1.64.7)

---
updated-dependencies:
- dependency-name: github.com/golangci/golangci-lint
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
2025-03-12 08:20:59 +00:00
Evan Lezar
3df59b955a Merge pull request #978 from NVIDIA/dependabot/go_modules/main/tags.cncf.io/container-device-interface-1.0.0
Some checks failed
CI Pipeline / code-scanning (push) Has been cancelled
CI Pipeline / variables (push) Has been cancelled
CI Pipeline / golang (push) Has been cancelled
CI Pipeline / image (push) Has been cancelled
CI Pipeline / e2e-test (push) Has been cancelled
Bump tags.cncf.io/container-device-interface from 0.8.1 to 1.0.0
2025-03-10 10:45:38 +02:00
Evan Lezar
33280cd2b2 [no-relnote] Address stricter validation
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2025-03-10 10:41:40 +02:00
Evan Lezar
3306d5081e [no-relnote] Output sorted specs
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2025-03-10 10:41:40 +02:00
Evan Lezar
7c3ab75d08 [no-relnote] Update cdi.CurrentVersion reference
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2025-03-10 10:41:40 +02:00
dependabot[bot]
71985df972 Bump tags.cncf.io/container-device-interface from 0.8.1 to 1.0.0
Bumps [tags.cncf.io/container-device-interface](https://github.com/cncf-tags/container-device-interface) from 0.8.1 to 1.0.0.
- [Release notes](https://github.com/cncf-tags/container-device-interface/releases)
- [Changelog](https://github.com/cncf-tags/container-device-interface/blob/main/RELEASE.md)
- [Commits](https://github.com/cncf-tags/container-device-interface/compare/v0.8.1...v1.0.0)

---
updated-dependencies:
- dependency-name: tags.cncf.io/container-device-interface
  dependency-type: direct:production
  update-type: version-update:semver-major
...

Signed-off-by: dependabot[bot] <support@github.com>
2025-03-10 10:41:40 +02:00
Evan Lezar
4255d73d89 Merge pull request #981 from NVIDIA/dependabot/submodules/main/third_party/libnvidia-container-95d3e86
Bump third_party/libnvidia-container from `f23e5e5` to `95d3e86`
2025-03-10 10:40:48 +02:00
dependabot[bot]
9bdb74aec2 Bump third_party/libnvidia-container from f23e5e5 to 95d3e86
Bumps [third_party/libnvidia-container](https://github.com/NVIDIA/libnvidia-container) from `f23e5e5` to `95d3e86`.
- [Release notes](https://github.com/NVIDIA/libnvidia-container/releases)
- [Commits](f23e5e55ea...95d3e86522)

---
updated-dependencies:
- dependency-name: third_party/libnvidia-container
  dependency-type: direct:production
...

Signed-off-by: dependabot[bot] <support@github.com>
2025-03-10 08:34:52 +00:00
Evan Lezar
e436533a6f Merge pull request #968 from elezar/allow-hooks-disable
Some checks failed
CI Pipeline / code-scanning (push) Has been cancelled
CI Pipeline / variables (push) Has been cancelled
CI Pipeline / golang (push) Has been cancelled
CI Pipeline / image (push) Has been cancelled
CI Pipeline / e2e-test (push) Has been cancelled
Allow enable-cuda-compat hook to be disabled in CDI spec generation
2025-03-07 16:54:34 +02:00
Evan Lezar
0f299c3431 Disable enable-cuda-compat hook for management containers
Management containers don't generally need forward compatibility.
We disable the enable-cuda-compat hook to not include this in the
generated CDI specifications.

Signed-off-by: Evan Lezar <elezar@nvidia.com>
2025-03-07 16:54:09 +02:00
Evan Lezar
f852043078 Allow enable-cuda-compat hook to be disabled in CDI spec generation
This change adds support to the nvcdi package to opt out of specific hooks.

Currently only the `enable-cuda-compat` hook is supported. This allows clients to
generate a CDI spec that is compatible with older nvidia-cdi-hook CLIs.

Signed-off-by: Evan Lezar <elezar@nvidia.com>
2025-03-07 16:54:09 +02:00
Carlos Eduardo Arango Gutierrez
ef0b16bc24 Merge pull request #966 from NVIDIA/dependabot/go_modules/tests/main/golang.org/x/crypto-0.36.0
Bump golang.org/x/crypto from 0.35.0 to 0.36.0 in /tests
2025-03-07 15:40:44 +01:00
dependabot[bot]
225dfec83f Bump golang.org/x/crypto from 0.35.0 to 0.36.0 in /tests
Bumps [golang.org/x/crypto](https://github.com/golang/crypto) from 0.35.0 to 0.36.0.
- [Commits](https://github.com/golang/crypto/compare/v0.35.0...v0.36.0)

---
updated-dependencies:
- dependency-name: golang.org/x/crypto
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
2025-03-07 14:32:55 +00:00
Carlos Eduardo Arango Gutierrez
03c48a6824 Merge pull request #956 from NVIDIA/dependabot/go_modules/deployments/devel/main/github.com/golangci/golangci-lint-1.64.6
Bump github.com/golangci/golangci-lint from 1.64.5 to 1.64.6 in /deployments/devel
2025-03-07 15:16:06 +01:00
dependabot[bot]
6530826293 Bump github.com/golangci/golangci-lint in /deployments/devel
Bumps [github.com/golangci/golangci-lint](https://github.com/golangci/golangci-lint) from 1.64.5 to 1.64.6.
- [Release notes](https://github.com/golangci/golangci-lint/releases)
- [Changelog](https://github.com/golangci/golangci-lint/blob/master/CHANGELOG.md)
- [Commits](https://github.com/golangci/golangci-lint/compare/v1.64.5...v1.64.6)

---
updated-dependencies:
- dependency-name: github.com/golangci/golangci-lint
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
2025-03-07 14:05:34 +00:00
Carlos Eduardo Arango Gutierrez
971fd195b3 Merge pull request #964 from NVIDIA/dependabot/go_modules/tests/main/github.com/onsi/ginkgo/v2-2.23.0
Bump github.com/onsi/ginkgo/v2 from 2.22.2 to 2.23.0 in /tests
2025-03-07 15:04:17 +01:00
dependabot[bot]
3b10afd0fe Bump github.com/onsi/ginkgo/v2 from 2.22.2 to 2.23.0 in /tests
Bumps [github.com/onsi/ginkgo/v2](https://github.com/onsi/ginkgo) from 2.22.2 to 2.23.0.
- [Release notes](https://github.com/onsi/ginkgo/releases)
- [Changelog](https://github.com/onsi/ginkgo/blob/master/CHANGELOG.md)
- [Commits](https://github.com/onsi/ginkgo/compare/v2.22.2...v2.23.0)

---
updated-dependencies:
- dependency-name: github.com/onsi/ginkgo/v2
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
2025-03-07 13:44:49 +00:00
Evan Lezar
6b7ed26fba Merge pull request #963 from NVIDIA/dependabot/go_modules/main/github.com/urfave/cli/v2-2.27.6
Bump github.com/urfave/cli/v2 from 2.27.5 to 2.27.6
2025-03-07 15:43:32 +02:00
dependabot[bot]
8d5f1e2427 Bump github.com/urfave/cli/v2 from 2.27.5 to 2.27.6
Bumps [github.com/urfave/cli/v2](https://github.com/urfave/cli) from 2.27.5 to 2.27.6.
- [Release notes](https://github.com/urfave/cli/releases)
- [Changelog](https://github.com/urfave/cli/blob/main/docs/CHANGELOG.md)
- [Commits](https://github.com/urfave/cli/compare/v2.27.5...v2.27.6)

---
updated-dependencies:
- dependency-name: github.com/urfave/cli/v2
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
2025-03-06 10:48:36 +00:00
Evan Lezar
d82a9ccd89 Merge pull request #961 from NVIDIA/dependabot/go_modules/main/golang.org/x/sys-0.31.0
Some checks failed
CI Pipeline / code-scanning (push) Has been cancelled
CI Pipeline / variables (push) Has been cancelled
CI Pipeline / golang (push) Has been cancelled
CI Pipeline / image (push) Has been cancelled
CI Pipeline / e2e-test (push) Has been cancelled
Bump golang.org/x/sys from 0.30.0 to 0.31.0
2025-03-06 12:47:14 +02:00
dependabot[bot]
8ac213e3e6 Bump golang.org/x/sys from 0.30.0 to 0.31.0
Bumps [golang.org/x/sys](https://github.com/golang/sys) from 0.30.0 to 0.31.0.
- [Commits](https://github.com/golang/sys/compare/v0.30.0...v0.31.0)

---
updated-dependencies:
- dependency-name: golang.org/x/sys
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
2025-03-06 09:39:17 +00:00
Evan Lezar
0128762832 Merge pull request #962 from NVIDIA/dependabot/go_modules/main/golang.org/x/mod-0.24.0
Bump golang.org/x/mod from 0.23.0 to 0.24.0
2025-03-06 11:38:07 +02:00
dependabot[bot]
d7b150a2e6 Bump golang.org/x/mod from 0.23.0 to 0.24.0
Bumps [golang.org/x/mod](https://github.com/golang/mod) from 0.23.0 to 0.24.0.
- [Commits](https://github.com/golang/mod/compare/v0.23.0...v0.24.0)

---
updated-dependencies:
- dependency-name: golang.org/x/mod
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
2025-03-06 11:34:42 +02:00
Evan Lezar
57c917e3b1 [no-relnote] Use --exit-code instead of --quiet for mod check
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2025-03-06 11:33:48 +02:00
dependabot[bot]
ed8faa2d2e Bump golang from 1.24.0 to 1.24.1 in /deployments/devel
Bumps golang from 1.24.0 to 1.24.1.

---
updated-dependencies:
- dependency-name: golang
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
2025-03-05 08:17:42 +00:00
Evan Lezar
bc9ec77fdd Merge pull request #943 from elezar/add-disable-imex-channels-feature
Some checks failed
CI Pipeline / code-scanning (push) Has been cancelled
CI Pipeline / variables (push) Has been cancelled
CI Pipeline / golang (push) Has been cancelled
CI Pipeline / image (push) Has been cancelled
CI Pipeline / e2e-test (push) Has been cancelled
Add ignore-imex-channel-requests feature flag
2025-02-28 17:53:28 +02:00
Evan Lezar
82f2eb7b73 Merge pull request #949 from NVIDIA/dependabot/go_modules/main/github.com/opencontainers/runtime-spec-1.2.1
Bump github.com/opencontainers/runtime-spec from 1.2.0 to 1.2.1
2025-02-28 14:49:09 +02:00
dependabot[bot]
712d829018 Bump github.com/opencontainers/runtime-spec from 1.2.0 to 1.2.1
Bumps [github.com/opencontainers/runtime-spec](https://github.com/opencontainers/runtime-spec) from 1.2.0 to 1.2.1.
- [Release notes](https://github.com/opencontainers/runtime-spec/releases)
- [Changelog](https://github.com/opencontainers/runtime-spec/blob/main/ChangeLog)
- [Commits](https://github.com/opencontainers/runtime-spec/compare/v1.2.0...v1.2.1)

---
updated-dependencies:
- dependency-name: github.com/opencontainers/runtime-spec
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
2025-02-28 12:01:30 +00:00
Evan Lezar
598b9740fc Merge pull request #941 from elezar/seal-ldconfig
Use memfd when running ldconfig
2025-02-28 14:00:23 +02:00
Evan Lezar
968e2ccca4 Merge pull request #906 from elezar/add-compat-lib-hook
Add CUDA forward compatibility hook
2025-02-27 17:25:19 +02:00
Evan Lezar
aff9301f2e Add disable-cuda-compat-lib-hook feature flag
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2025-02-27 15:58:15 +02:00
Evan Lezar
011fb72330 Add basic integration tests for forward compat
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2025-02-27 15:58:15 +02:00
Evan Lezar
2adef9903e Ensure that mode hook is executed last
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2025-02-27 15:58:15 +02:00
Evan Lezar
70b1f5af98 Add enable-cuda-compat hook to CDI spec generation
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2025-02-27 15:58:15 +02:00
Evan Lezar
c9422f12b3 [no-relnote] Add basic CDI generate test
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2025-02-27 15:58:15 +02:00
Evan Lezar
b7fbd56f7e Add ldconfig hook in legacy mode
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2025-02-27 15:58:15 +02:00
Evan Lezar
bd87c009ba Add enable-cuda-compat hook if required
This change adds the enable-cuda-compat hook to the incomming OCI runtime spec
if the allow-cuda-compat-libs-from-container feature flag is not enabled.

An update-ldcache hook is also injected to ensure that the required folders
are processed.

Signed-off-by: Evan Lezar <elezar@nvidia.com>
2025-02-27 15:58:15 +02:00
Evan Lezar
fc65d3a784 Add enable-cuda-compat hook to allow compat libs to be discovered
This change adds an nvidia-cdi-hook enable-cuda-compat hook that checks the
container for cuda compat libs and updates /etc/ld.so.conf.d to include their
parent folder if their driver major version is sufficient.

This allows CUDA Forward Compatibility to be used when this is not available
through the libnvidia-container.

Signed-off-by: Evan Lezar <elezar@nvidia.com>
2025-02-27 15:58:15 +02:00
Evan Lezar
52b9631333 Use libcontainer execseal to run ldconfig
This change copies ldconfig into a memfd before executing it from
the createContainer hook.

Signed-off-by: Evan Lezar <elezar@nvidia.com>
2025-02-27 13:52:27 +02:00
Evan Lezar
9429fbac5f [no-relnote] Move root to separate file
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2025-02-27 13:48:29 +02:00
Evan Lezar
04e9bf4ac1 Merge pull request #937 from NVIDIA/dependabot/go_modules/main/tags.cncf.io/container-device-interface-0.8.1
Bump tags.cncf.io/container-device-interface from 0.8.0 to 0.8.1
2025-02-27 11:22:09 +02:00
Evan Lezar
3ceaf1f85c Merge pull request #938 from NVIDIA/dependabot/go_modules/tests/main/golang.org/x/crypto-0.35.0
Bump golang.org/x/crypto from 0.33.0 to 0.35.0 in /tests
2025-02-27 11:12:45 +02:00
Evan Lezar
9f0c1042c4 Merge pull request #935 from elezar/disable-nvsandboxutils
Disable nvsandboxutils in nvcdi API
2025-02-27 11:07:42 +02:00
Evan Lezar
352b55c8ce Add ignore-imex-channel-requests feature flag
This allows the NVIDIA Container Toolkit to ignore IMEX channel requests
through the NVIDIA_IMEX_CHANNELS envvar or volume mounts and ensures that
the NVIDIA Container Toolkit cannot be used to provide out-of-band access
to an IMEX channel by simply specifying an environment variable, possibly
bypassing other checks by an orchestration system such as kubernetes.

Signed-off-by: Evan Lezar <elezar@nvidia.com>
2025-02-26 17:46:36 +02:00
Evan Lezar
b13139793b Disable nvsandboxutils in nvcdi API
Repeated calls to nvsandboxutils.Init and Shutdown are causing
segmentation violations. Here we disabled nvsandbox utils unless explicitly
specified.

Signed-off-by: Evan Lezar <elezar@nvidia.com>
2025-02-26 14:45:22 +02:00
dependabot[bot]
05f44b7752 Bump golang.org/x/crypto from 0.33.0 to 0.35.0 in /tests
Bumps [golang.org/x/crypto](https://github.com/golang/crypto) from 0.33.0 to 0.35.0.
- [Commits](https://github.com/golang/crypto/compare/v0.33.0...v0.35.0)

---
updated-dependencies:
- dependency-name: golang.org/x/crypto
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
2025-02-25 09:12:04 +00:00
dependabot[bot]
a109f28cb6 Bump tags.cncf.io/container-device-interface from 0.8.0 to 0.8.1
Bumps [tags.cncf.io/container-device-interface](https://github.com/cncf-tags/container-device-interface) from 0.8.0 to 0.8.1.
- [Release notes](https://github.com/cncf-tags/container-device-interface/releases)
- [Commits](https://github.com/cncf-tags/container-device-interface/compare/v0.8.0...v0.8.1)

---
updated-dependencies:
- dependency-name: tags.cncf.io/container-device-interface
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
2025-02-25 09:11:59 +00:00
Evan Lezar
65b575fa96 Merge pull request #933 from elezar/move-wrapper
Some checks failed
CI Pipeline / code-scanning (push) Has been cancelled
CI Pipeline / variables (push) Has been cancelled
CI Pipeline / golang (push) Has been cancelled
CI Pipeline / image (push) Has been cancelled
CI Pipeline / e2e-test (push) Has been cancelled
[no-relnote] Move nvcdi wrapper to separate file
2025-02-21 22:43:45 +02:00
Evan Lezar
6e413d8445 [no-relnote] Move nvcdi wrapper to separate file
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2025-02-20 23:23:22 +02:00
481 changed files with 27998 additions and 8102 deletions

View File

@@ -22,15 +22,7 @@ variables:
BUILD_MULTI_ARCH_IMAGES: "true"
stages:
- trigger
- image
- lint
- go-checks
- go-build
- unit-tests
- package-build
- image-build
- test
- pull
- scan
- release
- sign
@@ -53,108 +45,6 @@ workflow:
# We then add all the regular triggers
- !reference [.pipeline-trigger-rules, rules]
# The main or manual job is used to filter out distributions or architectures that are not required on
# every build.
.main-or-manual:
rules:
- !reference [.pipeline-trigger-rules, rules]
- if: $CI_PIPELINE_SOURCE == "schedule"
when: manual
# The trigger-pipeline job adds a manualy triggered job to the pipeline on merge requests.
trigger-pipeline:
stage: trigger
script:
- echo "starting pipeline"
rules:
- !reference [.main-or-manual, rules]
- if: $CI_PIPELINE_SOURCE == "merge_request_event"
when: manual
allow_failure: false
- when: always
# Define the distribution targets
.dist-centos7:
rules:
- !reference [.main-or-manual, rules]
variables:
DIST: centos7
.dist-centos8:
variables:
DIST: centos8
.dist-ubi8:
rules:
- !reference [.main-or-manual, rules]
variables:
DIST: ubi8
.dist-ubuntu18.04:
variables:
DIST: ubuntu18.04
.dist-ubuntu20.04:
variables:
DIST: ubuntu20.04
.dist-packaging:
variables:
DIST: packaging
# Define architecture targets
.arch-aarch64:
variables:
ARCH: aarch64
.arch-amd64:
variables:
ARCH: amd64
.arch-arm64:
variables:
ARCH: arm64
.arch-ppc64le:
rules:
- !reference [.main-or-manual, rules]
variables:
ARCH: ppc64le
.arch-x86_64:
variables:
ARCH: x86_64
# Define the platform targets
.platform-amd64:
variables:
PLATFORM: linux/amd64
.platform-arm64:
variables:
PLATFORM: linux/arm64
# Define test helpers
.integration:
stage: test
variables:
IMAGE_NAME: "${CI_REGISTRY_IMAGE}/container-toolkit"
VERSION: "${CI_COMMIT_SHORT_SHA}"
before_script:
- apk add --no-cache make bash jq
- docker login -u "${CI_REGISTRY_USER}" -p "${CI_REGISTRY_PASSWORD}" "${CI_REGISTRY}"
- docker pull "${IMAGE_NAME}:${VERSION}-${DIST}"
script:
- make -f deployments/container/Makefile test-${DIST}
# Define the test targets
test-packaging:
extends:
- .integration
- .dist-packaging
needs:
- image-packaging
# Download the regctl binary for use in the release steps
.regctl-setup:
before_script:
@@ -164,87 +54,3 @@ test-packaging:
- curl -sSLo bin/regctl https://github.com/regclient/regclient/releases/download/${REGCTL_VERSION}/regctl-linux-amd64
- chmod a+x bin/regctl
- export PATH=$(pwd)/bin:${PATH}
# .release forms the base of the deployment jobs which push images to the CI registry.
# This is extended with the version to be deployed (e.g. the SHA or TAG) and the
# target os.
.release:
stage: release
variables:
# Define the source image for the release
IMAGE_NAME: "${CI_REGISTRY_IMAGE}/container-toolkit"
VERSION: "${CI_COMMIT_SHORT_SHA}"
# OUT_IMAGE_VERSION is overridden for external releases
OUT_IMAGE_VERSION: "${CI_COMMIT_SHORT_SHA}"
before_script:
- !reference [.regctl-setup, before_script]
# We ensure that the OUT_IMAGE_VERSION is set
- 'echo Version: ${OUT_IMAGE_VERSION} ; [[ -n "${OUT_IMAGE_VERSION}" ]] || exit 1'
# In the case where we are deploying a different version to the CI_COMMIT_SHA, we
# need to tag the image.
# Note: a leading 'v' is stripped from the version if present
- apk add --no-cache make bash
script:
# Log in to the "output" registry, tag the image and push the image
- 'echo "Logging in to CI registry ${CI_REGISTRY}"'
- regctl registry login "${CI_REGISTRY}" -u "${CI_REGISTRY_USER}" -p "${CI_REGISTRY_PASSWORD}"
- '[ ${CI_REGISTRY} = ${OUT_REGISTRY} ] || echo "Logging in to output registry ${OUT_REGISTRY}"'
- '[ ${CI_REGISTRY} = ${OUT_REGISTRY} ] || regctl registry login "${OUT_REGISTRY}" -u "${OUT_REGISTRY_USER}" -p "${OUT_REGISTRY_TOKEN}"'
# Since OUT_IMAGE_NAME and OUT_IMAGE_VERSION are set, this will push the CI image to the
# Target
- make -f deployments/container/Makefile push-${DIST}
# Define a staging release step that pushes an image to an internal "staging" repository
# This is triggered for all pipelines (i.e. not only tags) to test the pipeline steps
# outside of the release process.
.release:staging:
extends:
- .release
variables:
OUT_REGISTRY_USER: "${CI_REGISTRY_USER}"
OUT_REGISTRY_TOKEN: "${CI_REGISTRY_PASSWORD}"
OUT_REGISTRY: "${CI_REGISTRY}"
OUT_IMAGE_NAME: "${CI_REGISTRY_IMAGE}/staging/container-toolkit"
# Define an external release step that pushes an image to an external repository.
# This includes a devlopment image off main.
.release:external:
extends:
- .release
variables:
FORCE_PUBLISH_IMAGES: "yes"
rules:
- if: $CI_COMMIT_TAG
variables:
OUT_IMAGE_VERSION: "${CI_COMMIT_TAG}"
- if: $CI_COMMIT_BRANCH == $RELEASE_DEVEL_BRANCH
variables:
OUT_IMAGE_VERSION: "${DEVEL_RELEASE_IMAGE_VERSION}"
# Define the release jobs
release:staging-ubi8:
extends:
- .release:staging
- .dist-ubi8
needs:
- image-ubi8
release:staging-ubuntu20.04:
extends:
- .release:staging
- .dist-ubuntu20.04
needs:
- test-toolkit-ubuntu20.04
- test-containerd-ubuntu20.04
- test-crio-ubuntu20.04
- test-docker-ubuntu20.04
release:staging-packaging:
extends:
- .release:staging
- .dist-packaging
needs:
- test-packaging

View File

@@ -55,7 +55,7 @@ jobs:
go-version: ${{ env.GOLANG_VERSION }}
- name: Set up Holodeck
uses: NVIDIA/holodeck@v0.2.6
uses: NVIDIA/holodeck@v0.2.12
with:
aws_access_key_id: ${{ secrets.AWS_ACCESS_KEY_ID }}
aws_secret_access_key: ${{ secrets.AWS_SECRET_ACCESS_KEY }}
@@ -70,23 +70,28 @@ jobs:
- name: Run e2e tests
env:
IMAGE_NAME: ghcr.io/nvidia/container-toolkit
VERSION: ${{ inputs.version }}
SSH_KEY: ${{ secrets.AWS_SSH_KEY }}
E2E_INSTALL_CTK: "true"
E2E_IMAGE_NAME: ghcr.io/nvidia/container-toolkit
E2E_IMAGE_TAG: ${{ inputs.version }}
E2E_SSH_USER: ${{ secrets.E2E_SSH_USER }}
E2E_SSH_HOST: ${{ steps.holodeck_public_dns_name.outputs.result }}
E2E_INSTALL_CTK: "true"
run: |
e2e_ssh_key=$(mktemp)
echo "$SSH_KEY" > "$e2e_ssh_key"
echo "${{ secrets.AWS_SSH_KEY }}" > "$e2e_ssh_key"
chmod 600 "$e2e_ssh_key"
export E2E_SSH_KEY="$e2e_ssh_key"
make -f tests/e2e/Makefile test
- name: Archive Ginkgo logs
uses: actions/upload-artifact@v4
with:
name: ginkgo-logs
path: ginkgo.json
retention-days: 15
- name: Send Slack alert notification
if: ${{ failure() }}
uses: slackapi/slack-github-action@v2.0.0
uses: slackapi/slack-github-action@v2.1.0
with:
method: chat.postMessage
token: ${{ secrets.SLACK_BOT_TOKEN }}
@@ -94,5 +99,5 @@ jobs:
channel: ${{ secrets.SLACK_CHANNEL_ID }}
text: |
:x: On repository ${{ github.repository }}, the Workflow *${{ github.workflow }}* has failed.
Details: https://github.com/${{ github.repository }}/actions/runs/${{ github.run_id }}

View File

@@ -30,54 +30,73 @@ jobs:
steps:
- uses: actions/checkout@v4
name: Checkout code
- name: Get Golang version
id: vars
run: |
GOLANG_VERSION=$(./hack/golang-version.sh)
echo "GOLANG_VERSION=${GOLANG_VERSION##GOLANG_VERSION := }" >> $GITHUB_ENV
- name: Install Go
uses: actions/setup-go@v5
with:
go-version: ${{ env.GOLANG_VERSION }}
- name: Lint
uses: golangci/golangci-lint-action@v6
uses: golangci/golangci-lint-action@v8
with:
version: latest
args: -v --timeout 5m
skip-cache: true
- name: Check golang modules
run: |
make check-vendor
make -C deployments/devel check-modules
test:
name: Unit test
runs-on: ubuntu-latest
steps:
- name: Checkout code
uses: actions/checkout@v4
- name: Get Golang version
id: vars
run: |
GOLANG_VERSION=$(./hack/golang-version.sh)
echo "GOLANG_VERSION=${GOLANG_VERSION##GOLANG_VERSION := }" >> $GITHUB_ENV
- name: Install Go
uses: actions/setup-go@v5
with:
go-version: ${{ env.GOLANG_VERSION }}
- run: make test
- name: Run unit tests and generate coverage report
run: make coverage
- name: Upload to Coveralls
uses: coverallsapp/github-action@v2
with:
github-token: ${{ secrets.GITHUB_TOKEN }}
file: coverage.out
build:
name: Build
runs-on: ubuntu-latest
steps:
- name: Checkout code
uses: actions/checkout@v4
- name: Get Golang version
id: vars
run: |
GOLANG_VERSION=$(./hack/golang-version.sh)
echo "GOLANG_VERSION=${GOLANG_VERSION##GOLANG_VERSION ?= }" >> $GITHUB_ENV
- name: Install Go
uses: actions/setup-go@v5
with:
go-version: ${{ env.GOLANG_VERSION }}
- run: make build

View File

@@ -27,7 +27,7 @@ on:
jobs:
packages:
runs-on: linux-amd64-cpu4
runs-on: ubuntu-latest
strategy:
matrix:
target:
@@ -49,7 +49,7 @@ jobs:
- ispr: true
target: centos8-ppc64le
fail-fast: false
steps:
- uses: actions/checkout@v4
name: Check out code
@@ -76,18 +76,12 @@ jobs:
path: ${{ github.workspace }}/dist/*
image:
runs-on: linux-amd64-cpu4
runs-on: ubuntu-latest
strategy:
matrix:
dist:
- ubuntu20.04
- ubi8
target:
- application
- packaging
ispr:
- ${{ github.ref_name != 'main' && !startsWith( github.ref_name, 'release-' ) }}
exclude:
- ispr: true
dist: ubi8
needs: packages
steps:
- uses: actions/checkout@v4
@@ -123,4 +117,4 @@ jobs:
BUILD_MULTI_ARCH_IMAGES: ${{ inputs.build_multi_arch_images }}
run: |
echo "${VERSION}"
make -f deployments/container/Makefile build-${{ matrix.dist }}
make -f deployments/container/Makefile build-${{ matrix.target }}

8
.gitignore vendored
View File

@@ -4,10 +4,8 @@
*.swo
/coverage.out*
/tests/output/
/nvidia-container-runtime
/nvidia-container-runtime.*
/nvidia-container-runtime-hook
/nvidia-container-toolkit
/nvidia-ctk
/nvidia-*
/shared-*
/release-*
/bin
/toolkit-test

View File

@@ -1,228 +0,0 @@
# Copyright (c) 2019-2022, NVIDIA CORPORATION. All rights reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
include:
- .common-ci.yml
# Define the package build helpers
.multi-arch-build:
before_script:
- apk add --no-cache coreutils build-base sed git bash make
- '[[ -n "${SKIP_QEMU_SETUP}" ]] || docker run --rm --privileged multiarch/qemu-user-static --reset -p yes -c yes'
.package-artifacts:
variables:
ARTIFACTS_NAME: "toolkit-container-${CI_PIPELINE_ID}"
ARTIFACTS_ROOT: "toolkit-container-${CI_PIPELINE_ID}"
DIST_DIR: ${CI_PROJECT_DIR}/${ARTIFACTS_ROOT}
.package-build:
extends:
- .multi-arch-build
- .package-artifacts
stage: package-build
timeout: 3h
script:
- ./scripts/build-packages.sh ${DIST}-${ARCH}
artifacts:
name: ${ARTIFACTS_NAME}
paths:
- ${ARTIFACTS_ROOT}
needs:
- job: package-meta-packages
artifacts: true
# Define the package build targets
package-meta-packages:
extends:
- .package-artifacts
stage: package-build
variables:
SKIP_LIBNVIDIA_CONTAINER: "yes"
SKIP_NVIDIA_CONTAINER_TOOLKIT: "yes"
parallel:
matrix:
- PACKAGING: [deb, rpm]
before_script:
- apk add --no-cache coreutils build-base sed git bash make
script:
- ./scripts/build-packages.sh ${PACKAGING}
artifacts:
name: ${ARTIFACTS_NAME}
paths:
- ${ARTIFACTS_ROOT}
package-centos7-aarch64:
extends:
- .package-build
- .dist-centos7
- .arch-aarch64
package-centos7-x86_64:
extends:
- .package-build
- .dist-centos7
- .arch-x86_64
package-centos8-ppc64le:
extends:
- .package-build
- .dist-centos8
- .arch-ppc64le
package-ubuntu18.04-amd64:
extends:
- .package-build
- .dist-ubuntu18.04
- .arch-amd64
package-ubuntu18.04-arm64:
extends:
- .package-build
- .dist-ubuntu18.04
- .arch-arm64
package-ubuntu18.04-ppc64le:
extends:
- .package-build
- .dist-ubuntu18.04
- .arch-ppc64le
.buildx-setup:
before_script:
- export BUILDX_VERSION=v0.6.3
- apk add --no-cache curl
- mkdir -p ~/.docker/cli-plugins
- curl -sSLo ~/.docker/cli-plugins/docker-buildx "https://github.com/docker/buildx/releases/download/${BUILDX_VERSION}/buildx-${BUILDX_VERSION}.linux-amd64"
- chmod a+x ~/.docker/cli-plugins/docker-buildx
- docker buildx create --use --platform=linux/amd64,linux/arm64
- '[[ -n "${SKIP_QEMU_SETUP}" ]] || docker run --rm --privileged multiarch/qemu-user-static --reset -p yes'
# Define the image build targets
.image-build:
stage: image-build
variables:
IMAGE_NAME: "${CI_REGISTRY_IMAGE}/container-toolkit"
VERSION: "${CI_COMMIT_SHORT_SHA}"
PUSH_ON_BUILD: "true"
before_script:
- !reference [.buildx-setup, before_script]
- apk add --no-cache bash make git
- 'echo "Logging in to CI registry ${CI_REGISTRY}"'
- docker login -u "${CI_REGISTRY_USER}" -p "${CI_REGISTRY_PASSWORD}" "${CI_REGISTRY}"
script:
- make -f deployments/container/Makefile build-${DIST}
image-ubi8:
extends:
- .image-build
- .package-artifacts
- .dist-ubi8
needs:
# Note: The ubi8 image uses the centos7 packages
- package-centos7-aarch64
- package-centos7-x86_64
image-ubuntu20.04:
extends:
- .image-build
- .package-artifacts
- .dist-ubuntu20.04
needs:
- package-ubuntu18.04-amd64
- package-ubuntu18.04-arm64
- job: package-ubuntu18.04-ppc64le
optional: true
# The DIST=packaging target creates an image containing all built packages
image-packaging:
extends:
- .image-build
- .package-artifacts
- .dist-packaging
needs:
- job: package-ubuntu18.04-amd64
- job: package-ubuntu18.04-arm64
- job: package-amazonlinux2-aarch64
optional: true
- job: package-amazonlinux2-x86_64
optional: true
- job: package-centos7-aarch64
optional: true
- job: package-centos7-x86_64
optional: true
- job: package-centos8-ppc64le
optional: true
- job: package-debian10-amd64
optional: true
- job: package-opensuse-leap15.1-x86_64
optional: true
- job: package-ubuntu18.04-ppc64le
optional: true
# Define publish test helpers
.test:docker:
extends:
- .integration
variables:
TEST_CASES: "docker"
.test:containerd:
# TODO: The containerd tests fail due to issues with SIGHUP.
# Until this is resolved with retry up to twice and allow failure here.
retry: 2
allow_failure: true
extends:
- .integration
variables:
TEST_CASES: "containerd"
.test:crio:
extends:
- .integration
variables:
TEST_CASES: "crio"
# Define the test targets
test-toolkit-ubuntu20.04:
extends:
- .test:toolkit
- .dist-ubuntu20.04
needs:
- image-ubuntu20.04
test-containerd-ubuntu20.04:
extends:
- .test:containerd
- .dist-ubuntu20.04
needs:
- image-ubuntu20.04
test-crio-ubuntu20.04:
extends:
- .test:crio
- .dist-ubuntu20.04
needs:
- image-ubuntu20.04
test-docker-ubuntu20.04:
extends:
- .test:docker
- .dist-ubuntu20.04
needs:
- image-ubuntu20.04

View File

@@ -1,43 +1,72 @@
run:
timeout: 10m
# SPDX-FileCopyrightText: Copyright (c) 2025 NVIDIA CORPORATION & AFFILIATES. All rights reserved.
# SPDX-License-Identifier: Apache-2.0
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
version: "2"
linters:
enable:
- contextcheck
- gocritic
- gosec
- misspell
- unconvert
exclusions:
generated: lax
presets:
- comments
- common-false-positives
- legacy
- std-error-handling
rules:
# Exclude the gocritic dupSubExpr issue for cgo files.
- linters:
- gocritic
path: internal/dxcore/dxcore.go
text: dupSubExpr
# Exclude the checks for usage of returns to config.Delete(Path) in the
# crio and containerd config packages.
- linters:
- errcheck
path: pkg/config/engine/
text: config.Delete
# RENDERD refers to the Render Device and not the past tense of render.
- linters:
- misspell
path: .*.go
text: '`RENDERD` is a misspelling of `RENDERED`'
# The legacy hook relies on spec.Hooks.Prestart, which is deprecated as of
# the v1.2.0 OCI runtime spec.
- path: (.+)\.go$
text: SA1019:(.+).Prestart is deprecated(.+)
# TODO: We should address each of the following integer overflows.
- path: (.+)\.go$
text: 'G115: integer overflow conversion(.+)'
paths:
- third_party$
- builtin$
- examples$
formatters:
enable:
- gofmt
- goimports
- gosec
- gosimple
- govet
- ineffassign
- misspell
- staticcheck
- unconvert
linters-settings:
goimports:
local-prefixes: github.com/NVIDIA/nvidia-container-toolkit
issues:
exclude:
# The legacy hook relies on spec.Hooks.Prestart, which is deprecated as of the v1.2.0 OCI runtime spec.
- "SA1019:(.+).Prestart is deprecated(.+)"
# TODO: We should address each of the following integer overflows.
- "G115: integer overflow conversion(.+)"
exclude-rules:
# Exclude the gocritic dupSubExpr issue for cgo files.
- path: internal/dxcore/dxcore.go
linters:
- gocritic
text: dupSubExpr
# Exclude the checks for usage of returns to config.Delete(Path) in the crio and containerd config packages.
- path: pkg/config/engine/
linters:
- errcheck
text: config.Delete
# RENDERD refers to the Render Device and not the past tense of render.
- path: .*.go
linters:
- misspell
text: "`RENDERD` is a misspelling of `RENDERED`"
settings:
goimports:
local-prefixes:
- github.com/NVIDIA/nvidia-container-toolkit
exclusions:
generated: lax
paths:
- third_party$
- builtin$
- examples$

View File

@@ -39,19 +39,62 @@ variables:
KITMAKER_RELEASE_FOLDER: "kitmaker"
PACKAGE_ARCHIVE_RELEASE_FOLDER: "releases"
.image-pull:
stage: image-build
# .copy-images copies the required application and packaging images from the
# IN_IMAGE="${IN_IMAGE_NAME}:${IN_IMAGE_TAG}${TAG_SUFFIX}"
# to
# OUT_IMAGE="${OUT_IMAGE_NAME}:${OUT_IMAGE_TAG}${TAG_SUFFIX}"
# The script also logs into IN_REGISTRY and OUT_REGISTRY using the supplied
# username and tokens.
.copy-images:
parallel:
matrix:
- TAG_SUFFIX: ["", "-packaging"]
before_script:
- !reference [.regctl-setup, before_script]
- apk add --no-cache make bash
variables:
REGCTL: regctl
script:
- |
if [ -n ${IN_REGISTRY} ] && [ -n ${IN_REGISTRY_USER} ]; then
echo "Logging in to ${IN_REGISTRY}"
${REGCTL} registry login "${IN_REGISTRY}" -u "${IN_REGISTRY_USER}" -p "${IN_REGISTRY_TOKEN}" || exit 1
fi
if [ -n ${OUT_REGISTRY} ] && [ -n ${OUT_REGISTRY_USER} ] && [ "${IN_REGISTRY}" != "${OUT_REGISTRY}" ]; then
echo "Logging in to ${OUT_REGISTRY}"
${REGCTL} registry login "${OUT_REGISTRY}" -u "${OUT_REGISTRY_USER}" -p "${OUT_REGISTRY_TOKEN}" || exit 1
fi
export IN_IMAGE="${IN_IMAGE_NAME}:${IN_IMAGE_TAG}${TAG_SUFFIX}"
export OUT_IMAGE="${OUT_IMAGE_NAME}:${OUT_IMAGE_TAG}${TAG_SUFFIX}"
echo "Copying ${IN_IMAGE} to ${OUT_IMAGE}"
${REGCTL} image copy ${IN_IMAGE} ${OUT_IMAGE}
# pull-images pulls images from the public CI registry to the internal CI registry.
pull-images:
extends:
- .copy-images
stage: pull
variables:
IN_REGISTRY: "${STAGING_REGISTRY}"
IN_IMAGE_NAME: container-toolkit
IN_VERSION: "${STAGING_VERSION}"
IN_IMAGE_NAME: ${STAGING_REGISTRY}/container-toolkit
IN_IMAGE_TAG: "${STAGING_VERSION}"
OUT_REGISTRY: "${CI_REGISTRY}"
OUT_REGISTRY_USER: "${CI_REGISTRY_USER}"
OUT_REGISTRY_TOKEN: "${CI_REGISTRY_PASSWORD}"
OUT_REGISTRY: "${CI_REGISTRY}"
OUT_IMAGE_NAME: "${CI_REGISTRY_IMAGE}/container-toolkit"
PUSH_MULTIPLE_TAGS: "false"
OUT_IMAGE_TAG: "${CI_COMMIT_SHORT_SHA}"
# We delay the job start to allow the public pipeline to generate the required images.
rules:
# If the pipeline is triggered from a tag or the WEB UI we don't delay the
# start of the pipeline.
- if: $CI_COMMIT_TAG || $CI_PIPELINE_SOURCE == "web"
# If the pipeline is triggered through other means (i.e. a branch or MR)
# we add a 30 minute delay to ensure that the images are available in the
# public CI registry.
- when: delayed
start_in: 30 minutes
timeout: 30 minutes
@@ -60,30 +103,6 @@ variables:
when:
- job_execution_timeout
- stuck_or_timeout_failure
before_script:
- !reference [.regctl-setup, before_script]
- apk add --no-cache make bash
- >
regctl manifest get ${IN_REGISTRY}/${IN_IMAGE_NAME}:${IN_VERSION}-${DIST} --list > /dev/null && echo "${IN_REGISTRY}/${IN_IMAGE_NAME}:${IN_VERSION}-${DIST}" || ( echo "${IN_REGISTRY}/${IN_IMAGE_NAME}:${IN_VERSION}-${DIST} does not exist" && sleep infinity )
script:
- regctl registry login "${OUT_REGISTRY}" -u "${OUT_REGISTRY_USER}" -p "${OUT_REGISTRY_TOKEN}"
- make -f deployments/container/Makefile IMAGE=${IN_REGISTRY}/${IN_IMAGE_NAME}:${IN_VERSION}-${DIST} OUT_IMAGE=${OUT_IMAGE_NAME}:${CI_COMMIT_SHORT_SHA}-${DIST} push-${DIST}
image-ubi8:
extends:
- .dist-ubi8
- .image-pull
image-ubuntu20.04:
extends:
- .dist-ubuntu20.04
- .image-pull
# The DIST=packaging target creates an image containing all built packages
image-packaging:
extends:
- .dist-packaging
- .image-pull
# We skip the integration tests for the internal CI:
.integration:
@@ -95,27 +114,37 @@ image-packaging:
# The .scan step forms the base of the image scan operation performed before releasing
# images.
.scan:
scan-images:
stage: scan
needs:
- pull-images
image: "${PULSE_IMAGE}"
parallel:
matrix:
- TAG_SUFFIX: [""]
PLATFORM: ["linux/amd64", "linux/arm64"]
- TAG_SUFFIX: "-packaging"
PLATFORM: "linux/amd64"
variables:
IMAGE: "${CI_REGISTRY_IMAGE}/container-toolkit:${CI_COMMIT_SHORT_SHA}-${DIST}"
IMAGE_ARCHIVE: "container-toolkit-${DIST}-${ARCH}-${CI_JOB_ID}.tar"
IMAGE: "${CI_REGISTRY_IMAGE}/container-toolkit:${CI_COMMIT_SHORT_SHA}"
IMAGE_ARCHIVE: "container-toolkit-${CI_JOB_ID}.tar"
rules:
- if: $SKIP_SCANS != "yes"
- when: manual
before_script:
- docker login -u "${CI_REGISTRY_USER}" -p "${CI_REGISTRY_PASSWORD}" "${CI_REGISTRY}"
# TODO: We should specify the architecture here and scan all architectures
- docker pull --platform="${PLATFORM}" "${IMAGE}"
- docker save "${IMAGE}" -o "${IMAGE_ARCHIVE}"
- AuthHeader=$(echo -n $SSA_CLIENT_ID:$SSA_CLIENT_SECRET | base64 -w0)
- >
export SSA_TOKEN=$(curl --request POST --header "Authorization: Basic $AuthHeader" --header "Content-Type: application/x-www-form-urlencoded" ${SSA_ISSUER_URL} | jq ".access_token" | tr -d '"')
- if [ -z "$SSA_TOKEN" ]; then exit 1; else echo "SSA_TOKEN set!"; fi
- if: $IGNORE_SCANS == "yes"
allow_failure: true
- when: on_success
script:
- pulse-cli -n $NSPECT_ID --ssa $SSA_TOKEN scan -i $IMAGE_ARCHIVE -p $CONTAINER_POLICY -o
- rm -f "${IMAGE_ARCHIVE}"
- |
docker login -u "${CI_REGISTRY_USER}" -p "${CI_REGISTRY_PASSWORD}" "${CI_REGISTRY}"
export SCAN_IMAGE=${IMAGE}${TAG_SUFFIX}
echo "Scanning image ${SCAN_IMAGE} for ${PLATFORM}"
docker pull --platform="${PLATFORM}" "${SCAN_IMAGE}"
docker save "${SCAN_IMAGE}" -o "${IMAGE_ARCHIVE}"
AuthHeader=$(echo -n $SSA_CLIENT_ID:$SSA_CLIENT_SECRET | base64 -w0)
export SSA_TOKEN=$(curl --request POST --header "Authorization: Basic $AuthHeader" --header "Content-Type: application/x-www-form-urlencoded" ${SSA_ISSUER_URL} | jq ".access_token" | tr -d '"')
if [ -z "$SSA_TOKEN" ]; then exit 1; else echo "SSA_TOKEN set!"; fi
pulse-cli -n $NSPECT_ID --ssa $SSA_TOKEN scan -i $IMAGE_ARCHIVE -p $CONTAINER_POLICY -o
rm -f "${IMAGE_ARCHIVE}"
artifacts:
when: always
expire_in: 1 week
@@ -126,62 +155,10 @@ image-packaging:
- vulns.json
- policy_evaluation.json
# Define the scan targets
scan-ubuntu20.04-amd64:
extends:
- .dist-ubuntu20.04
- .platform-amd64
- .scan
needs:
- image-ubuntu20.04
scan-ubuntu20.04-arm64:
extends:
- .dist-ubuntu20.04
- .platform-arm64
- .scan
needs:
- image-ubuntu20.04
- scan-ubuntu20.04-amd64
scan-ubi8-amd64:
extends:
- .dist-ubi8
- .platform-amd64
- .scan
needs:
- image-ubi8
scan-ubi8-arm64:
extends:
- .dist-ubi8
- .platform-arm64
- .scan
needs:
- image-ubi8
- scan-ubi8-amd64
scan-packaging:
extends:
- .dist-packaging
- .scan
needs:
- image-packaging
# Define external release helpers
.release:ngc:
extends:
- .release:external
variables:
OUT_REGISTRY_USER: "${NGC_REGISTRY_USER}"
OUT_REGISTRY_TOKEN: "${NGC_REGISTRY_TOKEN}"
OUT_REGISTRY: "${NGC_REGISTRY}"
OUT_IMAGE_NAME: "${NGC_REGISTRY_IMAGE}"
.release:packages:
upload-kitmaker-packages:
stage: release
needs:
- image-packaging
- pull-images
variables:
VERSION: "${CI_COMMIT_SHORT_SHA}"
PACKAGE_REGISTRY: "${CI_REGISTRY}"
@@ -199,51 +176,81 @@ scan-packaging:
- ./scripts/release-kitmaker-artifactory.sh "${KITMAKER_ARTIFACTORY_REPO}"
- rm -rf ${ARTIFACTS_DIR}
# Define the package release targets
release:packages:kitmaker:
push-images-to-staging:
extends:
- .release:packages
release:archive:
extends:
- .release:external
- .copy-images
stage: release
needs:
- image-packaging
- scan-images
variables:
VERSION: "${CI_COMMIT_SHORT_SHA}"
PACKAGE_REGISTRY: "${CI_REGISTRY}"
PACKAGE_REGISTRY_USER: "${CI_REGISTRY_USER}"
PACKAGE_REGISTRY_TOKEN: "${CI_REGISTRY_PASSWORD}"
PACKAGE_IMAGE_NAME: "${CI_REGISTRY_IMAGE}/container-toolkit"
PACKAGE_IMAGE_TAG: "${CI_COMMIT_SHORT_SHA}-packaging"
PACKAGE_ARCHIVE_ARTIFACTORY_REPO: "${ARTIFACTORY_REPO_BASE}-generic-local/${PACKAGE_ARCHIVE_RELEASE_FOLDER}"
script:
- apk add --no-cache bash git
- ./scripts/archive-packages.sh "${PACKAGE_ARCHIVE_ARTIFACTORY_REPO}"
IN_REGISTRY: "${CI_REGISTRY}"
IN_REGISTRY_USER: "${CI_REGISTRY_USER}"
IN_REGISTRY_TOKEN: "${CI_REGISTRY_PASSWORD}"
IN_IMAGE_NAME: "${CI_REGISTRY_IMAGE}/container-toolkit"
IN_IMAGE_TAG: "${CI_COMMIT_SHORT_SHA}"
release:staging-ubuntu20.04:
OUT_REGISTRY: "${NGC_REGISTRY}"
OUT_REGISTRY_USER: "${NGC_REGISTRY_USER}"
OUT_REGISTRY_TOKEN: "${NGC_REGISTRY_TOKEN}"
OUT_IMAGE_NAME: "${NGC_STAGING_REGISTRY}/container-toolkit"
OUT_IMAGE_TAG: "${CI_COMMIT_SHORT_SHA}"
.release-images:
extends:
- .release:staging
- .dist-ubuntu20.04
- .copy-images
stage: release
needs:
- image-ubuntu20.04
- scan-images
- push-images-to-staging
variables:
IN_REGISTRY: "${CI_REGISTRY}"
IN_REGISTRY_USER: "${CI_REGISTRY_USER}"
IN_REGISTRY_TOKEN: "${CI_REGISTRY_PASSWORD}"
IN_IMAGE_NAME: "${CI_REGISTRY_IMAGE}/container-toolkit"
IN_IMAGE_TAG: "${CI_COMMIT_SHORT_SHA}"
# Define the external release targets
# Release to NGC
release:ngc-ubuntu20.04:
extends:
- .dist-ubuntu20.04
- .release:ngc
OUT_REGISTRY: "${NGC_REGISTRY}"
OUT_REGISTRY_USER: "${NGC_REGISTRY_USER}"
OUT_REGISTRY_TOKEN: "${NGC_REGISTRY_TOKEN}"
OUT_IMAGE_NAME: "${NGC_REGISTRY_IMAGE}"
OUT_IMAGE_TAG: "${CI_COMMIT_TAG}"
release:ngc-ubi8:
release-images-to-ngc:
extends:
- .dist-ubi8
- .release:ngc
- .release-images
rules:
- if: $CI_COMMIT_TAG
release:ngc-packaging:
release-images-dummy:
extends:
- .dist-packaging
- .release:ngc
- .release-images
variables:
REGCTL: "echo [DUMMY] regctl"
rules:
- if: $CI_COMMIT_TAG == null || $CI_COMMIT_TAG == ""
# .sign-images forms the base of the jobs which sign images in the NGC registry.
.sign-images:
stage: sign
image: ubuntu:latest
parallel:
matrix:
- TAG_SUFFIX: ["", "-packaging"]
variables:
IMAGE_NAME: "${NGC_REGISTRY_IMAGE}"
IMAGE_TAG: "${CI_COMMIT_TAG}"
NGC_CLI: "ngc-cli/ngc"
before_script:
- !reference [.ngccli-setup, before_script]
script:
- |
# We ensure that the IMAGE_NAME and IMAGE_TAG is set
echo Image Name: ${IMAGE_NAME} && [[ -n "${IMAGE_NAME}" ]] || exit 1
echo Image Tag: ${IMAGE_TAG} && [[ -n "${IMAGE_TAG}" ]] || exit 1
export IMAGE=${IMAGE_NAME}:${IMAGE_TAG}${TAG_SUFFIX}
echo "Signing the image ${IMAGE}"
${NGC_CLI} registry image publish --source ${IMAGE} ${IMAGE} --public --discoverable --allow-guest --sign --org nvidia
# Define the external image signing steps for NGC
# Download the ngc cli binary for use in the sign steps
@@ -261,45 +268,24 @@ release:ngc-packaging:
- unzip ngccli_linux.zip
- chmod u+x ngc-cli/ngc
# .sign forms the base of the deployment jobs which signs images in the CI registry.
# This is extended with the image name and version to be deployed.
.sign:ngc:
image: ubuntu:latest
stage: sign
sign-ngc-images:
extends:
- .sign-images
needs:
- release-images-to-ngc
rules:
- if: $CI_COMMIT_TAG
variables:
NGC_CLI_API_KEY: "${NGC_REGISTRY_TOKEN}"
IMAGE_NAME: "${NGC_REGISTRY_IMAGE}"
IMAGE_TAG: "${CI_COMMIT_TAG}-${DIST}"
retry:
max: 2
before_script:
- !reference [.ngccli-setup, before_script]
# We ensure that the IMAGE_NAME and IMAGE_TAG is set
- 'echo Image Name: ${IMAGE_NAME} && [[ -n "${IMAGE_NAME}" ]] || exit 1'
- 'echo Image Tag: ${IMAGE_TAG} && [[ -n "${IMAGE_TAG}" ]] || exit 1'
script:
- 'echo "Signing the image ${IMAGE_NAME}:${IMAGE_TAG}"'
- ngc-cli/ngc registry image publish --source ${IMAGE_NAME}:${IMAGE_TAG} ${IMAGE_NAME}:${IMAGE_TAG} --public --discoverable --allow-guest --sign --org nvidia
sign:ngc-ubuntu20.04:
sign-images-dummy:
extends:
- .dist-ubuntu20.04
- .sign:ngc
- .sign-images
needs:
- release:ngc-ubuntu20.04
sign:ngc-ubi8:
extends:
- .dist-ubi8
- .sign:ngc
needs:
- release:ngc-ubi8
sign:ngc-packaging:
extends:
- .dist-packaging
- .sign:ngc
needs:
- release:ngc-packaging
- release-images-dummy
variables:
NGC_CLI: "echo [DUMMY] ngc-cli/ngc"
rules:
- if: $CI_COMMIT_TAG == null || $CI_COMMIT_TAG == ""

View File

@@ -1,34 +1,139 @@
# NVIDIA Container Toolkit Changelog
## v1.17.4
- Disable mounting of compat libs from container by default
## v1.18.0-rc.1
- Add create-soname-symlinks hook
- Require matching version of libnvidia-container-tools
- Add envvar for libcuda.so parent dir to CDI spec
- Add EnvVar to Discover interface
- Resolve to legacy by default in nvidia-container-runtime-hook
- Default to jit-cdi mode in the nvidia runtime
- Use functional options to construct runtime mode resolver
- Add NVIDIA_CTK_CONFIG_FILE_PATH envvar
- Switch to cuda ubi9 base image
- Use single version tag for image
- BUGFIX: modifier: respect GPU volume-mount device requests
- Ensure consistent sorting of annotation devices
- Extract deb and rpm packages to single image
- Remove docker-run as default runtime candidate
- Return annotation devices from VisibleDevices
- Make CDI device requests consistent with other methods
- Construct container info once
- Add logic to extract annotation device requests to image type
- Add IsPrivileged function to CUDA container type
- Add device IDs to nvcdi.GetSpec API
- Refactor extracting requested devices from the container image
- Add EnvVars option for all nvidia-ctk cdi commands
- Add nvidia-cdi-refresh service
- Add discovery of arch-specific vulkan ICD
- Add disabled-device-node-modification hook to CDI spec
- Add a hook to disable device node creation in a container
- Remove redundant deduplication of search paths for WSL
- Added ability to disable specific (or all) CDI hooks
- Consolidate HookName functionality on internal/discover pkg
- Add envvar to control debug logging in CDI hooks
- Add FeatureFlags to the nvcdi API
- Reenable nvsandboxutils for driver discovery
- Edit discover.mounts to have a deterministic output
- Refactor the way we create CDI Hooks
- Issue warning on unsupported CDI hook
- Run update-ldcache in isolated namespaces
- Add cuda-compat-mode config option
- Fix mode detection on Thor-based systems
- Add rprivate to CDI mount options
- Skip nil discoverers in merge
- bump runc go dep to v1.3.0
- Fix resolution of libs in LDCache on ARM
- Updated .release:staging to stage images in nvstaging
- Refactor toolkit installer
- Allow container runtime executable path to be specified
- Add support for building ubuntu22.04 on arm64
- Fix race condition in mounts cache
- Add support for building ubuntu22.04 on amd64
- Fix update-ldcache arguments
- Remove positional arguments from nvidia-ctk-installer
- Remove deprecated --runtime-args from nvidia-ctk-installer
- Add version info to nvidia-ctk-installer
- Update nvidia-ctk-installer app name to match binary name
- Allow nvidia-ctk config --set to accept comma-separated lists
- Disable enable-cuda-compat hook for management containers
- Allow enable-cuda-compat hook to be disabled in CDI spec generation
- Add disable-cuda-compat-lib-hook feature flag
- Add basic integration tests for forward compat
- Ensure that mode hook is executed last
- Add enable-cuda-compat hook to CDI spec generation
- Add ldconfig hook in legacy mode
- Add enable-cuda-compat hook if required
- Add enable-cuda-compat hook to allow compat libs to be discovered
- Use libcontainer execseal to run ldconfig
- Add ignore-imex-channel-requests feature flag
- Disable nvsandboxutils in nvcdi API
- Allow cdi mode to work with --gpus flag
- Add E2E GitHub Action for Container Toolkit
- Add remote-test option for E2E
- Enable CDI in runtime if CDI_ENABLED is set
- Fix overwriting docker feature flags
- Add option in toolkit container to enable CDI in runtime
- Remove Set from engine config API
- Add EnableCDI() method to engine.Interface
- Add IMEX binaries to CDI discovery
- Rename test folder to tests
- Add allow-cuda-compat-libs-from-container feature flag
- Disable mounting of compat libs from container
- Skip graphics modifier in CSV mode
- Properly pass configSearchPaths to a Driver constructor
- Move nvidia-toolkit to nvidia-ctk-installer
- Automated regression testing for the NVIDIA Container Toolkit
- Add support for containerd version 3 config
- Remove watch option from create-dev-char-symlinks
- Add string TOML source
- Improve the implementation for UseLegacyConfig
- Properly pass configSearchPaths to a Driver constructor
- Fix create-device-node test when devices exist
- Add imex mode to CDI spec generation
- Only allow host-relative LDConfig paths
- Fix NVIDIA_IMEX_CHANNELS handling on legacy images
- Fix bug in default config file path
- Fix fsnotify.Remove logic function.
- Force symlink creation in create-symlink hook
### Changes in the Toolkit Container
- Create /work/nvidia-toolkit symlink
- Use Apache license for images
- Switch to golang distroless image
- Switch to cuda ubi9 base image
- Use single version tag for image
- Extract deb and rpm packages to single image
- Bump nvidia/cuda in /deployments/container
- Bump nvidia/cuda in /deployments/container
- Add E2E GitHub Action for Container Toolkit
- Bump nvidia/cuda in /deployments/container
- Move nvidia-toolkit to nvidia-ctk-installer
- Add support for containerd version 3 config
- Improve the implementation for UseLegacyConfig
- Bump nvidia/cuda in /deployments/container
- Add imex mode to CDI spec generation
- Only allow host-relative LDConfig paths
- Fallback to file for runtime config
### Changes in libnvidia-container
- Fix pointer accessing local variable out of scope
- Require version match between libnvidia-container-tools and libnvidia-container1
- Add libnvidia-gpucomp.so to the list of compute libs
- Use VERSION_ prefix for version parts in makefiles
- Add additional logging
- Do not discard container flags when --cuda-compat-mode is not specified
- Remove unneeded --no-cntlibs argument from list command
- Add cuda-compat-mode flag to configure command
- Skip files when user has insufficient permissions
- Fix building with Go 1.24
- Add no-cntlibs CLI option to nvidia-container-cli
### Changes in the Toolkit Container
- Bump CUDA base image version to 12.6.3
## v1.17.3
- Only allow host-relative LDConfig paths by default.
### Changes in libnvidia-container
- Fix always using fallback
- Add fallback for systems without memfd_create()
- Create virtual copy of host ldconfig binary before calling fexecve()
- Fix some typos in text.
## v1.17.2
- Fixed a bug where legacy images would set imex channels as `all`.
## v1.17.1
- Fixed a bug where specific symlinks existing in a container image could cause a container to fail to start.
- Fixed a bug on Tegra-based systems where a container would fail to start.
- Fixed a bug where the default container runtime config path was not properly set.
### Changes in the Toolkit Container
- Fallback to using a config file if the current runtime config can not be determined from the command line.
## v1.17.0
- Promote v1.17.0-rc.2 to v1.17.0

View File

@@ -115,18 +115,18 @@ mod-verify:
check-vendor: vendor
git diff --quiet HEAD -- go.mod go.sum vendor
git diff --exit-code HEAD -- go.mod go.sum vendor
licenses:
go-licenses csv $(MODULE)/...
COVERAGE_FILE := coverage.out
test: build cmds
go test -coverprofile=$(COVERAGE_FILE) $(MODULE)/...
go test -coverprofile=$(COVERAGE_FILE).with-mocks $(MODULE)/...
coverage: test
cat $(COVERAGE_FILE) | grep -v "_mock.go" > $(COVERAGE_FILE).no-mocks
go tool cover -func=$(COVERAGE_FILE).no-mocks
cat $(COVERAGE_FILE).with-mocks | grep -v "_mock.go" > $(COVERAGE_FILE)
go tool cover -func=$(COVERAGE_FILE)
generate:
go generate $(MODULE)/...

24
SECURITY.md Normal file
View File

@@ -0,0 +1,24 @@
# Security
NVIDIA is dedicated to the security and trust of our software products and services, including all source code repositories managed through our organization.
If you need to report a security issue, please use the appropriate contact points outlined below. **Please do not report security vulnerabilities through GitHub.**
## Reporting Potential Security Vulnerability in an NVIDIA Product
To report a potential security vulnerability in any NVIDIA product:
- Web: [Security Vulnerability Submission Form](https://www.nvidia.com/object/submit-security-vulnerability.html)
- E-Mail: psirt@nvidia.com
- We encourage you to use the following PGP key for secure email communication: [NVIDIA public PGP Key for communication](https://www.nvidia.com/en-us/security/pgp-key)
- Please include the following information:
- Product/Driver name and version/branch that contains the vulnerability
- Type of vulnerability (code execution, denial of service, buffer overflow, etc.)
- Instructions to reproduce the vulnerability
- Proof-of-concept or exploit code
- Potential impact of the vulnerability, including how an attacker could exploit the vulnerability
While NVIDIA currently does not have a bug bounty program, we do offer acknowledgement when an externally reported security issue is addressed under our coordinated vulnerability disclosure policy. Please visit our [Product Security Incident Response Team (PSIRT)](https://www.nvidia.com/en-us/security/psirt-policies/) policies page for more information.
## NVIDIA Product Security
For all security-related concerns, please visit NVIDIA's Product Security portal at https://www.nvidia.com/en-us/security

View File

@@ -20,7 +20,10 @@ import (
"github.com/urfave/cli/v2"
"github.com/NVIDIA/nvidia-container-toolkit/cmd/nvidia-cdi-hook/chmod"
createsonamesymlinks "github.com/NVIDIA/nvidia-container-toolkit/cmd/nvidia-cdi-hook/create-soname-symlinks"
symlinks "github.com/NVIDIA/nvidia-container-toolkit/cmd/nvidia-cdi-hook/create-symlinks"
"github.com/NVIDIA/nvidia-container-toolkit/cmd/nvidia-cdi-hook/cudacompat"
disabledevicenodemodification "github.com/NVIDIA/nvidia-container-toolkit/cmd/nvidia-cdi-hook/disable-device-node-modification"
ldcache "github.com/NVIDIA/nvidia-container-toolkit/cmd/nvidia-cdi-hook/update-ldcache"
"github.com/NVIDIA/nvidia-container-toolkit/internal/logger"
)
@@ -32,5 +35,21 @@ func New(logger logger.Interface) []*cli.Command {
ldcache.NewCommand(logger),
symlinks.NewCommand(logger),
chmod.NewCommand(logger),
cudacompat.NewCommand(logger),
createsonamesymlinks.NewCommand(logger),
disabledevicenodemodification.NewCommand(logger),
}
}
// IssueUnsupportedHookWarning logs a warning that no hook or an unsupported
// hook has been specified.
// This happens if a subcommand is provided that does not match one of the
// subcommands that has been explicitly specified.
func IssueUnsupportedHookWarning(logger logger.Interface, c *cli.Context) {
args := c.Args().Slice()
if len(args) == 0 {
logger.Warningf("No CDI hook specified")
} else {
logger.Warningf("Unsupported CDI hook: %v", args[0])
}
}

View File

@@ -0,0 +1,166 @@
/**
# SPDX-FileCopyrightText: Copyright (c) 2025 NVIDIA CORPORATION & AFFILIATES. All rights reserved.
# SPDX-License-Identifier: Apache-2.0
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
**/
package create_soname_symlinks
import (
"errors"
"fmt"
"log"
"os"
"github.com/moby/sys/reexec"
"github.com/urfave/cli/v2"
"github.com/NVIDIA/nvidia-container-toolkit/internal/ldconfig"
"github.com/NVIDIA/nvidia-container-toolkit/internal/logger"
"github.com/NVIDIA/nvidia-container-toolkit/internal/oci"
)
const (
reexecUpdateLdCacheCommandName = "reexec-create-soname-symlinks"
)
type command struct {
logger logger.Interface
}
type options struct {
folders cli.StringSlice
ldconfigPath string
containerSpec string
}
func init() {
reexec.Register(reexecUpdateLdCacheCommandName, createSonameSymlinksHandler)
if reexec.Init() {
os.Exit(0)
}
}
// NewCommand constructs an create-soname-symlinks command with the specified logger
func NewCommand(logger logger.Interface) *cli.Command {
c := command{
logger: logger,
}
return c.build()
}
// build the create-soname-symlinks command
func (m command) build() *cli.Command {
cfg := options{}
// Create the 'create-soname-symlinks' command
c := cli.Command{
Name: "create-soname-symlinks",
Usage: "Create soname symlinks libraries in specified directories",
Before: func(c *cli.Context) error {
return m.validateFlags(c, &cfg)
},
Action: func(c *cli.Context) error {
return m.run(c, &cfg)
},
}
c.Flags = []cli.Flag{
&cli.StringSliceFlag{
Name: "folder",
Usage: "Specify a directory to generate soname symlinks in. Can be specified multiple times",
Destination: &cfg.folders,
},
&cli.StringFlag{
Name: "ldconfig-path",
Usage: "Specify the path to ldconfig on the host",
Destination: &cfg.ldconfigPath,
Value: "/sbin/ldconfig",
},
&cli.StringFlag{
Name: "container-spec",
Usage: "Specify the path to the OCI container spec. If empty or '-' the spec will be read from STDIN",
Destination: &cfg.containerSpec,
},
}
return &c
}
func (m command) validateFlags(c *cli.Context, cfg *options) error {
if cfg.ldconfigPath == "" {
return errors.New("ldconfig-path must be specified")
}
return nil
}
func (m command) run(c *cli.Context, cfg *options) error {
s, err := oci.LoadContainerState(cfg.containerSpec)
if err != nil {
return fmt.Errorf("failed to load container state: %v", err)
}
containerRootDir, err := s.GetContainerRoot()
if err != nil || containerRootDir == "" || containerRootDir == "/" {
return fmt.Errorf("failed to determined container root: %v", err)
}
cmd, err := ldconfig.NewRunner(
reexecUpdateLdCacheCommandName,
cfg.ldconfigPath,
containerRootDir,
cfg.folders.Value()...,
)
if err != nil {
return err
}
return cmd.Run()
}
// createSonameSymlinksHandler wraps createSonameSymlinks with error handling.
func createSonameSymlinksHandler() {
if err := createSonameSymlinks(os.Args); err != nil {
log.Printf("Error updating ldcache: %v", err)
os.Exit(1)
}
}
// createSonameSymlinks ensures that soname symlinks are created in the
// specified directories.
// It is invoked from a reexec'd handler and provides namespace isolation for
// the operations performed by this hook. At the point where this is invoked,
// we are in a new mount namespace that is cloned from the parent.
//
// args[0] is the reexec initializer function name
// args[1] is the path of the ldconfig binary on the host
// args[2] is the container root directory
// The remaining args are directories where soname symlinks need to be created.
func createSonameSymlinks(args []string) error {
if len(args) < 3 {
return fmt.Errorf("incorrect arguments: %v", args)
}
hostLdconfigPath := args[1]
containerRootDirPath := args[2]
ldconfig, err := ldconfig.New(
hostLdconfigPath,
containerRootDirPath,
)
if err != nil {
return fmt.Errorf("failed to construct ldconfig runner: %w", err)
}
return ldconfig.CreateSonameSymlinks(args[3:]...)
}

View File

@@ -0,0 +1,76 @@
/**
# Copyright (c) 2025, NVIDIA CORPORATION. All rights reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
**/
package cudacompat
import (
"os"
"path/filepath"
"github.com/moby/sys/symlink"
)
// A containerRoot represents the root filesystem of a container.
type containerRoot string
// hasPath checks whether the specified path exists in the root.
func (r containerRoot) hasPath(path string) bool {
resolved, err := r.resolve(path)
if err != nil {
return false
}
if _, err := os.Stat(resolved); err != nil && os.IsNotExist(err) {
return false
}
return true
}
// globFiles matches the specified pattern in the root.
// The files that match must be regular files.
func (r containerRoot) globFiles(pattern string) ([]string, error) {
patternPath, err := r.resolve(pattern)
if err != nil {
return nil, err
}
matches, err := filepath.Glob(patternPath)
if err != nil {
return nil, err
}
var files []string
for _, match := range matches {
info, err := os.Lstat(match)
if err != nil {
return nil, err
}
// Ignore symlinks.
if info.Mode()&os.ModeSymlink != 0 {
continue
}
// Ignore directories.
if info.IsDir() {
continue
}
files = append(files, match)
}
return files, nil
}
// resolve returns the absolute path including root path.
// Symlinks are resolved, but are guaranteed to resolve in the root.
func (r containerRoot) resolve(path string) (string, error) {
absolute := filepath.Clean(filepath.Join(string(r), path))
return symlink.FollowSymlinkInScope(absolute, string(r))
}

View File

@@ -0,0 +1,221 @@
/**
# Copyright (c) 2025, NVIDIA CORPORATION. All rights reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
**/
package cudacompat
import (
"fmt"
"os"
"path/filepath"
"strconv"
"strings"
"github.com/urfave/cli/v2"
"github.com/NVIDIA/nvidia-container-toolkit/internal/logger"
"github.com/NVIDIA/nvidia-container-toolkit/internal/oci"
)
const (
cudaCompatPath = "/usr/local/cuda/compat"
// cudaCompatLdsoconfdFilenamePattern specifies the pattern for the filename
// in ld.so.conf.d that includes a reference to the CUDA compat path.
// The 00-compat prefix is chosen to ensure that these libraries have a
// higher precedence than other libraries on the system.
cudaCompatLdsoconfdFilenamePattern = "00-compat-*.conf"
)
type command struct {
logger logger.Interface
}
type options struct {
hostDriverVersion string
containerSpec string
}
// NewCommand constructs a cuda-compat command with the specified logger
func NewCommand(logger logger.Interface) *cli.Command {
c := command{
logger: logger,
}
return c.build()
}
// build the enable-cuda-compat command
func (m command) build() *cli.Command {
cfg := options{}
// Create the 'enable-cuda-compat' command
c := cli.Command{
Name: "enable-cuda-compat",
Usage: "This hook ensures that the folder containing the CUDA compat libraries is added to the ldconfig search path if required.",
Before: func(c *cli.Context) error {
return m.validateFlags(c, &cfg)
},
Action: func(c *cli.Context) error {
return m.run(c, &cfg)
},
}
c.Flags = []cli.Flag{
&cli.StringFlag{
Name: "host-driver-version",
Usage: "Specify the host driver version. If the CUDA compat libraries detected in the container do not have a higher MAJOR version, the hook is a no-op.",
Destination: &cfg.hostDriverVersion,
},
&cli.StringFlag{
Name: "container-spec",
Hidden: true,
Category: "testing-only",
Usage: "Specify the path to the OCI container spec. If empty or '-' the spec will be read from STDIN",
Destination: &cfg.containerSpec,
},
}
return &c
}
func (m command) validateFlags(_ *cli.Context, cfg *options) error {
return nil
}
func (m command) run(_ *cli.Context, cfg *options) error {
if cfg.hostDriverVersion == "" {
return nil
}
s, err := oci.LoadContainerState(cfg.containerSpec)
if err != nil {
return fmt.Errorf("failed to load container state: %w", err)
}
containerRootDir, err := s.GetContainerRoot()
if err != nil {
return fmt.Errorf("failed to determined container root: %w", err)
}
containerForwardCompatDir, err := m.getContainerForwardCompatDir(containerRoot(containerRootDir), cfg.hostDriverVersion)
if err != nil {
return fmt.Errorf("failed to get container forward compat directory: %w", err)
}
if containerForwardCompatDir == "" {
return nil
}
return m.createLdsoconfdFile(containerRoot(containerRootDir), cudaCompatLdsoconfdFilenamePattern, containerForwardCompatDir)
}
func (m command) getContainerForwardCompatDir(containerRoot containerRoot, hostDriverVersion string) (string, error) {
if hostDriverVersion == "" {
m.logger.Debugf("Host driver version not specified")
return "", nil
}
if !containerRoot.hasPath(cudaCompatPath) {
m.logger.Debugf("No CUDA forward compatibility libraries directory in container")
return "", nil
}
if !containerRoot.hasPath("/etc/ld.so.cache") {
m.logger.Debugf("The container does not have an LDCache")
return "", nil
}
libs, err := containerRoot.globFiles(filepath.Join(cudaCompatPath, "libcuda.so.*.*"))
if err != nil {
m.logger.Warningf("Failed to find CUDA compat library: %w", err)
return "", nil
}
if len(libs) == 0 {
m.logger.Debugf("No CUDA forward compatibility libraries container")
return "", nil
}
if len(libs) != 1 {
m.logger.Warningf("Unexpected number of CUDA compat libraries in container: %v", libs)
return "", nil
}
compatDriverVersion := strings.TrimPrefix(filepath.Base(libs[0]), "libcuda.so.")
compatMajor, err := extractMajorVersion(compatDriverVersion)
if err != nil {
return "", fmt.Errorf("failed to extract major version from %q: %v", compatDriverVersion, err)
}
driverMajor, err := extractMajorVersion(hostDriverVersion)
if err != nil {
return "", fmt.Errorf("failed to extract major version from %q: %v", hostDriverVersion, err)
}
if driverMajor >= compatMajor {
m.logger.Debugf("Compat major version is not greater than the host driver major version (%v >= %v)", hostDriverVersion, compatDriverVersion)
return "", nil
}
resolvedCompatDir := strings.TrimPrefix(filepath.Dir(libs[0]), string(containerRoot))
return resolvedCompatDir, nil
}
// createLdsoconfdFile creates a file at /etc/ld.so.conf.d/ in the specified root.
// The file is created at /etc/ld.so.conf.d/{{ .pattern }} using `CreateTemp` and
// contains the specified directories on each line.
func (m command) createLdsoconfdFile(in containerRoot, pattern string, dirs ...string) error {
if len(dirs) == 0 {
m.logger.Debugf("No directories to add to /etc/ld.so.conf")
return nil
}
ldsoconfdDir, err := in.resolve("/etc/ld.so.conf.d")
if err != nil {
return err
}
if err := os.MkdirAll(ldsoconfdDir, 0755); err != nil {
return fmt.Errorf("failed to create ld.so.conf.d: %w", err)
}
configFile, err := os.CreateTemp(ldsoconfdDir, pattern)
if err != nil {
return fmt.Errorf("failed to create config file: %w", err)
}
defer configFile.Close()
m.logger.Debugf("Adding directories %v to %v", dirs, configFile.Name())
added := make(map[string]bool)
for _, dir := range dirs {
if added[dir] {
continue
}
_, err = fmt.Fprintf(configFile, "%s\n", dir)
if err != nil {
return fmt.Errorf("failed to update config file: %w", err)
}
added[dir] = true
}
// The created file needs to be world readable for the cases where the container is run as a non-root user.
if err := configFile.Chmod(0644); err != nil {
return fmt.Errorf("failed to chmod config file: %w", err)
}
return nil
}
// extractMajorVersion parses a version string and returns the major version as an int.
func extractMajorVersion(version string) (int, error) {
majorString := strings.SplitN(version, ".", 2)[0]
return strconv.Atoi(majorString)
}

View File

@@ -0,0 +1,182 @@
/*
# Copyright (c) 2025, NVIDIA CORPORATION. All rights reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
*/
package cudacompat
import (
"os"
"path/filepath"
"strings"
"testing"
testlog "github.com/sirupsen/logrus/hooks/test"
"github.com/stretchr/testify/require"
)
func TestCompatLibs(t *testing.T) {
logger, _ := testlog.NewNullLogger()
testCases := []struct {
description string
contents map[string]string
hostDriverVersion string
expectedContainerForwardCompatDir string
}{
{
description: "empty root",
hostDriverVersion: "222.55.66",
},
{
description: "compat lib is newer; no ldcache",
contents: map[string]string{
"/usr/local/cuda/compat/libcuda.so.333.88.99": "",
},
hostDriverVersion: "222.55.66",
},
{
description: "compat lib is newer; ldcache",
contents: map[string]string{
"/etc/ld.so.cache": "",
"/usr/local/cuda/compat/libcuda.so.333.88.99": "",
},
hostDriverVersion: "222.55.66",
expectedContainerForwardCompatDir: "/usr/local/cuda/compat",
},
{
description: "compat lib is older; ldcache",
contents: map[string]string{
"/etc/ld.so.cache": "",
"/usr/local/cuda/compat/libcuda.so.111.88.99": "",
},
hostDriverVersion: "222.55.66",
expectedContainerForwardCompatDir: "",
},
{
description: "compat lib has same major version; ldcache",
contents: map[string]string{
"/etc/ld.so.cache": "",
"/usr/local/cuda/compat/libcuda.so.222.88.99": "",
},
hostDriverVersion: "222.55.66",
expectedContainerForwardCompatDir: "",
},
{
description: "numeric comparison is used; ldcache",
contents: map[string]string{
"/etc/ld.so.cache": "",
"/usr/local/cuda/compat/libcuda.so.222.88.99": "",
},
hostDriverVersion: "99.55.66",
expectedContainerForwardCompatDir: "/usr/local/cuda/compat",
},
{
description: "driver version empty; ldcache",
contents: map[string]string{
"/etc/ld.so.cache": "",
"/usr/local/cuda/compat/libcuda.so.222.88.99": "",
},
hostDriverVersion: "",
},
{
description: "symlinks are followed",
contents: map[string]string{
"/etc/ld.so.cache": "",
"/etc/alternatives/cuda/compat/libcuda.so.333.88.99": "",
"/usr/local/cuda": "symlink=/etc/alternatives/cuda",
},
hostDriverVersion: "222.55.66",
expectedContainerForwardCompatDir: "/etc/alternatives/cuda/compat",
},
{
description: "symlinks stay in container",
contents: map[string]string{
"/etc/ld.so.cache": "",
"/compat/libcuda.so.333.88.99": "",
"/usr/local/cuda": "symlink=../../../../../../",
},
hostDriverVersion: "222.55.66",
expectedContainerForwardCompatDir: "/compat",
},
}
for _, tc := range testCases {
t.Run(tc.description, func(t *testing.T) {
containerRootDir := t.TempDir()
for name, contents := range tc.contents {
target := filepath.Join(containerRootDir, name)
require.NoError(t, os.MkdirAll(filepath.Dir(target), 0755))
if strings.HasPrefix(contents, "symlink=") {
require.NoError(t, os.Symlink(strings.TrimPrefix(contents, "symlink="), target))
continue
}
require.NoError(t, os.WriteFile(target, []byte(contents), 0600))
}
c := command{
logger: logger,
}
containerForwardCompatDir, err := c.getContainerForwardCompatDir(containerRoot(containerRootDir), tc.hostDriverVersion)
require.NoError(t, err)
require.EqualValues(t, tc.expectedContainerForwardCompatDir, containerForwardCompatDir)
})
}
}
func TestUpdateLdconfig(t *testing.T) {
logger, _ := testlog.NewNullLogger()
testCases := []struct {
description string
folders []string
expectedContents string
}{
{
description: "no folders; have no contents",
},
{
description: "single folder is added",
folders: []string{"/usr/local/cuda/compat"},
expectedContents: "/usr/local/cuda/compat\n",
},
}
for _, tc := range testCases {
t.Run(tc.description, func(t *testing.T) {
containerRootDir := t.TempDir()
c := command{
logger: logger,
}
err := c.createLdsoconfdFile(containerRoot(containerRootDir), cudaCompatLdsoconfdFilenamePattern, tc.folders...)
require.NoError(t, err)
matches, err := filepath.Glob(filepath.Join(containerRootDir, "/etc/ld.so.conf.d/00-compat-*.conf"))
require.NoError(t, err)
if tc.expectedContents == "" {
require.Empty(t, matches)
return
}
require.Len(t, matches, 1)
contents, err := os.ReadFile(matches[0])
require.NoError(t, err)
require.EqualValues(t, tc.expectedContents, string(contents))
})
}
}

View File

@@ -0,0 +1,144 @@
/**
# SPDX-FileCopyrightText: Copyright (c) 2025 NVIDIA CORPORATION & AFFILIATES. All rights reserved.
# SPDX-License-Identifier: Apache-2.0
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
**/
package disabledevicenodemodification
import (
"bufio"
"bytes"
"errors"
"fmt"
"io"
"os"
"strings"
"github.com/urfave/cli/v2"
"github.com/NVIDIA/nvidia-container-toolkit/internal/logger"
"github.com/NVIDIA/nvidia-container-toolkit/internal/oci"
)
const (
nvidiaDriverParamsPath = "/proc/driver/nvidia/params"
)
type options struct {
containerSpec string
}
// NewCommand constructs an disable-device-node-modification subcommand with the specified logger
func NewCommand(logger logger.Interface) *cli.Command {
cfg := options{}
c := cli.Command{
Name: "disable-device-node-modification",
Usage: "Ensure that the /proc/driver/nvidia/params file present in the container does not allow device node modifications.",
Before: func(c *cli.Context) error {
return validateFlags(c, &cfg)
},
Action: func(c *cli.Context) error {
return run(c, &cfg)
},
}
c.Flags = []cli.Flag{
&cli.StringFlag{
Name: "container-spec",
Hidden: true,
Usage: "Specify the path to the OCI container spec. If empty or '-' the spec will be read from STDIN",
Destination: &cfg.containerSpec,
},
}
return &c
}
func validateFlags(c *cli.Context, cfg *options) error {
return nil
}
func run(_ *cli.Context, cfg *options) error {
modifiedParamsFileContents, err := getModifiedNVIDIAParamsContents()
if err != nil {
return fmt.Errorf("failed to get modified params file contents: %w", err)
}
if len(modifiedParamsFileContents) == 0 {
return nil
}
s, err := oci.LoadContainerState(cfg.containerSpec)
if err != nil {
return fmt.Errorf("failed to load container state: %w", err)
}
containerRootDirPath, err := s.GetContainerRoot()
if err != nil {
return fmt.Errorf("failed to determined container root: %w", err)
}
return createParamsFileInContainer(containerRootDirPath, modifiedParamsFileContents)
}
func getModifiedNVIDIAParamsContents() ([]byte, error) {
hostNvidiaParamsFile, err := os.Open(nvidiaDriverParamsPath)
if errors.Is(err, os.ErrNotExist) {
return nil, nil
}
if err != nil {
return nil, fmt.Errorf("failed to load params file: %w", err)
}
defer hostNvidiaParamsFile.Close()
modifiedContents, err := getModifiedParamsFileContentsFromReader(hostNvidiaParamsFile)
if err != nil {
return nil, fmt.Errorf("failed to get modfied params file contents: %w", err)
}
return modifiedContents, nil
}
// getModifiedParamsFileContentsFromReader returns the contents of a modified params file from the specified reader.
func getModifiedParamsFileContentsFromReader(r io.Reader) ([]byte, error) {
var modified bytes.Buffer
scanner := bufio.NewScanner(r)
var requiresModification bool
for scanner.Scan() {
line := scanner.Text()
if strings.HasPrefix(line, "ModifyDeviceFiles: ") {
if line == "ModifyDeviceFiles: 0" {
return nil, nil
}
if line == "ModifyDeviceFiles: 1" {
line = "ModifyDeviceFiles: 0"
requiresModification = true
}
}
if _, err := modified.WriteString(line + "\n"); err != nil {
return nil, fmt.Errorf("failed to create output buffer: %w", err)
}
}
if err := scanner.Err(); err != nil {
return nil, fmt.Errorf("failed to read params file: %w", err)
}
if !requiresModification {
return nil, nil
}
return modified.Bytes(), nil
}

View File

@@ -0,0 +1,91 @@
/**
# SPDX-FileCopyrightText: Copyright (c) 2025 NVIDIA CORPORATION & AFFILIATES. All rights reserved.
# SPDX-License-Identifier: Apache-2.0
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
**/
package disabledevicenodemodification
import (
"bytes"
"testing"
"github.com/stretchr/testify/require"
)
func TestGetModifiedParamsFileContentsFromReader(t *testing.T) {
testCases := map[string]struct {
contents []byte
expectedError error
expectedContents []byte
}{
"no contents": {
contents: nil,
expectedError: nil,
expectedContents: nil,
},
"other contents are ignored": {
contents: []byte(`# Some other content
that we don't care about
`),
expectedError: nil,
expectedContents: nil,
},
"already zero requires no modification": {
contents: []byte("ModifyDeviceFiles: 0"),
expectedError: nil,
expectedContents: nil,
},
"leading spaces require no modification": {
contents: []byte(" ModifyDeviceFiles: 1"),
},
"Trailing spaces require no modification": {
contents: []byte("ModifyDeviceFiles: 1 "),
},
"Not 1 require no modification": {
contents: []byte("ModifyDeviceFiles: 11"),
},
"single line requires modification": {
contents: []byte("ModifyDeviceFiles: 1"),
expectedError: nil,
expectedContents: []byte("ModifyDeviceFiles: 0\n"),
},
"single line with trailing newline requires modification": {
contents: []byte("ModifyDeviceFiles: 1\n"),
expectedError: nil,
expectedContents: []byte("ModifyDeviceFiles: 0\n"),
},
"other content is maintained": {
contents: []byte(`ModifyDeviceFiles: 1
other content
that
is maintained`),
expectedError: nil,
expectedContents: []byte(`ModifyDeviceFiles: 0
other content
that
is maintained
`),
},
}
for description, tc := range testCases {
t.Run(description, func(t *testing.T) {
contents, err := getModifiedParamsFileContentsFromReader(bytes.NewReader(tc.contents))
require.EqualValues(t, tc.expectedError, err)
require.EqualValues(t, string(tc.expectedContents), string(contents))
})
}
}

View File

@@ -0,0 +1,63 @@
//go:build linux
/**
# SPDX-FileCopyrightText: Copyright (c) 2025 NVIDIA CORPORATION & AFFILIATES. All rights reserved.
# SPDX-License-Identifier: Apache-2.0
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
**/
package disabledevicenodemodification
import (
"fmt"
"os"
"path/filepath"
"github.com/opencontainers/runc/libcontainer/utils"
"golang.org/x/sys/unix"
)
func createParamsFileInContainer(containerRootDirPath string, contents []byte) error {
tmpRoot, err := os.MkdirTemp("", "nvct-empty-dir*")
if err != nil {
return fmt.Errorf("failed to create temp root: %w", err)
}
if err := createTmpFs(tmpRoot, len(contents)); err != nil {
return fmt.Errorf("failed to create tmpfs mount for params file: %w", err)
}
modifiedParamsFile, err := os.OpenFile(filepath.Join(tmpRoot, "nvct-params"), os.O_WRONLY|os.O_CREATE|os.O_TRUNC, 0444)
if err != nil {
return fmt.Errorf("failed to open modified params file: %w", err)
}
defer modifiedParamsFile.Close()
if _, err := modifiedParamsFile.Write(contents); err != nil {
return fmt.Errorf("failed to write temporary params file: %w", err)
}
err = utils.WithProcfd(containerRootDirPath, nvidiaDriverParamsPath, func(nvidiaDriverParamsFdPath string) error {
return unix.Mount(modifiedParamsFile.Name(), nvidiaDriverParamsFdPath, "", unix.MS_BIND|unix.MS_RDONLY|unix.MS_NODEV|unix.MS_PRIVATE|unix.MS_NOSYMFOLLOW, "")
})
if err != nil {
return fmt.Errorf("failed to mount modified params file: %w", err)
}
return nil
}
func createTmpFs(target string, size int) error {
return unix.Mount("tmpfs", target, "tmpfs", 0, fmt.Sprintf("size=%d", size))
}

View File

@@ -0,0 +1,27 @@
//go:build !linux
// +build !linux
/**
# SPDX-FileCopyrightText: Copyright (c) 2025 NVIDIA CORPORATION & AFFILIATES. All rights reserved.
# SPDX-License-Identifier: Apache-2.0
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
**/
package disabledevicenodemodification
import "fmt"
func createParamsFileInContainer(containerRootDirPath string, contents []byte) error {
return fmt.Errorf("not supported")
}

View File

@@ -51,6 +51,18 @@ func main() {
c.Usage = "Command to structure files for usage inside a container, called as hooks from a container runtime, defined in a CDI yaml file"
c.Version = info.GetVersionString()
// We set the default action for the `nvidia-cdi-hook` command to issue a
// warning and exit with no error.
// This means that if an unsupported hook is run, a container will not fail
// to launch. An unsupported hook could be the result of a CDI specification
// referring to a new hook that is not yet supported by an older NVIDIA
// Container Toolkit version or a hook that has been removed in newer
// version.
c.Action = func(ctx *cli.Context) error {
commands.IssueUnsupportedHookWarning(logger, ctx)
return nil
}
// Setup the flags for this command
c.Flags = []cli.Flag{
&cli.BoolFlag{
@@ -58,13 +70,15 @@ func main() {
Aliases: []string{"d"},
Usage: "Enable debug-level logging",
Destination: &opts.Debug,
EnvVars: []string{"NVIDIA_CDI_DEBUG"},
// TODO: Support for NVIDIA_CDI_DEBUG is deprecated and NVIDIA_CTK_DEBUG should be used instead.
EnvVars: []string{"NVIDIA_CTK_DEBUG", "NVIDIA_CDI_DEBUG"},
},
&cli.BoolFlag{
Name: "quiet",
Usage: "Suppress all output except for errors; overrides --debug",
Destination: &opts.Quiet,
EnvVars: []string{"NVIDIA_CDI_QUIET"},
// TODO: Support for NVIDIA_CDI_QUIET is deprecated and NVIDIA_CTK_QUIET should be used instead.
EnvVars: []string{"NVDIA_CTK_QUIET", "NVIDIA_CDI_QUIET"},
},
}

View File

@@ -19,18 +19,21 @@ package ldcache
import (
"errors"
"fmt"
"log"
"os"
"path/filepath"
"strings"
"syscall"
"github.com/moby/sys/reexec"
"github.com/urfave/cli/v2"
"github.com/NVIDIA/nvidia-container-toolkit/internal/config"
"github.com/NVIDIA/nvidia-container-toolkit/internal/ldconfig"
"github.com/NVIDIA/nvidia-container-toolkit/internal/logger"
"github.com/NVIDIA/nvidia-container-toolkit/internal/oci"
)
const (
reexecUpdateLdCacheCommandName = "reexec-update-ldcache"
)
type command struct {
logger logger.Interface
}
@@ -41,6 +44,13 @@ type options struct {
containerSpec string
}
func init() {
reexec.Register(reexecUpdateLdCacheCommandName, updateLdCacheHandler)
if reexec.Init() {
os.Exit(0)
}
}
// NewCommand constructs an update-ldcache command with the specified logger
func NewCommand(logger logger.Interface) *cli.Command {
c := command{
@@ -100,98 +110,55 @@ func (m command) run(c *cli.Context, cfg *options) error {
return fmt.Errorf("failed to load container state: %v", err)
}
containerRoot, err := s.GetContainerRoot()
if err != nil {
containerRootDir, err := s.GetContainerRoot()
if err != nil || containerRootDir == "" || containerRootDir == "/" {
return fmt.Errorf("failed to determined container root: %v", err)
}
ldconfigPath := m.resolveLDConfigPath(cfg.ldconfigPath)
args := []string{filepath.Base(ldconfigPath)}
if containerRoot != "" {
args = append(args, "-r", containerRoot)
}
if root(containerRoot).hasPath("/etc/ld.so.cache") {
args = append(args, "-C", "/etc/ld.so.cache")
} else {
m.logger.Debugf("No ld.so.cache found, skipping update")
args = append(args, "-N")
}
folders := cfg.folders.Value()
if root(containerRoot).hasPath("/etc/ld.so.conf.d") {
err := m.createConfig(containerRoot, folders)
if err != nil {
return fmt.Errorf("failed to update ld.so.conf.d: %v", err)
}
} else {
args = append(args, folders...)
}
// Explicitly specify using /etc/ld.so.conf since the host's ldconfig may
// be configured to use a different config file by default.
args = append(args, "-f", "/etc/ld.so.conf")
//nolint:gosec // TODO: Can we harden this so that there is less risk of command injection
return syscall.Exec(ldconfigPath, args, nil)
}
type root string
func (r root) hasPath(path string) bool {
_, err := os.Stat(filepath.Join(string(r), path))
if err != nil && os.IsNotExist(err) {
return false
}
return true
}
// resolveLDConfigPath determines the LDConfig path to use for the system.
// On systems such as Ubuntu where `/sbin/ldconfig` is a wrapper around
// /sbin/ldconfig.real, the latter is returned.
func (m command) resolveLDConfigPath(path string) string {
return strings.TrimPrefix(config.NormalizeLDConfigPath("@"+path), "@")
}
// createConfig creates (or updates) /etc/ld.so.conf.d/00-nvcr-<RANDOM_STRING>.conf in the container
// to include the required paths.
// Note that the 00-nvcr prefix is chosen to ensure that these libraries have
// a higher precedence than other libraries on the system but are applied AFTER
// 00-cuda-compat.conf.
func (m command) createConfig(root string, folders []string) error {
if len(folders) == 0 {
m.logger.Debugf("No folders to add to /etc/ld.so.conf")
return nil
}
if err := os.MkdirAll(filepath.Join(root, "/etc/ld.so.conf.d"), 0755); err != nil {
return fmt.Errorf("failed to create ld.so.conf.d: %v", err)
}
configFile, err := os.CreateTemp(filepath.Join(root, "/etc/ld.so.conf.d"), "00-nvcr-*.conf")
cmd, err := ldconfig.NewRunner(
reexecUpdateLdCacheCommandName,
cfg.ldconfigPath,
containerRootDir,
cfg.folders.Value()...,
)
if err != nil {
return fmt.Errorf("failed to create config file: %v", err)
return err
}
defer configFile.Close()
m.logger.Debugf("Adding folders %v to %v", folders, configFile.Name())
configured := make(map[string]bool)
for _, folder := range folders {
if configured[folder] {
continue
}
_, err = configFile.WriteString(fmt.Sprintf("%s\n", folder))
if err != nil {
return fmt.Errorf("failed to update ld.so.conf.d: %v", err)
}
configured[folder] = true
}
// The created file needs to be world readable for the cases where the container is run as a non-root user.
if err := os.Chmod(configFile.Name(), 0644); err != nil {
return fmt.Errorf("failed to chmod config file: %v", err)
}
return nil
return cmd.Run()
}
// updateLdCacheHandler wraps updateLdCache with error handling.
func updateLdCacheHandler() {
if err := updateLdCache(os.Args); err != nil {
log.Printf("Error updating ldcache: %v", err)
os.Exit(1)
}
}
// updateLdCache ensures that the ldcache in the container is updated to include
// libraries that are mounted from the host.
// It is invoked from a reexec'd handler and provides namespace isolation for
// the operations performed by this hook. At the point where this is invoked,
// we are in a new mount namespace that is cloned from the parent.
//
// args[0] is the reexec initializer function name
// args[1] is the path of the ldconfig binary on the host
// args[2] is the container root directory
// The remaining args are folders where soname symlinks need to be created.
func updateLdCache(args []string) error {
if len(args) < 3 {
return fmt.Errorf("incorrect arguments: %v", args)
}
hostLdconfigPath := args[1]
containerRootDirPath := args[2]
ldconfig, err := ldconfig.New(
hostLdconfigPath,
containerRootDirPath,
)
if err != nil {
return fmt.Errorf("failed to construct ldconfig runner: %w", err)
}
return ldconfig.UpdateLDCache(args[3:]...)
}

View File

@@ -13,10 +13,6 @@ import (
"github.com/NVIDIA/nvidia-container-toolkit/internal/config/image"
)
const (
capSysAdmin = "CAP_SYS_ADMIN"
)
type nvidiaConfig struct {
Devices []string
MigConfigDevices string
@@ -103,9 +99,9 @@ func loadSpec(path string) (spec *Spec) {
return
}
func isPrivileged(s *Spec) bool {
if s.Process.Capabilities == nil {
return false
func (s *Spec) GetCapabilities() []string {
if s == nil || s.Process == nil || s.Process.Capabilities == nil {
return nil
}
var caps []string
@@ -118,67 +114,22 @@ func isPrivileged(s *Spec) bool {
if err != nil {
log.Panicln("could not decode Process.Capabilities in OCI spec:", err)
}
for _, c := range caps {
if c == capSysAdmin {
return true
}
}
return false
return caps
}
// Otherwise, parse s.Process.Capabilities as:
// github.com/opencontainers/runtime-spec/blob/v1.0.0/specs-go/config.go#L30-L54
process := specs.Process{
Env: s.Process.Env,
}
err := json.Unmarshal(*s.Process.Capabilities, &process.Capabilities)
capabilities := specs.LinuxCapabilities{}
err := json.Unmarshal(*s.Process.Capabilities, &capabilities)
if err != nil {
log.Panicln("could not decode Process.Capabilities in OCI spec:", err)
}
fullSpec := specs.Spec{
Version: *s.Version,
Process: &process,
}
return image.IsPrivileged(&fullSpec)
return image.OCISpecCapabilities(capabilities).GetCapabilities()
}
func getDevicesFromEnvvar(containerImage image.CUDA, swarmResourceEnvvars []string) []string {
// We check if the image has at least one of the Swarm resource envvars defined and use this
// if specified.
for _, envvar := range swarmResourceEnvvars {
if containerImage.HasEnvvar(envvar) {
return containerImage.DevicesFromEnvvars(swarmResourceEnvvars...).List()
}
}
return containerImage.VisibleDevicesFromEnvVar()
}
func (hookConfig *hookConfig) getDevices(image image.CUDA, privileged bool) []string {
// If enabled, try and get the device list from volume mounts first
if hookConfig.AcceptDeviceListAsVolumeMounts {
devices := image.VisibleDevicesFromMounts()
if len(devices) > 0 {
return devices
}
}
// Fallback to reading from the environment variable if privileges are correct
devices := getDevicesFromEnvvar(image, hookConfig.getSwarmResourceEnvvars())
if len(devices) == 0 {
return nil
}
if privileged || hookConfig.AcceptEnvvarUnprivileged {
return devices
}
configName := hookConfig.getConfigOption("AcceptEnvvarUnprivileged")
log.Printf("Ignoring devices specified in NVIDIA_VISIBLE_DEVICES (privileged=%v, %v=%v) ", privileged, configName, hookConfig.AcceptEnvvarUnprivileged)
return nil
func isPrivileged(s *Spec) bool {
return image.IsPrivileged(s)
}
func getMigConfigDevices(i image.CUDA) *string {
@@ -198,6 +149,10 @@ func getMigDevices(image image.CUDA, envvar string) *string {
}
func (hookConfig *hookConfig) getImexChannels(image image.CUDA, privileged bool) []string {
if hookConfig.Features.IgnoreImexChannelRequests.IsEnabled() {
return nil
}
// If enabled, try and get the device list from volume mounts first
if hookConfig.AcceptDeviceListAsVolumeMounts {
devices := image.ImexChannelsFromMounts()
@@ -221,7 +176,6 @@ func (hookConfig *hookConfig) getDriverCapabilities(cudaImage image.CUDA, legacy
// We use the default driver capabilities by default. This is filtered to only include the
// supported capabilities
supportedDriverCapabilities := image.NewDriverCapabilities(hookConfig.SupportedDriverCapabilities)
capabilities := supportedDriverCapabilities.Intersection(image.DefaultDriverCapabilities)
capsEnvSpecified := cudaImage.HasEnvvar(image.EnvVarNvidiaDriverCapabilities)
@@ -247,7 +201,7 @@ func (hookConfig *hookConfig) getDriverCapabilities(cudaImage image.CUDA, legacy
func (hookConfig *hookConfig) getNvidiaConfig(image image.CUDA, privileged bool) *nvidiaConfig {
legacyImage := image.IsLegacy()
devices := hookConfig.getDevices(image, privileged)
devices := image.VisibleDevices()
if len(devices) == 0 {
// empty devices means this is not a GPU container.
return nil
@@ -288,7 +242,14 @@ func (hookConfig *hookConfig) getNvidiaConfig(image image.CUDA, privileged bool)
}
}
func (hookConfig *hookConfig) getContainerConfig() (config containerConfig) {
func (hookConfig *hookConfig) getContainerConfig() (config *containerConfig) {
hookConfig.Lock()
defer hookConfig.Unlock()
if hookConfig.containerConfig != nil {
return hookConfig.containerConfig
}
var h HookState
d := json.NewDecoder(os.Stdin)
if err := d.Decode(&h); err != nil {
@@ -302,20 +263,28 @@ func (hookConfig *hookConfig) getContainerConfig() (config containerConfig) {
s := loadSpec(path.Join(b, "config.json"))
image, err := image.New(
privileged := isPrivileged(s)
i, err := image.New(
image.WithEnv(s.Process.Env),
image.WithMounts(s.Mounts),
image.WithPrivileged(privileged),
image.WithDisableRequire(hookConfig.DisableRequire),
image.WithAcceptDeviceListAsVolumeMounts(hookConfig.AcceptDeviceListAsVolumeMounts),
image.WithAcceptEnvvarUnprivileged(hookConfig.AcceptEnvvarUnprivileged),
image.WithPreferredVisibleDevicesEnvVars(hookConfig.getSwarmResourceEnvvars()...),
)
if err != nil {
log.Panicln(err)
}
privileged := isPrivileged(s)
return containerConfig{
cc := containerConfig{
Pid: h.Pid,
Rootfs: s.Root.Path,
Image: image,
Nvidia: hookConfig.getNvidiaConfig(image, privileged),
Image: i,
Nvidia: hookConfig.getNvidiaConfig(i, privileged),
}
hookConfig.containerConfig = &cc
return hookConfig.containerConfig
}

View File

@@ -1,10 +1,8 @@
package main
import (
"path/filepath"
"testing"
"github.com/opencontainers/runtime-spec/specs-go"
"github.com/stretchr/testify/require"
"github.com/NVIDIA/nvidia-container-toolkit/internal/config"
@@ -479,14 +477,17 @@ func TestGetNvidiaConfig(t *testing.T) {
t.Run(tc.description, func(t *testing.T) {
image, _ := image.New(
image.WithEnvMap(tc.env),
image.WithPrivileged(tc.privileged),
image.WithPreferredVisibleDevicesEnvVars(tc.hookConfig.getSwarmResourceEnvvars()...),
)
// Wrap the call to getNvidiaConfig() in a closure.
var cfg *nvidiaConfig
getConfig := func() {
hookCfg := tc.hookConfig
if hookCfg == nil {
defaultConfig, _ := config.GetDefault()
hookCfg = &hookConfig{defaultConfig}
hookCfg = &hookConfig{Config: defaultConfig}
}
cfg = hookCfg.getNvidiaConfig(image, tc.privileged)
}
@@ -518,340 +519,6 @@ func TestGetNvidiaConfig(t *testing.T) {
}
}
func TestDeviceListSourcePriority(t *testing.T) {
var tests = []struct {
description string
mountDevices []specs.Mount
envvarDevices string
privileged bool
acceptUnprivileged bool
acceptMounts bool
expectedDevices []string
}{
{
description: "Mount devices, unprivileged, no accept unprivileged",
mountDevices: []specs.Mount{
{
Source: "/dev/null",
Destination: filepath.Join(image.DeviceListAsVolumeMountsRoot, "GPU0"),
},
{
Source: "/dev/null",
Destination: filepath.Join(image.DeviceListAsVolumeMountsRoot, "GPU1"),
},
},
envvarDevices: "GPU2,GPU3",
privileged: false,
acceptUnprivileged: false,
acceptMounts: true,
expectedDevices: []string{"GPU0", "GPU1"},
},
{
description: "No mount devices, unprivileged, no accept unprivileged",
mountDevices: nil,
envvarDevices: "GPU0,GPU1",
privileged: false,
acceptUnprivileged: false,
acceptMounts: true,
expectedDevices: nil,
},
{
description: "No mount devices, privileged, no accept unprivileged",
mountDevices: nil,
envvarDevices: "GPU0,GPU1",
privileged: true,
acceptUnprivileged: false,
acceptMounts: true,
expectedDevices: []string{"GPU0", "GPU1"},
},
{
description: "No mount devices, unprivileged, accept unprivileged",
mountDevices: nil,
envvarDevices: "GPU0,GPU1",
privileged: false,
acceptUnprivileged: true,
acceptMounts: true,
expectedDevices: []string{"GPU0", "GPU1"},
},
{
description: "Mount devices, unprivileged, accept unprivileged, no accept mounts",
mountDevices: []specs.Mount{
{
Source: "/dev/null",
Destination: filepath.Join(image.DeviceListAsVolumeMountsRoot, "GPU0"),
},
{
Source: "/dev/null",
Destination: filepath.Join(image.DeviceListAsVolumeMountsRoot, "GPU1"),
},
},
envvarDevices: "GPU2,GPU3",
privileged: false,
acceptUnprivileged: true,
acceptMounts: false,
expectedDevices: []string{"GPU2", "GPU3"},
},
{
description: "Mount devices, unprivileged, no accept unprivileged, no accept mounts",
mountDevices: []specs.Mount{
{
Source: "/dev/null",
Destination: filepath.Join(image.DeviceListAsVolumeMountsRoot, "GPU0"),
},
{
Source: "/dev/null",
Destination: filepath.Join(image.DeviceListAsVolumeMountsRoot, "GPU1"),
},
},
envvarDevices: "GPU2,GPU3",
privileged: false,
acceptUnprivileged: false,
acceptMounts: false,
expectedDevices: nil,
},
}
for _, tc := range tests {
t.Run(tc.description, func(t *testing.T) {
// Wrap the call to getDevices() in a closure.
var devices []string
getDevices := func() {
image, _ := image.New(
image.WithEnvMap(
map[string]string{
image.EnvVarNvidiaVisibleDevices: tc.envvarDevices,
},
),
image.WithMounts(tc.mountDevices),
)
defaultConfig, _ := config.GetDefault()
cfg := &hookConfig{defaultConfig}
cfg.AcceptEnvvarUnprivileged = tc.acceptUnprivileged
cfg.AcceptDeviceListAsVolumeMounts = tc.acceptMounts
devices = cfg.getDevices(image, tc.privileged)
}
// For all other tests, just grab the devices and check the results
getDevices()
require.Equal(t, tc.expectedDevices, devices)
})
}
}
func TestGetDevicesFromEnvvar(t *testing.T) {
envDockerResourceGPUs := "DOCKER_RESOURCE_GPUS"
gpuID := "GPU-12345"
anotherGPUID := "GPU-67890"
thirdGPUID := "MIG-12345"
var tests = []struct {
description string
swarmResourceEnvvars []string
env map[string]string
expectedDevices []string
}{
{
description: "empty env returns nil for non-legacy image",
},
{
description: "blank NVIDIA_VISIBLE_DEVICES returns nil for non-legacy image",
env: map[string]string{
image.EnvVarNvidiaVisibleDevices: "",
},
},
{
description: "'void' NVIDIA_VISIBLE_DEVICES returns nil for non-legacy image",
env: map[string]string{
image.EnvVarNvidiaVisibleDevices: "void",
},
},
{
description: "'none' NVIDIA_VISIBLE_DEVICES returns empty for non-legacy image",
env: map[string]string{
image.EnvVarNvidiaVisibleDevices: "none",
},
expectedDevices: []string{""},
},
{
description: "NVIDIA_VISIBLE_DEVICES set returns value for non-legacy image",
env: map[string]string{
image.EnvVarNvidiaVisibleDevices: gpuID,
},
expectedDevices: []string{gpuID},
},
{
description: "NVIDIA_VISIBLE_DEVICES set returns value for legacy image",
env: map[string]string{
image.EnvVarNvidiaVisibleDevices: gpuID,
image.EnvVarCudaVersion: "legacy",
},
expectedDevices: []string{gpuID},
},
{
description: "empty env returns all for legacy image",
env: map[string]string{
image.EnvVarCudaVersion: "legacy",
},
expectedDevices: []string{"all"},
},
// Add the `DOCKER_RESOURCE_GPUS` envvar and ensure that this is ignored when
// not enabled
{
description: "missing NVIDIA_VISIBLE_DEVICES returns nil for non-legacy image",
env: map[string]string{
envDockerResourceGPUs: anotherGPUID,
},
},
{
description: "blank NVIDIA_VISIBLE_DEVICES returns nil for non-legacy image",
env: map[string]string{
image.EnvVarNvidiaVisibleDevices: "",
envDockerResourceGPUs: anotherGPUID,
},
},
{
description: "'void' NVIDIA_VISIBLE_DEVICES returns nil for non-legacy image",
env: map[string]string{
image.EnvVarNvidiaVisibleDevices: "void",
envDockerResourceGPUs: anotherGPUID,
},
},
{
description: "'none' NVIDIA_VISIBLE_DEVICES returns empty for non-legacy image",
env: map[string]string{
image.EnvVarNvidiaVisibleDevices: "none",
envDockerResourceGPUs: anotherGPUID,
},
expectedDevices: []string{""},
},
{
description: "NVIDIA_VISIBLE_DEVICES set returns value for non-legacy image",
env: map[string]string{
image.EnvVarNvidiaVisibleDevices: gpuID,
envDockerResourceGPUs: anotherGPUID,
},
expectedDevices: []string{gpuID},
},
{
description: "NVIDIA_VISIBLE_DEVICES set returns value for legacy image",
env: map[string]string{
image.EnvVarNvidiaVisibleDevices: gpuID,
envDockerResourceGPUs: anotherGPUID,
image.EnvVarCudaVersion: "legacy",
},
expectedDevices: []string{gpuID},
},
{
description: "empty env returns all for legacy image",
env: map[string]string{
envDockerResourceGPUs: anotherGPUID,
image.EnvVarCudaVersion: "legacy",
},
expectedDevices: []string{"all"},
},
// Add the `DOCKER_RESOURCE_GPUS` envvar and ensure that this is selected when
// enabled
{
description: "empty env returns nil for non-legacy image",
swarmResourceEnvvars: []string{envDockerResourceGPUs},
},
{
description: "blank DOCKER_RESOURCE_GPUS returns nil for non-legacy image",
swarmResourceEnvvars: []string{envDockerResourceGPUs},
env: map[string]string{
envDockerResourceGPUs: "",
},
},
{
description: "'void' DOCKER_RESOURCE_GPUS returns nil for non-legacy image",
swarmResourceEnvvars: []string{envDockerResourceGPUs},
env: map[string]string{
envDockerResourceGPUs: "void",
},
},
{
description: "'none' DOCKER_RESOURCE_GPUS returns empty for non-legacy image",
swarmResourceEnvvars: []string{envDockerResourceGPUs},
env: map[string]string{
envDockerResourceGPUs: "none",
},
expectedDevices: []string{""},
},
{
description: "DOCKER_RESOURCE_GPUS set returns value for non-legacy image",
swarmResourceEnvvars: []string{envDockerResourceGPUs},
env: map[string]string{
envDockerResourceGPUs: gpuID,
},
expectedDevices: []string{gpuID},
},
{
description: "DOCKER_RESOURCE_GPUS set returns value for legacy image",
swarmResourceEnvvars: []string{envDockerResourceGPUs},
env: map[string]string{
envDockerResourceGPUs: gpuID,
image.EnvVarCudaVersion: "legacy",
},
expectedDevices: []string{gpuID},
},
{
description: "DOCKER_RESOURCE_GPUS is selected if present",
swarmResourceEnvvars: []string{envDockerResourceGPUs},
env: map[string]string{
envDockerResourceGPUs: anotherGPUID,
},
expectedDevices: []string{anotherGPUID},
},
{
description: "DOCKER_RESOURCE_GPUS overrides NVIDIA_VISIBLE_DEVICES if present",
swarmResourceEnvvars: []string{envDockerResourceGPUs},
env: map[string]string{
image.EnvVarNvidiaVisibleDevices: gpuID,
envDockerResourceGPUs: anotherGPUID,
},
expectedDevices: []string{anotherGPUID},
},
{
description: "DOCKER_RESOURCE_GPUS_ADDITIONAL overrides NVIDIA_VISIBLE_DEVICES if present",
swarmResourceEnvvars: []string{"DOCKER_RESOURCE_GPUS_ADDITIONAL"},
env: map[string]string{
image.EnvVarNvidiaVisibleDevices: gpuID,
"DOCKER_RESOURCE_GPUS_ADDITIONAL": anotherGPUID,
},
expectedDevices: []string{anotherGPUID},
},
{
description: "All available swarm resource envvars are selected and override NVIDIA_VISIBLE_DEVICES if present",
swarmResourceEnvvars: []string{"DOCKER_RESOURCE_GPUS", "DOCKER_RESOURCE_GPUS_ADDITIONAL"},
env: map[string]string{
image.EnvVarNvidiaVisibleDevices: gpuID,
"DOCKER_RESOURCE_GPUS": thirdGPUID,
"DOCKER_RESOURCE_GPUS_ADDITIONAL": anotherGPUID,
},
expectedDevices: []string{thirdGPUID, anotherGPUID},
},
{
description: "DOCKER_RESOURCE_GPUS_ADDITIONAL or DOCKER_RESOURCE_GPUS override NVIDIA_VISIBLE_DEVICES if present",
swarmResourceEnvvars: []string{"DOCKER_RESOURCE_GPUS", "DOCKER_RESOURCE_GPUS_ADDITIONAL"},
env: map[string]string{
image.EnvVarNvidiaVisibleDevices: gpuID,
"DOCKER_RESOURCE_GPUS_ADDITIONAL": anotherGPUID,
},
expectedDevices: []string{anotherGPUID},
},
}
for _, tc := range tests {
t.Run(tc.description, func(t *testing.T) {
image, _ := image.New(
image.WithEnvMap(tc.env),
)
devices := getDevicesFromEnvvar(image, tc.swarmResourceEnvvars)
require.EqualValues(t, tc.expectedDevices, devices)
})
}
}
func TestGetDriverCapabilities(t *testing.T) {
supportedCapabilities := "compute,display,utility,video"

View File

@@ -4,50 +4,46 @@ import (
"fmt"
"log"
"os"
"path"
"reflect"
"strings"
"sync"
"github.com/NVIDIA/nvidia-container-toolkit/internal/config"
"github.com/NVIDIA/nvidia-container-toolkit/internal/config/image"
)
const (
configPath = "/etc/nvidia-container-runtime/config.toml"
driverPath = "/run/nvidia/driver"
"github.com/NVIDIA/nvidia-container-toolkit/internal/info"
)
// hookConfig wraps the toolkit config.
// This allows for functions to be defined on the local type.
type hookConfig struct {
sync.Mutex
*config.Config
containerConfig *containerConfig
}
// loadConfig loads the required paths for the hook config.
func loadConfig() (*config.Config, error) {
var configPaths []string
var required bool
if len(*configflag) != 0 {
configPaths = append(configPaths, *configflag)
required = true
} else {
configPaths = append(configPaths, path.Join(driverPath, configPath), configPath)
configFilePath, required := getConfigFilePath()
cfg, err := config.New(
config.WithConfigFile(configFilePath),
config.WithRequired(true),
)
if err == nil {
return cfg.Config()
} else if os.IsNotExist(err) && !required {
return config.GetDefault()
}
return nil, fmt.Errorf("couldn't open required configuration file: %v", err)
}
for _, p := range configPaths {
cfg, err := config.New(
config.WithConfigFile(p),
config.WithRequired(true),
)
if err == nil {
return cfg.Config()
} else if os.IsNotExist(err) && !required {
continue
}
return nil, fmt.Errorf("couldn't open required configuration file: %v", err)
func getConfigFilePath() (string, bool) {
if configFromFlag := *configflag; configFromFlag != "" {
return configFromFlag, true
}
return config.GetDefault()
if configFromEnvvar := os.Getenv(config.FilePathOverrideEnvVar); configFromEnvvar != "" {
return configFromEnvvar, true
}
return config.GetConfigFilePath(), false
}
func getHookConfig() (*hookConfig, error) {
@@ -55,7 +51,7 @@ func getHookConfig() (*hookConfig, error) {
if err != nil {
return nil, fmt.Errorf("failed to load config: %v", err)
}
config := &hookConfig{cfg}
config := &hookConfig{Config: cfg}
allSupportedDriverCapabilities := image.SupportedDriverCapabilities
if config.SupportedDriverCapabilities == "all" {
@@ -73,8 +69,8 @@ func getHookConfig() (*hookConfig, error) {
// getConfigOption returns the toml config option associated with the
// specified struct field.
func (c hookConfig) getConfigOption(fieldName string) string {
t := reflect.TypeOf(c)
func (c *hookConfig) getConfigOption(fieldName string) string {
t := reflect.TypeOf(&c)
f, ok := t.FieldByName(fieldName)
if !ok {
return fieldName
@@ -88,7 +84,7 @@ func (c hookConfig) getConfigOption(fieldName string) string {
// getSwarmResourceEnvvars returns the swarm resource envvars for the config.
func (c *hookConfig) getSwarmResourceEnvvars() []string {
if c.SwarmResource == "" {
if c == nil || c.SwarmResource == "" {
return nil
}
@@ -104,3 +100,44 @@ func (c *hookConfig) getSwarmResourceEnvvars() []string {
return envvars
}
// nvidiaContainerCliCUDACompatModeFlags returns required --cuda-compat-mode
// flag(s) depending on the hook and runtime configurations.
func (c *hookConfig) nvidiaContainerCliCUDACompatModeFlags() []string {
var flag string
switch c.NVIDIAContainerRuntimeConfig.Modes.Legacy.CUDACompatMode {
case config.CUDACompatModeLdconfig:
flag = "--cuda-compat-mode=ldconfig"
case config.CUDACompatModeMount:
flag = "--cuda-compat-mode=mount"
case config.CUDACompatModeDisabled, config.CUDACompatModeHook:
flag = "--cuda-compat-mode=disabled"
default:
if !c.Features.AllowCUDACompatLibsFromContainer.IsEnabled() {
flag = "--cuda-compat-mode=disabled"
}
}
if flag == "" {
return nil
}
return []string{flag}
}
func (c *hookConfig) assertModeIsLegacy() error {
if c.NVIDIAContainerRuntimeHookConfig.SkipModeDetection {
return nil
}
mr := info.NewRuntimeModeResolver(
info.WithLogger(&logInterceptor{}),
info.WithImage(&c.containerConfig.Image),
info.WithDefaultMode(info.LegacyRuntimeMode),
)
mode := mr.ResolveRuntimeMode(c.NVIDIAContainerRuntimeConfig.Mode)
if mode == "legacy" {
return nil
}
return fmt.Errorf("invoking the NVIDIA Container Runtime Hook directly (e.g. specifying the docker --gpus flag) is not supported. Please use the NVIDIA Container Runtime (e.g. specify the --runtime=nvidia flag) instead")
}

View File

@@ -85,15 +85,15 @@ func TestGetHookConfig(t *testing.T) {
configflag = &filename
for _, line := range tc.lines {
_, err := configFile.WriteString(fmt.Sprintf("%s\n", line))
_, err := fmt.Fprintf(configFile, "%s\n", line)
require.NoError(t, err)
}
}
var cfg hookConfig
var cfg *hookConfig
getHookConfig := func() {
c, _ := getHookConfig()
cfg = *c
cfg = c
}
if tc.expectedPanic {

View File

@@ -55,7 +55,7 @@ func getCLIPath(config config.ContainerCLIConfig) string {
}
// getRootfsPath returns an absolute path. We don't need to resolve symlinks for now.
func getRootfsPath(config containerConfig) string {
func getRootfsPath(config *containerConfig) string {
rootfs, err := filepath.Abs(config.Rootfs)
if err != nil {
log.Panicln(err)
@@ -82,8 +82,8 @@ func doPrestart() {
return
}
if !hook.NVIDIAContainerRuntimeHookConfig.SkipModeDetection && info.ResolveAutoMode(&logInterceptor{}, hook.NVIDIAContainerRuntimeConfig.Mode, container.Image) != "legacy" {
log.Panicln("invoking the NVIDIA Container Runtime Hook directly (e.g. specifying the docker --gpus flag) is not supported. Please use the NVIDIA Container Runtime (e.g. specify the --runtime=nvidia flag) instead.")
if err := hook.assertModeIsLegacy(); err != nil {
log.Panicf("%v", err)
}
rootfs := getRootfsPath(container)
@@ -114,9 +114,8 @@ func doPrestart() {
}
args = append(args, "configure")
if !hook.Features.AllowCUDACompatLibsFromContainer.IsEnabled() {
args = append(args, "--no-cntlibs")
}
args = append(args, hook.nvidiaContainerCliCUDACompatModeFlags()...)
if ldconfigPath := cli.NormalizeLDConfigPath(); ldconfigPath != "" {
args = append(args, fmt.Sprintf("--ldconfig=%s", ldconfigPath))
}

View File

@@ -21,8 +21,8 @@ The `runtimes` config option allows for the low-level runtime to be specified. T
The default value for this setting is:
```toml
runtimes = [
"docker-runc",
"runc",
"crun",
]
```

View File

@@ -122,11 +122,10 @@ func TestGoodInput(t *testing.T) {
err = cmdCreate.Run()
require.NoError(t, err, "runtime should not return an error")
// Check config.json for NVIDIA prestart hook
// Check config.json to ensure that the NVIDIA prestart was not inserted.
spec, err = cfg.getRuntimeSpec()
require.NoError(t, err, "should be no errors when reading and parsing spec from config.json")
require.NotEmpty(t, spec.Hooks, "there should be hooks in config.json")
require.Equal(t, 1, nvidiaHookCount(spec.Hooks), "exactly one nvidia prestart hook should be inserted correctly into config.json")
require.Empty(t, spec.Hooks, "there should be no hooks in config.json")
}
// NVIDIA prestart hook already present in config file
@@ -168,11 +167,10 @@ func TestDuplicateHook(t *testing.T) {
output, err := cmdCreate.CombinedOutput()
require.NoErrorf(t, err, "runtime should not return an error", "output=%v", string(output))
// Check config.json for NVIDIA prestart hook
// Check config.json to ensure that the NVIDIA prestart hook was removed.
spec, err = cfg.getRuntimeSpec()
require.NoError(t, err, "should be no errors when reading and parsing spec from config.json")
require.NotEmpty(t, spec.Hooks, "there should be hooks in config.json")
require.Equal(t, 1, nvidiaHookCount(spec.Hooks), "exactly one nvidia prestart hook should be inserted correctly into config.json")
require.Empty(t, spec.Hooks, "there should be no hooks in config.json")
}
// addNVIDIAHook is a basic wrapper for an addHookModifier that is used for
@@ -240,18 +238,3 @@ func (c testConfig) generateNewRuntimeSpec() error {
}
return nil
}
// Return number of valid NVIDIA prestart hooks in runtime spec
func nvidiaHookCount(hooks *specs.Hooks) int {
if hooks == nil {
return 0
}
count := 0
for _, hook := range hooks.Prestart {
if strings.Contains(hook.Path, nvidiaHook) {
count++
}
}
return count
}

View File

@@ -38,6 +38,11 @@ const (
type Options struct {
Config string
Socket string
// ExecutablePath specifies the path to the container runtime executable.
// This is used to extract the current config, for example.
// If a HostRootMount is specified, this path is relative to the host root
// mount.
ExecutablePath string
// EnabledCDI indicates whether CDI should be enabled.
EnableCDI bool
RuntimeName string

View File

@@ -20,8 +20,6 @@ import "path/filepath"
const (
defaultRuntimeName = "nvidia"
defaultRoot = "/usr/bin"
)
// Runtime defines a runtime to be configured.
@@ -48,9 +46,6 @@ func GetRuntimes(opts ...Option) Runtimes {
opt(c)
}
if c.root == "" {
c.root = defaultRoot
}
if c.nvidiaRuntimeName == "" {
c.nvidiaRuntimeName = defaultRuntimeName
}

View File

@@ -27,7 +27,6 @@ func TestOptions(t *testing.T) {
testCases := []struct {
setAsDefault bool
nvidiaRuntimeName string
root string
expectedDefaultRuntime string
expectedRuntimes Runtimes
}{
@@ -131,7 +130,7 @@ func TestOptions(t *testing.T) {
runtimes := GetRuntimes(
WithNvidiaRuntimeName(tc.nvidiaRuntimeName),
WithSetAsDefault(tc.setAsDefault),
WithRoot(tc.root),
WithRoot("/usr/bin"),
)
require.EqualValues(t, tc.expectedRuntimes, runtimes)

View File

@@ -173,7 +173,7 @@ func getRuntimeConfig(o *container.Options, co *Options) (engine.Interface, erro
containerd.WithPath(o.Config),
containerd.WithConfigSource(
toml.LoadFirst(
containerd.CommandLineSource(o.HostRootMount),
containerd.CommandLineSource(o.HostRootMount, o.ExecutablePath),
toml.FromFile(o.Config),
),
),

View File

@@ -202,7 +202,7 @@ func getRuntimeConfig(o *container.Options) (engine.Interface, error) {
crio.WithPath(o.Config),
crio.WithConfigSource(
toml.LoadFirst(
crio.CommandLineSource(o.HostRootMount),
crio.CommandLineSource(o.HostRootMount, o.ExecutablePath),
toml.FromFile(o.Config),
),
),

View File

@@ -25,7 +25,8 @@ import (
"github.com/NVIDIA/nvidia-container-toolkit/cmd/nvidia-ctk-installer/container/runtime/containerd"
"github.com/NVIDIA/nvidia-container-toolkit/cmd/nvidia-ctk-installer/container/runtime/crio"
"github.com/NVIDIA/nvidia-container-toolkit/cmd/nvidia-ctk-installer/container/runtime/docker"
"github.com/NVIDIA/nvidia-container-toolkit/cmd/nvidia-ctk-installer/container/toolkit"
"github.com/NVIDIA/nvidia-container-toolkit/cmd/nvidia-ctk-installer/toolkit"
"github.com/NVIDIA/nvidia-container-toolkit/internal/logger"
)
const (
@@ -53,6 +54,12 @@ func Flags(opts *Options) []cli.Flag {
Destination: &opts.Config,
EnvVars: []string{"RUNTIME_CONFIG", "CONTAINERD_CONFIG", "DOCKER_CONFIG"},
},
&cli.StringFlag{
Name: "executable-path",
Usage: "The path to the runtime executable. This is used to extract the current config",
Destination: &opts.ExecutablePath,
EnvVars: []string{"RUNTIME_EXECUTABLE_PATH"},
},
&cli.StringFlag{
Name: "socket",
Usage: "Path to the runtime socket file",
@@ -104,8 +111,8 @@ func Flags(opts *Options) []cli.Flag {
return flags
}
// ValidateOptions checks whether the specified options are valid
func ValidateOptions(c *cli.Context, opts *Options, runtime string, toolkitRoot string, to *toolkit.Options) error {
// Validate checks whether the specified options are valid
func (opts *Options) Validate(logger logger.Interface, c *cli.Context, runtime string, toolkitRoot string, to *toolkit.Options) error {
// We set this option here to ensure that it is available in future calls.
opts.RuntimeDir = toolkitRoot
@@ -113,6 +120,11 @@ func ValidateOptions(c *cli.Context, opts *Options, runtime string, toolkitRoot
opts.EnableCDI = to.CDI.Enabled
}
if opts.ExecutablePath != "" && opts.RuntimeName == docker.Name {
logger.Warningf("Ignoring executable-path=%q flag for %v", opts.ExecutablePath, opts.RuntimeName)
opts.ExecutablePath = ""
}
// Apply the runtime-specific config changes.
switch runtime {
case containerd.Name:

View File

@@ -1,152 +0,0 @@
/**
# Copyright (c) 2021, NVIDIA CORPORATION. All rights reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
*/
package toolkit
import (
"fmt"
"io"
"os"
"path/filepath"
"sort"
"strings"
)
type executableTarget struct {
dotfileName string
wrapperName string
}
type executable struct {
fileInstaller
source string
target executableTarget
env map[string]string
preLines []string
argLines []string
}
// install installs an executable component of the NVIDIA container toolkit. The source executable
// is copied to a `.real` file and a wapper is created to set up the environment as required.
func (e executable) install(destFolder string) (string, error) {
e.logger.Infof("Installing executable '%v' to %v", e.source, destFolder)
dotfileName := e.dotfileName()
installedDotfileName, err := e.installFileToFolderWithName(destFolder, dotfileName, e.source)
if err != nil {
return "", fmt.Errorf("error installing file '%v' as '%v': %v", e.source, dotfileName, err)
}
e.logger.Infof("Installed '%v'", installedDotfileName)
wrapperFilename, err := e.installWrapper(destFolder, installedDotfileName)
if err != nil {
return "", fmt.Errorf("error wrapping '%v': %v", installedDotfileName, err)
}
e.logger.Infof("Installed wrapper '%v'", wrapperFilename)
return wrapperFilename, nil
}
func (e executable) dotfileName() string {
return e.target.dotfileName
}
func (e executable) wrapperName() string {
return e.target.wrapperName
}
func (e executable) installWrapper(destFolder string, dotfileName string) (string, error) {
wrapperPath := filepath.Join(destFolder, e.wrapperName())
wrapper, err := os.Create(wrapperPath)
if err != nil {
return "", fmt.Errorf("error creating executable wrapper: %v", err)
}
defer wrapper.Close()
err = e.writeWrapperTo(wrapper, destFolder, dotfileName)
if err != nil {
return "", fmt.Errorf("error writing wrapper contents: %v", err)
}
err = ensureExecutable(wrapperPath)
if err != nil {
return "", fmt.Errorf("error making wrapper executable: %v", err)
}
return wrapperPath, nil
}
func (e executable) writeWrapperTo(wrapper io.Writer, destFolder string, dotfileName string) error {
r := newReplacements(destDirPattern, destFolder)
// Add the shebang
fmt.Fprintln(wrapper, "#! /bin/sh")
// Add the preceding lines if any
for _, line := range e.preLines {
fmt.Fprintf(wrapper, "%s\n", r.apply(line))
}
// Update the path to include the destination folder
var env map[string]string
if e.env == nil {
env = make(map[string]string)
} else {
env = e.env
}
path, specified := env["PATH"]
if !specified {
path = "$PATH"
}
env["PATH"] = strings.Join([]string{destFolder, path}, ":")
var sortedEnvvars []string
for e := range env {
sortedEnvvars = append(sortedEnvvars, e)
}
sort.Strings(sortedEnvvars)
for _, e := range sortedEnvvars {
v := env[e]
fmt.Fprintf(wrapper, "%s=%s \\\n", e, r.apply(v))
}
// Add the call to the target executable
fmt.Fprintf(wrapper, "%s \\\n", dotfileName)
// Insert additional lines in the `arg` list
for _, line := range e.argLines {
fmt.Fprintf(wrapper, "\t%s \\\n", r.apply(line))
}
// Add the script arguments "$@"
fmt.Fprintln(wrapper, "\t\"$@\"")
return nil
}
// ensureExecutable is equivalent to running chmod +x on the specified file
func ensureExecutable(path string) error {
info, err := os.Stat(path)
if err != nil {
return fmt.Errorf("error getting file info for '%v': %v", path, err)
}
executableMode := info.Mode() | 0111
err = os.Chmod(path, executableMode)
if err != nil {
return fmt.Errorf("error setting executable mode for '%v': %v", path, err)
}
return nil
}

View File

@@ -1,162 +0,0 @@
/**
# Copyright (c) 2021, NVIDIA CORPORATION. All rights reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
*/
package toolkit
import (
"bytes"
"os"
"path/filepath"
"strings"
"testing"
testlog "github.com/sirupsen/logrus/hooks/test"
"github.com/stretchr/testify/require"
)
func TestWrapper(t *testing.T) {
logger, _ := testlog.NewNullLogger()
const shebang = "#! /bin/sh"
const destFolder = "/dest/folder"
const dotfileName = "source.real"
testCases := []struct {
e executable
expectedLines []string
}{
{
e: executable{},
expectedLines: []string{
shebang,
"PATH=/dest/folder:$PATH \\",
"source.real \\",
"\t\"$@\"",
"",
},
},
{
e: executable{
env: map[string]string{
"PATH": "some-path",
},
},
expectedLines: []string{
shebang,
"PATH=/dest/folder:some-path \\",
"source.real \\",
"\t\"$@\"",
"",
},
},
{
e: executable{
preLines: []string{
"preline1",
"preline2",
},
},
expectedLines: []string{
shebang,
"preline1",
"preline2",
"PATH=/dest/folder:$PATH \\",
"source.real \\",
"\t\"$@\"",
"",
},
},
{
e: executable{
argLines: []string{
"argline1",
"argline2",
},
},
expectedLines: []string{
shebang,
"PATH=/dest/folder:$PATH \\",
"source.real \\",
"\targline1 \\",
"\targline2 \\",
"\t\"$@\"",
"",
},
},
}
for i, tc := range testCases {
buf := &bytes.Buffer{}
tc.e.logger = logger
err := tc.e.writeWrapperTo(buf, destFolder, dotfileName)
require.NoError(t, err)
exepectedContents := strings.Join(tc.expectedLines, "\n")
require.Equal(t, exepectedContents, buf.String(), "%v: %v", i, tc)
}
}
func TestInstallExecutable(t *testing.T) {
logger, _ := testlog.NewNullLogger()
inputFolder, err := os.MkdirTemp("", "")
require.NoError(t, err)
defer os.RemoveAll(inputFolder)
// Create the source file
source := filepath.Join(inputFolder, "input")
sourceFile, err := os.Create(source)
base := filepath.Base(source)
require.NoError(t, err)
require.NoError(t, sourceFile.Close())
e := executable{
fileInstaller: fileInstaller{
logger: logger,
},
source: source,
target: executableTarget{
dotfileName: "input.real",
wrapperName: "input",
},
}
destFolder, err := os.MkdirTemp("", "output-*")
require.NoError(t, err)
defer os.RemoveAll(destFolder)
installed, err := e.install(destFolder)
require.NoError(t, err)
require.Equal(t, filepath.Join(destFolder, base), installed)
// Now check the post conditions:
sourceInfo, err := os.Stat(source)
require.NoError(t, err)
destInfo, err := os.Stat(filepath.Join(destFolder, base+".real"))
require.NoError(t, err)
require.Equal(t, sourceInfo.Size(), destInfo.Size())
require.Equal(t, sourceInfo.Mode(), destInfo.Mode())
wrapperInfo, err := os.Stat(installed)
require.NoError(t, err)
require.NotEqual(t, 0, wrapperInfo.Mode()&0111)
}

View File

@@ -1,95 +0,0 @@
/**
# Copyright 2024 NVIDIA CORPORATION
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
**/
package toolkit
import (
"fmt"
"io"
"os"
"path/filepath"
"github.com/NVIDIA/nvidia-container-toolkit/internal/logger"
)
type fileInstaller struct {
logger logger.Interface
// sourceRoot specifies the root that is searched for the components to install.
sourceRoot string
}
// installFileToFolder copies a source file to a destination folder.
// The path of the input file is ignored.
// e.g. installFileToFolder("/some/path/file.txt", "/output/path")
// will result in a file "/output/path/file.txt" being generated
func (t *fileInstaller) installFileToFolder(destFolder string, src string) (string, error) {
name := filepath.Base(src)
return t.installFileToFolderWithName(destFolder, name, src)
}
// cp src destFolder/name
func (t *fileInstaller) installFileToFolderWithName(destFolder string, name, src string) (string, error) {
dest := filepath.Join(destFolder, name)
err := t.installFile(dest, src)
if err != nil {
return "", fmt.Errorf("error copying '%v' to '%v': %v", src, dest, err)
}
return dest, nil
}
// installFile copies a file from src to dest and maintains
// file modes
func (t *fileInstaller) installFile(dest string, src string) error {
src = filepath.Join(t.sourceRoot, src)
t.logger.Infof("Installing '%v' to '%v'", src, dest)
source, err := os.Open(src)
if err != nil {
return fmt.Errorf("error opening source: %v", err)
}
defer source.Close()
destination, err := os.Create(dest)
if err != nil {
return fmt.Errorf("error creating destination: %v", err)
}
defer destination.Close()
_, err = io.Copy(destination, source)
if err != nil {
return fmt.Errorf("error copying file: %v", err)
}
err = applyModeFromSource(dest, src)
if err != nil {
return fmt.Errorf("error setting destination file mode: %v", err)
}
return nil
}
// applyModeFromSource sets the file mode for a destination file
// to match that of a specified source file
func applyModeFromSource(dest string, src string) error {
sourceInfo, err := os.Stat(src)
if err != nil {
return fmt.Errorf("error getting file info for '%v': %v", src, err)
}
err = os.Chmod(dest, sourceInfo.Mode())
if err != nil {
return fmt.Errorf("error setting mode for '%v': %v", dest, err)
}
return nil
}

View File

@@ -1,85 +0,0 @@
/**
# Copyright (c) 2021, NVIDIA CORPORATION. All rights reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
*/
package toolkit
import (
"fmt"
"path/filepath"
"github.com/NVIDIA/nvidia-container-toolkit/cmd/nvidia-ctk-installer/container/operator"
)
const (
nvidiaContainerRuntimeSource = "/usr/bin/nvidia-container-runtime"
)
// installContainerRuntimes sets up the NVIDIA container runtimes, copying the executables
// and implementing the required wrapper
func (t *Installer) installContainerRuntimes(toolkitDir string) error {
runtimes := operator.GetRuntimes()
for _, runtime := range runtimes {
r := t.newNvidiaContainerRuntimeInstaller(runtime.Path)
_, err := r.install(toolkitDir)
if err != nil {
return fmt.Errorf("error installing NVIDIA container runtime: %v", err)
}
}
return nil
}
// newNVidiaContainerRuntimeInstaller returns a new executable installer for the NVIDIA container runtime.
// This installer will copy the specified source executable to the toolkit directory.
// The executable is copied to a file with the same name as the source, but with a ".real" suffix and a wrapper is
// created to allow for the configuration of the runtime environment.
func (t *Installer) newNvidiaContainerRuntimeInstaller(source string) *executable {
wrapperName := filepath.Base(source)
dotfileName := wrapperName + ".real"
target := executableTarget{
dotfileName: dotfileName,
wrapperName: wrapperName,
}
return t.newRuntimeInstaller(source, target, nil)
}
func (t *Installer) newRuntimeInstaller(source string, target executableTarget, env map[string]string) *executable {
preLines := []string{
"",
"cat /proc/modules | grep -e \"^nvidia \" >/dev/null 2>&1",
"if [ \"${?}\" != \"0\" ]; then",
" echo \"nvidia driver modules are not yet loaded, invoking runc directly\"",
" exec runc \"$@\"",
"fi",
"",
}
runtimeEnv := make(map[string]string)
runtimeEnv["XDG_CONFIG_HOME"] = filepath.Join(destDirPattern, ".config")
for k, v := range env {
runtimeEnv[k] = v
}
r := executable{
fileInstaller: t.fileInstaller,
source: source,
target: target,
env: runtimeEnv,
preLines: preLines,
}
return &r
}

View File

@@ -1,64 +0,0 @@
/**
# Copyright (c) 2021, NVIDIA CORPORATION. All rights reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
*/
package toolkit
import (
"bytes"
"strings"
"testing"
testlog "github.com/sirupsen/logrus/hooks/test"
"github.com/stretchr/testify/require"
)
func TestNvidiaContainerRuntimeInstallerWrapper(t *testing.T) {
logger, _ := testlog.NewNullLogger()
i := Installer{
fileInstaller: fileInstaller{
logger: logger,
},
}
r := i.newNvidiaContainerRuntimeInstaller(nvidiaContainerRuntimeSource)
const shebang = "#! /bin/sh"
const destFolder = "/dest/folder"
const dotfileName = "source.real"
buf := &bytes.Buffer{}
err := r.writeWrapperTo(buf, destFolder, dotfileName)
require.NoError(t, err)
expectedLines := []string{
shebang,
"",
"cat /proc/modules | grep -e \"^nvidia \" >/dev/null 2>&1",
"if [ \"${?}\" != \"0\" ]; then",
" echo \"nvidia driver modules are not yet loaded, invoking runc directly\"",
" exec runc \"$@\"",
"fi",
"",
"PATH=/dest/folder:$PATH \\",
"XDG_CONFIG_HOME=/dest/folder/.config \\",
"source.real \\",
"\t\"$@\"",
"",
}
exepectedContents := strings.Join(expectedLines, "\n")
require.Equal(t, exepectedContents, buf.String())
}

View File

@@ -5,67 +5,60 @@ import (
"os"
"os/signal"
"path/filepath"
"strings"
"syscall"
"github.com/urfave/cli/v2"
"golang.org/x/sys/unix"
"github.com/NVIDIA/nvidia-container-toolkit/cmd/nvidia-ctk-installer/container/runtime"
"github.com/NVIDIA/nvidia-container-toolkit/cmd/nvidia-ctk-installer/container/toolkit"
"github.com/NVIDIA/nvidia-container-toolkit/cmd/nvidia-ctk-installer/toolkit"
"github.com/NVIDIA/nvidia-container-toolkit/internal/info"
"github.com/NVIDIA/nvidia-container-toolkit/internal/logger"
"github.com/NVIDIA/nvidia-container-toolkit/internal/lookup"
)
const (
toolkitPidFilename = "toolkit.pid"
defaultPidFile = "/run/nvidia/toolkit/" + toolkitPidFilename
toolkitSubDir = "toolkit"
defaultRuntime = "docker"
defaultRuntimeArgs = ""
defaultToolkitInstallDir = "/usr/local/nvidia"
toolkitSubDir = "toolkit"
defaultRuntime = "docker"
)
var availableRuntimes = map[string]struct{}{"docker": {}, "crio": {}, "containerd": {}}
var defaultLowLevelRuntimes = []string{"docker-runc", "runc", "crun"}
var defaultLowLevelRuntimes = []string{"runc", "crun"}
var waitingForSignal = make(chan bool, 1)
var signalReceived = make(chan bool, 1)
// options stores the command line arguments
type options struct {
toolkitInstallDir string
noDaemon bool
runtime string
runtimeArgs string
root string
pidFile string
sourceRoot string
packageType string
toolkitOptions toolkit.Options
runtimeOptions runtime.Options
}
func (o options) toolkitRoot() string {
return filepath.Join(o.root, toolkitSubDir)
return filepath.Join(o.toolkitInstallDir, toolkitSubDir)
}
// Version defines the CLI version. This is set at build time using LD FLAGS
var Version = "development"
func main() {
logger := logger.New()
remainingArgs, root, err := ParseArgs(logger, os.Args)
if err != nil {
logger.Errorf("Error: unable to parse arguments: %v", err)
os.Exit(1)
}
c := NewApp(logger, root)
c := NewApp(logger)
// Run the CLI
logger.Infof("Starting %v", c.Name)
if err := c.Run(remainingArgs); err != nil {
logger.Errorf("error running nvidia-toolkit: %v", err)
if err := c.Run(os.Args); err != nil {
logger.Errorf("error running %v: %v", c.Name, err)
os.Exit(1)
}
@@ -75,18 +68,14 @@ func main() {
// An app represents the nvidia-ctk-installer.
type app struct {
logger logger.Interface
// defaultRoot stores the root to use if the --root flag is not specified.
defaultRoot string
toolkit *toolkit.Installer
}
// NewApp creates the CLI app fro the specified options.
// defaultRoot is used as the root if not specified via the --root flag.
func NewApp(logger logger.Interface, defaultRoot string) *cli.App {
func NewApp(logger logger.Interface) *cli.App {
a := app{
logger: logger,
defaultRoot: defaultRoot,
logger: logger,
}
return a.build()
}
@@ -97,11 +86,9 @@ func (a app) build() *cli.App {
}
// Create the top-level CLI
c := cli.NewApp()
c.Name = "nvidia-toolkit"
c.Usage = "Install the nvidia-container-toolkit for use by a given runtime"
c.UsageText = "[DESTINATION] [-n | --no-daemon] [-r | --runtime] [-u | --runtime-args]"
c.Description = "DESTINATION points to the host path underneath which the nvidia-container-toolkit should be installed.\nIt will be installed at ${DESTINATION}/toolkit"
c.Version = Version
c.Name = "nvidia-ctk-installer"
c.Usage = "Install the NVIDIA Container Toolkit and configure the specified runtime to use the `nvidia` runtime."
c.Version = info.GetVersionString()
c.Before = func(ctx *cli.Context) error {
return a.Before(ctx, &options)
}
@@ -126,28 +113,29 @@ func (a app) build() *cli.App {
Destination: &options.runtime,
EnvVars: []string{"RUNTIME"},
},
// TODO: Remove runtime-args
&cli.StringFlag{
Name: "runtime-args",
Aliases: []string{"u"},
Usage: "arguments to pass to 'docker', 'crio', or 'containerd' setup command",
Value: defaultRuntimeArgs,
Destination: &options.runtimeArgs,
EnvVars: []string{"RUNTIME_ARGS"},
Name: "toolkit-install-dir",
Aliases: []string{"root"},
Usage: "The directory where the NVIDIA Container Toolkit is to be installed. " +
"The components of the toolkit will be installed to `ROOT`/toolkit. " +
"Note that in the case of a containerized installer, this is the path in the container and it is " +
"recommended that this match the path on the host.",
Value: defaultToolkitInstallDir,
Destination: &options.toolkitInstallDir,
EnvVars: []string{"TOOLKIT_INSTALL_DIR", "ROOT"},
},
&cli.StringFlag{
Name: "root",
Value: a.defaultRoot,
Usage: "the folder where the NVIDIA Container Toolkit is to be installed. It will be installed to `ROOT`/toolkit",
Destination: &options.root,
EnvVars: []string{"ROOT"},
},
&cli.StringFlag{
Name: "source-root",
Value: "/",
Usage: "The folder where the required toolkit artifacts can be found",
Name: "toolkit-source-root",
Usage: "The folder where the required toolkit artifacts can be found. If this is not specified, the path /artifacts/{{ .ToolkitPackageType }} is used where ToolkitPackageType is the resolved package type",
Destination: &options.sourceRoot,
EnvVars: []string{"SOURCE_ROOT"},
EnvVars: []string{"TOOLKIT_SOURCE_ROOT"},
},
&cli.StringFlag{
Name: "toolkit-package-type",
Usage: "specify the package type to use for the toolkit. One of ['deb', 'rpm', 'auto', '']. If 'auto' or '' are used, the type is inferred automatically.",
Value: "auto",
Destination: &options.packageType,
EnvVars: []string{"TOOLKIT_PACKAGE_TYPE"},
},
&cli.StringFlag{
Name: "pid-file",
@@ -165,6 +153,15 @@ func (a app) build() *cli.App {
}
func (a *app) Before(c *cli.Context, o *options) error {
if o.sourceRoot == "" {
sourceRoot, err := a.resolveSourceRoot(o.runtimeOptions.HostRootMount, o.packageType)
if err != nil {
return fmt.Errorf("failed to resolve source root: %v", err)
}
a.logger.Infof("Resolved source root to %v", sourceRoot)
o.sourceRoot = sourceRoot
}
a.toolkit = toolkit.NewInstaller(
toolkit.WithLogger(a.logger),
toolkit.WithSourceRoot(o.sourceRoot),
@@ -174,7 +171,7 @@ func (a *app) Before(c *cli.Context, o *options) error {
}
func (a *app) validateFlags(c *cli.Context, o *options) error {
if o.root == "" {
if o.toolkitInstallDir == "" {
return fmt.Errorf("the install root must be specified")
}
if _, exists := availableRuntimes[o.runtime]; !exists {
@@ -187,7 +184,7 @@ func (a *app) validateFlags(c *cli.Context, o *options) error {
if err := a.toolkit.ValidateOptions(&o.toolkitOptions); err != nil {
return err
}
if err := runtime.ValidateOptions(c, &o.runtimeOptions, o.runtime, o.toolkitRoot(), &o.toolkitOptions); err != nil {
if err := o.runtimeOptions.Validate(a.logger, c, o.runtime, o.toolkitRoot(), &o.toolkitOptions); err != nil {
return err
}
return nil
@@ -238,34 +235,6 @@ func (a *app) Run(c *cli.Context, o *options) error {
return nil
}
// ParseArgs checks if a single positional argument was defined and extracts this the root.
// If no positional arguments are defined, it is assumed that the root is specified as a flag.
func ParseArgs(logger logger.Interface, args []string) ([]string, string, error) {
logger.Infof("Parsing arguments")
if len(args) < 2 {
return args, "", nil
}
var lastPositionalArg int
for i, arg := range args {
if strings.HasPrefix(arg, "-") {
break
}
lastPositionalArg = i
}
if lastPositionalArg == 0 {
return args, "", nil
}
if lastPositionalArg == 1 {
return append([]string{args[0]}, args[2:]...), args[1], nil
}
return nil, "", fmt.Errorf("unexpected positional argument(s) %v", args[2:lastPositionalArg+1])
}
func (a *app) initialize(pidFile string) error {
a.logger.Infof("Initializing")
@@ -288,7 +257,7 @@ func (a *app) initialize(pidFile string) error {
return fmt.Errorf("unable to get flock on pidfile: %v", err)
}
_, err = f.WriteString(fmt.Sprintf("%v\n", os.Getpid()))
_, err = fmt.Fprintf(f, "%v\n", os.Getpid())
if err != nil {
return fmt.Errorf("unable to write PID to pidfile: %v", err)
}
@@ -325,3 +294,35 @@ func (a *app) shutdown(pidFile string) {
a.logger.Warningf("Unable to remove pidfile: %v", err)
}
}
func (a *app) resolveSourceRoot(hostRoot string, packageType string) (string, error) {
resolvedPackageType, err := a.resolvePackageType(hostRoot, packageType)
if err != nil {
return "", err
}
switch resolvedPackageType {
case "deb":
return "/artifacts/deb", nil
case "rpm":
return "/artifacts/rpm", nil
default:
return "", fmt.Errorf("invalid package type: %v", resolvedPackageType)
}
}
func (a *app) resolvePackageType(hostRoot string, packageType string) (rPackageTypes string, rerr error) {
if packageType != "" && packageType != "auto" {
return packageType, nil
}
locator := lookup.NewExecutableLocator(a.logger, hostRoot)
if candidates, err := locator.Locate("/usr/bin/rpm"); err == nil && len(candidates) > 0 {
return "rpm", nil
}
if candidates, err := locator.Locate("/usr/bin/dpkg"); err == nil && len(candidates) > 0 {
return "deb", nil
}
return "deb", nil
}

View File

@@ -17,7 +17,6 @@
package main
import (
"fmt"
"os"
"path/filepath"
"strings"
@@ -29,67 +28,6 @@ import (
"github.com/NVIDIA/nvidia-container-toolkit/internal/test"
)
func TestParseArgs(t *testing.T) {
logger, _ := testlog.NewNullLogger()
testCases := []struct {
args []string
expectedRemaining []string
expectedRoot string
expectedError error
}{
{
args: []string{},
expectedRemaining: []string{},
expectedRoot: "",
expectedError: nil,
},
{
args: []string{"app"},
expectedRemaining: []string{"app"},
},
{
args: []string{"app", "root"},
expectedRemaining: []string{"app"},
expectedRoot: "root",
},
{
args: []string{"app", "--flag"},
expectedRemaining: []string{"app", "--flag"},
},
{
args: []string{"app", "root", "--flag"},
expectedRemaining: []string{"app", "--flag"},
expectedRoot: "root",
},
{
args: []string{"app", "root", "not-root", "--flag"},
expectedError: fmt.Errorf("unexpected positional argument(s) [not-root]"),
},
{
args: []string{"app", "root", "not-root"},
expectedError: fmt.Errorf("unexpected positional argument(s) [not-root]"),
},
{
args: []string{"app", "root", "not-root", "also"},
expectedError: fmt.Errorf("unexpected positional argument(s) [not-root also]"),
},
}
for i, tc := range testCases {
t.Run(fmt.Sprintf("%d", i), func(t *testing.T) {
remaining, root, err := ParseArgs(logger, tc.args)
if tc.expectedError != nil {
require.EqualError(t, err, tc.expectedError.Error())
} else {
require.NoError(t, err)
}
require.ElementsMatch(t, tc.expectedRemaining, remaining)
require.Equal(t, tc.expectedRoot, root)
})
}
}
func TestApp(t *testing.T) {
t.Setenv("__NVCT_TESTING_DEVICES_ARE_FILES", "true")
logger, _ := testlog.NewNullLogger()
@@ -129,7 +67,7 @@ swarm-resource = ""
debug = "/dev/null"
log-level = "info"
mode = "auto"
runtimes = ["docker-runc", "runc", "crun"]
runtimes = ["runc", "crun"]
[nvidia-container-runtime.modes]
@@ -141,6 +79,9 @@ swarm-resource = ""
[nvidia-container-runtime.modes.csv]
mount-spec-path = "/etc/nvidia-container-runtime/host-files-for-container.d"
[nvidia-container-runtime.modes.legacy]
cuda-compat-mode = "ldconfig"
[nvidia-container-runtime-hook]
path = "{{ .toolkitRoot }}/toolkit/nvidia-container-runtime-hook"
skip-mode-detection = true
@@ -190,7 +131,7 @@ swarm-resource = ""
debug = "/dev/null"
log-level = "info"
mode = "auto"
runtimes = ["docker-runc", "runc", "crun"]
runtimes = ["runc", "crun"]
[nvidia-container-runtime.modes]
@@ -202,6 +143,9 @@ swarm-resource = ""
[nvidia-container-runtime.modes.csv]
mount-spec-path = "/etc/nvidia-container-runtime/host-files-for-container.d"
[nvidia-container-runtime.modes.legacy]
cuda-compat-mode = "ldconfig"
[nvidia-container-runtime-hook]
path = "{{ .toolkitRoot }}/toolkit/nvidia-container-runtime-hook"
skip-mode-detection = true
@@ -254,7 +198,7 @@ swarm-resource = ""
debug = "/dev/null"
log-level = "info"
mode = "auto"
runtimes = ["docker-runc", "runc", "crun"]
runtimes = ["runc", "crun"]
[nvidia-container-runtime.modes]
@@ -266,6 +210,9 @@ swarm-resource = ""
[nvidia-container-runtime.modes.csv]
mount-spec-path = "/etc/nvidia-container-runtime/host-files-for-container.d"
[nvidia-container-runtime.modes.legacy]
cuda-compat-mode = "ldconfig"
[nvidia-container-runtime-hook]
path = "{{ .toolkitRoot }}/toolkit/nvidia-container-runtime-hook"
skip-mode-detection = true
@@ -315,7 +262,7 @@ swarm-resource = ""
debug = "/dev/null"
log-level = "info"
mode = "auto"
runtimes = ["docker-runc", "runc", "crun"]
runtimes = ["runc", "crun"]
[nvidia-container-runtime.modes]
@@ -327,6 +274,9 @@ swarm-resource = ""
[nvidia-container-runtime.modes.csv]
mount-spec-path = "/etc/nvidia-container-runtime/host-files-for-container.d"
[nvidia-container-runtime.modes.legacy]
cuda-compat-mode = "ldconfig"
[nvidia-container-runtime-hook]
path = "{{ .toolkitRoot }}/toolkit/nvidia-container-runtime-hook"
skip-mode-detection = true
@@ -398,7 +348,7 @@ swarm-resource = ""
debug = "/dev/null"
log-level = "info"
mode = "auto"
runtimes = ["docker-runc", "runc", "crun"]
runtimes = ["runc", "crun"]
[nvidia-container-runtime.modes]
@@ -410,6 +360,9 @@ swarm-resource = ""
[nvidia-container-runtime.modes.csv]
mount-spec-path = "/etc/nvidia-container-runtime/host-files-for-container.d"
[nvidia-container-runtime.modes.legacy]
cuda-compat-mode = "ldconfig"
[nvidia-container-runtime-hook]
path = "{{ .toolkitRoot }}/toolkit/nvidia-container-runtime-hook"
skip-mode-detection = true
@@ -468,10 +421,11 @@ swarm-resource = ""
toolkitRoot := filepath.Join(testRoot, "toolkit-test")
toolkitConfigFile := filepath.Join(toolkitRoot, "toolkit/.config/nvidia-container-runtime/config.toml")
app := NewApp(logger, toolkitRoot)
app := NewApp(logger)
testArgs := []string{
"nvidia-ctk-installer",
"--toolkit-install-dir=" + toolkitRoot,
"--no-daemon",
"--cdi-output-dir=" + cdiOutputDir,
"--config=" + runtimeConfigFile,
@@ -479,7 +433,7 @@ swarm-resource = ""
"--driver-root-ctr-path=" + hostRoot,
"--pid-file=" + filepath.Join(testRoot, "toolkit.pid"),
"--restart-mode=none",
"--source-root=" + filepath.Join(artifactRoot, "deb"),
"--toolkit-source-root=" + filepath.Join(artifactRoot, "deb"),
}
err := app.Run(append(testArgs, tc.args...))

View File

@@ -0,0 +1,85 @@
/**
# SPDX-FileCopyrightText: Copyright (c) 2025 NVIDIA CORPORATION & AFFILIATES. All rights reserved.
# SPDX-License-Identifier: Apache-2.0
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
**/
package installer
import (
"fmt"
"path/filepath"
"github.com/NVIDIA/nvidia-container-toolkit/internal/logger"
"github.com/NVIDIA/nvidia-container-toolkit/internal/lookup"
)
// An artifactRoot is used as a source for installed artifacts.
// It is refined by a directory path, a library locator, and an executable locator.
type artifactRoot struct {
path string
libraries lookup.Locator
executables lookup.Locator
}
func newArtifactRoot(logger logger.Interface, rootDirectoryPath string) (*artifactRoot, error) {
relativeLibrarySearchPaths := []string{
"/usr/lib64",
"/usr/lib/x86_64-linux-gnu",
"/usr/lib/aarch64-linux-gnu",
}
var librarySearchPaths []string
for _, l := range relativeLibrarySearchPaths {
librarySearchPaths = append(librarySearchPaths, filepath.Join(rootDirectoryPath, l))
}
a := artifactRoot{
path: rootDirectoryPath,
libraries: lookup.NewLibraryLocator(
lookup.WithLogger(logger),
lookup.WithCount(1),
lookup.WithSearchPaths(librarySearchPaths...),
),
executables: lookup.NewExecutableLocator(
logger,
rootDirectoryPath,
),
}
return &a, nil
}
func (r *artifactRoot) findLibrary(name string) (string, error) {
candidates, err := r.libraries.Locate(name)
if err != nil {
return "", fmt.Errorf("error locating library: %w", err)
}
if len(candidates) == 0 {
return "", fmt.Errorf("library %v not found", name)
}
return candidates[0], nil
}
func (r *artifactRoot) findExecutable(name string) (string, error) {
candidates, err := r.executables.Locate(name)
if err != nil {
return "", fmt.Errorf("error locating executable: %w", err)
}
if len(candidates) == 0 {
return "", fmt.Errorf("executable %v not found", name)
}
return candidates[0], nil
}

View File

@@ -0,0 +1,47 @@
/**
# SPDX-FileCopyrightText: Copyright (c) 2025 NVIDIA CORPORATION & AFFILIATES. All rights reserved.
# SPDX-License-Identifier: Apache-2.0
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
**/
package installer
import (
"fmt"
"os"
"github.com/NVIDIA/nvidia-container-toolkit/internal/logger"
)
type createDirectory struct {
logger logger.Interface
}
func (t *ToolkitInstaller) createDirectory() Installer {
return &createDirectory{
logger: t.logger,
}
}
func (d *createDirectory) Install(dir string) error {
if dir == "" {
return nil
}
d.logger.Infof("Creating directory '%v'", dir)
err := os.MkdirAll(dir, 0755)
if err != nil {
return fmt.Errorf("error creating directory: %v", err)
}
return nil
}

View File

@@ -0,0 +1,179 @@
/**
# SPDX-FileCopyrightText: Copyright (c) 2025 NVIDIA CORPORATION & AFFILIATES. All rights reserved.
# SPDX-License-Identifier: Apache-2.0
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
**/
package installer
import (
"bytes"
"fmt"
"html/template"
"io"
"path/filepath"
"strings"
log "github.com/sirupsen/logrus"
"github.com/NVIDIA/nvidia-container-toolkit/cmd/nvidia-ctk-installer/container/operator"
"github.com/NVIDIA/nvidia-container-toolkit/internal/config"
)
type executable struct {
requiresKernelModule bool
path string
symlink string
env map[string]string
}
func (t *ToolkitInstaller) collectExecutables(destDir string) ([]Installer, error) {
configFilePath := t.ConfigFilePath(destDir)
executables := []executable{
{
path: "nvidia-ctk",
},
{
path: "nvidia-cdi-hook",
},
}
for _, runtime := range operator.GetRuntimes() {
e := executable{
path: runtime.Path,
requiresKernelModule: true,
env: map[string]string{
config.FilePathOverrideEnvVar: configFilePath,
},
}
executables = append(executables, e)
}
executables = append(executables,
executable{
path: "nvidia-container-cli",
env: map[string]string{"LD_LIBRARY_PATH": destDir + ":$LD_LIBRARY_PATH"},
},
)
executables = append(executables,
executable{
path: "nvidia-container-runtime-hook",
symlink: "nvidia-container-toolkit",
env: map[string]string{
config.FilePathOverrideEnvVar: configFilePath,
},
},
)
var installers []Installer
for _, executable := range executables {
executablePath, err := t.artifactRoot.findExecutable(executable.path)
if err != nil {
if t.ignoreErrors {
log.Errorf("Ignoring error: %v", err)
continue
}
return nil, err
}
wrappedExecutableFilename := filepath.Base(executablePath)
dotRealFilename := wrappedExecutableFilename + ".real"
w := &wrapper{
Source: executablePath,
WrappedExecutable: dotRealFilename,
CheckModules: executable.requiresKernelModule,
Envvars: map[string]string{
"PATH": strings.Join([]string{destDir, "$PATH"}, ":"),
},
}
for k, v := range executable.env {
w.Envvars[k] = v
}
installers = append(installers, w)
if executable.symlink == "" {
continue
}
link := symlink{
linkname: executable.symlink,
target: filepath.Base(executablePath),
}
installers = append(installers, link)
}
return installers, nil
}
type wrapper struct {
Source string
Envvars map[string]string
WrappedExecutable string
CheckModules bool
}
type render struct {
*wrapper
DestDir string
}
func (w *wrapper) Install(destDir string) error {
// Copy the executable with a .real extension.
mode, err := installFile(w.Source, filepath.Join(destDir, w.WrappedExecutable))
if err != nil {
return err
}
// Create a wrapper file.
r := render{
wrapper: w,
DestDir: destDir,
}
content, err := r.render()
if err != nil {
return fmt.Errorf("failed to render wrapper: %w", err)
}
wrapperFile := filepath.Join(destDir, filepath.Base(w.Source))
return installContent(content, wrapperFile, mode|0111)
}
func (w *render) render() (io.Reader, error) {
wrapperTemplate := `#! /bin/sh
{{- if (.CheckModules) }}
cat /proc/modules | grep -e "^nvidia " >/dev/null 2>&1
if [ "${?}" != "0" ]; then
echo "nvidia driver modules are not yet loaded, invoking runc directly"
exec runc "$@"
fi
{{- end }}
{{- range $key, $value := .Envvars }}
{{$key}}={{$value}} \
{{- end }}
{{ .DestDir }}/{{ .WrappedExecutable }} \
"$@"
`
var content bytes.Buffer
tmpl, err := template.New("wrapper").Parse(wrapperTemplate)
if err != nil {
return nil, err
}
if err := tmpl.Execute(&content, w); err != nil {
return nil, err
}
return &content, nil
}

View File

@@ -0,0 +1,91 @@
/**
# SPDX-FileCopyrightText: Copyright (c) 2025 NVIDIA CORPORATION & AFFILIATES. All rights reserved.
# SPDX-License-Identifier: Apache-2.0
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
**/
package installer
import (
"bytes"
"testing"
"github.com/stretchr/testify/require"
)
func TestWrapperRender(t *testing.T) {
testCases := []struct {
description string
w *wrapper
expected string
}{
{
description: "executable is added",
w: &wrapper{
WrappedExecutable: "some-runtime",
},
expected: `#! /bin/sh
/dest-dir/some-runtime \
"$@"
`,
},
{
description: "module check is added",
w: &wrapper{
WrappedExecutable: "some-runtime",
CheckModules: true,
},
expected: `#! /bin/sh
cat /proc/modules | grep -e "^nvidia " >/dev/null 2>&1
if [ "${?}" != "0" ]; then
echo "nvidia driver modules are not yet loaded, invoking runc directly"
exec runc "$@"
fi
/dest-dir/some-runtime \
"$@"
`,
},
{
description: "environment is added",
w: &wrapper{
WrappedExecutable: "some-runtime",
Envvars: map[string]string{
"PATH": "/foo/bar/baz",
},
},
expected: `#! /bin/sh
PATH=/foo/bar/baz \
/dest-dir/some-runtime \
"$@"
`,
},
}
for _, tc := range testCases {
t.Run(tc.description, func(t *testing.T) {
r := render{
wrapper: tc.w,
DestDir: "/dest-dir",
}
reader, err := r.render()
require.NoError(t, err)
var content bytes.Buffer
_, err = content.ReadFrom(reader)
require.NoError(t, err)
require.Equal(t, tc.expected, content.String())
})
}
}

View File

@@ -0,0 +1,188 @@
// Code generated by moq; DO NOT EDIT.
// github.com/matryer/moq
package installer
import (
"io"
"os"
"sync"
)
// Ensure, that fileInstallerMock does implement fileInstaller.
// If this is not the case, regenerate this file with moq.
var _ fileInstaller = &fileInstallerMock{}
// fileInstallerMock is a mock implementation of fileInstaller.
//
// func TestSomethingThatUsesfileInstaller(t *testing.T) {
//
// // make and configure a mocked fileInstaller
// mockedfileInstaller := &fileInstallerMock{
// installContentFunc: func(reader io.Reader, s string, v os.FileMode) error {
// panic("mock out the installContent method")
// },
// installFileFunc: func(s1 string, s2 string) (os.FileMode, error) {
// panic("mock out the installFile method")
// },
// installSymlinkFunc: func(s1 string, s2 string) error {
// panic("mock out the installSymlink method")
// },
// }
//
// // use mockedfileInstaller in code that requires fileInstaller
// // and then make assertions.
//
// }
type fileInstallerMock struct {
// installContentFunc mocks the installContent method.
installContentFunc func(reader io.Reader, s string, v os.FileMode) error
// installFileFunc mocks the installFile method.
installFileFunc func(s1 string, s2 string) (os.FileMode, error)
// installSymlinkFunc mocks the installSymlink method.
installSymlinkFunc func(s1 string, s2 string) error
// calls tracks calls to the methods.
calls struct {
// installContent holds details about calls to the installContent method.
installContent []struct {
// Reader is the reader argument value.
Reader io.Reader
// S is the s argument value.
S string
// V is the v argument value.
V os.FileMode
}
// installFile holds details about calls to the installFile method.
installFile []struct {
// S1 is the s1 argument value.
S1 string
// S2 is the s2 argument value.
S2 string
}
// installSymlink holds details about calls to the installSymlink method.
installSymlink []struct {
// S1 is the s1 argument value.
S1 string
// S2 is the s2 argument value.
S2 string
}
}
lockinstallContent sync.RWMutex
lockinstallFile sync.RWMutex
lockinstallSymlink sync.RWMutex
}
// installContent calls installContentFunc.
func (mock *fileInstallerMock) installContent(reader io.Reader, s string, v os.FileMode) error {
if mock.installContentFunc == nil {
panic("fileInstallerMock.installContentFunc: method is nil but fileInstaller.installContent was just called")
}
callInfo := struct {
Reader io.Reader
S string
V os.FileMode
}{
Reader: reader,
S: s,
V: v,
}
mock.lockinstallContent.Lock()
mock.calls.installContent = append(mock.calls.installContent, callInfo)
mock.lockinstallContent.Unlock()
return mock.installContentFunc(reader, s, v)
}
// installContentCalls gets all the calls that were made to installContent.
// Check the length with:
//
// len(mockedfileInstaller.installContentCalls())
func (mock *fileInstallerMock) installContentCalls() []struct {
Reader io.Reader
S string
V os.FileMode
} {
var calls []struct {
Reader io.Reader
S string
V os.FileMode
}
mock.lockinstallContent.RLock()
calls = mock.calls.installContent
mock.lockinstallContent.RUnlock()
return calls
}
// installFile calls installFileFunc.
func (mock *fileInstallerMock) installFile(s1 string, s2 string) (os.FileMode, error) {
if mock.installFileFunc == nil {
panic("fileInstallerMock.installFileFunc: method is nil but fileInstaller.installFile was just called")
}
callInfo := struct {
S1 string
S2 string
}{
S1: s1,
S2: s2,
}
mock.lockinstallFile.Lock()
mock.calls.installFile = append(mock.calls.installFile, callInfo)
mock.lockinstallFile.Unlock()
return mock.installFileFunc(s1, s2)
}
// installFileCalls gets all the calls that were made to installFile.
// Check the length with:
//
// len(mockedfileInstaller.installFileCalls())
func (mock *fileInstallerMock) installFileCalls() []struct {
S1 string
S2 string
} {
var calls []struct {
S1 string
S2 string
}
mock.lockinstallFile.RLock()
calls = mock.calls.installFile
mock.lockinstallFile.RUnlock()
return calls
}
// installSymlink calls installSymlinkFunc.
func (mock *fileInstallerMock) installSymlink(s1 string, s2 string) error {
if mock.installSymlinkFunc == nil {
panic("fileInstallerMock.installSymlinkFunc: method is nil but fileInstaller.installSymlink was just called")
}
callInfo := struct {
S1 string
S2 string
}{
S1: s1,
S2: s2,
}
mock.lockinstallSymlink.Lock()
mock.calls.installSymlink = append(mock.calls.installSymlink, callInfo)
mock.lockinstallSymlink.Unlock()
return mock.installSymlinkFunc(s1, s2)
}
// installSymlinkCalls gets all the calls that were made to installSymlink.
// Check the length with:
//
// len(mockedfileInstaller.installSymlinkCalls())
func (mock *fileInstallerMock) installSymlinkCalls() []struct {
S1 string
S2 string
} {
var calls []struct {
S1 string
S2 string
}
mock.lockinstallSymlink.RLock()
calls = mock.calls.installSymlink
mock.lockinstallSymlink.RUnlock()
return calls
}

View File

@@ -0,0 +1,172 @@
/**
# SPDX-FileCopyrightText: Copyright (c) 2025 NVIDIA CORPORATION & AFFILIATES. All rights reserved.
# SPDX-License-Identifier: Apache-2.0
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
**/
package installer
import (
"errors"
"fmt"
"io"
"io/fs"
"os"
"path/filepath"
"github.com/NVIDIA/nvidia-container-toolkit/internal/logger"
)
//go:generate moq -rm -fmt=goimports -out installer_mock.go . Installer
type Installer interface {
Install(string) error
}
type ToolkitInstaller struct {
logger logger.Interface
ignoreErrors bool
sourceRoot string
artifactRoot *artifactRoot
ensureTargetDirectory Installer
}
var _ Installer = (*ToolkitInstaller)(nil)
// New creates a toolkit installer with the specified options.
func New(opts ...Option) (*ToolkitInstaller, error) {
t := &ToolkitInstaller{
sourceRoot: "/",
}
for _, opt := range opts {
opt(t)
}
if t.logger == nil {
t.logger = logger.New()
}
if t.artifactRoot == nil {
artifactRoot, err := newArtifactRoot(t.logger, t.sourceRoot)
if err != nil {
return nil, err
}
t.artifactRoot = artifactRoot
}
if t.ensureTargetDirectory == nil {
t.ensureTargetDirectory = t.createDirectory()
}
return t, nil
}
// Install ensures that the required toolkit files are installed in the specified directory.
func (t *ToolkitInstaller) Install(destDir string) error {
var installers []Installer
installers = append(installers, t.ensureTargetDirectory)
libraries, err := t.collectLibraries()
if err != nil {
return fmt.Errorf("failed to collect libraries: %w", err)
}
installers = append(installers, libraries...)
executables, err := t.collectExecutables(destDir)
if err != nil {
return fmt.Errorf("failed to collect executables: %w", err)
}
installers = append(installers, executables...)
var errs error
for _, i := range installers {
errs = errors.Join(errs, i.Install(destDir))
}
return errs
}
func (t *ToolkitInstaller) ConfigFilePath(destDir string) string {
toolkitConfigDir := filepath.Join(destDir, ".config", "nvidia-container-runtime")
return filepath.Join(toolkitConfigDir, "config.toml")
}
type symlink struct {
linkname string
target string
}
func (s symlink) Install(destDir string) error {
symlinkPath := filepath.Join(destDir, s.linkname)
return installSymlink(s.target, symlinkPath)
}
//go:generate moq -rm -fmt=goimports -out file-installer_mock.go . fileInstaller
type fileInstaller interface {
installContent(io.Reader, string, os.FileMode) error
installFile(string, string) (os.FileMode, error)
installSymlink(string, string) error
}
var installSymlink = installSymlinkStub
func installSymlinkStub(target string, link string) error {
err := os.Symlink(target, link)
if err != nil {
return fmt.Errorf("error creating symlink '%v' => '%v': %v", link, target, err)
}
return nil
}
var installFile = installFileStub
func installFileStub(src string, dest string) (os.FileMode, error) {
sourceInfo, err := os.Stat(src)
if err != nil {
return 0, fmt.Errorf("error getting file info for '%v': %v", src, err)
}
source, err := os.Open(src)
if err != nil {
return 0, fmt.Errorf("error opening source: %w", err)
}
defer source.Close()
mode := sourceInfo.Mode()
if err := installContent(source, dest, mode); err != nil {
return 0, err
}
return mode, nil
}
var installContent = installContentStub
func installContentStub(content io.Reader, dest string, mode fs.FileMode) error {
destination, err := os.Create(dest)
if err != nil {
return fmt.Errorf("error creating destination: %w", err)
}
defer destination.Close()
_, err = io.Copy(destination, content)
if err != nil {
return fmt.Errorf("error copying file: %w", err)
}
err = os.Chmod(dest, mode)
if err != nil {
return fmt.Errorf("error setting mode for '%v': %v", dest, err)
}
return nil
}

View File

@@ -0,0 +1,74 @@
// Code generated by moq; DO NOT EDIT.
// github.com/matryer/moq
package installer
import (
"sync"
)
// Ensure, that InstallerMock does implement Installer.
// If this is not the case, regenerate this file with moq.
var _ Installer = &InstallerMock{}
// InstallerMock is a mock implementation of Installer.
//
// func TestSomethingThatUsesInstaller(t *testing.T) {
//
// // make and configure a mocked Installer
// mockedInstaller := &InstallerMock{
// InstallFunc: func(s string) error {
// panic("mock out the Install method")
// },
// }
//
// // use mockedInstaller in code that requires Installer
// // and then make assertions.
//
// }
type InstallerMock struct {
// InstallFunc mocks the Install method.
InstallFunc func(s string) error
// calls tracks calls to the methods.
calls struct {
// Install holds details about calls to the Install method.
Install []struct {
// S is the s argument value.
S string
}
}
lockInstall sync.RWMutex
}
// Install calls InstallFunc.
func (mock *InstallerMock) Install(s string) error {
if mock.InstallFunc == nil {
panic("InstallerMock.InstallFunc: method is nil but Installer.Install was just called")
}
callInfo := struct {
S string
}{
S: s,
}
mock.lockInstall.Lock()
mock.calls.Install = append(mock.calls.Install, callInfo)
mock.lockInstall.Unlock()
return mock.InstallFunc(s)
}
// InstallCalls gets all the calls that were made to Install.
// Check the length with:
//
// len(mockedInstaller.InstallCalls())
func (mock *InstallerMock) InstallCalls() []struct {
S string
} {
var calls []struct {
S string
}
mock.lockInstall.RLock()
calls = mock.calls.Install
mock.lockInstall.RUnlock()
return calls
}

View File

@@ -0,0 +1,251 @@
/**
# SPDX-FileCopyrightText: Copyright (c) 2025 NVIDIA CORPORATION & AFFILIATES. All rights reserved.
# SPDX-License-Identifier: Apache-2.0
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
**/
package installer
import (
"bytes"
"fmt"
"io"
"io/fs"
"os"
"path/filepath"
"testing"
testlog "github.com/sirupsen/logrus/hooks/test"
"github.com/stretchr/testify/require"
"github.com/NVIDIA/nvidia-container-toolkit/internal/lookup"
)
func TestToolkitInstaller(t *testing.T) {
logger, _ := testlog.NewNullLogger()
type contentCall struct {
wrapper string
path string
mode fs.FileMode
}
var contentCalls []contentCall
installer := &fileInstallerMock{
installFileFunc: func(s1, s2 string) (os.FileMode, error) {
return 0666, nil
},
installContentFunc: func(reader io.Reader, s string, fileMode fs.FileMode) error {
var b bytes.Buffer
if _, err := b.ReadFrom(reader); err != nil {
return err
}
contents := contentCall{
wrapper: b.String(),
path: s,
mode: fileMode,
}
contentCalls = append(contentCalls, contents)
return nil
},
installSymlinkFunc: func(s1, s2 string) error {
return nil
},
}
installFile = installer.installFile
installContent = installer.installContent
installSymlink = installer.installSymlink
root := "/artifacts/test"
libraries := &lookup.LocatorMock{
LocateFunc: func(s string) ([]string, error) {
switch s {
case "libnvidia-container.so.1":
return []string{filepath.Join(root, "libnvidia-container.so.987.65.43")}, nil
case "libnvidia-container-go.so.1":
return []string{filepath.Join(root, "libnvidia-container-go.so.1.23.4")}, nil
}
return nil, fmt.Errorf("%v not found", s)
},
}
executables := &lookup.LocatorMock{
LocateFunc: func(s string) ([]string, error) {
switch s {
case "nvidia-container-runtime.cdi":
fallthrough
case "nvidia-container-runtime.legacy":
fallthrough
case "nvidia-container-runtime":
fallthrough
case "nvidia-ctk":
fallthrough
case "nvidia-container-cli":
fallthrough
case "nvidia-container-runtime-hook":
fallthrough
case "nvidia-cdi-hook":
return []string{filepath.Join(root, "usr/bin", s)}, nil
}
return nil, fmt.Errorf("%v not found", s)
},
}
r := &artifactRoot{
libraries: libraries,
executables: executables,
}
createDirectory := &InstallerMock{
InstallFunc: func(c string) error {
return nil
},
}
i := ToolkitInstaller{
logger: logger,
artifactRoot: r,
ensureTargetDirectory: createDirectory,
}
err := i.Install("/foo/bar/baz")
require.NoError(t, err)
require.ElementsMatch(t,
[]struct {
S string
}{
{"/foo/bar/baz"},
},
createDirectory.InstallCalls(),
)
require.ElementsMatch(t,
installer.installFileCalls(),
[]struct {
S1 string
S2 string
}{
{"/artifacts/test/libnvidia-container-go.so.1.23.4", "/foo/bar/baz/libnvidia-container-go.so.1.23.4"},
{"/artifacts/test/libnvidia-container.so.987.65.43", "/foo/bar/baz/libnvidia-container.so.987.65.43"},
{"/artifacts/test/usr/bin/nvidia-container-runtime.cdi", "/foo/bar/baz/nvidia-container-runtime.cdi.real"},
{"/artifacts/test/usr/bin/nvidia-container-runtime.legacy", "/foo/bar/baz/nvidia-container-runtime.legacy.real"},
{"/artifacts/test/usr/bin/nvidia-container-runtime", "/foo/bar/baz/nvidia-container-runtime.real"},
{"/artifacts/test/usr/bin/nvidia-ctk", "/foo/bar/baz/nvidia-ctk.real"},
{"/artifacts/test/usr/bin/nvidia-cdi-hook", "/foo/bar/baz/nvidia-cdi-hook.real"},
{"/artifacts/test/usr/bin/nvidia-container-cli", "/foo/bar/baz/nvidia-container-cli.real"},
{"/artifacts/test/usr/bin/nvidia-container-runtime-hook", "/foo/bar/baz/nvidia-container-runtime-hook.real"},
},
)
require.ElementsMatch(t,
installer.installSymlinkCalls(),
[]struct {
S1 string
S2 string
}{
{"libnvidia-container-go.so.1.23.4", "/foo/bar/baz/libnvidia-container-go.so.1"},
{"libnvidia-container.so.987.65.43", "/foo/bar/baz/libnvidia-container.so.1"},
{"nvidia-container-runtime-hook", "/foo/bar/baz/nvidia-container-toolkit"},
},
)
require.ElementsMatch(t,
contentCalls,
[]contentCall{
{
path: "/foo/bar/baz/nvidia-container-runtime",
mode: 0777,
wrapper: `#! /bin/sh
cat /proc/modules | grep -e "^nvidia " >/dev/null 2>&1
if [ "${?}" != "0" ]; then
echo "nvidia driver modules are not yet loaded, invoking runc directly"
exec runc "$@"
fi
NVIDIA_CTK_CONFIG_FILE_PATH=/foo/bar/baz/.config/nvidia-container-runtime/config.toml \
PATH=/foo/bar/baz:$PATH \
/foo/bar/baz/nvidia-container-runtime.real \
"$@"
`,
},
{
path: "/foo/bar/baz/nvidia-container-runtime.cdi",
mode: 0777,
wrapper: `#! /bin/sh
cat /proc/modules | grep -e "^nvidia " >/dev/null 2>&1
if [ "${?}" != "0" ]; then
echo "nvidia driver modules are not yet loaded, invoking runc directly"
exec runc "$@"
fi
NVIDIA_CTK_CONFIG_FILE_PATH=/foo/bar/baz/.config/nvidia-container-runtime/config.toml \
PATH=/foo/bar/baz:$PATH \
/foo/bar/baz/nvidia-container-runtime.cdi.real \
"$@"
`,
},
{
path: "/foo/bar/baz/nvidia-container-runtime.legacy",
mode: 0777,
wrapper: `#! /bin/sh
cat /proc/modules | grep -e "^nvidia " >/dev/null 2>&1
if [ "${?}" != "0" ]; then
echo "nvidia driver modules are not yet loaded, invoking runc directly"
exec runc "$@"
fi
NVIDIA_CTK_CONFIG_FILE_PATH=/foo/bar/baz/.config/nvidia-container-runtime/config.toml \
PATH=/foo/bar/baz:$PATH \
/foo/bar/baz/nvidia-container-runtime.legacy.real \
"$@"
`,
},
{
path: "/foo/bar/baz/nvidia-ctk",
mode: 0777,
wrapper: `#! /bin/sh
PATH=/foo/bar/baz:$PATH \
/foo/bar/baz/nvidia-ctk.real \
"$@"
`,
},
{
path: "/foo/bar/baz/nvidia-cdi-hook",
mode: 0777,
wrapper: `#! /bin/sh
PATH=/foo/bar/baz:$PATH \
/foo/bar/baz/nvidia-cdi-hook.real \
"$@"
`,
},
{
path: "/foo/bar/baz/nvidia-container-cli",
mode: 0777,
wrapper: `#! /bin/sh
LD_LIBRARY_PATH=/foo/bar/baz:$LD_LIBRARY_PATH \
PATH=/foo/bar/baz:$PATH \
/foo/bar/baz/nvidia-container-cli.real \
"$@"
`,
},
{
path: "/foo/bar/baz/nvidia-container-runtime-hook",
mode: 0777,
wrapper: `#! /bin/sh
NVIDIA_CTK_CONFIG_FILE_PATH=/foo/bar/baz/.config/nvidia-container-runtime/config.toml \
PATH=/foo/bar/baz:$PATH \
/foo/bar/baz/nvidia-container-runtime-hook.real \
"$@"
`,
},
},
)
}

View File

@@ -0,0 +1,73 @@
/**
# SPDX-FileCopyrightText: Copyright (c) 2025 NVIDIA CORPORATION & AFFILIATES. All rights reserved.
# SPDX-License-Identifier: Apache-2.0
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
**/
package installer
import (
"path/filepath"
log "github.com/sirupsen/logrus"
)
// collectLibraries locates and installs the libraries that are part of
// the nvidia-container-toolkit.
// A predefined set of library candidates are considered, with the first one
// resulting in success being installed to the toolkit folder. The install process
// resolves the symlink for the library and copies the versioned library itself.
func (t *ToolkitInstaller) collectLibraries() ([]Installer, error) {
requiredLibraries := []string{
"libnvidia-container.so.1",
"libnvidia-container-go.so.1",
}
var installers []Installer
for _, l := range requiredLibraries {
libraryPath, err := t.artifactRoot.findLibrary(l)
if err != nil {
if t.ignoreErrors {
log.Errorf("Ignoring error: %v", err)
continue
}
return nil, err
}
installers = append(installers, library(libraryPath))
if filepath.Base(libraryPath) == l {
continue
}
link := symlink{
linkname: l,
target: filepath.Base(libraryPath),
}
installers = append(installers, link)
}
return installers, nil
}
type library string
// Install copies the library l to the destination folder.
// The same basename is used in the destination folder.
func (l library) Install(destinationDir string) error {
dest := filepath.Join(destinationDir, filepath.Base(string(l)))
_, err := installFile(string(l), dest)
return err
}

View File

@@ -0,0 +1,47 @@
/**
# SPDX-FileCopyrightText: Copyright (c) 2025 NVIDIA CORPORATION & AFFILIATES. All rights reserved.
# SPDX-License-Identifier: Apache-2.0
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
**/
package installer
import "github.com/NVIDIA/nvidia-container-toolkit/internal/logger"
type Option func(*ToolkitInstaller)
func WithLogger(logger logger.Interface) Option {
return func(ti *ToolkitInstaller) {
ti.logger = logger
}
}
func WithArtifactRoot(artifactRoot *artifactRoot) Option {
return func(ti *ToolkitInstaller) {
ti.artifactRoot = artifactRoot
}
}
func WithIgnoreErrors(ignoreErrors bool) Option {
return func(ti *ToolkitInstaller) {
ti.ignoreErrors = ignoreErrors
}
}
// WithSourceRoot sets the root directory for locating artifacts to be installed.
func WithSourceRoot(sourceRoot string) Option {
return func(ti *ToolkitInstaller) {
ti.sourceRoot = sourceRoot
}
}

View File

@@ -26,6 +26,7 @@ import (
"tags.cncf.io/container-device-interface/pkg/cdi"
"tags.cncf.io/container-device-interface/pkg/parser"
"github.com/NVIDIA/nvidia-container-toolkit/cmd/nvidia-ctk-installer/toolkit/installer"
"github.com/NVIDIA/nvidia-container-toolkit/internal/config"
"github.com/NVIDIA/nvidia-container-toolkit/internal/logger"
"github.com/NVIDIA/nvidia-container-toolkit/internal/system/nvdevices"
@@ -36,12 +37,6 @@ import (
const (
// DefaultNvidiaDriverRoot specifies the default NVIDIA driver run directory
DefaultNvidiaDriverRoot = "/run/nvidia/driver"
nvidiaContainerCliSource = "/usr/bin/nvidia-container-cli"
nvidiaContainerRuntimeHookSource = "/usr/bin/nvidia-container-runtime-hook"
nvidiaContainerToolkitConfigSource = "/etc/nvidia-container-runtime/config.toml"
configFilename = "config.toml"
)
type cdiOptions struct {
@@ -218,7 +213,9 @@ func Flags(opts *Options) []cli.Flag {
// An Installer is used to install the NVIDIA Container Toolkit from the toolkit container.
type Installer struct {
fileInstaller
logger logger.Interface
sourceRoot string
// toolkitRoot specifies the destination path at which the toolkit is installed.
toolkitRoot string
}
@@ -233,6 +230,7 @@ func NewInstaller(opts ...Option) *Installer {
if i.logger == nil {
i.logger = logger.New()
}
return i
}
@@ -297,59 +295,26 @@ func (t *Installer) Install(cli *cli.Context, opts *Options) error {
t.logger.Errorf("Ignoring error: %v", fmt.Errorf("error removing toolkit directory: %v", err))
}
toolkitConfigDir := filepath.Join(t.toolkitRoot, ".config", "nvidia-container-runtime")
toolkitConfigPath := filepath.Join(toolkitConfigDir, configFilename)
err = t.createDirectories(t.toolkitRoot, toolkitConfigDir)
if err != nil && !opts.ignoreErrors {
return fmt.Errorf("could not create required directories: %v", err)
} else if err != nil {
t.logger.Errorf("Ignoring error: %v", fmt.Errorf("could not create required directories: %v", err))
// Create a toolkit installer to actually install the toolkit components.
toolkit, err := installer.New(
installer.WithLogger(t.logger),
installer.WithSourceRoot(t.sourceRoot),
installer.WithIgnoreErrors(opts.ignoreErrors),
)
if err != nil {
if !opts.ignoreErrors {
return fmt.Errorf("could not create toolkit installer: %w", err)
}
t.logger.Errorf("Ignoring error: %v", fmt.Errorf("could not create toolkit installer: %w", err))
}
if err := toolkit.Install(t.toolkitRoot); err != nil {
if !opts.ignoreErrors {
return fmt.Errorf("could not install toolkit components: %w", err)
}
t.logger.Errorf("Ignoring error: %v", fmt.Errorf("could not install toolkit components: %w", err))
}
err = t.installContainerLibraries(t.toolkitRoot)
if err != nil && !opts.ignoreErrors {
return fmt.Errorf("error installing NVIDIA container library: %v", err)
} else if err != nil {
t.logger.Errorf("Ignoring error: %v", fmt.Errorf("error installing NVIDIA container library: %v", err))
}
err = t.installContainerRuntimes(t.toolkitRoot)
if err != nil && !opts.ignoreErrors {
return fmt.Errorf("error installing NVIDIA container runtime: %v", err)
} else if err != nil {
t.logger.Errorf("Ignoring error: %v", fmt.Errorf("error installing NVIDIA container runtime: %v", err))
}
nvidiaContainerCliExecutable, err := t.installContainerCLI(t.toolkitRoot)
if err != nil && !opts.ignoreErrors {
return fmt.Errorf("error installing NVIDIA container CLI: %v", err)
} else if err != nil {
t.logger.Errorf("Ignoring error: %v", fmt.Errorf("error installing NVIDIA container CLI: %v", err))
}
nvidiaContainerRuntimeHookPath, err := t.installRuntimeHook(t.toolkitRoot, toolkitConfigPath)
if err != nil && !opts.ignoreErrors {
return fmt.Errorf("error installing NVIDIA container runtime hook: %v", err)
} else if err != nil {
t.logger.Errorf("Ignoring error: %v", fmt.Errorf("error installing NVIDIA container runtime hook: %v", err))
}
nvidiaCTKPath, err := t.installContainerToolkitCLI(t.toolkitRoot)
if err != nil && !opts.ignoreErrors {
return fmt.Errorf("error installing NVIDIA Container Toolkit CLI: %v", err)
} else if err != nil {
t.logger.Errorf("Ignoring error: %v", fmt.Errorf("error installing NVIDIA Container Toolkit CLI: %v", err))
}
nvidiaCDIHookPath, err := t.installContainerCDIHookCLI(t.toolkitRoot)
if err != nil && !opts.ignoreErrors {
return fmt.Errorf("error installing NVIDIA Container CDI Hook CLI: %v", err)
} else if err != nil {
t.logger.Errorf("Ignoring error: %v", fmt.Errorf("error installing NVIDIA Container CDI Hook CLI: %v", err))
}
err = t.installToolkitConfig(cli, toolkitConfigPath, nvidiaContainerCliExecutable, nvidiaCTKPath, nvidiaContainerRuntimeHookPath, opts)
err = t.installToolkitConfig(cli, opts, toolkit.ConfigFilePath(t.toolkitRoot))
if err != nil && !opts.ignoreErrors {
return fmt.Errorf("error installing NVIDIA container toolkit config: %v", err)
} else if err != nil {
@@ -363,6 +328,7 @@ func (t *Installer) Install(cli *cli.Context, opts *Options) error {
t.logger.Errorf("Ignoring error: %v", fmt.Errorf("error creating device nodes: %v", err))
}
nvidiaCDIHookPath := filepath.Join(t.toolkitRoot, "nvidia-cdi-hook")
err = t.generateCDISpec(opts, nvidiaCDIHookPath)
if err != nil && !opts.ignoreErrors {
return fmt.Errorf("error generating CDI specification: %v", err)
@@ -373,62 +339,23 @@ func (t *Installer) Install(cli *cli.Context, opts *Options) error {
return nil
}
// installContainerLibraries locates and installs the libraries that are part of
// the nvidia-container-toolkit.
// A predefined set of library candidates are considered, with the first one
// resulting in success being installed to the toolkit folder. The install process
// resolves the symlink for the library and copies the versioned library itself.
func (t *Installer) installContainerLibraries(toolkitRoot string) error {
t.logger.Infof("Installing NVIDIA container library to '%v'", toolkitRoot)
libs := []string{
"libnvidia-container.so.1",
"libnvidia-container-go.so.1",
}
for _, l := range libs {
err := t.installLibrary(l, toolkitRoot)
if err != nil {
return fmt.Errorf("failed to install %s: %v", l, err)
}
}
return nil
}
// installLibrary installs the specified library to the toolkit directory.
func (t *Installer) installLibrary(libName string, toolkitRoot string) error {
libraryPath, err := t.findLibrary(libName)
if err != nil {
return fmt.Errorf("error locating NVIDIA container library: %v", err)
}
installedLibPath, err := t.installFileToFolder(toolkitRoot, libraryPath)
if err != nil {
return fmt.Errorf("error installing %v to %v: %v", libraryPath, toolkitRoot, err)
}
t.logger.Infof("Installed '%v' to '%v'", libraryPath, installedLibPath)
if filepath.Base(installedLibPath) == libName {
return nil
}
err = t.installSymlink(toolkitRoot, libName, installedLibPath)
if err != nil {
return fmt.Errorf("error installing symlink for NVIDIA container library: %v", err)
}
return nil
}
// installToolkitConfig installs the config file for the NVIDIA container toolkit ensuring
// that the settings are updated to match the desired install and nvidia driver directories.
func (t *Installer) installToolkitConfig(c *cli.Context, toolkitConfigPath string, nvidiaContainerCliExecutablePath string, nvidiaCTKPath string, nvidaContainerRuntimeHookPath string, opts *Options) error {
func (t *Installer) installToolkitConfig(c *cli.Context, opts *Options, toolkitConfigPath string) error {
t.logger.Infof("Installing NVIDIA container toolkit config '%v'", toolkitConfigPath)
cfg, err := config.New(
config.WithConfigFile(nvidiaContainerToolkitConfigSource),
)
err := t.createDirectories(filepath.Dir(toolkitConfigPath))
if err != nil && !opts.ignoreErrors {
return fmt.Errorf("could not create required directories: %v", err)
} else if err != nil {
t.logger.Errorf("Ignoring error: %v", fmt.Errorf("could not create required directories: %v", err))
}
nvidiaContainerCliExecutablePath := filepath.Join(t.toolkitRoot, "nvidia-container-cli")
nvidiaCTKPath := filepath.Join(t.toolkitRoot, "nvidia-ctk")
nvidiaContainerRuntimeHookPath := filepath.Join(t.toolkitRoot, "nvidia-container-runtime-hook")
cfg, err := config.New()
if err != nil {
return fmt.Errorf("could not open source config file: %v", err)
}
@@ -456,7 +383,7 @@ func (t *Installer) installToolkitConfig(c *cli.Context, toolkitConfigPath strin
// Set nvidia-ctk options
"nvidia-ctk.path": nvidiaCTKPath,
// Set the nvidia-container-runtime-hook options
"nvidia-container-runtime-hook.path": nvidaContainerRuntimeHookPath,
"nvidia-container-runtime-hook.path": nvidiaContainerRuntimeHookPath,
"nvidia-container-runtime-hook.skip-mode-detection": opts.ContainerRuntimeHookSkipModeDetection,
}
@@ -523,147 +450,6 @@ func (t *Installer) installToolkitConfig(c *cli.Context, toolkitConfigPath strin
return nil
}
// installContainerToolkitCLI installs the nvidia-ctk CLI executable and wrapper.
func (t *Installer) installContainerToolkitCLI(toolkitDir string) (string, error) {
e := executable{
fileInstaller: t.fileInstaller,
source: "/usr/bin/nvidia-ctk",
target: executableTarget{
dotfileName: "nvidia-ctk.real",
wrapperName: "nvidia-ctk",
},
}
return e.install(toolkitDir)
}
// installContainerCDIHookCLI installs the nvidia-cdi-hook CLI executable and wrapper.
func (t *Installer) installContainerCDIHookCLI(toolkitDir string) (string, error) {
e := executable{
fileInstaller: t.fileInstaller,
source: "/usr/bin/nvidia-cdi-hook",
target: executableTarget{
dotfileName: "nvidia-cdi-hook.real",
wrapperName: "nvidia-cdi-hook",
},
}
return e.install(toolkitDir)
}
// installContainerCLI sets up the NVIDIA container CLI executable, copying the executable
// and implementing the required wrapper
func (t *Installer) installContainerCLI(toolkitRoot string) (string, error) {
t.logger.Infof("Installing NVIDIA container CLI from '%v'", nvidiaContainerCliSource)
env := map[string]string{
"LD_LIBRARY_PATH": toolkitRoot,
}
e := executable{
fileInstaller: t.fileInstaller,
source: nvidiaContainerCliSource,
target: executableTarget{
dotfileName: "nvidia-container-cli.real",
wrapperName: "nvidia-container-cli",
},
env: env,
}
installedPath, err := e.install(toolkitRoot)
if err != nil {
return "", fmt.Errorf("error installing NVIDIA container CLI: %v", err)
}
return installedPath, nil
}
// installRuntimeHook sets up the NVIDIA runtime hook, copying the executable
// and implementing the required wrapper
func (t *Installer) installRuntimeHook(toolkitRoot string, configFilePath string) (string, error) {
t.logger.Infof("Installing NVIDIA container runtime hook from '%v'", nvidiaContainerRuntimeHookSource)
argLines := []string{
fmt.Sprintf("-config \"%s\"", configFilePath),
}
e := executable{
fileInstaller: t.fileInstaller,
source: nvidiaContainerRuntimeHookSource,
target: executableTarget{
dotfileName: "nvidia-container-runtime-hook.real",
wrapperName: "nvidia-container-runtime-hook",
},
argLines: argLines,
}
installedPath, err := e.install(toolkitRoot)
if err != nil {
return "", fmt.Errorf("error installing NVIDIA container runtime hook: %v", err)
}
err = t.installSymlink(toolkitRoot, "nvidia-container-toolkit", installedPath)
if err != nil {
return "", fmt.Errorf("error installing symlink to NVIDIA container runtime hook: %v", err)
}
return installedPath, nil
}
// installSymlink creates a symlink in the toolkitDirectory that points to the specified target.
// Note: The target is assumed to be local to the toolkit directory
func (t *Installer) installSymlink(toolkitRoot string, link string, target string) error {
symlinkPath := filepath.Join(toolkitRoot, link)
targetPath := filepath.Base(target)
t.logger.Infof("Creating symlink '%v' -> '%v'", symlinkPath, targetPath)
err := os.Symlink(targetPath, symlinkPath)
if err != nil {
return fmt.Errorf("error creating symlink '%v' => '%v': %v", symlinkPath, targetPath, err)
}
return nil
}
// findLibrary searches a set of candidate libraries in the specified root for
// a given library name
func (t *Installer) findLibrary(libName string) (string, error) {
t.logger.Infof("Finding library %v (root=%v)", libName)
candidateDirs := []string{
"/usr/lib64",
"/usr/lib/x86_64-linux-gnu",
"/usr/lib/aarch64-linux-gnu",
}
for _, d := range candidateDirs {
l := filepath.Join(t.sourceRoot, d, libName)
t.logger.Infof("Checking library candidate '%v'", l)
libraryCandidate, err := t.resolveLink(l)
if err != nil {
t.logger.Infof("Skipping library candidate '%v': %v", l, err)
continue
}
return strings.TrimPrefix(libraryCandidate, t.sourceRoot), nil
}
return "", fmt.Errorf("error locating library '%v'", libName)
}
// resolveLink finds the target of a symlink or the file itself in the
// case of a regular file.
// This is equivalent to running `readlink -f ${l}`
func (t *Installer) resolveLink(l string) (string, error) {
resolved, err := filepath.EvalSymlinks(l)
if err != nil {
return "", fmt.Errorf("error resolving link '%v': %v", l, err)
}
if l != resolved {
t.logger.Infof("Resolved link: '%v' => '%v'", l, resolved)
}
return resolved, nil
}
func (t *Installer) createDirectories(dir ...string) error {
for _, d := range dir {
t.logger.Infof("Creating directory '%v'", d)

View File

@@ -69,47 +69,62 @@ func TestInstall(t *testing.T) {
cdiEnabled: true,
expectedCdiSpec: `---
cdiVersion: 0.5.0
containerEdits:
env:
- NVIDIA_VISIBLE_DEVICES=void
hooks:
- args:
- nvidia-cdi-hook
- create-symlinks
- --link
- libcuda.so.1::/lib/x86_64-linux-gnu/libcuda.so
hookName: createContainer
path: {{ .toolkitRoot }}/nvidia-cdi-hook
- args:
- nvidia-cdi-hook
- update-ldcache
- --folder
- /lib/x86_64-linux-gnu
hookName: createContainer
path: {{ .toolkitRoot }}/nvidia-cdi-hook
mounts:
- containerPath: /lib/x86_64-linux-gnu/libcuda.so.999.88.77
hostPath: /host/driver/root/lib/x86_64-linux-gnu/libcuda.so.999.88.77
options:
- ro
- nosuid
- nodev
- bind
devices:
- containerEdits:
deviceNodes:
- hostPath: /host/driver/root/dev/nvidia0
path: /dev/nvidia0
- hostPath: /host/driver/root/dev/nvidiactl
path: /dev/nvidiactl
- hostPath: /host/driver/root/dev/nvidia-caps-imex-channels/channel0
path: /dev/nvidia-caps-imex-channels/channel0
- hostPath: /host/driver/root/dev/nvidia-caps-imex-channels/channel1
path: /dev/nvidia-caps-imex-channels/channel1
- hostPath: /host/driver/root/dev/nvidia-caps-imex-channels/channel2047
path: /dev/nvidia-caps-imex-channels/channel2047
name: all
kind: example.com/class
devices:
- name: all
containerEdits:
deviceNodes:
- path: /dev/nvidia0
hostPath: /host/driver/root/dev/nvidia0
- path: /dev/nvidiactl
hostPath: /host/driver/root/dev/nvidiactl
- path: /dev/nvidia-caps-imex-channels/channel0
hostPath: /host/driver/root/dev/nvidia-caps-imex-channels/channel0
- path: /dev/nvidia-caps-imex-channels/channel1
hostPath: /host/driver/root/dev/nvidia-caps-imex-channels/channel1
- path: /dev/nvidia-caps-imex-channels/channel2047
hostPath: /host/driver/root/dev/nvidia-caps-imex-channels/channel2047
containerEdits:
env:
- NVIDIA_CTK_LIBCUDA_DIR=/lib/x86_64-linux-gnu
- NVIDIA_VISIBLE_DEVICES=void
hooks:
- hookName: createContainer
path: {{ .toolkitRoot }}/nvidia-cdi-hook
args:
- nvidia-cdi-hook
- create-symlinks
- --link
- libcuda.so.1::/lib/x86_64-linux-gnu/libcuda.so
env:
- NVIDIA_CTK_DEBUG=false
- hookName: createContainer
path: {{ .toolkitRoot }}/nvidia-cdi-hook
args:
- nvidia-cdi-hook
- create-soname-symlinks
- --folder
- /lib/x86_64-linux-gnu
env:
- NVIDIA_CTK_DEBUG=false
- hookName: createContainer
path: {{ .toolkitRoot }}/nvidia-cdi-hook
args:
- nvidia-cdi-hook
- update-ldcache
- --folder
- /lib/x86_64-linux-gnu
env:
- NVIDIA_CTK_DEBUG=false
mounts:
- hostPath: /host/driver/root/lib/x86_64-linux-gnu/libcuda.so.999.88.77
containerPath: /lib/x86_64-linux-gnu/libcuda.so.999.88.77
options:
- ro
- nosuid
- nodev
- rbind
- rprivate
`,
},
}

View File

@@ -25,6 +25,8 @@ import (
"github.com/urfave/cli/v2"
cdi "tags.cncf.io/container-device-interface/pkg/parser"
"github.com/NVIDIA/go-nvml/pkg/nvml"
"github.com/NVIDIA/nvidia-container-toolkit/internal/config"
"github.com/NVIDIA/nvidia-container-toolkit/internal/logger"
"github.com/NVIDIA/nvidia-container-toolkit/internal/platform-support/tegra/csv"
@@ -55,11 +57,15 @@ type options struct {
configSearchPaths cli.StringSlice
librarySearchPaths cli.StringSlice
disabledHooks cli.StringSlice
csv struct {
files cli.StringSlice
ignorePatterns cli.StringSlice
}
// the following are used for dependency injection during spec generation.
nvmllib nvml.Interface
}
// NewCommand constructs a generate-cdi command with the specified logger
@@ -91,17 +97,20 @@ func (m command) build() *cli.Command {
Name: "config-search-path",
Usage: "Specify the path to search for config files when discovering the entities that should be included in the CDI specification.",
Destination: &opts.configSearchPaths,
EnvVars: []string{"NVIDIA_CTK_CDI_GENERATE_CONFIG_SEARCH_PATHS"},
},
&cli.StringFlag{
Name: "output",
Usage: "Specify the file to output the generated CDI specification to. If this is '' the specification is output to STDOUT",
Destination: &opts.output,
EnvVars: []string{"NVIDIA_CTK_CDI_OUTPUT_FILE_PATH"},
},
&cli.StringFlag{
Name: "format",
Usage: "The output format for the generated spec [json | yaml]. This overrides the format defined by the output file extension (if specified).",
Value: spec.FormatYAML,
Destination: &opts.format,
EnvVars: []string{"NVIDIA_CTK_CDI_GENERATE_OUTPUT_FORMAT"},
},
&cli.StringFlag{
Name: "mode",
@@ -111,27 +120,32 @@ func (m command) build() *cli.Command {
"If mode is set to 'auto' the mode will be determined based on the system configuration.",
Value: string(nvcdi.ModeAuto),
Destination: &opts.mode,
EnvVars: []string{"NVIDIA_CTK_CDI_GENERATE_MODE"},
},
&cli.StringFlag{
Name: "dev-root",
Usage: "Specify the root where `/dev` is located. If this is not specified, the driver-root is assumed.",
Destination: &opts.devRoot,
EnvVars: []string{"NVIDIA_CTK_DEV_ROOT"},
},
&cli.StringSliceFlag{
Name: "device-name-strategy",
Usage: "Specify the strategy for generating device names. If this is specified multiple times, the devices will be duplicated for each strategy. One of [index | uuid | type-index]",
Value: cli.NewStringSlice(nvcdi.DeviceNameStrategyIndex, nvcdi.DeviceNameStrategyUUID),
Destination: &opts.deviceNameStrategies,
EnvVars: []string{"NVIDIA_CTK_CDI_GENERATE_DEVICE_NAME_STRATEGIES"},
},
&cli.StringFlag{
Name: "driver-root",
Usage: "Specify the NVIDIA GPU driver root to use when discovering the entities that should be included in the CDI specification.",
Destination: &opts.driverRoot,
EnvVars: []string{"NVIDIA_CTK_DRIVER_ROOT"},
},
&cli.StringSliceFlag{
Name: "library-search-path",
Usage: "Specify the path to search for libraries when discovering the entities that should be included in the CDI specification.\n\tNote: This option only applies to CSV mode.",
Destination: &opts.librarySearchPaths,
EnvVars: []string{"NVIDIA_CTK_CDI_GENERATE_LIBRARY_SEARCH_PATHS"},
},
&cli.StringFlag{
Name: "nvidia-cdi-hook-path",
@@ -140,11 +154,13 @@ func (m command) build() *cli.Command {
"If not specified, the PATH will be searched for `nvidia-cdi-hook`. " +
"NOTE: That if this is specified as `nvidia-ctk`, the PATH will be searched for `nvidia-ctk` instead.",
Destination: &opts.nvidiaCDIHookPath,
EnvVars: []string{"NVIDIA_CTK_CDI_HOOK_PATH"},
},
&cli.StringFlag{
Name: "ldconfig-path",
Usage: "Specify the path to use for ldconfig in the generated CDI specification",
Destination: &opts.ldconfigPath,
EnvVars: []string{"NVIDIA_CTK_CDI_GENERATE_LDCONFIG_PATH"},
},
&cli.StringFlag{
Name: "vendor",
@@ -152,6 +168,7 @@ func (m command) build() *cli.Command {
Usage: "the vendor string to use for the generated CDI specification.",
Value: "nvidia.com",
Destination: &opts.vendor,
EnvVars: []string{"NVIDIA_CTK_CDI_GENERATE_VENDOR"},
},
&cli.StringFlag{
Name: "class",
@@ -159,17 +176,30 @@ func (m command) build() *cli.Command {
Usage: "the class string to use for the generated CDI specification.",
Value: "gpu",
Destination: &opts.class,
EnvVars: []string{"NVIDIA_CTK_CDI_GENERATE_CLASS"},
},
&cli.StringSliceFlag{
Name: "csv.file",
Usage: "The path to the list of CSV files to use when generating the CDI specification in CSV mode.",
Value: cli.NewStringSlice(csv.DefaultFileList()...),
Destination: &opts.csv.files,
EnvVars: []string{"NVIDIA_CTK_CDI_GENERATE_CSV_FILES"},
},
&cli.StringSliceFlag{
Name: "csv.ignore-pattern",
Usage: "Specify a pattern the CSV mount specifications.",
Usage: "specify a pattern the CSV mount specifications.",
Destination: &opts.csv.ignorePatterns,
EnvVars: []string{"NVIDIA_CTK_CDI_GENERATE_CSV_IGNORE_PATTERNS"},
},
&cli.StringSliceFlag{
Name: "disable-hook",
Aliases: []string{"disable-hooks"},
Usage: "specify a specific hook to skip when generating CDI " +
"specifications. This can be specified multiple times and the " +
"special hook name 'all' can be used ensure that the generated " +
"CDI specification does not include any hooks.",
Destination: &opts.disabledHooks,
EnvVars: []string{"NVIDIA_CTK_CDI_GENERATE_DISABLED_HOOKS"},
},
}
@@ -257,7 +287,7 @@ func (m command) generateSpec(opts *options) (spec.Interface, error) {
deviceNamers = append(deviceNamers, deviceNamer)
}
cdilib, err := nvcdi.New(
cdiOptions := []nvcdi.Option{
nvcdi.WithLogger(m.logger),
nvcdi.WithDriverRoot(opts.driverRoot),
nvcdi.WithDevRoot(opts.devRoot),
@@ -269,7 +299,15 @@ func (m command) generateSpec(opts *options) (spec.Interface, error) {
nvcdi.WithLibrarySearchPaths(opts.librarySearchPaths.Value()),
nvcdi.WithCSVFiles(opts.csv.files.Value()),
nvcdi.WithCSVIgnorePatterns(opts.csv.ignorePatterns.Value()),
)
// We set the following to allow for dependency injection:
nvcdi.WithNvmlLib(opts.nvmllib),
}
for _, hook := range opts.disabledHooks.Value() {
cdiOptions = append(cdiOptions, nvcdi.WithDisabledHook(hook))
}
cdilib, err := nvcdi.New(cdiOptions...)
if err != nil {
return nil, fmt.Errorf("failed to create CDI library: %v", err)
}

View File

@@ -0,0 +1,402 @@
/**
# Copyright (c) 2025, NVIDIA CORPORATION. All rights reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
**/
package generate
import (
"bytes"
"path/filepath"
"strings"
"testing"
"github.com/NVIDIA/go-nvml/pkg/nvml"
"github.com/NVIDIA/go-nvml/pkg/nvml/mock/dgxa100"
testlog "github.com/sirupsen/logrus/hooks/test"
"github.com/stretchr/testify/require"
"github.com/urfave/cli/v2"
"github.com/NVIDIA/nvidia-container-toolkit/internal/test"
)
func TestGenerateSpec(t *testing.T) {
t.Setenv("__NVCT_TESTING_DEVICES_ARE_FILES", "true")
moduleRoot, err := test.GetModuleRoot()
require.NoError(t, err)
driverRoot := filepath.Join(moduleRoot, "testdata", "lookup", "rootfs-1")
logger, _ := testlog.NewNullLogger()
testCases := []struct {
description string
options options
expectedValidateError error
expectedOptions options
expectedError error
expectedSpec string
}{
{
description: "default",
options: options{
format: "yaml",
mode: "nvml",
vendor: "example.com",
class: "device",
driverRoot: driverRoot,
},
expectedOptions: options{
format: "yaml",
mode: "nvml",
vendor: "example.com",
class: "device",
nvidiaCDIHookPath: "/usr/bin/nvidia-cdi-hook",
driverRoot: driverRoot,
},
expectedSpec: `---
cdiVersion: 0.5.0
kind: example.com/device
devices:
- name: "0"
containerEdits:
deviceNodes:
- path: /dev/nvidia0
hostPath: {{ .driverRoot }}/dev/nvidia0
- name: all
containerEdits:
deviceNodes:
- path: /dev/nvidia0
hostPath: {{ .driverRoot }}/dev/nvidia0
containerEdits:
env:
- NVIDIA_CTK_LIBCUDA_DIR=/lib/x86_64-linux-gnu
- NVIDIA_VISIBLE_DEVICES=void
deviceNodes:
- path: /dev/nvidiactl
hostPath: {{ .driverRoot }}/dev/nvidiactl
hooks:
- hookName: createContainer
path: /usr/bin/nvidia-cdi-hook
args:
- nvidia-cdi-hook
- create-symlinks
- --link
- libcuda.so.1::/lib/x86_64-linux-gnu/libcuda.so
env:
- NVIDIA_CTK_DEBUG=false
- hookName: createContainer
path: /usr/bin/nvidia-cdi-hook
args:
- nvidia-cdi-hook
- enable-cuda-compat
- --host-driver-version=999.88.77
env:
- NVIDIA_CTK_DEBUG=false
- hookName: createContainer
path: /usr/bin/nvidia-cdi-hook
args:
- nvidia-cdi-hook
- create-soname-symlinks
- --folder
- /lib/x86_64-linux-gnu
env:
- NVIDIA_CTK_DEBUG=false
- hookName: createContainer
path: /usr/bin/nvidia-cdi-hook
args:
- nvidia-cdi-hook
- update-ldcache
- --folder
- /lib/x86_64-linux-gnu
env:
- NVIDIA_CTK_DEBUG=false
- hookName: createContainer
path: /usr/bin/nvidia-cdi-hook
args:
- nvidia-cdi-hook
- disable-device-node-modification
env:
- NVIDIA_CTK_DEBUG=false
mounts:
- hostPath: {{ .driverRoot }}/lib/x86_64-linux-gnu/libcuda.so.999.88.77
containerPath: /lib/x86_64-linux-gnu/libcuda.so.999.88.77
options:
- ro
- nosuid
- nodev
- rbind
- rprivate
`,
},
{
description: "disableHooks1",
options: options{
format: "yaml",
mode: "nvml",
vendor: "example.com",
class: "device",
driverRoot: driverRoot,
disabledHooks: valueOf(cli.NewStringSlice("enable-cuda-compat")),
},
expectedOptions: options{
format: "yaml",
mode: "nvml",
vendor: "example.com",
class: "device",
nvidiaCDIHookPath: "/usr/bin/nvidia-cdi-hook",
driverRoot: driverRoot,
disabledHooks: valueOf(cli.NewStringSlice("enable-cuda-compat")),
},
expectedSpec: `---
cdiVersion: 0.5.0
kind: example.com/device
devices:
- name: "0"
containerEdits:
deviceNodes:
- path: /dev/nvidia0
hostPath: {{ .driverRoot }}/dev/nvidia0
- name: all
containerEdits:
deviceNodes:
- path: /dev/nvidia0
hostPath: {{ .driverRoot }}/dev/nvidia0
containerEdits:
env:
- NVIDIA_CTK_LIBCUDA_DIR=/lib/x86_64-linux-gnu
- NVIDIA_VISIBLE_DEVICES=void
deviceNodes:
- path: /dev/nvidiactl
hostPath: {{ .driverRoot }}/dev/nvidiactl
hooks:
- hookName: createContainer
path: /usr/bin/nvidia-cdi-hook
args:
- nvidia-cdi-hook
- create-symlinks
- --link
- libcuda.so.1::/lib/x86_64-linux-gnu/libcuda.so
env:
- NVIDIA_CTK_DEBUG=false
- hookName: createContainer
path: /usr/bin/nvidia-cdi-hook
args:
- nvidia-cdi-hook
- create-soname-symlinks
- --folder
- /lib/x86_64-linux-gnu
env:
- NVIDIA_CTK_DEBUG=false
- hookName: createContainer
path: /usr/bin/nvidia-cdi-hook
args:
- nvidia-cdi-hook
- update-ldcache
- --folder
- /lib/x86_64-linux-gnu
env:
- NVIDIA_CTK_DEBUG=false
- hookName: createContainer
path: /usr/bin/nvidia-cdi-hook
args:
- nvidia-cdi-hook
- disable-device-node-modification
env:
- NVIDIA_CTK_DEBUG=false
mounts:
- hostPath: {{ .driverRoot }}/lib/x86_64-linux-gnu/libcuda.so.999.88.77
containerPath: /lib/x86_64-linux-gnu/libcuda.so.999.88.77
options:
- ro
- nosuid
- nodev
- rbind
- rprivate
`,
},
{
description: "disableHooks2",
options: options{
format: "yaml",
mode: "nvml",
vendor: "example.com",
class: "device",
driverRoot: driverRoot,
disabledHooks: valueOf(cli.NewStringSlice("enable-cuda-compat", "update-ldcache")),
},
expectedOptions: options{
format: "yaml",
mode: "nvml",
vendor: "example.com",
class: "device",
nvidiaCDIHookPath: "/usr/bin/nvidia-cdi-hook",
driverRoot: driverRoot,
disabledHooks: valueOf(cli.NewStringSlice("enable-cuda-compat", "update-ldcache")),
},
expectedSpec: `---
cdiVersion: 0.5.0
kind: example.com/device
devices:
- name: "0"
containerEdits:
deviceNodes:
- path: /dev/nvidia0
hostPath: {{ .driverRoot }}/dev/nvidia0
- name: all
containerEdits:
deviceNodes:
- path: /dev/nvidia0
hostPath: {{ .driverRoot }}/dev/nvidia0
containerEdits:
env:
- NVIDIA_CTK_LIBCUDA_DIR=/lib/x86_64-linux-gnu
- NVIDIA_VISIBLE_DEVICES=void
deviceNodes:
- path: /dev/nvidiactl
hostPath: {{ .driverRoot }}/dev/nvidiactl
hooks:
- hookName: createContainer
path: /usr/bin/nvidia-cdi-hook
args:
- nvidia-cdi-hook
- create-symlinks
- --link
- libcuda.so.1::/lib/x86_64-linux-gnu/libcuda.so
env:
- NVIDIA_CTK_DEBUG=false
- hookName: createContainer
path: /usr/bin/nvidia-cdi-hook
args:
- nvidia-cdi-hook
- create-soname-symlinks
- --folder
- /lib/x86_64-linux-gnu
env:
- NVIDIA_CTK_DEBUG=false
- hookName: createContainer
path: /usr/bin/nvidia-cdi-hook
args:
- nvidia-cdi-hook
- disable-device-node-modification
env:
- NVIDIA_CTK_DEBUG=false
mounts:
- hostPath: {{ .driverRoot }}/lib/x86_64-linux-gnu/libcuda.so.999.88.77
containerPath: /lib/x86_64-linux-gnu/libcuda.so.999.88.77
options:
- ro
- nosuid
- nodev
- rbind
- rprivate
`,
},
{
description: "disableHooksAll",
options: options{
format: "yaml",
mode: "nvml",
vendor: "example.com",
class: "device",
driverRoot: driverRoot,
disabledHooks: valueOf(cli.NewStringSlice("all")),
},
expectedOptions: options{
format: "yaml",
mode: "nvml",
vendor: "example.com",
class: "device",
nvidiaCDIHookPath: "/usr/bin/nvidia-cdi-hook",
driverRoot: driverRoot,
disabledHooks: valueOf(cli.NewStringSlice("all")),
},
expectedSpec: `---
cdiVersion: 0.5.0
kind: example.com/device
devices:
- name: "0"
containerEdits:
deviceNodes:
- path: /dev/nvidia0
hostPath: {{ .driverRoot }}/dev/nvidia0
- name: all
containerEdits:
deviceNodes:
- path: /dev/nvidia0
hostPath: {{ .driverRoot }}/dev/nvidia0
containerEdits:
env:
- NVIDIA_CTK_LIBCUDA_DIR=/lib/x86_64-linux-gnu
- NVIDIA_VISIBLE_DEVICES=void
deviceNodes:
- path: /dev/nvidiactl
hostPath: {{ .driverRoot }}/dev/nvidiactl
mounts:
- hostPath: {{ .driverRoot }}/lib/x86_64-linux-gnu/libcuda.so.999.88.77
containerPath: /lib/x86_64-linux-gnu/libcuda.so.999.88.77
options:
- ro
- nosuid
- nodev
- rbind
- rprivate
`,
},
}
for _, tc := range testCases {
t.Run(tc.description, func(t *testing.T) {
c := command{
logger: logger,
}
err := c.validateFlags(nil, &tc.options)
require.ErrorIs(t, err, tc.expectedValidateError)
require.EqualValues(t, tc.expectedOptions, tc.options)
// Set up a mock server, reusing the DGX A100 mock.
server := dgxa100.New()
// Override the driver version to match the version in our mock filesystem.
server.SystemGetDriverVersionFunc = func() (string, nvml.Return) {
return "999.88.77", nvml.SUCCESS
}
// Set the device count to 1 explicitly since we only have a single device node.
server.DeviceGetCountFunc = func() (int, nvml.Return) {
return 1, nvml.SUCCESS
}
for _, d := range server.Devices {
// TODO: This is not implemented in the mock.
(d.(*dgxa100.Device)).GetMaxMigDeviceCountFunc = func() (int, nvml.Return) {
return 0, nvml.SUCCESS
}
}
tc.options.nvmllib = server
spec, err := c.generateSpec(&tc.options)
require.ErrorIs(t, err, tc.expectedError)
var buf bytes.Buffer
_, err = spec.WriteTo(&buf)
require.NoError(t, err)
require.Equal(t, strings.ReplaceAll(tc.expectedSpec, "{{ .driverRoot }}", driverRoot), buf.String())
})
}
}
// valueOf returns the value of a pointer.
// Note that this does not check for a nil pointer and is only used for testing.
func valueOf[T any](v *T) T {
return *v
}

View File

@@ -64,6 +64,7 @@ func (m command) build() *cli.Command {
Usage: "specify the directories to scan for CDI specifications",
Value: cli.NewStringSlice(cdi.DefaultSpecDirs...),
Destination: &cfg.cdiSpecDirs,
EnvVars: []string{"NVIDIA_CTK_CDI_SPEC_DIRS"},
},
}

View File

@@ -194,7 +194,14 @@ func setFlagToKeyValue(setFlag string, setListSeparator string) (string, interfa
case reflect.String:
return key, value, nil
case reflect.Slice:
valueParts := strings.Split(value, setListSeparator)
valueParts := []string{value}
for _, sep := range []string{setListSeparator, ","} {
if !strings.Contains(value, sep) {
continue
}
valueParts = strings.Split(value, sep)
break
}
switch field.Elem().Kind() {
case reflect.String:
return key, valueParts, nil

View File

@@ -27,7 +27,7 @@ type hookCommand struct {
logger logger.Interface
}
// NewCommand constructs a hook command with the specified logger
// NewCommand constructs CLI subcommand for handling CDI hooks.
func NewCommand(logger logger.Interface) *cli.Command {
c := hookCommand{
logger: logger,
@@ -37,10 +37,21 @@ func NewCommand(logger logger.Interface) *cli.Command {
// build
func (m hookCommand) build() *cli.Command {
// Create the 'hook' command
// Create the 'hook' subcommand
hook := cli.Command{
Name: "hook",
Usage: "A collection of hooks that may be injected into an OCI spec",
// We set the default action for the `hook` subcommand to issue a
// warning and exit with no error.
// This means that if an unsupported hook is run, a container will not fail
// to launch. An unsupported hook could be the result of a CDI specification
// referring to a new hook that is not yet supported by an older NVIDIA
// Container Toolkit version or a hook that has been removed in newer
// version.
Action: func(ctx *cli.Context) error {
commands.IssueUnsupportedHookWarning(m.logger, ctx)
return nil
},
}
hook.Subcommands = commands.New(m.logger)

View File

@@ -49,6 +49,7 @@ func main() {
// Create the top-level CLI
c := cli.NewApp()
c.DisableSliceFlagSeparator = true
c.Name = "NVIDIA Container Toolkit CLI"
c.UseShortOptionHandling = true
c.EnableBashCompletion = true

View File

@@ -68,12 +68,11 @@ type config struct {
dryRun bool
runtime string
configFilePath string
executablePath string
configSource string
mode string
hookFilePath string
runtimeConfigOverrideJSON string
nvidiaRuntime struct {
name string
path string
@@ -120,6 +119,11 @@ func (m command) build() *cli.Command {
Usage: "path to the config file for the target runtime",
Destination: &config.configFilePath,
},
&cli.StringFlag{
Name: "executable-path",
Usage: "The path to the runtime executable. This is used to extract the current config",
Destination: &config.executablePath,
},
&cli.StringFlag{
Name: "config-mode",
Usage: "the config mode for runtimes that support multiple configuration mechanisms",
@@ -208,9 +212,9 @@ func (m command) validateFlags(c *cli.Context, config *config) error {
config.cdi.enabled = false
}
if config.runtimeConfigOverrideJSON != "" && config.runtime != "containerd" {
m.logger.Warningf("Ignoring runtime-config-override flag for %v", config.runtime)
config.runtimeConfigOverrideJSON = ""
if config.executablePath != "" && config.runtime == "docker" {
m.logger.Warningf("Ignoring executable-path=%q flag for %v", config.executablePath, config.runtime)
config.executablePath = ""
}
switch config.configSource {
@@ -330,9 +334,9 @@ func (c *config) resolveConfigSource() (toml.Loader, error) {
func (c *config) getCommandConfigSource() toml.Loader {
switch c.runtime {
case "containerd":
return containerd.CommandLineSource("")
return containerd.CommandLineSource("", c.executablePath)
case "crio":
return crio.CommandLineSource("")
return crio.CommandLineSource("", c.executablePath)
}
return toml.Empty
}

View File

@@ -0,0 +1,171 @@
# SPDX-FileCopyrightText: Copyright (c) 2019 NVIDIA CORPORATION & AFFILIATES. All rights reserved.
# SPDX-License-Identifier: Apache-2.0
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
ARG GOLANG_VERSION=x.x.x
ARG VERSION="N/A"
FROM nvcr.io/nvidia/cuda:12.9.0-base-ubi9 AS build
RUN dnf install -y \
wget make git gcc \
&& \
rm -rf /var/cache/yum/*
ARG GOLANG_VERSION=x.x.x
RUN set -eux; \
\
arch="$(uname -m)"; \
case "${arch##*-}" in \
x86_64 | amd64) ARCH='amd64' ;; \
ppc64el | ppc64le) ARCH='ppc64le' ;; \
aarch64 | arm64) ARCH='arm64' ;; \
*) echo "unsupported architecture" ; exit 1 ;; \
esac; \
wget -nv -O - https://storage.googleapis.com/golang/go${GOLANG_VERSION}.linux-${ARCH}.tar.gz \
| tar -C /usr/local -xz
ENV GOPATH=/go
ENV PATH=$GOPATH/bin:/usr/local/go/bin:$PATH
WORKDIR /build
COPY . .
RUN mkdir -p /artifacts/bin
ARG VERSION="N/A"
ARG GIT_COMMIT="unknown"
RUN make PREFIX=/artifacts/bin cmd-nvidia-ctk-installer
# The packaging stage collects the deb and rpm packages built for
# supported architectures.
FROM nvcr.io/nvidia/distroless/go:v3.1.9-dev AS packaging
USER 0:0
SHELL ["/busybox/sh", "-c"]
RUN ln -s /busybox/sh /bin/sh
ARG ARTIFACTS_ROOT
COPY ${ARTIFACTS_ROOT} /artifacts/packages/
WORKDIR /artifacts
# build-args are added to the manifest.txt file below.
ARG PACKAGE_VERSION
ARG GIT_BRANCH
ARG GIT_COMMIT
ARG GIT_COMMIT_SHORT
ARG SOURCE_DATE_EPOCH
ARG VERSION
# Create a manifest.txt file with the absolute paths of all deb and rpm packages in the container
RUN echo "#IMAGE_EPOCH=$(date '+%s')" > /artifacts/manifest.txt && \
env | sed 's/^/#/g' >> /artifacts/manifest.txt && \
find /artifacts/packages -iname '*.deb' -o -iname '*.rpm' >> /artifacts/manifest.txt
LABEL name="NVIDIA Container Toolkit Packages"
LABEL vendor="NVIDIA"
LABEL version="${VERSION}"
LABEL release="N/A"
LABEL summary="deb and rpm packages for the NVIDIA Container Toolkit"
LABEL description="See summary"
COPY LICENSE /licenses/
# The debpackages stage is used to extract the contents of deb packages.
FROM nvcr.io/nvidia/cuda:12.9.0-base-ubuntu20.04 AS debpackages
ARG TARGETARCH
ARG PACKAGE_DIST_DEB=ubuntu18.04
COPY --from=packaging /artifacts/packages/${PACKAGE_DIST_DEB} /deb-packages
RUN mkdir -p /artifacts/deb
RUN set -eux; \
\
case "${TARGETARCH}" in \
x86_64 | amd64) ARCH='amd64' ;; \
ppc64el | ppc64le) ARCH='ppc64le' ;; \
aarch64 | arm64) ARCH='arm64' ;; \
*) echo "unsupported architecture" ; exit 1 ;; \
esac; \
for p in $(ls /deb-packages/${ARCH}/*.deb); do dpkg-deb -xv $p /artifacts/deb/; done
# The rpmpackages stage is used to extract the contents of the rpm packages.
FROM nvcr.io/nvidia/cuda:12.9.0-base-ubi9 AS rpmpackages
RUN dnf install -y cpio
ARG TARGETARCH
ARG PACKAGE_DIST_RPM=centos7
COPY --from=packaging /artifacts/packages/${PACKAGE_DIST_RPM} /rpm-packages
RUN mkdir -p /artifacts/rpm
RUN set -eux; \
\
case "${TARGETARCH}" in \
x86_64 | amd64) ARCH='x86_64' ;; \
ppc64el | ppc64le) ARCH='ppc64le' ;; \
aarch64 | arm64) ARCH='aarch64' ;; \
*) echo "unsupported architecture" ; exit 1 ;; \
esac; \
for p in $(ls /rpm-packages/${ARCH}/*.rpm); do rpm2cpio $p | cpio -idmv -D /artifacts/rpm; done
# The artifacts image serves as an intermediate stage to collect the artifacts
# From the previous stages:
# - The extracted deb packages
# - The extracted rpm packages
# - The nvidia-ctk-installer binary
FROM scratch AS artifacts
COPY --from=rpmpackages /artifacts/rpm /artifacts/rpm
COPY --from=debpackages /artifacts/deb /artifacts/deb
COPY --from=build /artifacts/bin /artifacts/build
# The application stage contains the application used as a GPU Operator
# operand.
FROM nvcr.io/nvidia/distroless/go:v3.1.9-dev AS application
USER 0:0
SHELL ["/busybox/sh", "-c"]
RUN ln -s /busybox/sh /bin/sh
ENV NVIDIA_DISABLE_REQUIRE="true"
ENV NVIDIA_VISIBLE_DEVICES=void
ENV NVIDIA_DRIVER_CAPABILITIES=utility
COPY --from=artifacts /artifacts/rpm /artifacts/rpm
COPY --from=artifacts /artifacts/deb /artifacts/deb
COPY --from=artifacts /artifacts/build /work
WORKDIR /work
ENV PATH=/work:$PATH
ARG VERSION
LABEL io.k8s.display-name="NVIDIA Container Runtime Config"
LABEL name="NVIDIA Container Runtime Config"
LABEL vendor="NVIDIA"
LABEL version="${VERSION}"
LABEL release="N/A"
LABEL summary="Automatically Configure your Container Runtime for GPU support."
LABEL description="See summary"
COPY LICENSE /licenses/
ENTRYPOINT ["/work/nvidia-ctk-installer"]
# The GPU Operator exec's nvidia-toolkit in its entrypoint.
# We create a symlink here to ensure compatibility with older
# GPU Operator versions.
RUN ln -s /work/nvidia-ctk-installer /work/nvidia-toolkit

View File

@@ -1,38 +0,0 @@
# Copyright (c) 2021, NVIDIA CORPORATION. All rights reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
ARG GOLANG_VERSION=x.x.x
FROM nvcr.io/nvidia/cuda:12.8.0-base-ubuntu20.04
ARG ARTIFACTS_ROOT
COPY ${ARTIFACTS_ROOT} /artifacts/packages/
WORKDIR /artifacts/packages
# build-args are added to the manifest.txt file below.
ARG PACKAGE_DIST
ARG PACKAGE_VERSION
ARG GIT_BRANCH
ARG GIT_COMMIT
ARG GIT_COMMIT_SHORT
ARG SOURCE_DATE_EPOCH
ARG VERSION
# Create a manifest.txt file with the absolute paths of all deb and rpm packages in the container
RUN echo "#IMAGE_EPOCH=$(date '+%s')" > /artifacts/manifest.txt && \
env | sed 's/^/#/g' >> /artifacts/manifest.txt && \
find /artifacts/packages -iname '*.deb' -o -iname '*.rpm' >> /artifacts/manifest.txt
RUN mkdir /licenses && mv /NGC-DL-CONTAINER-LICENSE /licenses/NGC-DL-CONTAINER-LICENSE

View File

@@ -1,90 +0,0 @@
# Copyright (c) 2019-2021, NVIDIA CORPORATION. All rights reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
ARG GOLANG_VERSION=x.x.x
ARG VERSION="N/A"
FROM nvcr.io/nvidia/cuda:12.8.0-base-ubi8 AS build
RUN yum install -y \
wget make git gcc \
&& \
rm -rf /var/cache/yum/*
ARG GOLANG_VERSION=x.x.x
RUN set -eux; \
\
arch="$(uname -m)"; \
case "${arch##*-}" in \
x86_64 | amd64) ARCH='amd64' ;; \
ppc64el | ppc64le) ARCH='ppc64le' ;; \
aarch64 | arm64) ARCH='arm64' ;; \
*) echo "unsupported architecture" ; exit 1 ;; \
esac; \
wget -nv -O - https://storage.googleapis.com/golang/go${GOLANG_VERSION}.linux-${ARCH}.tar.gz \
| tar -C /usr/local -xz
ENV GOPATH=/go
ENV PATH=$GOPATH/bin:/usr/local/go/bin:$PATH
WORKDIR /build
COPY . .
RUN mkdir /artifacts
ARG VERSION="N/A"
ARG GIT_COMMIT="unknown"
RUN make PREFIX=/artifacts cmd-nvidia-ctk-installer
FROM nvcr.io/nvidia/cuda:12.8.0-base-ubi8
ENV NVIDIA_DISABLE_REQUIRE="true"
ENV NVIDIA_VISIBLE_DEVICES=void
ENV NVIDIA_DRIVER_CAPABILITIES=utility
ARG ARTIFACTS_ROOT
ARG PACKAGE_DIST
COPY ${ARTIFACTS_ROOT}/${PACKAGE_DIST} /artifacts/packages/${PACKAGE_DIST}
WORKDIR /artifacts/packages
ARG PACKAGE_VERSION
ARG TARGETARCH
ENV PACKAGE_ARCH=${TARGETARCH}
RUN PACKAGE_ARCH=${PACKAGE_ARCH/amd64/x86_64} && PACKAGE_ARCH=${PACKAGE_ARCH/arm64/aarch64} && \
yum localinstall -y \
${PACKAGE_DIST}/${PACKAGE_ARCH}/libnvidia-container1-1.*.rpm \
${PACKAGE_DIST}/${PACKAGE_ARCH}/libnvidia-container-tools-1.*.rpm \
${PACKAGE_DIST}/${PACKAGE_ARCH}/nvidia-container-toolkit*-${PACKAGE_VERSION}*.rpm
WORKDIR /work
COPY --from=build /artifacts/nvidia-ctk-installer /work/nvidia-ctk-installer
RUN ln -s nvidia-ctk-installer nvidia-toolkit
ENV PATH=/work:$PATH
ARG VERSION
LABEL io.k8s.display-name="NVIDIA Container Runtime Config"
LABEL name="NVIDIA Container Runtime Config"
LABEL vendor="NVIDIA"
LABEL version="${VERSION}"
LABEL release="N/A"
LABEL summary="Automatically Configure your Container Runtime for GPU support."
LABEL description="See summary"
RUN mkdir /licenses && mv /NGC-DL-CONTAINER-LICENSE /licenses/NGC-DL-CONTAINER-LICENSE
ENTRYPOINT ["/work/nvidia-ctk-installer"]

View File

@@ -1,98 +0,0 @@
# Copyright (c) 2019-2021, NVIDIA CORPORATION. All rights reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
ARG GOLANG_VERSION=x.x.x
ARG VERSION="N/A"
FROM nvcr.io/nvidia/cuda:12.8.0-base-ubuntu20.04 AS build
RUN apt-get update && \
apt-get install -y wget make git gcc \
&& \
rm -rf /var/lib/apt/lists/*
ARG GOLANG_VERSION=x.x.x
RUN set -eux; \
\
arch="$(uname -m)"; \
case "${arch##*-}" in \
x86_64 | amd64) ARCH='amd64' ;; \
ppc64el | ppc64le) ARCH='ppc64le' ;; \
aarch64 | arm64) ARCH='arm64' ;; \
*) echo "unsupported architecture" ; exit 1 ;; \
esac; \
wget -nv -O - https://storage.googleapis.com/golang/go${GOLANG_VERSION}.linux-${ARCH}.tar.gz \
| tar -C /usr/local -xz
ENV GOPATH=/go
ENV PATH=$GOPATH/bin:/usr/local/go/bin:$PATH
WORKDIR /build
COPY . .
RUN mkdir /artifacts
ARG VERSION="N/A"
ARG GIT_COMMIT="unknown"
RUN make PREFIX=/artifacts cmd-nvidia-ctk-installer
FROM nvcr.io/nvidia/cuda:12.8.0-base-ubuntu20.04
# Remove the CUDA repository configurations to avoid issues with rotated GPG keys
RUN rm -f /etc/apt/sources.list.d/cuda.list
ARG DEBIAN_FRONTEND=noninteractive
RUN apt-get update && apt-get install -y --no-install-recommends \
libcap2 \
curl \
&& \
rm -rf /var/lib/apt/lists/*
ENV NVIDIA_DISABLE_REQUIRE="true"
ENV NVIDIA_VISIBLE_DEVICES=void
ENV NVIDIA_DRIVER_CAPABILITIES=utility
ARG ARTIFACTS_ROOT
ARG PACKAGE_DIST
COPY ${ARTIFACTS_ROOT}/${PACKAGE_DIST} /artifacts/packages/${PACKAGE_DIST}
WORKDIR /artifacts/packages
ARG PACKAGE_VERSION
ARG TARGETARCH
ENV PACKAGE_ARCH=${TARGETARCH}
RUN dpkg -i \
${PACKAGE_DIST}/${PACKAGE_ARCH}/libnvidia-container1_1.*.deb \
${PACKAGE_DIST}/${PACKAGE_ARCH}/libnvidia-container-tools_1.*.deb \
${PACKAGE_DIST}/${PACKAGE_ARCH}/nvidia-container-toolkit*_${PACKAGE_VERSION}*.deb
WORKDIR /work
COPY --from=build /artifacts/nvidia-ctk-installer /work/nvidia-ctk-installer
RUN ln -s nvidia-ctk-installer nvidia-toolkit
ENV PATH=/work:$PATH
ARG VERSION
LABEL io.k8s.display-name="NVIDIA Container Runtime Config"
LABEL name="NVIDIA Container Runtime Config"
LABEL vendor="NVIDIA"
LABEL version="${VERSION}"
LABEL release="N/A"
LABEL summary="Automatically Configure your Container Runtime for GPU support."
LABEL description="See summary"
RUN mkdir /licenses && mv /NGC-DL-CONTAINER-LICENSE /licenses/NGC-DL-CONTAINER-LICENSE
ENTRYPOINT ["/work/nvidia-ctk-installer"]

View File

@@ -29,17 +29,17 @@ include $(CURDIR)/versions.mk
IMAGE_VERSION := $(VERSION)
IMAGE_TAG ?= $(VERSION)-$(DIST)
IMAGE_TAG ?= $(VERSION)
IMAGE = $(IMAGE_NAME):$(IMAGE_TAG)
OUT_IMAGE_NAME ?= $(IMAGE_NAME)
OUT_IMAGE_VERSION ?= $(IMAGE_VERSION)
OUT_IMAGE_TAG = $(OUT_IMAGE_VERSION)-$(DIST)
OUT_IMAGE_TAG = $(OUT_IMAGE_VERSION)
OUT_IMAGE = $(OUT_IMAGE_NAME):$(OUT_IMAGE_TAG)
##### Public rules #####
DEFAULT_PUSH_TARGET := ubuntu20.04
DISTRIBUTIONS := ubuntu20.04 ubi8
DEFAULT_PUSH_TARGET := application
DISTRIBUTIONS := $(DEFAULT_PUSH_TARGET)
META_TARGETS := packaging
@@ -56,30 +56,16 @@ else
include $(CURDIR)/deployments/container/multi-arch.mk
endif
# For the default push target we also push a short tag equal to the version.
# We skip this for the development release
DEVEL_RELEASE_IMAGE_VERSION ?= devel
PUSH_MULTIPLE_TAGS ?= true
ifeq ($(strip $(OUT_IMAGE_VERSION)),$(DEVEL_RELEASE_IMAGE_VERSION))
PUSH_MULTIPLE_TAGS = false
endif
ifeq ($(PUSH_MULTIPLE_TAGS),true)
push-$(DEFAULT_PUSH_TARGET): push-short
endif
push-%: DIST = $(*)
push-short: DIST = $(DEFAULT_PUSH_TARGET)
# Define the push targets
$(PUSH_TARGETS): push-%:
$(CURDIR)/scripts/publish-image.sh $(IMAGE) $(OUT_IMAGE)
push-short:
$(CURDIR)/scripts/publish-image.sh $(IMAGE) $(OUT_IMAGE)
DOCKERFILE = $(CURDIR)/deployments/container/Dockerfile
build-%: DIST = $(*)
build-%: DOCKERFILE = $(CURDIR)/deployments/container/Dockerfile.$(DOCKERFILE_SUFFIX)
# For packaging targets we set the output image tag to include the -packaging suffix.
%-packaging: INTERMEDIATE_TARGET := --target=packaging
%-packaging: IMAGE_TAG = $(IMAGE_VERSION)-packaging
%-packaging: OUT_IMAGE_TAG = $(IMAGE_VERSION)-packaging
ARTIFACTS_ROOT ?= $(shell realpath --relative-to=$(CURDIR) $(DIST_DIR))
@@ -90,10 +76,12 @@ $(IMAGE_TARGETS): image-%: $(ARTIFACTS_ROOT)
--provenance=false --sbom=false \
$(DOCKER_BUILD_OPTIONS) \
$(DOCKER_BUILD_PLATFORM_OPTIONS) \
$(INTERMEDIATE_TARGET) \
--tag $(IMAGE) \
--build-arg ARTIFACTS_ROOT="$(ARTIFACTS_ROOT)" \
--build-arg GOLANG_VERSION="$(GOLANG_VERSION)" \
--build-arg PACKAGE_DIST="$(PACKAGE_DIST)" \
--build-arg PACKAGE_DIST_DEB="$(PACKAGE_DIST_DEB)" \
--build-arg PACKAGE_DIST_RPM="$(PACKAGE_DIST_RPM)" \
--build-arg PACKAGE_VERSION="$(PACKAGE_VERSION)" \
--build-arg VERSION="$(VERSION)" \
--build-arg GIT_COMMIT="$(GIT_COMMIT)" \
@@ -103,25 +91,17 @@ $(IMAGE_TARGETS): image-%: $(ARTIFACTS_ROOT)
-f $(DOCKERFILE) \
$(CURDIR)
build-ubuntu%: DOCKERFILE_SUFFIX := ubuntu
build-ubuntu%: PACKAGE_DIST = ubuntu18.04
build-ubi8: DOCKERFILE_SUFFIX := ubi8
build-ubi8: PACKAGE_DIST = centos7
build-packaging: DOCKERFILE_SUFFIX := packaging
build-packaging: PACKAGE_ARCH := amd64
build-packaging: PACKAGE_DIST = all
# Test targets
test-%: DIST = $(*)
PACKAGE_DIST_DEB = ubuntu18.04
# TODO: This needs to be set to centos8 for ppc64le builds
PACKAGE_DIST_RPM = centos7
# Handle the default build target.
.PHONY: build
build: $(DEFAULT_PUSH_TARGET)
$(DEFAULT_PUSH_TARGET): build-$(DEFAULT_PUSH_TARGET)
$(DEFAULT_PUSH_TARGET): DIST = $(DEFAULT_PUSH_TARGET)
.PHONY: build push
build: build-$(DEFAULT_PUSH_TARGET)
push: push-$(DEFAULT_PUSH_TARGET)
# Test targets
TEST_CASES ?= docker crio containerd
$(TEST_TARGETS): test-%:
TEST_CASES="$(TEST_CASES)" bash -x $(CURDIR)/test/container/main.sh run \

View File

@@ -23,11 +23,3 @@ $(BUILD_TARGETS): build-%: image-%
else
$(BUILD_TARGETS): build-%: image-%
endif
# For the default distribution we also retag the image.
# Note: This needs to be updated for multi-arch images.
ifeq ($(IMAGE_TAG),$(VERSION)-$(DIST))
$(DEFAULT_PUSH_TARGET):
$(DOCKER) image inspect $(IMAGE) > /dev/null || $(DOCKER) pull $(IMAGE)
$(DOCKER) tag $(IMAGE) $(subst :$(IMAGE_TAG),:$(VERSION),$(IMAGE))
endif

View File

@@ -14,7 +14,7 @@
# This Dockerfile is also used to define the golang version used in this project
# This allows dependabot to manage this version in addition to other images.
FROM golang:1.24.0
FROM golang:1.24.4
WORKDIR /work
COPY * .

View File

@@ -5,14 +5,14 @@ go 1.24
toolchain go1.24.0
require (
github.com/golangci/golangci-lint v1.64.5
github.com/golangci/golangci-lint v1.64.7
github.com/matryer/moq v0.5.3
)
require (
4d63.com/gocheckcompilerdirectives v1.2.1 // indirect
4d63.com/gocheckcompilerdirectives v1.3.0 // indirect
4d63.com/gochecknoglobals v0.2.2 // indirect
github.com/4meepo/tagalign v1.4.1 // indirect
github.com/4meepo/tagalign v1.4.2 // indirect
github.com/Abirdcfly/dupword v0.1.3 // indirect
github.com/Antonboom/errname v1.0.0 // indirect
github.com/Antonboom/nilnil v1.0.1 // indirect
@@ -20,9 +20,9 @@ require (
github.com/BurntSushi/toml v1.4.1-0.20240526193622-a339e1f7089c // indirect
github.com/Crocmagnon/fatcontext v0.7.1 // indirect
github.com/Djarvur/go-err113 v0.0.0-20210108212216-aea10b59be24 // indirect
github.com/GaijinEntertainment/go-exhaustruct/v3 v3.3.0 // indirect
github.com/GaijinEntertainment/go-exhaustruct/v3 v3.3.1 // indirect
github.com/Masterminds/semver/v3 v3.3.0 // indirect
github.com/OpenPeeDeeP/depguard/v2 v2.2.0 // indirect
github.com/OpenPeeDeeP/depguard/v2 v2.2.1 // indirect
github.com/alecthomas/go-check-sumtype v0.3.1 // indirect
github.com/alexkohler/nakedret/v2 v2.0.5 // indirect
github.com/alexkohler/prealloc v1.0.0 // indirect
@@ -38,7 +38,7 @@ require (
github.com/breml/errchkjson v0.4.0 // indirect
github.com/butuzov/ireturn v0.3.1 // indirect
github.com/butuzov/mirror v1.3.0 // indirect
github.com/catenacyber/perfsprint v0.8.1 // indirect
github.com/catenacyber/perfsprint v0.8.2 // indirect
github.com/ccojocar/zxcvbn-go v1.0.2 // indirect
github.com/cespare/xxhash/v2 v2.3.0 // indirect
github.com/charithe/durationcheck v0.0.10 // indirect
@@ -68,17 +68,17 @@ require (
github.com/gobwas/glob v0.2.3 // indirect
github.com/gofrs/flock v0.12.1 // indirect
github.com/golang/protobuf v1.5.3 // indirect
github.com/golangci/dupl v0.0.0-20180902072040-3e9179ac440a // indirect
github.com/golangci/dupl v0.0.0-20250308024227-f665c8d69b32 // indirect
github.com/golangci/go-printf-func-name v0.1.0 // indirect
github.com/golangci/gofmt v0.0.0-20250106114630-d62b90e6713d // indirect
github.com/golangci/misspell v0.6.0 // indirect
github.com/golangci/plugin-module-register v0.1.1 // indirect
github.com/golangci/revgrep v0.8.0 // indirect
github.com/golangci/unconvert v0.0.0-20240309020433-c5143eacb3ed // indirect
github.com/google/go-cmp v0.6.0 // indirect
github.com/google/go-cmp v0.7.0 // indirect
github.com/gordonklaus/ineffassign v0.1.0 // indirect
github.com/gostaticanalysis/analysisutil v0.7.1 // indirect
github.com/gostaticanalysis/comment v1.4.2 // indirect
github.com/gostaticanalysis/comment v1.5.0 // indirect
github.com/gostaticanalysis/forcetypeassert v0.2.0 // indirect
github.com/gostaticanalysis/nilerr v0.1.1 // indirect
github.com/hashicorp/go-immutable-radix/v2 v2.1.0 // indirect
@@ -92,12 +92,12 @@ require (
github.com/jjti/go-spancheck v0.6.4 // indirect
github.com/julz/importas v0.2.0 // indirect
github.com/karamaru-alpha/copyloopvar v1.2.1 // indirect
github.com/kisielk/errcheck v1.8.0 // indirect
github.com/kkHAIKE/contextcheck v1.1.5 // indirect
github.com/kisielk/errcheck v1.9.0 // indirect
github.com/kkHAIKE/contextcheck v1.1.6 // indirect
github.com/kulti/thelper v0.6.3 // indirect
github.com/kunwardeep/paralleltest v1.0.10 // indirect
github.com/lasiar/canonicalheader v1.1.2 // indirect
github.com/ldez/exptostd v0.4.1 // indirect
github.com/ldez/exptostd v0.4.2 // indirect
github.com/ldez/gomoddirectives v0.6.1 // indirect
github.com/ldez/grignotin v0.9.0 // indirect
github.com/ldez/tagliatelle v0.7.1 // indirect
@@ -112,14 +112,14 @@ require (
github.com/mattn/go-isatty v0.0.20 // indirect
github.com/mattn/go-runewidth v0.0.16 // indirect
github.com/matttproud/golang_protobuf_extensions v1.0.1 // indirect
github.com/mgechev/revive v1.6.1 // indirect
github.com/mgechev/revive v1.7.0 // indirect
github.com/mitchellh/go-homedir v1.1.0 // indirect
github.com/mitchellh/mapstructure v1.5.0 // indirect
github.com/moricho/tparallel v0.3.2 // indirect
github.com/nakabonne/nestif v0.3.1 // indirect
github.com/nishanths/exhaustive v0.12.0 // indirect
github.com/nishanths/predeclared v0.2.2 // indirect
github.com/nunnatsa/ginkgolinter v0.19.0 // indirect
github.com/nunnatsa/ginkgolinter v0.19.1 // indirect
github.com/olekukonko/tablewriter v0.0.5 // indirect
github.com/pelletier/go-toml v1.9.5 // indirect
github.com/pelletier/go-toml/v2 v2.2.3 // indirect
@@ -136,14 +136,14 @@ require (
github.com/quasilyte/stdinfo v0.0.0-20220114132959-f7386bf02567 // indirect
github.com/raeperd/recvcheck v0.2.0 // indirect
github.com/rivo/uniseg v0.4.7 // indirect
github.com/rogpeppe/go-internal v1.13.1 // indirect
github.com/rogpeppe/go-internal v1.14.1 // indirect
github.com/ryancurrah/gomodguard v1.3.5 // indirect
github.com/ryanrolds/sqlclosecheck v0.5.1 // indirect
github.com/sanposhiho/wastedassign/v2 v2.1.0 // indirect
github.com/santhosh-tekuri/jsonschema/v6 v6.0.1 // indirect
github.com/sashamelentyev/interfacebloat v1.1.0 // indirect
github.com/sashamelentyev/usestdlibvars v1.28.0 // indirect
github.com/securego/gosec/v2 v2.22.1 // indirect
github.com/securego/gosec/v2 v2.22.2 // indirect
github.com/sirupsen/logrus v1.9.3 // indirect
github.com/sivchari/containedctx v1.0.3 // indirect
github.com/sivchari/tenv v1.12.1 // indirect
@@ -151,7 +151,7 @@ require (
github.com/sourcegraph/go-diff v0.7.0 // indirect
github.com/spf13/afero v1.12.0 // indirect
github.com/spf13/cast v1.5.0 // indirect
github.com/spf13/cobra v1.8.1 // indirect
github.com/spf13/cobra v1.9.1 // indirect
github.com/spf13/jwalterweatherman v1.1.0 // indirect
github.com/spf13/pflag v1.0.6 // indirect
github.com/spf13/viper v1.12.0 // indirect
@@ -160,8 +160,8 @@ require (
github.com/stretchr/objx v0.5.2 // indirect
github.com/stretchr/testify v1.10.0 // indirect
github.com/subosito/gotenv v1.4.1 // indirect
github.com/tdakkota/asciicheck v0.4.0 // indirect
github.com/tetafro/godot v1.4.20 // indirect
github.com/tdakkota/asciicheck v0.4.1 // indirect
github.com/tetafro/godot v1.5.0 // indirect
github.com/timakin/bodyclose v0.0.0-20241017074812-ed6a65f985e3 // indirect
github.com/timonwong/loggercheck v0.10.1 // indirect
github.com/tomarrell/wrapcheck/v2 v2.10.0 // indirect
@@ -182,16 +182,16 @@ require (
go.uber.org/multierr v1.6.0 // indirect
go.uber.org/zap v1.24.0 // indirect
golang.org/x/exp/typeparams v0.0.0-20250210185358-939b2ce775ac // indirect
golang.org/x/mod v0.23.0 // indirect
golang.org/x/sync v0.11.0 // indirect
golang.org/x/sys v0.30.0 // indirect
golang.org/x/mod v0.24.0 // indirect
golang.org/x/sync v0.12.0 // indirect
golang.org/x/sys v0.31.0 // indirect
golang.org/x/text v0.22.0 // indirect
golang.org/x/tools v0.30.0 // indirect
google.golang.org/protobuf v1.36.4 // indirect
golang.org/x/tools v0.31.0 // indirect
google.golang.org/protobuf v1.36.5 // indirect
gopkg.in/ini.v1 v1.67.0 // indirect
gopkg.in/yaml.v2 v2.4.0 // indirect
gopkg.in/yaml.v3 v3.0.1 // indirect
honnef.co/go/tools v0.6.0 // indirect
honnef.co/go/tools v0.6.1 // indirect
mvdan.cc/gofumpt v0.7.0 // indirect
mvdan.cc/unparam v0.0.0-20240528143540-8a5130ca722f // indirect
)

View File

@@ -1,5 +1,5 @@
4d63.com/gocheckcompilerdirectives v1.2.1 h1:AHcMYuw56NPjq/2y615IGg2kYkBdTvOaojYCBcRE7MA=
4d63.com/gocheckcompilerdirectives v1.2.1/go.mod h1:yjDJSxmDTtIHHCqX0ufRYZDL6vQtMG7tJdKVeWwsqvs=
4d63.com/gocheckcompilerdirectives v1.3.0 h1:Ew5y5CtcAAQeTVKUVFrE7EwHMrTO6BggtEj8BZSjZ3A=
4d63.com/gocheckcompilerdirectives v1.3.0/go.mod h1:ofsJ4zx2QAuIP/NO/NAh1ig6R1Fb18/GI7RVMwz7kAY=
4d63.com/gochecknoglobals v0.2.2 h1:H1vdnwnMaZdQW/N+NrkT1SZMTBmcwHe9Vq8lJcYYTtU=
4d63.com/gochecknoglobals v0.2.2/go.mod h1:lLxwTQjL5eIesRbvnzIP3jZtG140FnTdz+AlMa+ogt0=
cloud.google.com/go v0.26.0/go.mod h1:aQUYkXzVsufM+DwF1aE+0xfcU+56JwCaLick0ClmMTw=
@@ -35,8 +35,8 @@ cloud.google.com/go/storage v1.6.0/go.mod h1:N7U0C8pVQ/+NIKOBQyamJIeKQKkZ+mxpohl
cloud.google.com/go/storage v1.8.0/go.mod h1:Wv1Oy7z6Yz3DshWRJFhqM/UCfaWIRTdp0RXyy7KQOVs=
cloud.google.com/go/storage v1.10.0/go.mod h1:FLPqc6j+Ki4BU591ie1oL6qBQGu2Bl/tZ9ullr3+Kg0=
dmitri.shuralyov.com/gpu/mtl v0.0.0-20190408044501-666a987793e9/go.mod h1:H6x//7gZCb22OMCxBHrMx7a5I7Hp++hsVxbQ4BYO7hU=
github.com/4meepo/tagalign v1.4.1 h1:GYTu2FaPGOGb/xJalcqHeD4il5BiCywyEYZOA55P6J4=
github.com/4meepo/tagalign v1.4.1/go.mod h1:2H9Yu6sZ67hmuraFgfZkNcg5Py9Ch/Om9l2K/2W1qS4=
github.com/4meepo/tagalign v1.4.2 h1:0hcLHPGMjDyM1gHG58cS73aQF8J4TdVR96TZViorO9E=
github.com/4meepo/tagalign v1.4.2/go.mod h1:+p4aMyFM+ra7nb41CnFG6aSDXqRxU/w1VQqScKqDARI=
github.com/Abirdcfly/dupword v0.1.3 h1:9Pa1NuAsZvpFPi9Pqkd93I7LIYRURj+A//dFd5tgBeE=
github.com/Abirdcfly/dupword v0.1.3/go.mod h1:8VbB2t7e10KRNdwTVoxdBaxla6avbhGzb8sCTygUMhw=
github.com/Antonboom/errname v1.0.0 h1:oJOOWR07vS1kRusl6YRSlat7HFnb3mSfMl6sDMRoTBA=
@@ -53,12 +53,12 @@ github.com/Crocmagnon/fatcontext v0.7.1 h1:SC/VIbRRZQeQWj/TcQBS6JmrXcfA+BU4OGSVU
github.com/Crocmagnon/fatcontext v0.7.1/go.mod h1:1wMvv3NXEBJucFGfwOJBxSVWcoIO6emV215SMkW9MFU=
github.com/Djarvur/go-err113 v0.0.0-20210108212216-aea10b59be24 h1:sHglBQTwgx+rWPdisA5ynNEsoARbiCBOyGcJM4/OzsM=
github.com/Djarvur/go-err113 v0.0.0-20210108212216-aea10b59be24/go.mod h1:4UJr5HIiMZrwgkSPdsjy2uOQExX/WEILpIrO9UPGuXs=
github.com/GaijinEntertainment/go-exhaustruct/v3 v3.3.0 h1:/fTUt5vmbkAcMBt4YQiuC23cV0kEsN1MVMNqeOW43cU=
github.com/GaijinEntertainment/go-exhaustruct/v3 v3.3.0/go.mod h1:ONJg5sxcbsdQQ4pOW8TGdTidT2TMAUy/2Xhr8mrYaao=
github.com/GaijinEntertainment/go-exhaustruct/v3 v3.3.1 h1:Sz1JIXEcSfhz7fUi7xHnhpIE0thVASYjvosApmHuD2k=
github.com/GaijinEntertainment/go-exhaustruct/v3 v3.3.1/go.mod h1:n/LSCXNuIYqVfBlVXyHfMQkZDdp1/mmxfSjADd3z1Zg=
github.com/Masterminds/semver/v3 v3.3.0 h1:B8LGeaivUe71a5qox1ICM/JLl0NqZSW5CHyL+hmvYS0=
github.com/Masterminds/semver/v3 v3.3.0/go.mod h1:4V+yj/TJE1HU9XfppCwVMZq3I84lprf4nC11bSS5beM=
github.com/OpenPeeDeeP/depguard/v2 v2.2.0 h1:vDfG60vDtIuf0MEOhmLlLLSzqaRM8EMcgJPdp74zmpA=
github.com/OpenPeeDeeP/depguard/v2 v2.2.0/go.mod h1:CIzddKRvLBC4Au5aYP/i3nyaWQ+ClszLIuVocRiCYFQ=
github.com/OpenPeeDeeP/depguard/v2 v2.2.1 h1:vckeWVESWp6Qog7UZSARNqfu/cZqvki8zsuj3piCMx4=
github.com/OpenPeeDeeP/depguard/v2 v2.2.1/go.mod h1:q4DKzC4UcVaAvcfd41CZh0PWpGgzrVxUYBlgKNGquUo=
github.com/alecthomas/assert/v2 v2.11.0 h1:2Q9r3ki8+JYXvGsDyBXwH3LcJ+WK5D0gc5E8vS6K3D0=
github.com/alecthomas/assert/v2 v2.11.0/go.mod h1:Bze95FyfUr7x34QZrjL+XP+0qgp/zg8yS+TtBj1WA3k=
github.com/alecthomas/go-check-sumtype v0.3.1 h1:u9aUvbGINJxLVXiFvHUlPEaD7VDULsrxJb4Aq31NLkU=
@@ -102,8 +102,8 @@ github.com/butuzov/ireturn v0.3.1 h1:mFgbEI6m+9W8oP/oDdfA34dLisRFCj2G6o/yiI1yZrY
github.com/butuzov/ireturn v0.3.1/go.mod h1:ZfRp+E7eJLC0NQmk1Nrm1LOrn/gQlOykv+cVPdiXH5M=
github.com/butuzov/mirror v1.3.0 h1:HdWCXzmwlQHdVhwvsfBb2Au0r3HyINry3bDWLYXiKoc=
github.com/butuzov/mirror v1.3.0/go.mod h1:AEij0Z8YMALaq4yQj9CPPVYOyJQyiexpQEQgihajRfI=
github.com/catenacyber/perfsprint v0.8.1 h1:bGOHuzHe0IkoGeY831RW4aSlt1lPRd3WRAScSWOaV7E=
github.com/catenacyber/perfsprint v0.8.1/go.mod h1:/wclWYompEyjUD2FuIIDVKNkqz7IgBIWXIH3V0Zol50=
github.com/catenacyber/perfsprint v0.8.2 h1:+o9zVmCSVa7M4MvabsWvESEhpsMkhfE7k0sHNGL95yw=
github.com/catenacyber/perfsprint v0.8.2/go.mod h1:q//VWC2fWbcdSLEY1R3l8n0zQCDPdE4IjZwyY1HMunM=
github.com/ccojocar/zxcvbn-go v1.0.2 h1:na/czXU8RrhXO4EZme6eQJLR4PzcGsahsBOAwU6I3Vg=
github.com/ccojocar/zxcvbn-go v1.0.2/go.mod h1:g1qkXtUSvHP8lhHp5GrSmTz6uWALGRMQdw6Qnz/hi60=
github.com/census-instrumentation/opencensus-proto v0.2.1/go.mod h1:f6KPmirojxKA12rnyqOA5BBL4O983OfeGPqjHWSTneU=
@@ -122,7 +122,7 @@ github.com/ckaznocha/intrange v0.3.0 h1:VqnxtK32pxgkhJgYQEeOArVidIPg+ahLP7WBOXZd
github.com/ckaznocha/intrange v0.3.0/go.mod h1:+I/o2d2A1FBHgGELbGxzIcyd3/9l9DuwjM8FsbSS3Lo=
github.com/client9/misspell v0.3.4/go.mod h1:qj6jICC3Q7zFZvVWo7KLAzC3yx5G7kyvSDkc90ppPyw=
github.com/cncf/udpa/go v0.0.0-20191209042840-269d4d468f6f/go.mod h1:M8M6+tZqaGXZJjfX53e64911xZQV5JYwmTeXPW+k8Sc=
github.com/cpuguy83/go-md2man/v2 v2.0.4/go.mod h1:tgQtvFlXSQOSOSIRvRPT7W67SCa46tRHOmNcaadrF8o=
github.com/cpuguy83/go-md2man/v2 v2.0.6/go.mod h1:oOW0eioCTA6cOiMLiUPZOpcVxMig6NIQQ7OS05n1F4g=
github.com/curioswitch/go-reassign v0.3.0 h1:dh3kpQHuADL3cobV/sSGETA8DOv457dwl+fbBAhrQPs=
github.com/curioswitch/go-reassign v0.3.0/go.mod h1:nApPCCTtqLJN/s8HfItCcKV0jIPwluBOvZP+dsJGA88=
github.com/daixiang0/gci v0.13.5 h1:kThgmH1yBmZSBCh1EJVxQ7JsHpm5Oms0AMed/0LaH4c=
@@ -229,14 +229,14 @@ github.com/golang/protobuf v1.5.0/go.mod h1:FsONVRAS9T7sI+LIUmWTfcYkHO4aIWwzhcaS
github.com/golang/protobuf v1.5.2/go.mod h1:XVQd3VNwM+JqD3oG2Ue2ip4fOMUkwXdXDdiuN0vRsmY=
github.com/golang/protobuf v1.5.3 h1:KhyjKVUg7Usr/dYsdSqoFveMYd5ko72D+zANwlG1mmg=
github.com/golang/protobuf v1.5.3/go.mod h1:XVQd3VNwM+JqD3oG2Ue2ip4fOMUkwXdXDdiuN0vRsmY=
github.com/golangci/dupl v0.0.0-20180902072040-3e9179ac440a h1:w8hkcTqaFpzKqonE9uMCefW1WDie15eSP/4MssdenaM=
github.com/golangci/dupl v0.0.0-20180902072040-3e9179ac440a/go.mod h1:ryS0uhF+x9jgbj/N71xsEqODy9BN81/GonCZiOzirOk=
github.com/golangci/dupl v0.0.0-20250308024227-f665c8d69b32 h1:WUvBfQL6EW/40l6OmeSBYQJNSif4O11+bmWEz+C7FYw=
github.com/golangci/dupl v0.0.0-20250308024227-f665c8d69b32/go.mod h1:NUw9Zr2Sy7+HxzdjIULge71wI6yEg1lWQr7Evcu8K0E=
github.com/golangci/go-printf-func-name v0.1.0 h1:dVokQP+NMTO7jwO4bwsRwLWeudOVUPPyAKJuzv8pEJU=
github.com/golangci/go-printf-func-name v0.1.0/go.mod h1:wqhWFH5mUdJQhweRnldEywnR5021wTdZSNgwYceV14s=
github.com/golangci/gofmt v0.0.0-20250106114630-d62b90e6713d h1:viFft9sS/dxoYY0aiOTsLKO2aZQAPT4nlQCsimGcSGE=
github.com/golangci/gofmt v0.0.0-20250106114630-d62b90e6713d/go.mod h1:ivJ9QDg0XucIkmwhzCDsqcnxxlDStoTl89jDMIoNxKY=
github.com/golangci/golangci-lint v1.64.5 h1:5omC86XFBKXZgCrVdUWU+WNHKd+CWCxNx717KXnzKZY=
github.com/golangci/golangci-lint v1.64.5/go.mod h1:WZnwq8TF0z61h3jLQ7Sk5trcP7b3kUFxLD6l1ivtdvU=
github.com/golangci/golangci-lint v1.64.7 h1:Xk1EyxoXqZabn5b4vnjNKSjCx1whBK53NP+mzLfX7HA=
github.com/golangci/golangci-lint v1.64.7/go.mod h1:5cEsUQBSr6zi8XI8OjmcY2Xmliqc4iYL7YoPrL+zLJ4=
github.com/golangci/misspell v0.6.0 h1:JCle2HUTNWirNlDIAUO44hUsKhOFqGPoC4LZxlaSXDs=
github.com/golangci/misspell v0.6.0/go.mod h1:keMNyY6R9isGaSAu+4Q8NMBwMPkh15Gtc8UCVoDtAWo=
github.com/golangci/plugin-module-register v0.1.1 h1:TCmesur25LnyJkpsVrupv1Cdzo+2f7zX0H6Jkw1Ol6c=
@@ -259,8 +259,8 @@ github.com/google/go-cmp v0.5.4/go.mod h1:v8dTdLbMG2kIc/vJvl+f65V22dbkXbowE6jgT/
github.com/google/go-cmp v0.5.5/go.mod h1:v8dTdLbMG2kIc/vJvl+f65V22dbkXbowE6jgT/gNBxE=
github.com/google/go-cmp v0.5.6/go.mod h1:v8dTdLbMG2kIc/vJvl+f65V22dbkXbowE6jgT/gNBxE=
github.com/google/go-cmp v0.5.8/go.mod h1:17dUlkBOakJ0+DkrSSNjCkIjxS6bF9zb3elmeNGIjoY=
github.com/google/go-cmp v0.6.0 h1:ofyhxvXcZhMsU5ulbFiLKl/XBFqE1GSq7atu8tAmTRI=
github.com/google/go-cmp v0.6.0/go.mod h1:17dUlkBOakJ0+DkrSSNjCkIjxS6bF9zb3elmeNGIjoY=
github.com/google/go-cmp v0.7.0 h1:wk8382ETsv4JYUZwIsn6YpYiWiBsYLSJiTsyBybVuN8=
github.com/google/go-cmp v0.7.0/go.mod h1:pXiqmnSA92OHEEa9HXL2W4E7lf9JzCmGVUdgjX3N/iU=
github.com/google/gofuzz v1.0.0/go.mod h1:dBl0BpW6vV/+mYPU4Po3pmUjxk6FQPldtuIdl/M65Eg=
github.com/google/martian v2.1.0+incompatible/go.mod h1:9I4somxYTbIHy5NJKHRl3wXiIaQGbYVAs8BPL6v8lEs=
github.com/google/martian/v3 v3.0.0/go.mod h1:y5Zk1BBys9G+gd6Jrk0W3cC1+ELVxBWuIGO+w/tUAp0=
@@ -281,8 +281,9 @@ github.com/gordonklaus/ineffassign v0.1.0/go.mod h1:Qcp2HIAYhR7mNUVSIxZww3Guk4it
github.com/gostaticanalysis/analysisutil v0.7.1 h1:ZMCjoue3DtDWQ5WyU16YbjbQEQ3VuzwxALrpYd+HeKk=
github.com/gostaticanalysis/analysisutil v0.7.1/go.mod h1:v21E3hY37WKMGSnbsw2S/ojApNWb6C1//mXO48CXbVc=
github.com/gostaticanalysis/comment v1.4.1/go.mod h1:ih6ZxzTHLdadaiSnF5WY3dxUoXfXAlTaRzuaNDlSado=
github.com/gostaticanalysis/comment v1.4.2 h1:hlnx5+S2fY9Zo9ePo4AhgYsYHbM2+eAv8m/s1JiCd6Q=
github.com/gostaticanalysis/comment v1.4.2/go.mod h1:KLUTGDv6HOCotCH8h2erHKmpci2ZoR8VPu34YA2uzdM=
github.com/gostaticanalysis/comment v1.5.0 h1:X82FLl+TswsUMpMh17srGRuKaaXprTaytmEpgnKIDu8=
github.com/gostaticanalysis/comment v1.5.0/go.mod h1:V6eb3gpCv9GNVqb6amXzEUX3jXLVK/AdA+IrAMSqvEc=
github.com/gostaticanalysis/forcetypeassert v0.2.0 h1:uSnWrrUEYDr86OCxWa4/Tp2jeYDlogZiZHzGkWFefTk=
github.com/gostaticanalysis/forcetypeassert v0.2.0/go.mod h1:M5iPavzE9pPqWyeiVXSFghQjljW1+l/Uke3PXHS6ILY=
github.com/gostaticanalysis/nilerr v0.1.1 h1:ThE+hJP0fEp4zWLkWHWcRyI2Od0p7DlgYG3Uqrmrcpk=
@@ -327,11 +328,11 @@ github.com/julz/importas v0.2.0 h1:y+MJN/UdL63QbFJHws9BVC5RpA2iq0kpjrFajTGivjQ=
github.com/julz/importas v0.2.0/go.mod h1:pThlt589EnCYtMnmhmRYY/qn9lCf/frPOK+WMx3xiJY=
github.com/karamaru-alpha/copyloopvar v1.2.1 h1:wmZaZYIjnJ0b5UoKDjUHrikcV0zuPyyxI4SVplLd2CI=
github.com/karamaru-alpha/copyloopvar v1.2.1/go.mod h1:nFmMlFNlClC2BPvNaHMdkirmTJxVCY0lhxBtlfOypMM=
github.com/kisielk/errcheck v1.8.0 h1:ZX/URYa7ilESY19ik/vBmCn6zdGQLxACwjAcWbHlYlg=
github.com/kisielk/errcheck v1.8.0/go.mod h1:1kLL+jV4e+CFfueBmI1dSK2ADDyQnlrnrY/FqKluHJQ=
github.com/kisielk/errcheck v1.9.0 h1:9xt1zI9EBfcYBvdU1nVrzMzzUPUtPKs9bVSIM3TAb3M=
github.com/kisielk/errcheck v1.9.0/go.mod h1:kQxWMMVZgIkDq7U8xtG/n2juOjbLgZtedi0D+/VL/i8=
github.com/kisielk/gotool v1.0.0/go.mod h1:XhKaO+MFFWcvkIS/tQcRk01m1F5IRFswLeQ+oQHNcck=
github.com/kkHAIKE/contextcheck v1.1.5 h1:CdnJh63tcDe53vG+RebdpdXJTc9atMgGqdx8LXxiilg=
github.com/kkHAIKE/contextcheck v1.1.5/go.mod h1:O930cpht4xb1YQpK+1+AgoM3mFsvxr7uyFptcnWTYUA=
github.com/kkHAIKE/contextcheck v1.1.6 h1:7HIyRcnyzxL9Lz06NGhiKvenXq7Zw6Q0UQu/ttjfJCE=
github.com/kkHAIKE/contextcheck v1.1.6/go.mod h1:3dDbMRNBFaq8HFXWC1JyvDSPm43CmE6IuHam8Wr0rkg=
github.com/konsorten/go-windows-terminal-sequences v1.0.1/go.mod h1:T0+1ngSBFLxvqU3pZ+m/2kptfBszLMUkC4ZK/EgS/cQ=
github.com/konsorten/go-windows-terminal-sequences v1.0.3/go.mod h1:T0+1ngSBFLxvqU3pZ+m/2kptfBszLMUkC4ZK/EgS/cQ=
github.com/kr/logfmt v0.0.0-20140226030751-b84e30acd515/go.mod h1:+0opPa2QZZtGFBFZlji/RkVcI2GknAs/DXo4wKdlNEc=
@@ -348,8 +349,8 @@ github.com/kunwardeep/paralleltest v1.0.10 h1:wrodoaKYzS2mdNVnc4/w31YaXFtsc21PCT
github.com/kunwardeep/paralleltest v1.0.10/go.mod h1:2C7s65hONVqY7Q5Efj5aLzRCNLjw2h4eMc9EcypGjcY=
github.com/lasiar/canonicalheader v1.1.2 h1:vZ5uqwvDbyJCnMhmFYimgMZnJMjwljN5VGY0VKbMXb4=
github.com/lasiar/canonicalheader v1.1.2/go.mod h1:qJCeLFS0G/QlLQ506T+Fk/fWMa2VmBUiEI2cuMK4djI=
github.com/ldez/exptostd v0.4.1 h1:DIollgQ3LWZMp3HJbSXsdE2giJxMfjyHj3eX4oiD6JU=
github.com/ldez/exptostd v0.4.1/go.mod h1:iZBRYaUmcW5jwCR3KROEZ1KivQQp6PHXbDPk9hqJKCQ=
github.com/ldez/exptostd v0.4.2 h1:l5pOzHBz8mFOlbcifTxzfyYbgEmoUqjxLFHZkjlbHXs=
github.com/ldez/exptostd v0.4.2/go.mod h1:iZBRYaUmcW5jwCR3KROEZ1KivQQp6PHXbDPk9hqJKCQ=
github.com/ldez/gomoddirectives v0.6.1 h1:Z+PxGAY+217f/bSGjNZr/b2KTXcyYLgiWI6geMBN2Qc=
github.com/ldez/gomoddirectives v0.6.1/go.mod h1:cVBiu3AHR9V31em9u2kwfMKD43ayN5/XDgr+cdaFaKs=
github.com/ldez/grignotin v0.9.0 h1:MgOEmjZIVNn6p5wPaGp/0OKWyvq42KnzAt/DAb8O4Ow=
@@ -383,8 +384,8 @@ github.com/mattn/go-runewidth v0.0.16 h1:E5ScNMtiwvlvB5paMFdw9p4kSQzbXFikJ5SQO6T
github.com/mattn/go-runewidth v0.0.16/go.mod h1:Jdepj2loyihRzMpdS35Xk/zdY8IAYHsh153qUoGf23w=
github.com/matttproud/golang_protobuf_extensions v1.0.1 h1:4hp9jkHxhMHkqkrB3Ix0jegS5sx/RkqARlsWZ6pIwiU=
github.com/matttproud/golang_protobuf_extensions v1.0.1/go.mod h1:D8He9yQNgCq6Z5Ld7szi9bcBfOoFv/3dc6xSMkL2PC0=
github.com/mgechev/revive v1.6.1 h1:ncK0ZCMWtb8GXwVAmk+IeWF2ULIDsvRxSRfg5sTwQ2w=
github.com/mgechev/revive v1.6.1/go.mod h1:/2tfHWVO8UQi/hqJsIYNEKELi+DJy/e+PQpLgTB1v88=
github.com/mgechev/revive v1.7.0 h1:JyeQ4yO5K8aZhIKf5rec56u0376h8AlKNQEmjfkjKlY=
github.com/mgechev/revive v1.7.0/go.mod h1:qZnwcNhoguE58dfi96IJeSTPeZQejNeoMQLUZGi4SW4=
github.com/mitchellh/go-homedir v1.1.0 h1:lukF9ziXFxDFPkA1vsr5zpc1XuPDn/wFntq5mG+4E0Y=
github.com/mitchellh/go-homedir v1.1.0/go.mod h1:SfyaCUpYCn1Vlf4IUYiD9fPX4A5wJrkLzIz1N1q0pr0=
github.com/mitchellh/mapstructure v1.5.0 h1:jeMsZIYE/09sWLaz43PL7Gy6RuMjD2eJVyuac5Z2hdY=
@@ -404,8 +405,8 @@ github.com/nishanths/exhaustive v0.12.0 h1:vIY9sALmw6T/yxiASewa4TQcFsVYZQQRUQJhK
github.com/nishanths/exhaustive v0.12.0/go.mod h1:mEZ95wPIZW+x8kC4TgC+9YCUgiST7ecevsVDTgc2obs=
github.com/nishanths/predeclared v0.2.2 h1:V2EPdZPliZymNAn79T8RkNApBjMmVKh5XRpLm/w98Vk=
github.com/nishanths/predeclared v0.2.2/go.mod h1:RROzoN6TnGQupbC+lqggsOlcgysk3LMK/HI84Mp280c=
github.com/nunnatsa/ginkgolinter v0.19.0 h1:CnHRFAeBS3LdLI9h+Jidbcc5KH71GKOmaBZQk8Srnto=
github.com/nunnatsa/ginkgolinter v0.19.0/go.mod h1:jkQ3naZDmxaZMXPWaS9rblH+i+GWXQCaS/JFIWcOH2s=
github.com/nunnatsa/ginkgolinter v0.19.1 h1:mjwbOlDQxZi9Cal+KfbEJTCz327OLNfwNvoZ70NJ+c4=
github.com/nunnatsa/ginkgolinter v0.19.1/go.mod h1:jkQ3naZDmxaZMXPWaS9rblH+i+GWXQCaS/JFIWcOH2s=
github.com/olekukonko/tablewriter v0.0.5 h1:P2Ga83D34wi1o9J6Wh1mRuqd4mF/x/lgBS7N7AbDhec=
github.com/olekukonko/tablewriter v0.0.5/go.mod h1:hPp6KlRPjbx+hW8ykQs1w3UBbZlj6HuIJcUGPhkA7kY=
github.com/onsi/ginkgo/v2 v2.22.2 h1:/3X8Panh8/WwhU/3Ssa6rCKqPLuAkVY2I0RoyDLySlU=
@@ -471,8 +472,8 @@ github.com/rivo/uniseg v0.2.0/go.mod h1:J6wj4VEh+S6ZtnVlnTBMWIodfgj8LQOQFoIToxlJ
github.com/rivo/uniseg v0.4.7 h1:WUdvkW8uEhrYfLC4ZzdpI2ztxP1I582+49Oc5Mq64VQ=
github.com/rivo/uniseg v0.4.7/go.mod h1:FN3SvrM+Zdj16jyLfmOkMNblXMcoc8DfTHruCPUcx88=
github.com/rogpeppe/go-internal v1.3.0/go.mod h1:M8bDsm7K2OlrFYOpmOWEs/qY81heoFRclV5y23lUDJ4=
github.com/rogpeppe/go-internal v1.13.1 h1:KvO1DLK/DRN07sQ1LQKScxyZJuNnedQ5/wKSR38lUII=
github.com/rogpeppe/go-internal v1.13.1/go.mod h1:uMEvuHeurkdAXX61udpOXGD/AzZDWNMNyH2VO9fmH0o=
github.com/rogpeppe/go-internal v1.14.1 h1:UQB4HGPB6osV0SQTLymcB4TgvyWu6ZyliaW0tI/otEQ=
github.com/rogpeppe/go-internal v1.14.1/go.mod h1:MaRKkUm5W0goXpeCfT7UZI6fk/L7L7so1lCWt35ZSgc=
github.com/russross/blackfriday/v2 v2.1.0/go.mod h1:+Rmxgy9KzJVeS9/2gXHxylqXiyQDYRxCVz55jmeOWTM=
github.com/ryancurrah/gomodguard v1.3.5 h1:cShyguSwUEeC0jS7ylOiG/idnd1TpJ1LfHGpV3oJmPU=
github.com/ryancurrah/gomodguard v1.3.5/go.mod h1:MXlEPQRxgfPQa62O8wzK3Ozbkv9Rkqr+wKjSxTdsNJE=
@@ -486,8 +487,8 @@ github.com/sashamelentyev/interfacebloat v1.1.0 h1:xdRdJp0irL086OyW1H/RTZTr1h/tM
github.com/sashamelentyev/interfacebloat v1.1.0/go.mod h1:+Y9yU5YdTkrNvoX0xHc84dxiN1iBi9+G8zZIhPVoNjQ=
github.com/sashamelentyev/usestdlibvars v1.28.0 h1:jZnudE2zKCtYlGzLVreNp5pmCdOxXUzwsMDBkR21cyQ=
github.com/sashamelentyev/usestdlibvars v1.28.0/go.mod h1:9nl0jgOfHKWNFS43Ojw0i7aRoS4j6EBye3YBhmAIRF8=
github.com/securego/gosec/v2 v2.22.1 h1:IcBt3TpI5Y9VN1YlwjSpM2cHu0i3Iw52QM+PQeg7jN8=
github.com/securego/gosec/v2 v2.22.1/go.mod h1:4bb95X4Jz7VSEPdVjC0hD7C/yR6kdeUBvCPOy9gDQ0g=
github.com/securego/gosec/v2 v2.22.2 h1:IXbuI7cJninj0nRpZSLCUlotsj8jGusohfONMrHoF6g=
github.com/securego/gosec/v2 v2.22.2/go.mod h1:UEBGA+dSKb+VqM6TdehR7lnQtIIMorYJ4/9CW1KVQBE=
github.com/shurcooL/go v0.0.0-20180423040247-9e1955d9fb6e/go.mod h1:TDJrrUr11Vxrven61rcy3hJMUqaf/CLWYhHNPmT14Lk=
github.com/shurcooL/go-goon v0.0.0-20170922171312-37c2f522c041/go.mod h1:N5mDOmsrJOB+vfqUK+7DmDyjhSLIIBnXo9lvZJj3MWQ=
github.com/sirupsen/logrus v1.2.0/go.mod h1:LxeOpSwHxABJmUn/MG1IvRgCAasNZTLOkJPxbbu5VWo=
@@ -507,8 +508,8 @@ github.com/spf13/afero v1.12.0 h1:UcOPyRBYczmFn6yvphxkn9ZEOY65cpwGKb5mL36mrqs=
github.com/spf13/afero v1.12.0/go.mod h1:ZTlWwG4/ahT8W7T0WQ5uYmjI9duaLQGy3Q2OAl4sk/4=
github.com/spf13/cast v1.5.0 h1:rj3WzYc11XZaIZMPKmwP96zkFEnnAmV8s6XbB2aY32w=
github.com/spf13/cast v1.5.0/go.mod h1:SpXXQ5YoyJw6s3/6cMTQuxvgRl3PCJiyaX9p6b155UU=
github.com/spf13/cobra v1.8.1 h1:e5/vxKd/rZsfSJMUX1agtjeTDf+qv1/JdBF8gg5k9ZM=
github.com/spf13/cobra v1.8.1/go.mod h1:wHxEcudfqmLYa8iTfL+OuZPbBZkmvliBWKIezN3kD9Y=
github.com/spf13/cobra v1.9.1 h1:CXSaggrXdbHK9CF+8ywj8Amf7PBRmPCOJugH954Nnlo=
github.com/spf13/cobra v1.9.1/go.mod h1:nDyEzZ8ogv936Cinf6g1RU9MRY64Ir93oCnqb9wxYW0=
github.com/spf13/jwalterweatherman v1.1.0 h1:ue6voC5bR5F8YxI5S67j9i582FU4Qvo2bmqnqMYADFk=
github.com/spf13/jwalterweatherman v1.1.0/go.mod h1:aNWZUN0dPAAO/Ljvb5BEdw96iTZ0EXowPYD95IqWIGo=
github.com/spf13/pflag v1.0.5/go.mod h1:McXfInJRrz4CZXVZOBLb0bTZqETkiAhM9Iw0y3An2Bg=
@@ -538,14 +539,14 @@ github.com/stretchr/testify v1.10.0 h1:Xv5erBjTwe/5IxqUQTdXv5kgmIvbHo3QQyRwhJsOf
github.com/stretchr/testify v1.10.0/go.mod h1:r2ic/lqez/lEtzL7wO/rwa5dbSLXVDPFyf8C91i36aY=
github.com/subosito/gotenv v1.4.1 h1:jyEFiXpy21Wm81FBN71l9VoMMV8H8jG+qIK3GCpY6Qs=
github.com/subosito/gotenv v1.4.1/go.mod h1:ayKnFf/c6rvx/2iiLrJUk1e6plDbT3edrFNGqEflhK0=
github.com/tdakkota/asciicheck v0.4.0 h1:VZ13Itw4k1i7d+dpDSNS8Op645XgGHpkCEh/WHicgWw=
github.com/tdakkota/asciicheck v0.4.0/go.mod h1:0k7M3rCfRXb0Z6bwgvkEIMleKH3kXNz9UqJ9Xuqopr8=
github.com/tdakkota/asciicheck v0.4.1 h1:bm0tbcmi0jezRA2b5kg4ozmMuGAFotKI3RZfrhfovg8=
github.com/tdakkota/asciicheck v0.4.1/go.mod h1:0k7M3rCfRXb0Z6bwgvkEIMleKH3kXNz9UqJ9Xuqopr8=
github.com/tenntenn/modver v1.0.1 h1:2klLppGhDgzJrScMpkj9Ujy3rXPUspSjAcev9tSEBgA=
github.com/tenntenn/modver v1.0.1/go.mod h1:bePIyQPb7UeioSRkw3Q0XeMhYZSMx9B8ePqg6SAMGH0=
github.com/tenntenn/text/transform v0.0.0-20200319021203-7eef512accb3 h1:f+jULpRQGxTSkNYKJ51yaw6ChIqO+Je8UqsTKN/cDag=
github.com/tenntenn/text/transform v0.0.0-20200319021203-7eef512accb3/go.mod h1:ON8b8w4BN/kE1EOhwT0o+d62W65a6aPw1nouo9LMgyY=
github.com/tetafro/godot v1.4.20 h1:z/p8Ek55UdNvzt4TFn2zx2KscpW4rWqcnUrdmvWJj7E=
github.com/tetafro/godot v1.4.20/go.mod h1:2oVxTBSftRTh4+MVfUaUXR6bn2GDXCaMcOG4Dk3rfio=
github.com/tetafro/godot v1.5.0 h1:aNwfVI4I3+gdxjMgYPus9eHmoBeJIbnajOyqZYStzuw=
github.com/tetafro/godot v1.5.0/go.mod h1:2oVxTBSftRTh4+MVfUaUXR6bn2GDXCaMcOG4Dk3rfio=
github.com/timakin/bodyclose v0.0.0-20241017074812-ed6a65f985e3 h1:y4mJRFlM6fUyPhoXuFg/Yu02fg/nIPFMOY8tOqppoFg=
github.com/timakin/bodyclose v0.0.0-20241017074812-ed6a65f985e3/go.mod h1:mkjARE7Yr8qU23YcGMSALbIxTQ9r9QBVahQOBRfU460=
github.com/timonwong/loggercheck v0.10.1 h1:uVZYClxQFpw55eh+PIoqM7uAOHMrhVcDoWDery9R8Lg=
@@ -654,8 +655,8 @@ golang.org/x/mod v0.8.0/go.mod h1:iBbtSCu2XBx23ZKBPSOrRkjjQPZFPuis4dIYUhu/chs=
golang.org/x/mod v0.9.0/go.mod h1:iBbtSCu2XBx23ZKBPSOrRkjjQPZFPuis4dIYUhu/chs=
golang.org/x/mod v0.12.0/go.mod h1:iBbtSCu2XBx23ZKBPSOrRkjjQPZFPuis4dIYUhu/chs=
golang.org/x/mod v0.13.0/go.mod h1:hTbmBsO62+eylJbnUtE2MGJUyE7QWk4xUqPFrRgJ+7c=
golang.org/x/mod v0.23.0 h1:Zb7khfcRGKk+kqfxFaP5tZqCnDZMjC5VtUBs87Hr6QM=
golang.org/x/mod v0.23.0/go.mod h1:6SkKJ3Xj0I0BrPOZoBy3bdMptDDU9oJrpohJ3eWZ1fY=
golang.org/x/mod v0.24.0 h1:ZfthKaKaT4NrhGVZHO1/WDTwGES4De8KtWO0SIbNJMU=
golang.org/x/mod v0.24.0/go.mod h1:IXM97Txy2VM4PJ3gI61r1YEk/gAj6zAHN3AdZt6S9Ww=
golang.org/x/net v0.0.0-20180724234803-3673e40ba225/go.mod h1:mL1N/T3taQHkDXs73rZJwtUhF3w3ftmwwsq0BUmARs4=
golang.org/x/net v0.0.0-20180826012351-8a410e7b638d/go.mod h1:mL1N/T3taQHkDXs73rZJwtUhF3w3ftmwwsq0BUmARs4=
golang.org/x/net v0.0.0-20181114220301-adae6a3d119a/go.mod h1:mL1N/T3taQHkDXs73rZJwtUhF3w3ftmwwsq0BUmARs4=
@@ -696,8 +697,8 @@ golang.org/x/net v0.8.0/go.mod h1:QVkue5JL9kW//ek3r6jTKnTFis1tRmNAW2P1shuFdJc=
golang.org/x/net v0.10.0/go.mod h1:0qNGK6F8kojg2nk9dLZ2mShWaEBan6FAoqfSigmmuDg=
golang.org/x/net v0.15.0/go.mod h1:idbUs1IY1+zTqbi8yxTbhexhEEk5ur9LInksu6HrEpk=
golang.org/x/net v0.16.0/go.mod h1:NxSsAGuq816PNPmqtQdLE42eU2Fs7NoRIZrHJAlaCOE=
golang.org/x/net v0.35.0 h1:T5GQRQb2y08kTAByq9L4/bz8cipCdA8FbRTXewonqY8=
golang.org/x/net v0.35.0/go.mod h1:EglIi67kWsHKlRzzVMUD93VMSWGFOMSZgxFjparz1Qk=
golang.org/x/net v0.37.0 h1:1zLorHbz+LYj7MQlSf1+2tPIIgibq2eL5xkrGk6f+2c=
golang.org/x/net v0.37.0/go.mod h1:ivrbrMbzFq5J41QOQh0siUuly180yBYtLp+CKbEaFx8=
golang.org/x/oauth2 v0.0.0-20180821212333-d2e6202438be/go.mod h1:N/0e6XlmueqKjAGxoOufVs8QHGRruUQn6yWY3a++T0U=
golang.org/x/oauth2 v0.0.0-20190226205417-e64efc72b421/go.mod h1:gOpvHmFTYa4IltrdGE7lF6nIHvwfUNPOp7c8zoXwtLw=
golang.org/x/oauth2 v0.0.0-20190604053449-0f29369cfe45/go.mod h1:gOpvHmFTYa4IltrdGE7lF6nIHvwfUNPOp7c8zoXwtLw=
@@ -719,8 +720,8 @@ golang.org/x/sync v0.0.0-20220722155255-886fb9371eb4/go.mod h1:RxMgew5VJxzue5/jJ
golang.org/x/sync v0.1.0/go.mod h1:RxMgew5VJxzue5/jJTE5uejpjVlOe/izrB70Jof72aM=
golang.org/x/sync v0.3.0/go.mod h1:FU7BRWz2tNW+3quACPkgCx/L+uEAv1htQ0V83Z9Rj+Y=
golang.org/x/sync v0.4.0/go.mod h1:FU7BRWz2tNW+3quACPkgCx/L+uEAv1htQ0V83Z9Rj+Y=
golang.org/x/sync v0.11.0 h1:GGz8+XQP4FvTTrjZPzNKTMFtSXH80RAzG+5ghFPgK9w=
golang.org/x/sync v0.11.0/go.mod h1:Czt+wKu1gCyEFDUtn0jG5QVvpJ6rzVqr5aXyt9drQfk=
golang.org/x/sync v0.12.0 h1:MHc5BpPuC30uJk597Ri8TV3CNZcTLu6B6z4lJy+g6Jw=
golang.org/x/sync v0.12.0/go.mod h1:1dzgHSNfp02xaA81J2MS99Qcpr2w7fw1gpm99rleRqA=
golang.org/x/sys v0.0.0-20180830151530-49385e6e1522/go.mod h1:STP8DvDyc/dI5b8T5hshtkjS+E42TnysNCUPdjciGhY=
golang.org/x/sys v0.0.0-20180905080454-ebe1bf3edb33/go.mod h1:STP8DvDyc/dI5b8T5hshtkjS+E42TnysNCUPdjciGhY=
golang.org/x/sys v0.0.0-20181116152217-5ac8a444bdc5/go.mod h1:STP8DvDyc/dI5b8T5hshtkjS+E42TnysNCUPdjciGhY=
@@ -773,8 +774,8 @@ golang.org/x/sys v0.6.0/go.mod h1:oPkhp1MJrh7nUepCBck5+mAzfO9JrbApNNgaTdGDITg=
golang.org/x/sys v0.8.0/go.mod h1:oPkhp1MJrh7nUepCBck5+mAzfO9JrbApNNgaTdGDITg=
golang.org/x/sys v0.12.0/go.mod h1:oPkhp1MJrh7nUepCBck5+mAzfO9JrbApNNgaTdGDITg=
golang.org/x/sys v0.13.0/go.mod h1:oPkhp1MJrh7nUepCBck5+mAzfO9JrbApNNgaTdGDITg=
golang.org/x/sys v0.30.0 h1:QjkSwP/36a20jFYWkSue1YwXzLmsV5Gfq7Eiy72C1uc=
golang.org/x/sys v0.30.0/go.mod h1:/VUhepiaJMQUp4+oa/7Zr1D23ma6VTLIYjOOTFZPUcA=
golang.org/x/sys v0.31.0 h1:ioabZlmFYtWhL+TRYpcnNlLwhyxaM9kWTDEmfnprqik=
golang.org/x/sys v0.31.0/go.mod h1:BJP2sWEmIv4KK5OTEluFJCKSidICx8ciO85XgH3Ak8k=
golang.org/x/term v0.0.0-20201126162022-7de9c90e9dd1/go.mod h1:bj7SfCRtBDWHUb9snDiAeCFNEtKQo2Wmx5Cou7ajbmo=
golang.org/x/term v0.0.0-20210927222741-03fcf44c2211/go.mod h1:jbD1KX2456YbFQfuXm/mYQcufACuNUgVhRMnK/tPxf8=
golang.org/x/term v0.2.0/go.mod h1:TVmDHMZPmdnySmBfhjOoOdhjzdE1h4u1VwSiw2l1Nuc=
@@ -856,8 +857,8 @@ golang.org/x/tools v0.6.0/go.mod h1:Xwgl3UAJ/d3gWutnCtw505GrjyAbvKui8lOU390QaIU=
golang.org/x/tools v0.7.0/go.mod h1:4pg6aUX35JBAogB10C9AtvVL+qowtN4pT3CGSQex14s=
golang.org/x/tools v0.13.0/go.mod h1:HvlwmtVNQAhOuCjW7xxvovg8wbNq7LwfXh/k7wXUl58=
golang.org/x/tools v0.14.0/go.mod h1:uYBEerGOWcJyEORxN+Ek8+TT266gXkNlHdJBwexUsBg=
golang.org/x/tools v0.30.0 h1:BgcpHewrV5AUp2G9MebG4XPFI1E2W41zU1SaqVA9vJY=
golang.org/x/tools v0.30.0/go.mod h1:c347cR/OJfw5TI+GfX7RUPNMdDRRbjvYTS0jPyvsVtY=
golang.org/x/tools v0.31.0 h1:0EedkvKDbh+qistFTd0Bcwe/YLh4vHwWEkiI0toFIBU=
golang.org/x/tools v0.31.0/go.mod h1:naFTU+Cev749tSJRXJlna0T3WxKvb1kWEx15xA4SdmQ=
golang.org/x/xerrors v0.0.0-20190717185122-a985d3407aa7/go.mod h1:I/5z698sn9Ka8TeJc9MKroUUfqBBauWjQqLJ2OPfmY0=
golang.org/x/xerrors v0.0.0-20191011141410-1b5146add898/go.mod h1:I/5z698sn9Ka8TeJc9MKroUUfqBBauWjQqLJ2OPfmY0=
golang.org/x/xerrors v0.0.0-20191204190536-9bdfabe68543/go.mod h1:I/5z698sn9Ka8TeJc9MKroUUfqBBauWjQqLJ2OPfmY0=
@@ -937,8 +938,8 @@ google.golang.org/protobuf v1.24.0/go.mod h1:r/3tXBNzIEhYS9I1OUVjXDlt8tc493IdKGj
google.golang.org/protobuf v1.25.0/go.mod h1:9JNX74DMeImyA3h4bdi1ymwjUzf21/xIlbajtzgsN7c=
google.golang.org/protobuf v1.26.0-rc.1/go.mod h1:jlhhOSvTdKEhbULTjvd4ARK9grFBp09yW+WbY/TyQbw=
google.golang.org/protobuf v1.26.0/go.mod h1:9q0QmTI4eRPtz6boOQmLYwt+qCgq0jsYwAQnmE0givc=
google.golang.org/protobuf v1.36.4 h1:6A3ZDJHn/eNqc1i+IdefRzy/9PokBTPvcqMySR7NNIM=
google.golang.org/protobuf v1.36.4/go.mod h1:9fA7Ob0pmnwhb644+1+CVWFRbNajQ6iRojtC/QF5bRE=
google.golang.org/protobuf v1.36.5 h1:tPhr+woSbjfYvY6/GPufUoYizxw1cF/yFoxJ2fmpwlM=
google.golang.org/protobuf v1.36.5/go.mod h1:9fA7Ob0pmnwhb644+1+CVWFRbNajQ6iRojtC/QF5bRE=
gopkg.in/alecthomas/kingpin.v2 v2.2.6/go.mod h1:FMv+mEhP44yOT+4EoQTLFTRgOQ1FBLkstjWtayDeSgw=
gopkg.in/check.v1 v0.0.0-20161208181325-20d25e280405/go.mod h1:Co6ibVJAznAaIkqp8huTwlJQCZ016jof/cbN4VW5Yz0=
gopkg.in/check.v1 v1.0.0-20180628173108-788fd7840127/go.mod h1:Co6ibVJAznAaIkqp8huTwlJQCZ016jof/cbN4VW5Yz0=
@@ -965,8 +966,8 @@ honnef.co/go/tools v0.0.0-20190523083050-ea95bdfd59fc/go.mod h1:rf3lG4BRIbNafJWh
honnef.co/go/tools v0.0.1-2019.2.3/go.mod h1:a3bituU0lyd329TUQxRnasdCoJDkEUEAqEt0JzvZhAg=
honnef.co/go/tools v0.0.1-2020.1.3/go.mod h1:X/FiERA/W4tHapMX5mGpAtMSVEeEUOyHaw9vFzvIQ3k=
honnef.co/go/tools v0.0.1-2020.1.4/go.mod h1:X/FiERA/W4tHapMX5mGpAtMSVEeEUOyHaw9vFzvIQ3k=
honnef.co/go/tools v0.6.0 h1:TAODvD3knlq75WCp2nyGJtT4LeRV/o7NN9nYPeVJXf8=
honnef.co/go/tools v0.6.0/go.mod h1:3puzxxljPCe8RGJX7BIy1plGbxEOZni5mR2aXe3/uk4=
honnef.co/go/tools v0.6.1 h1:R094WgE8K4JirYjBaOpz/AvTyUu/3wbmAoskKN/pxTI=
honnef.co/go/tools v0.6.1/go.mod h1:3puzxxljPCe8RGJX7BIy1plGbxEOZni5mR2aXe3/uk4=
mvdan.cc/gofumpt v0.7.0 h1:bg91ttqXmi9y2xawvkuMXyvAA/1ZGJqYAEGjXuP0JXU=
mvdan.cc/gofumpt v0.7.0/go.mod h1:txVFJy/Sc/mvaycET54pV8SW8gWxTlUuGHVEcncmNUo=
mvdan.cc/unparam v0.0.0-20240528143540-8a5130ca722f h1:lMpcwN6GxNbWtbpI1+xzFLSW8XzX0u72NttUGVFjO3U=

View File

@@ -0,0 +1,23 @@
# Copyright (c) 2025, NVIDIA CORPORATION. All rights reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
[Unit]
Description=Trigger CDI refresh on NVIDIA driver install / uninstall events
[Path]
PathChanged=/lib/modules/%v/modules.dep
PathChanged=/lib/modules/%v/modules.dep.bin
[Install]
WantedBy=multi-user.target

View File

@@ -0,0 +1,28 @@
# Copyright (c) 2025, NVIDIA CORPORATION. All rights reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
[Unit]
Description=Refresh NVIDIA CDI specification file
ConditionPathExists=/usr/bin/nvidia-smi
ConditionPathExists=/usr/bin/nvidia-ctk
[Service]
Type=oneshot
EnvironmentFile=-/etc/nvidia-container-toolkit/cdi-refresh.env
ExecCondition=/usr/bin/grep -qE '/nvidia.ko' /lib/modules/%v/modules.dep
ExecStart=/usr/bin/nvidia-ctk cdi generate --output=/var/run/cdi/nvidia.yaml
CapabilityBoundingSet=CAP_SYS_MODULE CAP_SYS_ADMIN CAP_MKNOD
[Install]
WantedBy=multi-user.target

View File

@@ -55,9 +55,7 @@ RUN make PREFIX=${DIST_DIR} cmds
WORKDIR $DIST_DIR
COPY packaging/debian ./debian
ARG LIBNVIDIA_CONTAINER_TOOLS_VERSION
ENV LIBNVIDIA_CONTAINER_TOOLS_VERSION ${LIBNVIDIA_CONTAINER_TOOLS_VERSION}
COPY deployments/systemd/ .
RUN dch --create --package="${PKG_NAME}" \
--newversion "${REVISION}" \
@@ -67,6 +65,6 @@ RUN dch --create --package="${PKG_NAME}" \
if [ "$REVISION" != "$(dpkg-parsechangelog --show-field=Version)" ]; then exit 1; fi
CMD export DISTRIB="$(lsb_release -cs)" && \
debuild -eDISTRIB -eSECTION -eLIBNVIDIA_CONTAINER_TOOLS_VERSION -eVERSION="${REVISION}" \
debuild -eDISTRIB -eSECTION -eVERSION="${REVISION}" \
--dpkg-buildpackage-hook='sh debian/prepare' -i -us -uc -b && \
mv /tmp/*.deb /dist

View File

@@ -46,9 +46,7 @@ RUN make PREFIX=${DIST_DIR} cmds
WORKDIR $DIST_DIR/..
COPY packaging/rpm .
ARG LIBNVIDIA_CONTAINER_TOOLS_VERSION
ENV LIBNVIDIA_CONTAINER_TOOLS_VERSION ${LIBNVIDIA_CONTAINER_TOOLS_VERSION}
COPY deployments/systemd/ .
CMD arch=$(uname -m) && \
rpmbuild --clean --target=$arch -bb \
@@ -56,7 +54,6 @@ CMD arch=$(uname -m) && \
-D "release_date $(date +'%a %b %d %Y')" \
-D "git_commit ${GIT_COMMIT}" \
-D "version ${PKG_VERS}" \
-D "libnvidia_container_tools_version ${LIBNVIDIA_CONTAINER_TOOLS_VERSION}" \
-D "release ${PKG_REV}" \
SPECS/nvidia-container-toolkit.spec && \
mv RPMS/$arch/*.rpm /dist

View File

@@ -71,9 +71,7 @@ RUN make PREFIX=${DIST_DIR} cmds
WORKDIR $DIST_DIR/..
COPY packaging/rpm .
ARG LIBNVIDIA_CONTAINER_TOOLS_VERSION
ENV LIBNVIDIA_CONTAINER_TOOLS_VERSION ${LIBNVIDIA_CONTAINER_TOOLS_VERSION}
COPY deployments/systemd/ ${DIST_DIR}/
CMD arch=$(uname -m) && \
rpmbuild --clean --target=$arch -bb \
@@ -81,7 +79,6 @@ CMD arch=$(uname -m) && \
-D "release_date $(date +'%a %b %d %Y')" \
-D "git_commit ${GIT_COMMIT}" \
-D "version ${PKG_VERS}" \
-D "libnvidia_container_tools_version ${LIBNVIDIA_CONTAINER_TOOLS_VERSION}" \
-D "release ${PKG_REV}" \
SPECS/nvidia-container-toolkit.spec && \
mv RPMS/$arch/*.rpm /dist

View File

@@ -53,18 +53,16 @@ RUN make PREFIX=${DIST_DIR} cmds
WORKDIR $DIST_DIR
COPY packaging/debian ./debian
ARG LIBNVIDIA_CONTAINER_TOOLS_VERSION
ENV LIBNVIDIA_CONTAINER_TOOLS_VERSION ${LIBNVIDIA_CONTAINER_TOOLS_VERSION}
COPY deployments/systemd/ .
RUN dch --create --package="${PKG_NAME}" \
--newversion "${REVISION}" \
"See https://gitlab.com/nvidia/container-toolkit/container-toolkit/-/blob/${GIT_COMMIT}/CHANGELOG.md for the changelog" && \
dch --append "Bump libnvidia-container dependency to ${LIBNVIDIA_CONTAINER_TOOLS_VERSION}" && \
dch --append "Bump libnvidia-container dependency to ${REVISION}" && \
dch -r "" && \
if [ "$REVISION" != "$(dpkg-parsechangelog --show-field=Version)" ]; then exit 1; fi
CMD export DISTRIB="$(lsb_release -cs)" && \
debuild -eDISTRIB -eSECTION -eLIBNVIDIA_CONTAINER_TOOLS_VERSION -eVERSION="${REVISION}" \
debuild -eDISTRIB -eSECTION -eVERSION="${REVISION}" \
--dpkg-buildpackage-hook='sh debian/prepare' -i -us -uc -b && \
mv /tmp/*.deb /dist

View File

@@ -13,10 +13,10 @@
# limitations under the License.
# Supported OSs by architecture
AMD64_TARGETS := ubuntu20.04 ubuntu18.04 ubuntu16.04 debian10 debian9
AMD64_TARGETS := ubuntu22.04 ubuntu20.04 ubuntu18.04 ubuntu16.04 debian10 debian9
X86_64_TARGETS := centos7 centos8 rhel7 rhel8 amazonlinux2 opensuse-leap15.1
PPC64LE_TARGETS := ubuntu18.04 ubuntu16.04 centos7 centos8 rhel7 rhel8
ARM64_TARGETS := ubuntu20.04 ubuntu18.04
ARM64_TARGETS := ubuntu22.04 ubuntu20.04 ubuntu18.04
AARCH64_TARGETS := centos7 centos8 rhel8 amazonlinux2
# Define top-level build targets
@@ -85,11 +85,6 @@ docker-all: $(AMD64_TARGETS) $(X86_64_TARGETS) \
--%: docker-build-%
@
LIBNVIDIA_CONTAINER_VERSION ?= $(LIB_VERSION)
LIBNVIDIA_CONTAINER_TAG ?= $(LIB_TAG)
LIBNVIDIA_CONTAINER_TOOLS_VERSION := $(LIBNVIDIA_CONTAINER_VERSION)$(if $(LIBNVIDIA_CONTAINER_TAG),~$(LIBNVIDIA_CONTAINER_TAG))-1
# private ubuntu target
--ubuntu%: OS := ubuntu
@@ -129,7 +124,6 @@ docker-build-%:
--build-arg PKG_NAME="$(LIB_NAME)" \
--build-arg PKG_VERS="$(PACKAGE_VERSION)" \
--build-arg PKG_REV="$(PACKAGE_REVISION)" \
--build-arg LIBNVIDIA_CONTAINER_TOOLS_VERSION="$(LIBNVIDIA_CONTAINER_TOOLS_VERSION)" \
--build-arg GIT_COMMIT="$(GIT_COMMIT)" \
--tag $(BUILDIMAGE) \
--file $(DOCKERFILE) .

28
go.mod
View File

@@ -1,38 +1,40 @@
module github.com/NVIDIA/nvidia-container-toolkit
go 1.22.0
go 1.23.0
require (
github.com/NVIDIA/go-nvlib v0.7.1
github.com/NVIDIA/go-nvml v0.12.4-1
github.com/NVIDIA/go-nvlib v0.7.3
github.com/NVIDIA/go-nvml v0.12.9-0
github.com/cyphar/filepath-securejoin v0.4.1
github.com/moby/sys/reexec v0.1.0
github.com/moby/sys/symlink v0.3.0
github.com/opencontainers/runtime-spec v1.2.0
github.com/opencontainers/runc v1.3.0
github.com/opencontainers/runtime-spec v1.2.1
github.com/pelletier/go-toml v1.9.5
github.com/sirupsen/logrus v1.9.3
github.com/stretchr/testify v1.10.0
github.com/urfave/cli/v2 v2.27.5
golang.org/x/mod v0.23.0
golang.org/x/sys v0.30.0
tags.cncf.io/container-device-interface v0.8.0
tags.cncf.io/container-device-interface/specs-go v0.8.0
github.com/urfave/cli/v2 v2.27.7
golang.org/x/mod v0.25.0
golang.org/x/sys v0.33.0
tags.cncf.io/container-device-interface v1.0.1
tags.cncf.io/container-device-interface/specs-go v1.0.0
)
require (
github.com/cpuguy83/go-md2man/v2 v2.0.5 // indirect
github.com/cpuguy83/go-md2man/v2 v2.0.7 // indirect
github.com/davecgh/go-spew v1.1.1 // indirect
github.com/fsnotify/fsnotify v1.7.0 // indirect
github.com/google/uuid v1.6.0 // indirect
github.com/hashicorp/errwrap v1.1.0 // indirect
github.com/kr/pretty v0.3.1 // indirect
github.com/opencontainers/runtime-tools v0.9.1-0.20221107090550-2e043c6bd626 // indirect
github.com/opencontainers/selinux v1.11.0 // indirect
github.com/pmezard/go-difflib v1.0.0 // indirect
github.com/rogpeppe/go-internal v1.11.0 // indirect
github.com/russross/blackfriday/v2 v2.1.0 // indirect
github.com/syndtr/gocapability v0.0.0-20200815063812-42c35b437635 // indirect
github.com/xeipuuv/gojsonpointer v0.0.0-20190905194746-02993c407bfb // indirect
github.com/xrash/smetrics v0.0.0-20240521201337-686a1a2994c1 // indirect
gopkg.in/check.v1 v1.0.0-20201130134442-10cb98267c6c // indirect
gopkg.in/yaml.v2 v2.4.0 // indirect
gopkg.in/yaml.v3 v3.0.1 // indirect
sigs.k8s.io/yaml v1.3.0 // indirect
sigs.k8s.io/yaml v1.4.0 // indirect
)

56
go.sum
View File

@@ -1,17 +1,21 @@
github.com/NVIDIA/go-nvlib v0.7.1 h1:7HHPZxoCjSLm1NgaRRjuhI8ffMCpc5Vgpg5yxQYUff8=
github.com/NVIDIA/go-nvlib v0.7.1/go.mod h1:2Kh2kYSP5IJ8EKf0/SYDzHiQKb9EJkwOf2LQzu6pXzY=
github.com/NVIDIA/go-nvml v0.12.4-1 h1:WKUvqshhWSNTfm47ETRhv0A0zJyr1ncCuHiXwoTrBEc=
github.com/NVIDIA/go-nvml v0.12.4-1/go.mod h1:8Llmj+1Rr+9VGGwZuRer5N/aCjxGuR5nPb/9ebBiIEQ=
github.com/NVIDIA/go-nvlib v0.7.3 h1:kXc8PkWUlrwedSpM4fR8xT/DAq1NKy8HqhpgteFcGAw=
github.com/NVIDIA/go-nvlib v0.7.3/go.mod h1:i95Je7GinMy/+BDs++DAdbPmT2TubjNP8i8joC7DD7I=
github.com/NVIDIA/go-nvml v0.12.9-0 h1:e344UK8ZkeMeeLkdQtRhmXRxNf+u532LDZPGMtkdus0=
github.com/NVIDIA/go-nvml v0.12.9-0/go.mod h1:+KNA7c7gIBH7SKSJ1ntlwkfN80zdx8ovl4hrK3LmPt4=
github.com/blang/semver/v4 v4.0.0 h1:1PFHFE6yCCTv8C1TeyNNarDzntLi7wMI5i/pzqYIsAM=
github.com/blang/semver/v4 v4.0.0/go.mod h1:IbckMUScFkM3pff0VJDNKRiT6TG/YpiHIM2yvyW5YoQ=
github.com/cpuguy83/go-md2man/v2 v2.0.5 h1:ZtcqGrnekaHpVLArFSe4HK5DoKx1T0rq2DwVB0alcyc=
github.com/cpuguy83/go-md2man/v2 v2.0.5/go.mod h1:tgQtvFlXSQOSOSIRvRPT7W67SCa46tRHOmNcaadrF8o=
github.com/cpuguy83/go-md2man/v2 v2.0.7 h1:zbFlGlXEAKlwXpmvle3d8Oe3YnkKIK4xSRTd3sHPnBo=
github.com/cpuguy83/go-md2man/v2 v2.0.7/go.mod h1:oOW0eioCTA6cOiMLiUPZOpcVxMig6NIQQ7OS05n1F4g=
github.com/creack/pty v1.1.9/go.mod h1:oKZEueFk5CKHvIhNR5MUki03XCEU+Q6VDXinZuGJ33E=
github.com/cyphar/filepath-securejoin v0.4.1 h1:JyxxyPEaktOD+GAnqIqTf9A8tHyAG22rowi7HkoSU1s=
github.com/cyphar/filepath-securejoin v0.4.1/go.mod h1:Sdj7gXlvMcPZsbhwhQ33GguGLDGQL7h7bg04C/+u9jI=
github.com/davecgh/go-spew v1.1.0/go.mod h1:J7Y8YcW2NihsgmVo/mv3lAwl/skON4iLHjSsI+c5H38=
github.com/davecgh/go-spew v1.1.1 h1:vj9j/u1bqnvCEfJOwUhtlOARqs3+rkHYY13jYWTU97c=
github.com/davecgh/go-spew v1.1.1/go.mod h1:J7Y8YcW2NihsgmVo/mv3lAwl/skON4iLHjSsI+c5H38=
github.com/fsnotify/fsnotify v1.7.0 h1:8JEhPFa5W2WU7YfeZzPNqzMP6Lwt7L2715Ggo0nosvA=
github.com/fsnotify/fsnotify v1.7.0/go.mod h1:40Bi/Hjc2AVfZrqy+aj+yEI+/bRxZnMJyTJwOpGvigM=
github.com/google/go-cmp v0.5.9 h1:O2Tfq5qg4qc4AmwVlvv0oLiVAGB7enBSJ2x2DqQFi38=
github.com/google/go-cmp v0.5.9/go.mod h1:17dUlkBOakJ0+DkrSSNjCkIjxS6bF9zb3elmeNGIjoY=
github.com/google/uuid v1.3.0/go.mod h1:TIyPZe4MgqvfeYDBFedMoGGpEw/LqOeaOT+nhxU+yHo=
github.com/google/uuid v1.6.0 h1:NIvaJDMOsjHA8n1jAhLSgzrAzy1Hgr+hNrb57e+94F0=
github.com/google/uuid v1.6.0/go.mod h1:TIyPZe4MgqvfeYDBFedMoGGpEw/LqOeaOT+nhxU+yHo=
@@ -28,24 +32,29 @@ github.com/kr/text v0.1.0/go.mod h1:4Jbv+DJW3UT/LiOwJeYQe1efqtUx/iVham/4vfdArNI=
github.com/kr/text v0.2.0 h1:5Nx0Ya0ZqY2ygV366QzturHI13Jq95ApcVaJBhpS+AY=
github.com/kr/text v0.2.0/go.mod h1:eLer722TekiGuMkidMxC/pM04lWEeraHUUmBw8l2grE=
github.com/mndrix/tap-go v0.0.0-20171203230836-629fa407e90b/go.mod h1:pzzDgJWZ34fGzaAZGFW22KVZDfyrYW+QABMrWnJBnSs=
github.com/moby/sys/reexec v0.1.0 h1:RrBi8e0EBTLEgfruBOFcxtElzRGTEUkeIFaVXgU7wok=
github.com/moby/sys/reexec v0.1.0/go.mod h1:EqjBg8F3X7iZe5pU6nRZnYCMUTXoxsjiIfHup5wYIN8=
github.com/moby/sys/symlink v0.3.0 h1:GZX89mEZ9u53f97npBy4Rc3vJKj7JBDj/PN2I22GrNU=
github.com/moby/sys/symlink v0.3.0/go.mod h1:3eNdhduHmYPcgsJtZXW1W4XUJdZGBIkttZ8xKqPUJq0=
github.com/mrunalp/fileutils v0.5.0/go.mod h1:M1WthSahJixYnrXQl/DFQuteStB1weuxD2QJNHXfbSQ=
github.com/opencontainers/runc v1.3.0 h1:cvP7xbEvD0QQAs0nZKLzkVog2OPZhI/V2w3WmTmUSXI=
github.com/opencontainers/runc v1.3.0/go.mod h1:9wbWt42gV+KRxKRVVugNP6D5+PQciRbenB4fLVsqGPs=
github.com/opencontainers/runtime-spec v1.0.3-0.20220825212826-86290f6a00fb/go.mod h1:jwyrGlmzljRJv/Fgzds9SsS/C5hL+LL3ko9hs6T5lQ0=
github.com/opencontainers/runtime-spec v1.2.0 h1:z97+pHb3uELt/yiAWD691HNHQIF07bE7dzrbT927iTk=
github.com/opencontainers/runtime-spec v1.2.0/go.mod h1:jwyrGlmzljRJv/Fgzds9SsS/C5hL+LL3ko9hs6T5lQ0=
github.com/opencontainers/runtime-spec v1.2.1 h1:S4k4ryNgEpxW1dzyqffOmhI1BHYcjzU8lpJfSlR0xww=
github.com/opencontainers/runtime-spec v1.2.1/go.mod h1:jwyrGlmzljRJv/Fgzds9SsS/C5hL+LL3ko9hs6T5lQ0=
github.com/opencontainers/runtime-tools v0.9.1-0.20221107090550-2e043c6bd626 h1:DmNGcqH3WDbV5k8OJ+esPWbqUOX5rMLR2PMvziDMJi0=
github.com/opencontainers/runtime-tools v0.9.1-0.20221107090550-2e043c6bd626/go.mod h1:BRHJJd0E+cx42OybVYSgUvZmU0B8P9gZuRXlZUP7TKI=
github.com/opencontainers/selinux v1.9.1/go.mod h1:2i0OySw99QjzBBQByd1Gr9gSjvuho1lHsJxIJ3gGbJI=
github.com/opencontainers/selinux v1.11.0 h1:+5Zbo97w3Lbmb3PeqQtpmTkMwsW5nRI3YaLpt7tQ7oU=
github.com/opencontainers/selinux v1.11.0/go.mod h1:E5dMC3VPuVvVHDYmi78qvhJp8+M586T4DlDRYpFkyec=
github.com/opencontainers/selinux v1.11.1 h1:nHFvthhM0qY8/m+vfhJylliSshm8G1jJ2jDMcgULaH8=
github.com/opencontainers/selinux v1.11.1/go.mod h1:E5dMC3VPuVvVHDYmi78qvhJp8+M586T4DlDRYpFkyec=
github.com/pelletier/go-toml v1.9.5 h1:4yBQzkHv+7BHq2PQUZF3Mx0IYxG7LsP222s7Agd3ve8=
github.com/pelletier/go-toml v1.9.5/go.mod h1:u1nR/EPcESfeI/szUZKdtJ0xRNbUoANCkoOuaOx1Y+c=
github.com/pkg/diff v0.0.0-20210226163009-20ebb0f2a09e/go.mod h1:pJLUxLENpZxwdsKMEsNbx1VGcRFpLqf3715MtcvvzbA=
github.com/pmezard/go-difflib v1.0.0 h1:4DBwDE0NGyQoBHbLQYPwSUPoCMWR5BEzIk/f1lZbAQM=
github.com/pmezard/go-difflib v1.0.0/go.mod h1:iKH77koFhYxTK1pcRnkKkqfTogsbg7gZNVY4sRDYZ/4=
github.com/rogpeppe/go-internal v1.9.0 h1:73kH8U+JUqXU8lRuOHeVHaa/SZPifC7BkcraZVejAe8=
github.com/rogpeppe/go-internal v1.9.0/go.mod h1:WtVeX8xhTBvf0smdhujwtBcq4Qrzq/fJaraNFVN+nFs=
github.com/rogpeppe/go-internal v1.11.0 h1:cWPaGQEPrBb5/AsnsZesgZZ9yb1OQ+GOISoDNXVBh4M=
github.com/rogpeppe/go-internal v1.11.0/go.mod h1:ddIwULY96R17DhadqLgMfk9H9tvdUzkipdSkR5nkCZA=
github.com/russross/blackfriday/v2 v2.1.0 h1:JIOH55/0cWyOuilr9/qlrm0BSXldqnqwMsf35Ld67mk=
github.com/russross/blackfriday/v2 v2.1.0/go.mod h1:+Rmxgy9KzJVeS9/2gXHxylqXiyQDYRxCVz55jmeOWTM=
github.com/sirupsen/logrus v1.8.1/go.mod h1:yWOB1SBYBC5VeMP7gHvWumXLIWorT60ONWic61uBYv0=
@@ -60,8 +69,8 @@ github.com/stretchr/testify v1.10.0/go.mod h1:r2ic/lqez/lEtzL7wO/rwa5dbSLXVDPFyf
github.com/syndtr/gocapability v0.0.0-20200815063812-42c35b437635 h1:kdXcSzyDtseVEc4yCz2qF8ZrQvIDBJLl4S1c3GCXmoI=
github.com/syndtr/gocapability v0.0.0-20200815063812-42c35b437635/go.mod h1:hkRG7XYTFWNJGYcbNJQlaLq0fg1yr4J4t/NcTQtrfww=
github.com/urfave/cli v1.19.1/go.mod h1:70zkFmudgCuE/ngEzBv17Jvp/497gISqfk5gWijbERA=
github.com/urfave/cli/v2 v2.27.5 h1:WoHEJLdsXr6dDWoJgMq/CboDmyY/8HMMH1fTECbih+w=
github.com/urfave/cli/v2 v2.27.5/go.mod h1:3Sevf16NykTbInEnD0yKkjDAeZDS0A6bzhBH5hrMvTQ=
github.com/urfave/cli/v2 v2.27.7 h1:bH59vdhbjLv3LAvIu6gd0usJHgoTTPhCFib8qqOwXYU=
github.com/urfave/cli/v2 v2.27.7/go.mod h1:CyNAG/xg+iAOg0N4MPGZqVmv2rCoP267496AOXUZjA4=
github.com/xeipuuv/gojsonpointer v0.0.0-20180127040702-4e3ac2762d5f/go.mod h1:N2zxlSyiKSe5eX1tZViRH5QA0qijqEDrYZiPEAiq3wU=
github.com/xeipuuv/gojsonpointer v0.0.0-20190905194746-02993c407bfb h1:zGWFAtiMcyryUHoUjUJX0/lt1H2+i2Ka2n+D3DImSNo=
github.com/xeipuuv/gojsonpointer v0.0.0-20190905194746-02993c407bfb/go.mod h1:N2zxlSyiKSe5eX1tZViRH5QA0qijqEDrYZiPEAiq3wU=
@@ -71,24 +80,23 @@ github.com/xeipuuv/gojsonschema v1.2.0 h1:LhYJRs+L4fBtjZUfuSZIKGeVu0QRy8e5Xi7D17
github.com/xeipuuv/gojsonschema v1.2.0/go.mod h1:anYRn/JVcOK2ZgGU+IjEV4nwlhoK5sQluxsYJ78Id3Y=
github.com/xrash/smetrics v0.0.0-20240521201337-686a1a2994c1 h1:gEOO8jv9F4OT7lGCjxCBTO/36wtF6j2nSip77qHd4x4=
github.com/xrash/smetrics v0.0.0-20240521201337-686a1a2994c1/go.mod h1:Ohn+xnUBiLI6FVj/9LpzZWtj1/D6lUovWYBkxHVV3aM=
golang.org/x/mod v0.23.0 h1:Zb7khfcRGKk+kqfxFaP5tZqCnDZMjC5VtUBs87Hr6QM=
golang.org/x/mod v0.23.0/go.mod h1:6SkKJ3Xj0I0BrPOZoBy3bdMptDDU9oJrpohJ3eWZ1fY=
golang.org/x/mod v0.25.0 h1:n7a+ZbQKQA/Ysbyb0/6IbB1H/X41mKgbhfv7AfG/44w=
golang.org/x/mod v0.25.0/go.mod h1:IXM97Txy2VM4PJ3gI61r1YEk/gAj6zAHN3AdZt6S9Ww=
golang.org/x/sys v0.0.0-20191026070338-33540a1f6037/go.mod h1:h1NjWce9XRLGQEsW7wpKNCjG9DtNlClVuFLEZdDNbEs=
golang.org/x/sys v0.0.0-20191115151921-52ab43148777/go.mod h1:h1NjWce9XRLGQEsW7wpKNCjG9DtNlClVuFLEZdDNbEs=
golang.org/x/sys v0.0.0-20220715151400-c0bba94af5f8/go.mod h1:oPkhp1MJrh7nUepCBck5+mAzfO9JrbApNNgaTdGDITg=
golang.org/x/sys v0.30.0 h1:QjkSwP/36a20jFYWkSue1YwXzLmsV5Gfq7Eiy72C1uc=
golang.org/x/sys v0.30.0/go.mod h1:/VUhepiaJMQUp4+oa/7Zr1D23ma6VTLIYjOOTFZPUcA=
golang.org/x/sys v0.33.0 h1:q3i8TbbEz+JRD9ywIRlyRAQbM0qF7hu24q3teo2hbuw=
golang.org/x/sys v0.33.0/go.mod h1:BJP2sWEmIv4KK5OTEluFJCKSidICx8ciO85XgH3Ak8k=
gopkg.in/check.v1 v0.0.0-20161208181325-20d25e280405/go.mod h1:Co6ibVJAznAaIkqp8huTwlJQCZ016jof/cbN4VW5Yz0=
gopkg.in/check.v1 v1.0.0-20201130134442-10cb98267c6c h1:Hei/4ADfdWqJk1ZMxUNpqntNwaWcugrBjAiHlqqRiVk=
gopkg.in/check.v1 v1.0.0-20201130134442-10cb98267c6c/go.mod h1:JHkPIbrfpd72SG/EVd6muEfDQjcINNoR0C8j2r3qZ4Q=
gopkg.in/yaml.v2 v2.4.0 h1:D8xgwECY7CYvx+Y2n4sBz93Jn9JRvxdiyyo8CTfuKaY=
gopkg.in/yaml.v2 v2.4.0/go.mod h1:RDklbk79AGWmwhnvt/jBztapEOGDOx6ZbXqjP6csGnQ=
gopkg.in/yaml.v3 v3.0.0-20200313102051-9f266ea9e77c/go.mod h1:K4uyk7z7BCEPqu6E+C64Yfv1cQ7kz7rIZviUmN+EgEM=
gopkg.in/yaml.v3 v3.0.1 h1:fxVm/GzAzEWqLHuvctI91KS9hhNmmWOoWu0XTYJS7CA=
gopkg.in/yaml.v3 v3.0.1/go.mod h1:K4uyk7z7BCEPqu6E+C64Yfv1cQ7kz7rIZviUmN+EgEM=
sigs.k8s.io/yaml v1.3.0 h1:a2VclLzOGrwOHDiV8EfBGhvjHvP46CtW5j6POvhYGGo=
sigs.k8s.io/yaml v1.3.0/go.mod h1:GeOyir5tyXNByN85N/dRIT9es5UQNerPYEKK56eTBm8=
tags.cncf.io/container-device-interface v0.8.0 h1:8bCFo/g9WODjWx3m6EYl3GfUG31eKJbaggyBDxEldRc=
tags.cncf.io/container-device-interface v0.8.0/go.mod h1:Apb7N4VdILW0EVdEMRYXIDVRZfNJZ+kmEUss2kRRQ6Y=
tags.cncf.io/container-device-interface/specs-go v0.8.0 h1:QYGFzGxvYK/ZLMrjhvY0RjpUavIn4KcmRmVP/JjdBTA=
tags.cncf.io/container-device-interface/specs-go v0.8.0/go.mod h1:BhJIkjjPh4qpys+qm4DAYtUyryaTDg9zris+AczXyws=
sigs.k8s.io/yaml v1.4.0 h1:Mk1wCc2gy/F0THH0TAp1QYyJNzRm2KCLy3o5ASXVI5E=
sigs.k8s.io/yaml v1.4.0/go.mod h1:Ejl7/uTz7PSA4eKMyQCUTnhZYNmLIl+5c2lQPGR2BPY=
tags.cncf.io/container-device-interface v1.0.1 h1:KqQDr4vIlxwfYh0Ed/uJGVgX+CHAkahrgabg6Q8GYxc=
tags.cncf.io/container-device-interface v1.0.1/go.mod h1:JojJIOeW3hNbcnOH2q0NrWNha/JuHoDZcmYxAZwb2i0=
tags.cncf.io/container-device-interface/specs-go v1.0.0 h1:8gLw29hH1ZQP9K1YtAzpvkHCjjyIxHZYzBAvlQ+0vD8=
tags.cncf.io/container-device-interface/specs-go v1.0.0/go.mod h1:u86hoFWqnh3hWz3esofRFKbI261bUlvUfLKGrDhJkgQ=

View File

@@ -53,6 +53,6 @@ docker run --rm \
-v $(pwd):$(pwd) \
-w $(pwd) \
-u $(id -u):$(id -g) \
--entrypoint="bash" \
--entrypoint="sh" \
${IMAGE} \
-c "cp --preserve=timestamps -R /artifacts/* ${DIST_DIR}"
-c "cp -p -R /artifacts/* ${DIST_DIR}"

View File

@@ -31,8 +31,10 @@ import (
)
const (
configOverride = "XDG_CONFIG_HOME"
configFilePath = "nvidia-container-runtime/config.toml"
FilePathOverrideEnvVar = "NVIDIA_CTK_CONFIG_FILE_PATH"
RelativeFilePath = "nvidia-container-runtime/config.toml"
configRootOverride = "XDG_CONFIG_HOME"
nvidiaCTKExecutable = "nvidia-ctk"
nvidiaCTKDefaultFilePath = "/usr/bin/nvidia-ctk"
@@ -74,11 +76,15 @@ type Config struct {
// GetConfigFilePath returns the path to the config file for the configured system
func GetConfigFilePath() string {
if XDGConfigDir := os.Getenv(configOverride); len(XDGConfigDir) != 0 {
return filepath.Join(XDGConfigDir, configFilePath)
if configFilePathOverride := os.Getenv(FilePathOverrideEnvVar); configFilePathOverride != "" {
return configFilePathOverride
}
configRoot := "/etc"
if XDGConfigDir := os.Getenv(configRootOverride); len(XDGConfigDir) != 0 {
configRoot = XDGConfigDir
}
return filepath.Join("/etc", configFilePath)
return filepath.Join(configRoot, RelativeFilePath)
}
// GetConfig sets up the config struct. Values are read from a toml file
@@ -110,7 +116,7 @@ func GetDefault() (*Config, error) {
NVIDIAContainerRuntimeConfig: RuntimeConfig{
DebugFilePath: "/dev/null",
LogLevel: "info",
Runtimes: []string{"docker-runc", "runc", "crun"},
Runtimes: []string{"runc", "crun"},
Mode: "auto",
Modes: modesConfig{
CSV: csvModeConfig{
@@ -121,6 +127,9 @@ func GetDefault() (*Config, error) {
AnnotationPrefixes: []string{cdi.AnnotationPrefix},
SpecDirs: cdi.DefaultSpecDirs,
},
Legacy: legacyModeConfig{
CUDACompatMode: defaultCUDACompatMode,
},
},
},
NVIDIAContainerRuntimeHookConfig: RuntimeHookConfig{

View File

@@ -27,9 +27,26 @@ import (
func TestGetConfigWithCustomConfig(t *testing.T) {
testDir := t.TempDir()
t.Setenv(configOverride, testDir)
t.Setenv(configRootOverride, testDir)
filename := filepath.Join(testDir, configFilePath)
filename := filepath.Join(testDir, RelativeFilePath)
// By default debug is disabled
contents := []byte("[nvidia-container-runtime]\ndebug = \"/nvidia-container-toolkit.log\"")
require.NoError(t, os.MkdirAll(filepath.Dir(filename), 0766))
require.NoError(t, os.WriteFile(filename, contents, 0600))
cfg, err := GetConfig()
require.NoError(t, err)
require.Equal(t, "/nvidia-container-toolkit.log", cfg.NVIDIAContainerRuntimeConfig.DebugFilePath)
}
func TestGetConfigWithConfigFilePathOverride(t *testing.T) {
testDir := t.TempDir()
filename := filepath.Join(testDir, RelativeFilePath)
t.Setenv(FilePathOverrideEnvVar, filename)
// By default debug is disabled
contents := []byte("[nvidia-container-runtime]\ndebug = \"/nvidia-container-toolkit.log\"")
@@ -63,7 +80,7 @@ func TestGetConfig(t *testing.T) {
NVIDIAContainerRuntimeConfig: RuntimeConfig{
DebugFilePath: "/dev/null",
LogLevel: "info",
Runtimes: []string{"docker-runc", "runc", "crun"},
Runtimes: []string{"runc", "crun"},
Mode: "auto",
Modes: modesConfig{
CSV: csvModeConfig{
@@ -74,6 +91,9 @@ func TestGetConfig(t *testing.T) {
AnnotationPrefixes: []string{"cdi.k8s.io/"},
SpecDirs: []string{"/etc/cdi", "/var/run/cdi"},
},
Legacy: legacyModeConfig{
CUDACompatMode: "ldconfig",
},
},
},
NVIDIAContainerRuntimeHookConfig: RuntimeHookConfig{
@@ -93,6 +113,7 @@ func TestGetConfig(t *testing.T) {
"nvidia-container-cli.load-kmods = false",
"nvidia-container-cli.ldconfig = \"@/foo/bar/ldconfig\"",
"nvidia-container-cli.user = \"foo:bar\"",
"nvidia-container-cli.cuda-compat-mode = \"mount\"",
"nvidia-container-runtime.debug = \"/foo/bar\"",
"nvidia-container-runtime.discover-mode = \"not-legacy\"",
"nvidia-container-runtime.log-level = \"debug\"",
@@ -102,6 +123,7 @@ func TestGetConfig(t *testing.T) {
"nvidia-container-runtime.modes.cdi.annotation-prefixes = [\"cdi.k8s.io/\", \"example.vendor.com/\",]",
"nvidia-container-runtime.modes.cdi.spec-dirs = [\"/except/etc/cdi\", \"/not/var/run/cdi\",]",
"nvidia-container-runtime.modes.csv.mount-spec-path = \"/not/etc/nvidia-container-runtime/host-files-for-container.d\"",
"nvidia-container-runtime.modes.legacy.cuda-compat-mode = \"mount\"",
"nvidia-container-runtime-hook.path = \"/foo/bar/nvidia-container-runtime-hook\"",
"nvidia-ctk.path = \"/foo/bar/nvidia-ctk\"",
},
@@ -134,6 +156,9 @@ func TestGetConfig(t *testing.T) {
"/not/var/run/cdi",
},
},
Legacy: legacyModeConfig{
CUDACompatMode: "mount",
},
},
},
NVIDIAContainerRuntimeHookConfig: RuntimeHookConfig{
@@ -162,7 +187,7 @@ func TestGetConfig(t *testing.T) {
NVIDIAContainerRuntimeConfig: RuntimeConfig{
DebugFilePath: "/dev/null",
LogLevel: "info",
Runtimes: []string{"docker-runc", "runc", "crun"},
Runtimes: []string{"runc", "crun"},
Mode: "auto",
Modes: modesConfig{
CSV: csvModeConfig{
@@ -178,6 +203,9 @@ func TestGetConfig(t *testing.T) {
"/var/run/cdi",
},
},
Legacy: legacyModeConfig{
CUDACompatMode: "ldconfig",
},
},
},
NVIDIAContainerRuntimeHookConfig: RuntimeHookConfig{
@@ -200,6 +228,7 @@ func TestGetConfig(t *testing.T) {
"root = \"/bar/baz\"",
"load-kmods = false",
"ldconfig = \"@/foo/bar/ldconfig\"",
"cuda-compat-mode = \"mount\"",
"user = \"foo:bar\"",
"[nvidia-container-runtime]",
"debug = \"/foo/bar\"",
@@ -213,6 +242,8 @@ func TestGetConfig(t *testing.T) {
"spec-dirs = [\"/except/etc/cdi\", \"/not/var/run/cdi\",]",
"[nvidia-container-runtime.modes.csv]",
"mount-spec-path = \"/not/etc/nvidia-container-runtime/host-files-for-container.d\"",
"[nvidia-container-runtime.modes.legacy]",
"cuda-compat-mode = \"mount\"",
"[nvidia-container-runtime-hook]",
"path = \"/foo/bar/nvidia-container-runtime-hook\"",
"[nvidia-ctk]",
@@ -247,6 +278,9 @@ func TestGetConfig(t *testing.T) {
"/not/var/run/cdi",
},
},
Legacy: legacyModeConfig{
CUDACompatMode: "mount",
},
},
},
NVIDIAContainerRuntimeHookConfig: RuntimeHookConfig{
@@ -272,7 +306,7 @@ func TestGetConfig(t *testing.T) {
NVIDIAContainerRuntimeConfig: RuntimeConfig{
DebugFilePath: "/dev/null",
LogLevel: "info",
Runtimes: []string{"docker-runc", "runc", "crun"},
Runtimes: []string{"runc", "crun"},
Mode: "auto",
Modes: modesConfig{
CSV: csvModeConfig{
@@ -283,6 +317,9 @@ func TestGetConfig(t *testing.T) {
AnnotationPrefixes: []string{"cdi.k8s.io/"},
SpecDirs: []string{"/etc/cdi", "/var/run/cdi"},
},
Legacy: legacyModeConfig{
CUDACompatMode: "ldconfig",
},
},
},
NVIDIAContainerRuntimeHookConfig: RuntimeHookConfig{
@@ -311,7 +348,7 @@ func TestGetConfig(t *testing.T) {
NVIDIAContainerRuntimeConfig: RuntimeConfig{
DebugFilePath: "/dev/null",
LogLevel: "info",
Runtimes: []string{"docker-runc", "runc", "crun"},
Runtimes: []string{"runc", "crun"},
Mode: "auto",
Modes: modesConfig{
CSV: csvModeConfig{
@@ -322,6 +359,9 @@ func TestGetConfig(t *testing.T) {
AnnotationPrefixes: []string{"cdi.k8s.io/"},
SpecDirs: []string{"/etc/cdi", "/var/run/cdi"},
},
Legacy: legacyModeConfig{
CUDACompatMode: "ldconfig",
},
},
},
NVIDIAContainerRuntimeHookConfig: RuntimeHookConfig{

View File

@@ -25,9 +25,23 @@ type features struct {
// If this feature flag is not set to 'true' only host-rooted config paths
// (i.e. paths starting with an '@' are considered valid)
AllowLDConfigFromContainer *feature `toml:"allow-ldconfig-from-container,omitempty"`
// DisableCUDACompatLibHook, when enabled skips the injection of a specific
// hook to process CUDA compatibility libraries.
//
// Note: Since this mechanism replaces the logic in the `nvidia-container-cli`,
// toggling this feature has no effect if `allow-cuda-compat-libs-from-container` is enabled.
DisableCUDACompatLibHook *feature `toml:"disable-cuda-compat-lib-hook,omitempty"`
// DisableImexChannelCreation ensures that the implicit creation of
// requested IMEX channels is skipped when invoking the nvidia-container-cli.
DisableImexChannelCreation *feature `toml:"disable-imex-channel-creation,omitempty"`
// IgnoreImexChannelRequests configures the NVIDIA Container Toolkit to
// ignore IMEX channel requests through the NVIDIA_IMEX_CHANNELS envvar or
// volume mounts.
// This ensures that the NVIDIA Container Toolkit cannot be used to provide
// access to an IMEX channel by simply specifying an environment variable,
// possibly bypassing other checks by an orchestration system such as
// kubernetes.
IgnoreImexChannelRequests *feature `toml:"ignore-imex-channel-requests,omitempty"`
}
type feature bool

View File

@@ -21,22 +21,35 @@ import (
"strings"
"github.com/opencontainers/runtime-spec/specs-go"
"github.com/NVIDIA/nvidia-container-toolkit/internal/logger"
)
type builder struct {
env map[string]string
mounts []specs.Mount
CUDA
disableRequire bool
}
// Option is a functional option for creating a CUDA image.
type Option func(*builder) error
// New creates a new CUDA image from the input options.
func New(opt ...Option) (CUDA, error) {
b := &builder{}
b := &builder{
CUDA: CUDA{
acceptEnvvarUnprivileged: true,
},
}
for _, o := range opt {
if err := o(b); err != nil {
return CUDA{}, err
}
}
if b.logger == nil {
b.logger = logger.New()
}
if b.env == nil {
b.env = make(map[string]string)
}
@@ -50,15 +63,36 @@ func (b builder) build() (CUDA, error) {
b.env[EnvVarNvidiaDisableRequire] = "true"
}
c := CUDA{
env: b.env,
mounts: b.mounts,
}
return c, nil
return b.CUDA, nil
}
// Option is a functional option for creating a CUDA image.
type Option func(*builder) error
func WithAcceptDeviceListAsVolumeMounts(acceptDeviceListAsVolumeMounts bool) Option {
return func(b *builder) error {
b.acceptDeviceListAsVolumeMounts = acceptDeviceListAsVolumeMounts
return nil
}
}
func WithAcceptEnvvarUnprivileged(acceptEnvvarUnprivileged bool) Option {
return func(b *builder) error {
b.acceptEnvvarUnprivileged = acceptEnvvarUnprivileged
return nil
}
}
func WithAnnotations(annotations map[string]string) Option {
return func(b *builder) error {
b.annotations = annotations
return nil
}
}
func WithAnnotationsPrefixes(annotationsPrefixes []string) Option {
return func(b *builder) error {
b.annotationsPrefixes = annotationsPrefixes
return nil
}
}
// WithDisableRequire sets the disable require option.
func WithDisableRequire(disableRequire bool) Option {
@@ -93,6 +127,14 @@ func WithEnvMap(env map[string]string) Option {
}
}
// WithLogger sets the logger to use when creating the CUDA image.
func WithLogger(logger logger.Interface) Option {
return func(b *builder) error {
b.logger = logger
return nil
}
}
// WithMounts sets the mounts associated with the CUDA image.
func WithMounts(mounts []specs.Mount) Option {
return func(b *builder) error {
@@ -100,3 +142,20 @@ func WithMounts(mounts []specs.Mount) Option {
return nil
}
}
// WithPreferredVisibleDevicesEnvVars sets the environment variables that
// should take precedence over the default NVIDIA_VISIBLE_DEVICES.
func WithPreferredVisibleDevicesEnvVars(preferredVisibleDeviceEnvVars ...string) Option {
return func(b *builder) error {
b.preferredVisibleDeviceEnvVars = preferredVisibleDeviceEnvVars
return nil
}
}
// WithPrivileged sets whether an image is privileged or not.
func WithPrivileged(isPrivileged bool) Option {
return func(b *builder) error {
b.isPrivileged = isPrivileged
return nil
}
}

View File

@@ -19,12 +19,15 @@ package image
import (
"fmt"
"path/filepath"
"slices"
"strconv"
"strings"
"github.com/opencontainers/runtime-spec/specs-go"
"golang.org/x/mod/semver"
"tags.cncf.io/container-device-interface/pkg/parser"
"github.com/NVIDIA/nvidia-container-toolkit/internal/logger"
)
const (
@@ -38,27 +41,44 @@ const (
// a map of environment variable to values that can be used to perform lookups
// such as requirements.
type CUDA struct {
env map[string]string
mounts []specs.Mount
logger logger.Interface
annotations map[string]string
env map[string]string
isPrivileged bool
mounts []specs.Mount
annotationsPrefixes []string
acceptDeviceListAsVolumeMounts bool
acceptEnvvarUnprivileged bool
preferredVisibleDeviceEnvVars []string
}
// NewCUDAImageFromSpec creates a CUDA image from the input OCI runtime spec.
// The process environment is read (if present) to construc the CUDA Image.
func NewCUDAImageFromSpec(spec *specs.Spec) (CUDA, error) {
func NewCUDAImageFromSpec(spec *specs.Spec, opts ...Option) (CUDA, error) {
if spec == nil {
return New(opts...)
}
var env []string
if spec != nil && spec.Process != nil {
if spec.Process != nil {
env = spec.Process.Env
}
return New(
specOpts := []Option{
WithAnnotations(spec.Annotations),
WithEnv(env),
WithMounts(spec.Mounts),
)
WithPrivileged(IsPrivileged((*OCISpec)(spec))),
}
return New(append(opts, specOpts...)...)
}
// NewCUDAImageFromEnv creates a CUDA image from the input environment. The environment
// newCUDAImageFromEnv creates a CUDA image from the input environment. The environment
// is a list of strings of the form ENVAR=VALUE.
func NewCUDAImageFromEnv(env []string) (CUDA, error) {
func newCUDAImageFromEnv(env []string) (CUDA, error) {
return New(WithEnv(env))
}
@@ -83,6 +103,10 @@ func (i CUDA) IsLegacy() bool {
return len(legacyCudaVersion) > 0 && len(cudaRequire) == 0
}
func (i CUDA) IsPrivileged() bool {
return i.isPrivileged
}
// GetRequirements returns the requirements from all NVIDIA_REQUIRE_ environment
// variables.
func (i CUDA) GetRequirements() ([]string, error) {
@@ -120,8 +144,8 @@ func (i CUDA) HasDisableRequire() bool {
return false
}
// DevicesFromEnvvars returns the devices requested by the image through environment variables
func (i CUDA) DevicesFromEnvvars(envVars ...string) VisibleDevices {
// devicesFromEnvvars returns the devices requested by the image through environment variables
func (i CUDA) devicesFromEnvvars(envVars ...string) []string {
// We concantenate all the devices from the specified env.
var isSet bool
var devices []string
@@ -142,15 +166,15 @@ func (i CUDA) DevicesFromEnvvars(envVars ...string) VisibleDevices {
// Environment variable unset with legacy image: default to "all".
if !isSet && len(devices) == 0 && i.IsLegacy() {
return NewVisibleDevices("all")
devices = []string{"all"}
}
// Environment variable unset or empty or "void": return nil
if len(devices) == 0 || requested["void"] {
return NewVisibleDevices("void")
devices = []string{"void"}
}
return NewVisibleDevices(devices...)
return NewVisibleDevices(devices...).List()
}
// GetDriverCapabilities returns the requested driver capabilities.
@@ -200,46 +224,137 @@ func parseMajorMinorVersion(version string) (string, error) {
// OnlyFullyQualifiedCDIDevices returns true if all devices requested in the image are requested as CDI devices/
func (i CUDA) OnlyFullyQualifiedCDIDevices() bool {
var hasCDIdevice bool
for _, device := range i.VisibleDevicesFromEnvVar() {
for _, device := range i.VisibleDevices() {
if !parser.IsQualifiedName(device) {
return false
}
hasCDIdevice = true
}
for _, device := range i.DevicesFromMounts() {
if !strings.HasPrefix(device, "cdi/") {
return false
}
hasCDIdevice = true
}
return hasCDIdevice
}
// VisibleDevicesFromEnvVar returns the set of visible devices requested through
// the NVIDIA_VISIBLE_DEVICES environment variable.
func (i CUDA) VisibleDevicesFromEnvVar() []string {
return i.DevicesFromEnvvars(EnvVarNvidiaVisibleDevices).List()
}
// VisibleDevicesFromMounts returns the set of visible devices requested as mounts.
func (i CUDA) VisibleDevicesFromMounts() []string {
var devices []string
for _, device := range i.DevicesFromMounts() {
switch {
case strings.HasPrefix(device, volumeMountDevicePrefixCDI):
continue
case strings.HasPrefix(device, volumeMountDevicePrefixImex):
// visibleEnvVars returns the environment variables that are used to determine device visibility.
// It returns the preferred environment variables that are set, or NVIDIA_VISIBLE_DEVICES if none are set.
func (i CUDA) visibleEnvVars() []string {
var envVars []string
for _, envVar := range i.preferredVisibleDeviceEnvVars {
if !i.HasEnvvar(envVar) {
continue
}
devices = append(devices, device)
envVars = append(envVars, envVar)
}
if len(envVars) > 0 {
return envVars
}
return []string{EnvVarNvidiaVisibleDevices}
}
// VisibleDevices returns a list of devices requested in the container image.
// If volume mount requests are enabled these are returned if requested,
// otherwise device requests through environment variables are considered.
// In cases where environment variable requests required privileged containers,
// such devices requests are ignored.
func (i CUDA) VisibleDevices() []string {
// If annotation device requests are present, these are preferred.
annotationDeviceRequests := i.cdiDeviceRequestsFromAnnotations()
if len(annotationDeviceRequests) > 0 {
return annotationDeviceRequests
}
// If enabled, try and get the device list from volume mounts first
if i.acceptDeviceListAsVolumeMounts {
volumeMountDeviceRequests := i.visibleDevicesFromMounts()
if len(volumeMountDeviceRequests) > 0 {
return volumeMountDeviceRequests
}
}
// Get the Fallback to reading from the environment variable if privileges are correct
envVarDeviceRequests := i.visibleDevicesFromEnvVar()
if len(envVarDeviceRequests) == 0 {
return nil
}
// If the container is privileged, or environment variable requests are
// allowed for unprivileged containers, these devices are returned.
if i.isPrivileged || i.acceptEnvvarUnprivileged {
return envVarDeviceRequests
}
// We log a warning if we are ignoring the environment variable requests.
envVars := i.visibleEnvVars()
if len(envVars) > 0 {
i.logger.Warningf("Ignoring devices requested by environment variable(s) in unprivileged container: %v", envVars)
}
return nil
}
// cdiDeviceRequestsFromAnnotations returns a list of devices specified in the
// annotations.
// Keys starting with the specified prefixes are considered and expected to
// contain a comma-separated list of fully-qualified CDI devices names.
// The format of the requested devices is not checked and the list is not
// deduplicated.
func (i CUDA) cdiDeviceRequestsFromAnnotations() []string {
if len(i.annotationsPrefixes) == 0 || len(i.annotations) == 0 {
return nil
}
var annotationKeys []string
for key := range i.annotations {
for _, prefix := range i.annotationsPrefixes {
if strings.HasPrefix(key, prefix) {
annotationKeys = append(annotationKeys, key)
// There is no need to check additional prefixes since we
// typically deduplicate devices in any case.
break
}
}
}
// We sort the annotationKeys for consistent results.
slices.Sort(annotationKeys)
var devices []string
for _, key := range annotationKeys {
devices = append(devices, strings.Split(i.annotations[key], ",")...)
}
return devices
}
// DevicesFromMounts returns a list of device specified as mounts.
// TODO: This should be merged with getDevicesFromMounts used in the NVIDIA Container Runtime
func (i CUDA) DevicesFromMounts() []string {
// visibleDevicesFromEnvVar returns the set of visible devices requested through environment variables.
// If any of the preferredVisibleDeviceEnvVars are present in the image, they
// are used to determine the visible devices. If this is not the case, the
// NVIDIA_VISIBLE_DEVICES environment variable is used.
func (i CUDA) visibleDevicesFromEnvVar() []string {
envVars := i.visibleEnvVars()
return i.devicesFromEnvvars(envVars...)
}
// visibleDevicesFromMounts returns the set of visible devices requested as mounts.
func (i CUDA) visibleDevicesFromMounts() []string {
var devices []string
for _, device := range i.requestsFromMounts() {
switch {
case strings.HasPrefix(device, volumeMountDevicePrefixImex):
continue
case strings.HasPrefix(device, volumeMountDevicePrefixCDI):
name, err := cdiDeviceMountRequest(device).qualifiedName()
if err != nil {
i.logger.Warningf("Ignoring invalid mount request for CDI device %v: %v", device, err)
continue
}
devices = append(devices, name)
default:
devices = append(devices, device)
}
}
return devices
}
// requestsFromMounts returns a list of device specified as mounts.
func (i CUDA) requestsFromMounts() []string {
root := filepath.Clean(DeviceListAsVolumeMountsRoot)
seen := make(map[string]bool)
var devices []string
@@ -271,28 +386,35 @@ func (i CUDA) DevicesFromMounts() []string {
return devices
}
// CDIDevicesFromMounts returns a list of CDI devices specified as mounts on the image.
func (i CUDA) CDIDevicesFromMounts() []string {
var devices []string
for _, mountDevice := range i.DevicesFromMounts() {
if !strings.HasPrefix(mountDevice, volumeMountDevicePrefixCDI) {
continue
}
parts := strings.SplitN(strings.TrimPrefix(mountDevice, volumeMountDevicePrefixCDI), "/", 3)
if len(parts) != 3 {
continue
}
vendor := parts[0]
class := parts[1]
device := parts[2]
devices = append(devices, fmt.Sprintf("%s/%s=%s", vendor, class, device))
// a cdiDeviceMountRequest represents a CDI device requests as a mount.
// Here the host path /dev/null is mounted to a particular path in the container.
// The container path has the form:
// /var/run/nvidia-container-devices/cdi/<vendor>/<class>/<device>
// or
// /var/run/nvidia-container-devices/cdi/<vendor>/<class>=<device>
type cdiDeviceMountRequest string
// qualifiedName returns the fully-qualified name of the CDI device.
func (m cdiDeviceMountRequest) qualifiedName() (string, error) {
if !strings.HasPrefix(string(m), volumeMountDevicePrefixCDI) {
return "", fmt.Errorf("invalid mount CDI device request: %s", m)
}
return devices
requestedDevice := strings.TrimPrefix(string(m), volumeMountDevicePrefixCDI)
if parser.IsQualifiedName(requestedDevice) {
return requestedDevice, nil
}
parts := strings.SplitN(requestedDevice, "/", 3)
if len(parts) != 3 {
return "", fmt.Errorf("invalid mount CDI device request: %s", m)
}
return fmt.Sprintf("%s/%s=%s", parts[0], parts[1], parts[2]), nil
}
// ImexChannelsFromEnvVar returns the list of IMEX channels requested for the image.
func (i CUDA) ImexChannelsFromEnvVar() []string {
imexChannels := i.DevicesFromEnvvars(EnvVarNvidiaImexChannels).List()
imexChannels := i.devicesFromEnvvars(EnvVarNvidiaImexChannels)
if len(imexChannels) == 1 && imexChannels[0] == "all" {
return nil
}
@@ -302,7 +424,7 @@ func (i CUDA) ImexChannelsFromEnvVar() []string {
// ImexChannelsFromMounts returns the list of IMEX channels requested for the image.
func (i CUDA) ImexChannelsFromMounts() []string {
var channels []string
for _, mountDevice := range i.DevicesFromMounts() {
for _, mountDevice := range i.requestsFromMounts() {
if !strings.HasPrefix(mountDevice, volumeMountDevicePrefixImex) {
continue
}

View File

@@ -21,9 +21,91 @@ import (
"testing"
"github.com/opencontainers/runtime-spec/specs-go"
testlog "github.com/sirupsen/logrus/hooks/test"
"github.com/stretchr/testify/require"
)
func TestNewCUDAImageFromSpec(t *testing.T) {
logger, _ := testlog.NewNullLogger()
testCases := []struct {
description string
spec *specs.Spec
options []Option
expected CUDA
}{
{
description: "no env vars",
spec: &specs.Spec{
Process: &specs.Process{
Env: []string{},
},
},
expected: CUDA{
logger: logger,
env: map[string]string{},
acceptEnvvarUnprivileged: true,
},
},
{
description: "NVIDIA_VISIBLE_DEVICES=all",
spec: &specs.Spec{
Process: &specs.Process{
Env: []string{"NVIDIA_VISIBLE_DEVICES=all"},
},
},
expected: CUDA{
logger: logger,
env: map[string]string{"NVIDIA_VISIBLE_DEVICES": "all"},
acceptEnvvarUnprivileged: true,
},
},
{
description: "Spec overrides options",
spec: &specs.Spec{
Process: &specs.Process{
Env: []string{"NVIDIA_VISIBLE_DEVICES=all"},
},
Mounts: []specs.Mount{
{
Source: "/spec-source",
Destination: "/spec-destination",
},
},
},
options: []Option{
WithEnvMap(map[string]string{"OTHER": "value"}),
WithMounts([]specs.Mount{
{
Source: "/option-source",
Destination: "/option-destination",
},
}),
},
expected: CUDA{
logger: logger,
env: map[string]string{"NVIDIA_VISIBLE_DEVICES": "all"},
mounts: []specs.Mount{
{
Source: "/spec-source",
Destination: "/spec-destination",
},
},
acceptEnvvarUnprivileged: true,
},
},
}
for _, tc := range testCases {
t.Run(tc.description, func(t *testing.T) {
options := append([]Option{WithLogger(logger)}, tc.options...)
image, err := NewCUDAImageFromSpec(tc.spec, options...)
require.NoError(t, err)
require.EqualValues(t, tc.expected, image)
})
}
}
func TestParseMajorMinorVersionValid(t *testing.T) {
var tests = []struct {
version string
@@ -122,7 +204,7 @@ func TestGetRequirements(t *testing.T) {
for _, tc := range testCases {
t.Run(tc.description, func(t *testing.T) {
image, err := NewCUDAImageFromEnv(tc.env)
image, err := newCUDAImageFromEnv(tc.env)
require.NoError(t, err)
requirements, err := image.GetRequirements()
@@ -133,6 +215,226 @@ func TestGetRequirements(t *testing.T) {
}
}
func TestGetDevicesFromEnvvar(t *testing.T) {
envDockerResourceGPUs := "DOCKER_RESOURCE_GPUS"
gpuID := "GPU-12345"
anotherGPUID := "GPU-67890"
thirdGPUID := "MIG-12345"
var tests = []struct {
description string
preferredVisibleDeviceEnvVars []string
env map[string]string
expectedDevices []string
}{
{
description: "empty env returns nil for non-legacy image",
},
{
description: "blank NVIDIA_VISIBLE_DEVICES returns nil for non-legacy image",
env: map[string]string{
EnvVarNvidiaVisibleDevices: "",
},
},
{
description: "'void' NVIDIA_VISIBLE_DEVICES returns nil for non-legacy image",
env: map[string]string{
EnvVarNvidiaVisibleDevices: "void",
},
},
{
description: "'none' NVIDIA_VISIBLE_DEVICES returns empty for non-legacy image",
env: map[string]string{
EnvVarNvidiaVisibleDevices: "none",
},
expectedDevices: []string{""},
},
{
description: "NVIDIA_VISIBLE_DEVICES set returns value for non-legacy image",
env: map[string]string{
EnvVarNvidiaVisibleDevices: gpuID,
},
expectedDevices: []string{gpuID},
},
{
description: "NVIDIA_VISIBLE_DEVICES set returns value for legacy image",
env: map[string]string{
EnvVarNvidiaVisibleDevices: gpuID,
EnvVarCudaVersion: "legacy",
},
expectedDevices: []string{gpuID},
},
{
description: "empty env returns all for legacy image",
env: map[string]string{
EnvVarCudaVersion: "legacy",
},
expectedDevices: []string{"all"},
},
// Add the `DOCKER_RESOURCE_GPUS` envvar and ensure that this is ignored when
// not enabled
{
description: "missing NVIDIA_VISIBLE_DEVICES returns nil for non-legacy image",
env: map[string]string{
envDockerResourceGPUs: anotherGPUID,
},
},
{
description: "blank NVIDIA_VISIBLE_DEVICES returns nil for non-legacy image",
env: map[string]string{
EnvVarNvidiaVisibleDevices: "",
envDockerResourceGPUs: anotherGPUID,
},
},
{
description: "'void' NVIDIA_VISIBLE_DEVICES returns nil for non-legacy image",
env: map[string]string{
EnvVarNvidiaVisibleDevices: "void",
envDockerResourceGPUs: anotherGPUID,
},
},
{
description: "'none' NVIDIA_VISIBLE_DEVICES returns empty for non-legacy image",
env: map[string]string{
EnvVarNvidiaVisibleDevices: "none",
envDockerResourceGPUs: anotherGPUID,
},
expectedDevices: []string{""},
},
{
description: "NVIDIA_VISIBLE_DEVICES set returns value for non-legacy image",
env: map[string]string{
EnvVarNvidiaVisibleDevices: gpuID,
envDockerResourceGPUs: anotherGPUID,
},
expectedDevices: []string{gpuID},
},
{
description: "NVIDIA_VISIBLE_DEVICES set returns value for legacy image",
env: map[string]string{
EnvVarNvidiaVisibleDevices: gpuID,
envDockerResourceGPUs: anotherGPUID,
EnvVarCudaVersion: "legacy",
},
expectedDevices: []string{gpuID},
},
{
description: "empty env returns all for legacy image",
env: map[string]string{
envDockerResourceGPUs: anotherGPUID,
EnvVarCudaVersion: "legacy",
},
expectedDevices: []string{"all"},
},
// Add the `DOCKER_RESOURCE_GPUS` envvar and ensure that this is selected when
// enabled
{
description: "empty env returns nil for non-legacy image",
preferredVisibleDeviceEnvVars: []string{envDockerResourceGPUs},
},
{
description: "blank DOCKER_RESOURCE_GPUS returns nil for non-legacy image",
preferredVisibleDeviceEnvVars: []string{envDockerResourceGPUs},
env: map[string]string{
envDockerResourceGPUs: "",
},
},
{
description: "'void' DOCKER_RESOURCE_GPUS returns nil for non-legacy image",
preferredVisibleDeviceEnvVars: []string{envDockerResourceGPUs},
env: map[string]string{
envDockerResourceGPUs: "void",
},
},
{
description: "'none' DOCKER_RESOURCE_GPUS returns empty for non-legacy image",
preferredVisibleDeviceEnvVars: []string{envDockerResourceGPUs},
env: map[string]string{
envDockerResourceGPUs: "none",
},
expectedDevices: []string{""},
},
{
description: "DOCKER_RESOURCE_GPUS set returns value for non-legacy image",
preferredVisibleDeviceEnvVars: []string{envDockerResourceGPUs},
env: map[string]string{
envDockerResourceGPUs: gpuID,
},
expectedDevices: []string{gpuID},
},
{
description: "DOCKER_RESOURCE_GPUS set returns value for legacy image",
preferredVisibleDeviceEnvVars: []string{envDockerResourceGPUs},
env: map[string]string{
envDockerResourceGPUs: gpuID,
EnvVarCudaVersion: "legacy",
},
expectedDevices: []string{gpuID},
},
{
description: "DOCKER_RESOURCE_GPUS is selected if present",
preferredVisibleDeviceEnvVars: []string{envDockerResourceGPUs},
env: map[string]string{
envDockerResourceGPUs: anotherGPUID,
},
expectedDevices: []string{anotherGPUID},
},
{
description: "DOCKER_RESOURCE_GPUS overrides NVIDIA_VISIBLE_DEVICES if present",
preferredVisibleDeviceEnvVars: []string{envDockerResourceGPUs},
env: map[string]string{
EnvVarNvidiaVisibleDevices: gpuID,
envDockerResourceGPUs: anotherGPUID,
},
expectedDevices: []string{anotherGPUID},
},
{
description: "DOCKER_RESOURCE_GPUS_ADDITIONAL overrides NVIDIA_VISIBLE_DEVICES if present",
preferredVisibleDeviceEnvVars: []string{"DOCKER_RESOURCE_GPUS_ADDITIONAL"},
env: map[string]string{
EnvVarNvidiaVisibleDevices: gpuID,
"DOCKER_RESOURCE_GPUS_ADDITIONAL": anotherGPUID,
},
expectedDevices: []string{anotherGPUID},
},
{
description: "All available swarm resource envvars are selected and override NVIDIA_VISIBLE_DEVICES if present",
preferredVisibleDeviceEnvVars: []string{"DOCKER_RESOURCE_GPUS", "DOCKER_RESOURCE_GPUS_ADDITIONAL"},
env: map[string]string{
EnvVarNvidiaVisibleDevices: gpuID,
"DOCKER_RESOURCE_GPUS": thirdGPUID,
"DOCKER_RESOURCE_GPUS_ADDITIONAL": anotherGPUID,
},
expectedDevices: []string{thirdGPUID, anotherGPUID},
},
{
description: "DOCKER_RESOURCE_GPUS_ADDITIONAL or DOCKER_RESOURCE_GPUS override NVIDIA_VISIBLE_DEVICES if present",
preferredVisibleDeviceEnvVars: []string{"DOCKER_RESOURCE_GPUS", "DOCKER_RESOURCE_GPUS_ADDITIONAL"},
env: map[string]string{
EnvVarNvidiaVisibleDevices: gpuID,
"DOCKER_RESOURCE_GPUS_ADDITIONAL": anotherGPUID,
},
expectedDevices: []string{anotherGPUID},
},
}
for _, tc := range tests {
t.Run(tc.description, func(t *testing.T) {
image, err := New(
WithEnvMap(tc.env),
WithPrivileged(true),
WithAcceptDeviceListAsVolumeMounts(false),
WithAcceptEnvvarUnprivileged(false),
WithPreferredVisibleDevicesEnvVars(tc.preferredVisibleDeviceEnvVars...),
)
require.NoError(t, err)
devices := image.visibleDevicesFromEnvVar()
require.EqualValues(t, tc.expectedDevices, devices)
})
}
}
func TestGetVisibleDevicesFromMounts(t *testing.T) {
var tests = []struct {
description string
@@ -185,9 +487,9 @@ func TestGetVisibleDevicesFromMounts(t *testing.T) {
expectedDevices: []string{"GPU0-MIG0/0/1", "GPU1-MIG0/0/1"},
},
{
description: "cdi devices are ignored",
mounts: makeTestMounts("GPU0", "cdi/nvidia.com/gpu=all", "GPU1"),
expectedDevices: []string{"GPU0", "GPU1"},
description: "cdi devices are included",
mounts: makeTestMounts("GPU0", "nvidia.com/gpu=all", "GPU1"),
expectedDevices: []string{"GPU0", "nvidia.com/gpu=all", "GPU1"},
},
{
description: "imex devices are ignored",
@@ -197,8 +499,195 @@ func TestGetVisibleDevicesFromMounts(t *testing.T) {
}
for _, tc := range tests {
t.Run(tc.description, func(t *testing.T) {
image, _ := New(WithMounts(tc.mounts))
require.Equal(t, tc.expectedDevices, image.VisibleDevicesFromMounts())
image, err := New(WithMounts(tc.mounts))
require.NoError(t, err)
require.Equal(t, tc.expectedDevices, image.visibleDevicesFromMounts())
})
}
}
func TestVisibleDevices(t *testing.T) {
var tests = []struct {
description string
mountDevices []specs.Mount
envvarDevices string
privileged bool
acceptUnprivileged bool
acceptMounts bool
preferredVisibleDeviceEnvVars []string
env map[string]string
expectedDevices []string
}{
{
description: "Mount devices, unprivileged, no accept unprivileged",
mountDevices: []specs.Mount{
{
Source: "/dev/null",
Destination: filepath.Join(DeviceListAsVolumeMountsRoot, "GPU0"),
},
{
Source: "/dev/null",
Destination: filepath.Join(DeviceListAsVolumeMountsRoot, "GPU1"),
},
},
envvarDevices: "GPU2,GPU3",
privileged: false,
acceptUnprivileged: false,
acceptMounts: true,
expectedDevices: []string{"GPU0", "GPU1"},
},
{
description: "No mount devices, unprivileged, no accept unprivileged",
mountDevices: nil,
envvarDevices: "GPU0,GPU1",
privileged: false,
acceptUnprivileged: false,
acceptMounts: true,
expectedDevices: nil,
},
{
description: "No mount devices, privileged, no accept unprivileged",
mountDevices: nil,
envvarDevices: "GPU0,GPU1",
privileged: true,
acceptUnprivileged: false,
acceptMounts: true,
expectedDevices: []string{"GPU0", "GPU1"},
},
{
description: "No mount devices, unprivileged, accept unprivileged",
mountDevices: nil,
envvarDevices: "GPU0,GPU1",
privileged: false,
acceptUnprivileged: true,
acceptMounts: true,
expectedDevices: []string{"GPU0", "GPU1"},
},
{
description: "Mount devices, unprivileged, accept unprivileged, no accept mounts",
mountDevices: []specs.Mount{
{
Source: "/dev/null",
Destination: filepath.Join(DeviceListAsVolumeMountsRoot, "GPU0"),
},
{
Source: "/dev/null",
Destination: filepath.Join(DeviceListAsVolumeMountsRoot, "GPU1"),
},
},
envvarDevices: "GPU2,GPU3",
privileged: false,
acceptUnprivileged: true,
acceptMounts: false,
expectedDevices: []string{"GPU2", "GPU3"},
},
{
description: "Mount devices, unprivileged, no accept unprivileged, no accept mounts",
mountDevices: []specs.Mount{
{
Source: "/dev/null",
Destination: filepath.Join(DeviceListAsVolumeMountsRoot, "GPU0"),
},
{
Source: "/dev/null",
Destination: filepath.Join(DeviceListAsVolumeMountsRoot, "GPU1"),
},
},
envvarDevices: "GPU2,GPU3",
privileged: false,
acceptUnprivileged: false,
acceptMounts: false,
expectedDevices: nil,
},
// New test cases for visibleEnvVars functionality
{
description: "preferred env var set and present in env, privileged",
mountDevices: nil,
envvarDevices: "",
privileged: true,
acceptUnprivileged: false,
acceptMounts: true,
preferredVisibleDeviceEnvVars: []string{"DOCKER_RESOURCE_GPUS"},
env: map[string]string{
"DOCKER_RESOURCE_GPUS": "GPU-12345",
},
expectedDevices: []string{"GPU-12345"},
},
{
description: "preferred env var set and present in env, unprivileged but accepted",
mountDevices: nil,
envvarDevices: "",
privileged: false,
acceptUnprivileged: true,
acceptMounts: true,
preferredVisibleDeviceEnvVars: []string{"DOCKER_RESOURCE_GPUS"},
env: map[string]string{
"DOCKER_RESOURCE_GPUS": "GPU-12345",
},
expectedDevices: []string{"GPU-12345"},
},
{
description: "preferred env var set and present in env, unprivileged and not accepted",
mountDevices: nil,
envvarDevices: "",
privileged: false,
acceptUnprivileged: false,
acceptMounts: true,
preferredVisibleDeviceEnvVars: []string{"DOCKER_RESOURCE_GPUS"},
env: map[string]string{
"DOCKER_RESOURCE_GPUS": "GPU-12345",
},
expectedDevices: nil,
},
{
description: "multiple preferred env vars, both present, privileged",
mountDevices: nil,
envvarDevices: "",
privileged: true,
acceptUnprivileged: false,
acceptMounts: true,
preferredVisibleDeviceEnvVars: []string{"DOCKER_RESOURCE_GPUS", "DOCKER_RESOURCE_GPUS_ADDITIONAL"},
env: map[string]string{
"DOCKER_RESOURCE_GPUS": "GPU-12345",
"DOCKER_RESOURCE_GPUS_ADDITIONAL": "GPU-67890",
},
expectedDevices: []string{"GPU-12345", "GPU-67890"},
},
{
description: "preferred env var not present, fallback to NVIDIA_VISIBLE_DEVICES, privileged",
mountDevices: nil,
envvarDevices: "GPU-12345",
privileged: true,
acceptUnprivileged: false,
acceptMounts: true,
preferredVisibleDeviceEnvVars: []string{"DOCKER_RESOURCE_GPUS"},
env: map[string]string{
EnvVarNvidiaVisibleDevices: "GPU-12345",
},
expectedDevices: []string{"GPU-12345"},
},
}
for _, tc := range tests {
t.Run(tc.description, func(t *testing.T) {
// Create env map with both NVIDIA_VISIBLE_DEVICES and any additional env vars
env := make(map[string]string)
if tc.envvarDevices != "" {
env[EnvVarNvidiaVisibleDevices] = tc.envvarDevices
}
for k, v := range tc.env {
env[k] = v
}
image, err := New(
WithEnvMap(env),
WithMounts(tc.mountDevices),
WithPrivileged(tc.privileged),
WithAcceptDeviceListAsVolumeMounts(tc.acceptMounts),
WithAcceptEnvvarUnprivileged(tc.acceptUnprivileged),
WithPreferredVisibleDevicesEnvVars(tc.preferredVisibleDeviceEnvVars...),
)
require.NoError(t, err)
require.Equal(t, tc.expectedDevices, image.VisibleDevices())
})
}
}
@@ -224,7 +713,7 @@ func TestImexChannelsFromEnvVar(t *testing.T) {
for _, tc := range testCases {
for id, baseEnvvars := range map[string][]string{"": nil, "legacy": {"CUDA_VERSION=1.2.3"}} {
t.Run(tc.description+id, func(t *testing.T) {
i, err := NewCUDAImageFromEnv(append(baseEnvvars, tc.env...))
i, err := newCUDAImageFromEnv(append(baseEnvvars, tc.env...))
require.NoError(t, err)
channels := i.ImexChannelsFromEnvVar()
@@ -234,6 +723,73 @@ func TestImexChannelsFromEnvVar(t *testing.T) {
}
}
func TestCDIDeviceRequestsFromAnnotations(t *testing.T) {
testCases := []struct {
description string
prefixes []string
annotations map[string]string
expectedDevices []string
}{
{
description: "no annotations",
},
{
description: "no matching annotations",
prefixes: []string{"not-prefix/"},
annotations: map[string]string{
"prefix/foo": "example.com/device=bar",
},
},
{
description: "single matching annotation",
prefixes: []string{"prefix/"},
annotations: map[string]string{
"prefix/foo": "example.com/device=bar",
},
expectedDevices: []string{"example.com/device=bar"},
},
{
description: "multiple matching annotations",
prefixes: []string{"prefix/", "another-prefix/"},
annotations: map[string]string{
"prefix/foo": "example.com/device=bar",
"another-prefix/bar": "example.com/device=baz",
},
expectedDevices: []string{"example.com/device=bar", "example.com/device=baz"},
},
{
description: "multiple matching annotations with duplicate devices",
prefixes: []string{"prefix/", "another-prefix/"},
annotations: map[string]string{
"prefix/foo": "example.com/device=bar",
"another-prefix/bar": "example.com/device=bar",
},
expectedDevices: []string{"example.com/device=bar", "example.com/device=bar"},
},
{
description: "invalid devices are returned as is",
prefixes: []string{"prefix/"},
annotations: map[string]string{
"prefix/foo": "example.com/device",
},
expectedDevices: []string{"example.com/device"},
},
}
for _, tc := range testCases {
t.Run(tc.description, func(t *testing.T) {
image, err := New(
WithAnnotationsPrefixes(tc.prefixes),
WithAnnotations(tc.annotations),
)
require.NoError(t, err)
devices := image.cdiDeviceRequestsFromAnnotations()
require.ElementsMatch(t, tc.expectedDevices, devices)
})
}
}
func makeTestMounts(paths ...string) []specs.Mount {
var mounts []specs.Mount
for _, path := range paths {

View File

@@ -24,20 +24,39 @@ const (
capSysAdmin = "CAP_SYS_ADMIN"
)
type CapabilitiesGetter interface {
GetCapabilities() []string
}
type OCISpec specs.Spec
type OCISpecCapabilities specs.LinuxCapabilities
// IsPrivileged returns true if the container is a privileged container.
func IsPrivileged(s *specs.Spec) bool {
if s.Process.Capabilities == nil {
func IsPrivileged(s CapabilitiesGetter) bool {
if s == nil {
return false
}
// We only make sure that the bounding capabibility set has
// CAP_SYS_ADMIN. This allows us to make sure that the container was
// actually started as '--privileged', but also allow non-root users to
// access the privileged NVIDIA capabilities.
for _, c := range s.Process.Capabilities.Bounding {
for _, c := range s.GetCapabilities() {
if c == capSysAdmin {
return true
}
}
return false
}
func (s OCISpec) GetCapabilities() []string {
if s.Process == nil || s.Process.Capabilities == nil {
return nil
}
return (*OCISpecCapabilities)(s.Process.Capabilities).GetCapabilities()
}
func (c OCISpecCapabilities) GetCapabilities() []string {
// We only make sure that the bounding capability set has
// CAP_SYS_ADMIN. This allows us to make sure that the container was
// actually started as '--privileged', but also allow non-root users to
// access the privileged NVIDIA capabilities.
return c.Bounding
}

View File

@@ -0,0 +1,57 @@
/**
# Copyright (c) NVIDIA CORPORATION. All rights reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
**/
package image
import (
"testing"
"github.com/opencontainers/runtime-spec/specs-go"
"github.com/stretchr/testify/require"
)
func TestIsPrivileged(t *testing.T) {
var tests = []struct {
spec specs.Spec
expected bool
}{
{
specs.Spec{
Process: &specs.Process{
Capabilities: &specs.LinuxCapabilities{
Bounding: []string{"CAP_SYS_ADMIN"},
},
},
},
true,
},
{
specs.Spec{
Process: &specs.Process{
Capabilities: &specs.LinuxCapabilities{
Bounding: []string{"CAP_SYS_FOO"},
},
},
},
false,
},
}
for i, tc := range tests {
privileged := IsPrivileged((*OCISpec)(&tc.spec))
require.Equal(t, tc.expected, privileged, "%d: %v", i, tc)
}
}

View File

@@ -29,8 +29,9 @@ type RuntimeConfig struct {
// modesConfig defines (optional) per-mode configs
type modesConfig struct {
CSV csvModeConfig `toml:"csv"`
CDI cdiModeConfig `toml:"cdi"`
CSV csvModeConfig `toml:"csv"`
CDI cdiModeConfig `toml:"cdi"`
Legacy legacyModeConfig `toml:"legacy"`
}
type cdiModeConfig struct {
@@ -45,3 +46,31 @@ type cdiModeConfig struct {
type csvModeConfig struct {
MountSpecPath string `toml:"mount-spec-path"`
}
type legacyModeConfig struct {
// CUDACompatMode sets the mode to be used to make CUDA Forward Compat
// libraries discoverable in the container.
CUDACompatMode cudaCompatMode `toml:"cuda-compat-mode,omitempty"`
}
type cudaCompatMode string
const (
defaultCUDACompatMode = CUDACompatModeLdconfig
// CUDACompatModeDisabled explicitly disables the handling of CUDA Forward
// Compatibility in the NVIDIA Container Runtime and NVIDIA Container
// Runtime Hook.
CUDACompatModeDisabled = cudaCompatMode("disabled")
// CUDACompatModeHook uses a container lifecycle hook to implement CUDA
// Forward Compatibility support. This requires the use of the NVIDIA
// Container Runtime and is not compatible with use cases where only the
// NVIDIA Container Runtime Hook is used (e.g. the Docker --gpus flag).
CUDACompatModeHook = cudaCompatMode("hook")
// CUDACompatModeLdconfig adds the folders containing CUDA Forward Compat
// libraries to the ldconfig command invoked from the NVIDIA Container
// Runtime Hook.
CUDACompatModeLdconfig = cudaCompatMode("ldconfig")
// CUDACompatModeMount mounts CUDA Forward Compat folders from the container
// to the container when using the NVIDIA Container Runtime Hook.
CUDACompatModeMount = cudaCompatMode("mount")
)

View File

@@ -62,7 +62,7 @@ load-kmods = true
#debug = "/var/log/nvidia-container-runtime.log"
log-level = "info"
mode = "auto"
runtimes = ["docker-runc", "runc", "crun"]
runtimes = ["runc", "crun"]
[nvidia-container-runtime.modes]
@@ -74,6 +74,9 @@ spec-dirs = ["/etc/cdi", "/var/run/cdi"]
[nvidia-container-runtime.modes.csv]
mount-spec-path = "/etc/nvidia-container-runtime/host-files-for-container.d"
[nvidia-container-runtime.modes.legacy]
cuda-compat-mode = "ldconfig"
[nvidia-container-runtime-hook]
path = "nvidia-container-runtime-hook"
skip-mode-detection = false

View File

@@ -23,6 +23,7 @@ type cache struct {
sync.Mutex
devices []Device
envVars []EnvVar
hooks []Hook
mounts []Mount
}
@@ -51,6 +52,20 @@ func (c *cache) Devices() ([]Device, error) {
return c.devices, nil
}
func (c *cache) EnvVars() ([]EnvVar, error) {
c.Lock()
defer c.Unlock()
if c.envVars == nil {
envVars, err := c.d.EnvVars()
if err != nil {
return nil, err
}
c.envVars = envVars
}
return c.envVars, nil
}
func (c *cache) Hooks() ([]Hook, error) {
c.Lock()
defer c.Unlock()

View File

@@ -0,0 +1,20 @@
package discover
import (
"strings"
"github.com/NVIDIA/nvidia-container-toolkit/internal/logger"
"github.com/NVIDIA/nvidia-container-toolkit/internal/lookup/root"
)
// NewCUDACompatHookDiscoverer creates a discoverer for a enable-cuda-compat hook.
// This hook is responsible for setting up CUDA compatibility in the container and depends on the host driver version.
func NewCUDACompatHookDiscoverer(logger logger.Interface, hookCreator HookCreator, driver *root.Driver) Discover {
_, cudaVersionPattern := getCUDALibRootAndVersionPattern(logger, driver)
var args []string
if !strings.Contains(cudaVersionPattern, "*") {
args = append(args, "--host-driver-version="+cudaVersionPattern)
}
return hookCreator.Create("enable-cuda-compat", args...)
}

View File

@@ -22,6 +22,12 @@ type Device struct {
Path string
}
// EnvVar represents a discovered environment variable.
type EnvVar struct {
Name string
Value string
}
// Mount represents a discovered mount.
type Mount struct {
HostPath string
@@ -34,13 +40,15 @@ type Hook struct {
Lifecycle string
Path string
Args []string
Env []string
}
// Discover defines an interface for discovering the devices, mounts, and hooks available on a system
//
//go:generate moq -stub -out discover_mock.go . Discover
//go:generate moq -rm -fmt=goimports -stub -out discover_mock.go . Discover
type Discover interface {
Devices() ([]Device, error)
EnvVars() ([]EnvVar, error)
Mounts() ([]Mount, error)
Hooks() ([]Hook, error)
}

View File

@@ -20,6 +20,9 @@ var _ Discover = &DiscoverMock{}
// DevicesFunc: func() ([]Device, error) {
// panic("mock out the Devices method")
// },
// EnvVarsFunc: func() ([]EnvVar, error) {
// panic("mock out the EnvVars method")
// },
// HooksFunc: func() ([]Hook, error) {
// panic("mock out the Hooks method")
// },
@@ -36,6 +39,9 @@ type DiscoverMock struct {
// DevicesFunc mocks the Devices method.
DevicesFunc func() ([]Device, error)
// EnvVarsFunc mocks the EnvVars method.
EnvVarsFunc func() ([]EnvVar, error)
// HooksFunc mocks the Hooks method.
HooksFunc func() ([]Hook, error)
@@ -47,6 +53,9 @@ type DiscoverMock struct {
// Devices holds details about calls to the Devices method.
Devices []struct {
}
// EnvVars holds details about calls to the EnvVars method.
EnvVars []struct {
}
// Hooks holds details about calls to the Hooks method.
Hooks []struct {
}
@@ -55,6 +64,7 @@ type DiscoverMock struct {
}
}
lockDevices sync.RWMutex
lockEnvVars sync.RWMutex
lockHooks sync.RWMutex
lockMounts sync.RWMutex
}
@@ -90,6 +100,37 @@ func (mock *DiscoverMock) DevicesCalls() []struct {
return calls
}
// EnvVars calls EnvVarsFunc.
func (mock *DiscoverMock) EnvVars() ([]EnvVar, error) {
callInfo := struct {
}{}
mock.lockEnvVars.Lock()
mock.calls.EnvVars = append(mock.calls.EnvVars, callInfo)
mock.lockEnvVars.Unlock()
if mock.EnvVarsFunc == nil {
var (
envVarsOut []EnvVar
errOut error
)
return envVarsOut, errOut
}
return mock.EnvVarsFunc()
}
// EnvVarsCalls gets all the calls that were made to EnvVars.
// Check the length with:
//
// len(mockedDiscover.EnvVarsCalls())
func (mock *DiscoverMock) EnvVarsCalls() []struct {
} {
var calls []struct {
}
mock.lockEnvVars.RLock()
calls = mock.calls.EnvVars
mock.lockEnvVars.RUnlock()
return calls
}
// Hooks calls HooksFunc.
func (mock *DiscoverMock) Hooks() ([]Hook, error) {
callInfo := struct {

View File

@@ -0,0 +1,41 @@
/**
# SPDX-FileCopyrightText: Copyright (c) 2025 NVIDIA CORPORATION & AFFILIATES. All rights reserved.
# SPDX-License-Identifier: Apache-2.0
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
**/
package discover
var _ Discover = (*EnvVar)(nil)
// Devices returns an empty list of devices for a EnvVar discoverer.
func (e EnvVar) Devices() ([]Device, error) {
return nil, nil
}
// EnvVars returns an empty list of envs for a EnvVar discoverer.
func (e EnvVar) EnvVars() ([]EnvVar, error) {
return []EnvVar{e}, nil
}
// Mounts returns an empty list of mounts for a EnvVar discoverer.
func (e EnvVar) Mounts() ([]Mount, error) {
return nil, nil
}
// Hooks allows the Hook type to also implement the Discoverer interface.
// It returns a single hook
func (e EnvVar) Hooks() ([]Hook, error) {
return nil, nil
}

View File

@@ -45,6 +45,19 @@ func (f firstOf) Devices() ([]Device, error) {
return nil, errs
}
func (f firstOf) EnvVars() ([]EnvVar, error) {
var errs error
for _, d := range f {
envs, err := d.EnvVars()
if err != nil {
errs = errors.Join(errs, err)
continue
}
return envs, nil
}
return nil, errs
}
func (f firstOf) Hooks() ([]Hook, error) {
var errs error
for _, d := range f {

View File

@@ -20,6 +20,7 @@ import (
"fmt"
"os"
"path/filepath"
"runtime"
"strings"
"github.com/NVIDIA/nvidia-container-toolkit/internal/config/image"
@@ -36,21 +37,21 @@ import (
// TODO: The logic for creating DRM devices should be consolidated between this
// and the logic for generating CDI specs for a single device. This is only used
// when applying OCI spec modifications to an incoming spec in "legacy" mode.
func NewDRMNodesDiscoverer(logger logger.Interface, devices image.VisibleDevices, devRoot string, nvidiaCDIHookPath string) (Discover, error) {
func NewDRMNodesDiscoverer(logger logger.Interface, devices image.VisibleDevices, devRoot string, hookCreator HookCreator) (Discover, error) {
drmDeviceNodes, err := newDRMDeviceDiscoverer(logger, devices, devRoot)
if err != nil {
return nil, fmt.Errorf("failed to create DRM device discoverer: %v", err)
}
drmByPathSymlinks := newCreateDRMByPathSymlinks(logger, drmDeviceNodes, devRoot, nvidiaCDIHookPath)
drmByPathSymlinks := newCreateDRMByPathSymlinks(logger, drmDeviceNodes, devRoot, hookCreator)
discover := Merge(drmDeviceNodes, drmByPathSymlinks)
return discover, nil
}
// NewGraphicsMountsDiscoverer creates a discoverer for the mounts required by graphics tools such as vulkan.
func NewGraphicsMountsDiscoverer(logger logger.Interface, driver *root.Driver, nvidiaCDIHookPath string) (Discover, error) {
libraries := newGraphicsLibrariesDiscoverer(logger, driver, nvidiaCDIHookPath)
func NewGraphicsMountsDiscoverer(logger logger.Interface, driver *root.Driver, hookCreator HookCreator) (Discover, error) {
libraries := newGraphicsLibrariesDiscoverer(logger, driver, hookCreator)
configs := NewMounts(
logger,
@@ -81,27 +82,38 @@ func NewGraphicsMountsDiscoverer(logger logger.Interface, driver *root.Driver, n
// vulkan ICD files are at {{ .driverRoot }}/vulkan instead of in /etc/vulkan.
func newVulkanConfigsDiscover(logger logger.Interface, driver *root.Driver) Discover {
locator := lookup.First(driver.Configs(), driver.Files())
required := []string{
"vulkan/icd.d/nvidia_icd.json",
"vulkan/icd.d/nvidia_layers.json",
"vulkan/implicit_layer.d/nvidia_layers.json",
}
// For some RPM-based driver packages, the vulkan ICD files are installed to
// /usr/share/vulkan/icd.d/nvidia_icd.%{_target_cpu}.json
// We also include this in the list of candidates for the ICD file.
switch runtime.GOARCH {
case "amd64":
required = append(required, "vulkan/icd.d/nvidia_icd.x86_64.json")
case "arm64":
required = append(required, "vulkan/icd.d/nvidia_icd.aarch64.json")
}
return &mountsToContainerPath{
logger: logger,
locator: locator,
required: []string{
"vulkan/icd.d/nvidia_icd.json",
"vulkan/icd.d/nvidia_layers.json",
"vulkan/implicit_layer.d/nvidia_layers.json",
},
logger: logger,
locator: locator,
required: required,
containerRoot: "/etc",
}
}
type graphicsDriverLibraries struct {
Discover
logger logger.Interface
nvidiaCDIHookPath string
logger logger.Interface
hookCreator HookCreator
}
var _ Discover = (*graphicsDriverLibraries)(nil)
func newGraphicsLibrariesDiscoverer(logger logger.Interface, driver *root.Driver, nvidiaCDIHookPath string) Discover {
func newGraphicsLibrariesDiscoverer(logger logger.Interface, driver *root.Driver, hookCreator HookCreator) Discover {
cudaLibRoot, cudaVersionPattern := getCUDALibRootAndVersionPattern(logger, driver)
libraries := NewMounts(
@@ -140,9 +152,9 @@ func newGraphicsLibrariesDiscoverer(logger logger.Interface, driver *root.Driver
)
return &graphicsDriverLibraries{
Discover: Merge(libraries, xorgLibraries),
logger: logger,
nvidiaCDIHookPath: nvidiaCDIHookPath,
Discover: Merge(libraries, xorgLibraries),
logger: logger,
hookCreator: hookCreator,
}
}
@@ -203,9 +215,9 @@ func (d graphicsDriverLibraries) Hooks() ([]Hook, error) {
return nil, nil
}
hooks := CreateCreateSymlinkHook(d.nvidiaCDIHookPath, links)
hook := d.hookCreator.Create("create-symlinks", links...)
return hooks.Hooks()
return hook.Hooks()
}
// isDriverLibrary checks whether the specified filename is a specific driver library.
@@ -275,19 +287,19 @@ func buildXOrgSearchPaths(libRoot string) []string {
type drmDevicesByPath struct {
None
logger logger.Interface
nvidiaCDIHookPath string
devRoot string
devicesFrom Discover
logger logger.Interface
hookCreator HookCreator
devRoot string
devicesFrom Discover
}
// newCreateDRMByPathSymlinks creates a discoverer for a hook to create the by-path symlinks for DRM devices discovered by the specified devices discoverer
func newCreateDRMByPathSymlinks(logger logger.Interface, devices Discover, devRoot string, nvidiaCDIHookPath string) Discover {
func newCreateDRMByPathSymlinks(logger logger.Interface, devices Discover, devRoot string, hookCreator HookCreator) Discover {
d := drmDevicesByPath{
logger: logger,
nvidiaCDIHookPath: nvidiaCDIHookPath,
devRoot: devRoot,
devicesFrom: devices,
logger: logger,
hookCreator: hookCreator,
devRoot: devRoot,
devicesFrom: devices,
}
return &d
@@ -315,13 +327,9 @@ func (d drmDevicesByPath) Hooks() ([]Hook, error) {
args = append(args, "--link", l)
}
hook := CreateNvidiaCDIHook(
d.nvidiaCDIHookPath,
"create-symlinks",
args...,
)
hook := d.hookCreator.Create("create-symlinks", args...)
return []Hook{hook}, nil
return hook.Hooks()
}
// getSpecificLinkArgs returns the required specific links that need to be created

View File

@@ -25,6 +25,7 @@ import (
func TestGraphicsLibrariesDiscoverer(t *testing.T) {
logger, _ := testlog.NewNullLogger()
hookCreator := NewHookCreator()
testCases := []struct {
description string
@@ -70,6 +71,7 @@ func TestGraphicsLibrariesDiscoverer(t *testing.T) {
Args: []string{"nvidia-cdi-hook", "create-symlinks",
"--link", "../libnvidia-allocator.so.1::/usr/lib64/gbm/nvidia-drm_gbm.so",
},
Env: []string{"NVIDIA_CTK_DEBUG=false"},
},
},
},
@@ -97,6 +99,7 @@ func TestGraphicsLibrariesDiscoverer(t *testing.T) {
Args: []string{"nvidia-cdi-hook", "create-symlinks",
"--link", "libnvidia-vulkan-producer.so.123.45.67::/usr/lib64/libnvidia-vulkan-producer.so",
},
Env: []string{"NVIDIA_CTK_DEBUG=false"},
},
},
},
@@ -128,6 +131,7 @@ func TestGraphicsLibrariesDiscoverer(t *testing.T) {
"--link", "../libnvidia-allocator.so.1::/usr/lib64/gbm/nvidia-drm_gbm.so",
"--link", "libnvidia-vulkan-producer.so.123.45.67::/usr/lib64/libnvidia-vulkan-producer.so",
},
Env: []string{"NVIDIA_CTK_DEBUG=false"},
},
},
},
@@ -136,9 +140,9 @@ func TestGraphicsLibrariesDiscoverer(t *testing.T) {
for _, tc := range testCases {
t.Run(tc.description, func(t *testing.T) {
d := &graphicsDriverLibraries{
Discover: tc.libraries,
logger: logger,
nvidiaCDIHookPath: "/usr/bin/nvidia-cdi-hook",
Discover: tc.libraries,
logger: logger,
hookCreator: hookCreator,
}
devices, err := d.Devices()

View File

@@ -17,64 +17,193 @@
package discover
import (
"fmt"
"path/filepath"
"tags.cncf.io/container-device-interface/pkg/cdi"
)
// A HookName represents a supported CDI hooks.
type HookName string
const (
// AllHooks is a special hook name that allows all hooks to be matched.
AllHooks = HookName("all")
// A ChmodHook is used to set the file mode of the specified paths.
// Deprecated: The chmod hook is deprecated and will be removed in a future release.
ChmodHook = HookName("chmod")
// A CreateSymlinksHook is used to create symlinks in the container.
CreateSymlinksHook = HookName("create-symlinks")
// DisableDeviceNodeModificationHook refers to the hook used to ensure that
// device nodes are not created by libnvidia-ml.so or nvidia-smi in a
// container.
// Added in v1.17.8
DisableDeviceNodeModificationHook = HookName("disable-device-node-modification")
// An EnableCudaCompatHook is used to enabled CUDA Forward Compatibility.
// Added in v1.17.5
EnableCudaCompatHook = HookName("enable-cuda-compat")
// An UpdateLDCacheHook is the hook used to update the ldcache in the
// container. This allows injected libraries to be discoverable.
UpdateLDCacheHook = HookName("update-ldcache")
// A CreateSonameSymlinksHook is the hook used to ensure that soname symlinks
// for injected libraries exist in the container.
CreateSonameSymlinksHook = HookName("create-soname-symlinks")
defaultNvidiaCDIHookPath = "/usr/bin/nvidia-cdi-hook"
)
var _ Discover = (*Hook)(nil)
// Devices returns an empty list of devices for a Hook discoverer.
func (h Hook) Devices() ([]Device, error) {
func (h *Hook) Devices() ([]Device, error) {
return nil, nil
}
// EnvVars returns an empty list of envs for a Hook discoverer.
func (h *Hook) EnvVars() ([]EnvVar, error) {
return nil, nil
}
// Mounts returns an empty list of mounts for a Hook discoverer.
func (h Hook) Mounts() ([]Mount, error) {
func (h *Hook) Mounts() ([]Mount, error) {
return nil, nil
}
// Hooks allows the Hook type to also implement the Discoverer interface.
// It returns a single hook
func (h Hook) Hooks() ([]Hook, error) {
return []Hook{h}, nil
}
// CreateCreateSymlinkHook creates a hook which creates a symlink from link -> target.
func CreateCreateSymlinkHook(nvidiaCDIHookPath string, links []string) Discover {
if len(links) == 0 {
return None{}
func (h *Hook) Hooks() ([]Hook, error) {
if h == nil {
return nil, nil
}
var args []string
for _, link := range links {
args = append(args, "--link", link)
return []Hook{*h}, nil
}
type Option func(*cdiHookCreator)
type cdiHookCreator struct {
nvidiaCDIHookPath string
disabledHooks map[HookName]bool
fixedArgs []string
debugLogging bool
}
// An allDisabledHookCreator is a HookCreator that does not create any hooks.
type allDisabledHookCreator struct{}
// Create returns nil for all hooks for an allDisabledHookCreator.
func (a *allDisabledHookCreator) Create(name HookName, args ...string) *Hook {
return nil
}
// A HookCreator defines an interface for creating discover hooks.
type HookCreator interface {
Create(HookName, ...string) *Hook
}
// WithDisabledHooks sets the set of hooks that are disabled for the CDI hook creator.
// This can be specified multiple times.
func WithDisabledHooks(hooks ...HookName) Option {
return func(c *cdiHookCreator) {
for _, hook := range hooks {
c.disabledHooks[hook] = true
}
}
return CreateNvidiaCDIHook(
nvidiaCDIHookPath,
"create-symlinks",
args...,
)
}
// CreateNvidiaCDIHook creates a hook which invokes the NVIDIA Container CLI hook subcommand.
func CreateNvidiaCDIHook(nvidiaCDIHookPath string, hookName string, additionalArgs ...string) Hook {
return cdiHook(nvidiaCDIHookPath).Create(hookName, additionalArgs...)
// WithNVIDIACDIHookPath sets the path to the nvidia-cdi-hook binary.
func WithNVIDIACDIHookPath(nvidiaCDIHookPath string) Option {
return func(c *cdiHookCreator) {
c.nvidiaCDIHookPath = nvidiaCDIHookPath
}
}
type cdiHook string
func NewHookCreator(opts ...Option) HookCreator {
cdiHookCreator := &cdiHookCreator{
nvidiaCDIHookPath: defaultNvidiaCDIHookPath,
disabledHooks: make(map[HookName]bool),
}
for _, opt := range opts {
opt(cdiHookCreator)
}
func (c cdiHook) Create(name string, args ...string) Hook {
return Hook{
if cdiHookCreator.disabledHooks[AllHooks] {
return &allDisabledHookCreator{}
}
cdiHookCreator.fixedArgs = getFixedArgsForCDIHookCLI(cdiHookCreator.nvidiaCDIHookPath)
return cdiHookCreator
}
// Create creates a new hook with the given name and arguments.
// If a hook is disabled, a nil hook is returned.
func (c cdiHookCreator) Create(name HookName, args ...string) *Hook {
if c.isDisabled(name, args...) {
return nil
}
return &Hook{
Lifecycle: cdi.CreateContainerHook,
Path: string(c),
Args: append(c.requiredArgs(name), args...),
Path: c.nvidiaCDIHookPath,
Args: append(c.requiredArgs(name), c.transformArgs(name, args...)...),
Env: []string{fmt.Sprintf("NVIDIA_CTK_DEBUG=%v", c.debugLogging)},
}
}
func (c cdiHook) requiredArgs(name string) []string {
base := filepath.Base(string(c))
// isDisabled checks if the specified hook name is disabled.
func (c cdiHookCreator) isDisabled(name HookName, args ...string) bool {
if c.disabledHooks[name] {
return true
}
switch name {
case CreateSymlinksHook:
if len(args) == 0 {
return true
}
case ChmodHook:
if len(args) == 0 {
return true
}
}
return false
}
func (c cdiHookCreator) requiredArgs(name HookName) []string {
return append(c.fixedArgs, string(name))
}
func (c cdiHookCreator) transformArgs(name HookName, args ...string) []string {
switch name {
case CreateSymlinksHook:
var transformedArgs []string
for _, arg := range args {
transformedArgs = append(transformedArgs, "--link", arg)
}
return transformedArgs
case ChmodHook:
var transformedArgs = []string{"--mode", "755"}
for _, arg := range args {
transformedArgs = append(transformedArgs, "--path", arg)
}
return transformedArgs
default:
return args
}
}
// getFixedArgsForCDIHookCLI returns the fixed arguments for the hook CLI.
// If the nvidia-ctk binary is used, hooks are implemented under the hook
// subcommand.
// For the nvidia-cdi-hook binary, the hooks are implemented as subcommands of
// the top-level CLI.
func getFixedArgsForCDIHookCLI(nvidiaCDIHookPath string) []string {
base := filepath.Base(nvidiaCDIHookPath)
if base == "nvidia-ctk" {
return []string{base, "hook", name}
return []string{base, "hook"}
}
return []string{base, name}
return []string{base}
}

Some files were not shown because too many files have changed in this diff Show More