nvidia-container-toolkit

mirror of https://github.com/NVIDIA/nvidia-container-toolkit synced 2025-06-16 11:30:20 +00:00

Author	SHA1	Message	Date
Evan Lezar	e5b690e200	Resolve to legacy by default in nvidia-container-runtime-hook Some checks are pending CI Pipeline / code-scanning (push) Waiting to run Details CI Pipeline / variables (push) Waiting to run Details CI Pipeline / golang (push) Waiting to run Details CI Pipeline / image (push) Blocked by required conditions Details CI Pipeline / e2e-test (push) Blocked by required conditions Details Signed-off-by: Evan Lezar <elezar@nvidia.com>	2025-06-16 11:25:16 +02:00
Evan Lezar	a39d147cbb	Default to jit-cdi mode in the nvidia runtime Signed-off-by: Evan Lezar <elezar@nvidia.com>	2025-06-16 11:25:13 +02:00
Evan Lezar	5d4da6200a	[no-relnote] Add RuntimeMode type Signed-off-by: Evan Lezar <elezar@nvidia.com>	2025-06-16 11:05:54 +02:00
Evan Lezar	7f5ad9c5b2	Ensure consistent sorting of annotation devices Signed-off-by: Evan Lezar <elezar@nvidia.com>	2025-06-16 10:52:06 +02:00
Evan Lezar	8bfce9488d	Use functional options to construct runtime mode resolver Signed-off-by: Evan Lezar <elezar@nvidia.com>	2025-06-16 10:52:06 +02:00
Evan Lezar	6359cc9919	Remove docker-run as default runtime candidate This change removes docker-runc as the highest priority default candidate for the low-level runtimes supported by the nvidia-container-runtime. Signed-off-by: Evan Lezar <elezar@nvidia.com>	2025-06-13 17:26:19 +02:00
Evan Lezar	4f9c860a37	Merge pull request #927 from elezar/disable-device-node-creation Disable device node creation in CDI mode	2025-06-13 16:44:18 +02:00
Evan Lezar	cc7812470f	Merge pull request #1143 from elezar/add-device-ids-to-getspec Add device IDs to nvcdi.GetSpec API	2025-06-13 16:41:43 +02:00
Evan Lezar	8be03cfc41	[no-relnote] Ignore annotation devices for non-CDI modes Signed-off-by: Evan Lezar <elezar@nvidia.com>	2025-06-13 15:26:48 +02:00
Evan Lezar	8650ca6533	[no-relnote] Move hookCreator initialisation for readability Signed-off-by: Evan Lezar <elezar@nvidia.com>	2025-06-13 15:00:51 +02:00
Evan Lezar	1bc2a9fee3	Return annotation devices from VisibleDevices This change includes annotation devices in CUDA.VisibleDevices with the highest priority. This allows for the CDI device request extraction to be consistent across all request mechanisms. Note that this does change behaviour in the following ways: 1. Annotations are considered when resolving the runtime mode. 2. Incorrectly formed device names in annotations are no longer treated as an error. Signed-off-by: Evan Lezar <elezar@nvidia.com>	2025-06-13 14:56:08 +02:00
Evan Lezar	dc87dcf786	Make CDI device requests consistent with other methods Following the refactoring of device request extraction, we can now make CDI device requests consistent with other methods. This change moves to using image.VisibleDevices instead of separate calls to CDIDevicesFromMounts and VisibleDevicesFromEnvVar. Signed-off-by: Evan Lezar <elezar@nvidia.com>	2025-06-13 14:34:02 +02:00
Evan Lezar	f17d424248	Construct container info once Signed-off-by: Evan Lezar <elezar@nvidia.com>	2025-06-13 14:05:51 +02:00
Evan Lezar	426186c992	Add logic to extract annotation device requests to image type This change updates the image.CUDA type to also extract CDI device requests. These are only relevant IF CDI prefixes are specifically set. Signed-off-by: Evan Lezar <elezar@nvidia.com>	2025-06-13 14:05:51 +02:00
Evan Lezar	6849ebd621	Add IsPrivileged function to CUDA container type Signed-off-by: Evan Lezar <elezar@nvidia.com>	2025-06-13 14:05:51 +02:00
Evan Lezar	0134ba4250	Add device IDs to nvcdi.GetSpec API This change allows device IDs to the specified in the GetSpec API. This simplifies cases where CDI specs are being generated for specific devices by ID. Signed-off-by: Evan Lezar <elezar@nvidia.com>	2025-06-12 16:23:11 +02:00
Evan Lezar	27f5ec83de	Merge pull request #1125 from elezar/vulkan-target-cpu Some checks failed CI Pipeline / code-scanning (push) Has been cancelled Details CI Pipeline / variables (push) Has been cancelled Details CI Pipeline / golang (push) Has been cancelled Details CI Pipeline / image (push) Has been cancelled Details CI Pipeline / e2e-test (push) Has been cancelled Details Add discovery of arch-specific vulkan ICD	2025-06-05 14:09:50 +02:00
Carlos Eduardo Arango Gutierrez	dede03f322	Refactor extracting requested devices from the container image This change consolidates the logic for determining requested devices from the container image. The logic for this has been integrated into the image.CUDA type so that multiple implementations are not required. Signed-off-by: Carlos Eduardo Arango Gutierrez <eduardoa@nvidia.com> Co-authored-by: Evan Lezar <elezar@nvidia.com>	2025-06-05 12:38:45 +02:00
Evan Lezar	2de997e25b	Add discovery of arch-specific vulkan ICD On some RPM-based platforms, the path of the Vulkan ICD file include an architecture-specific infix to distinguish it from other architectures. (Most notably x86_64 vs i686). This change attempts to discover the arch-specific ICD file in addition to the standard nvidia_icd.json. Signed-off-by: Evan Lezar <elezar@nvidia.com>	2025-06-04 23:06:16 +02:00
Evan Lezar	e046d6ae79	Add disabled-device-node-modification hook to CDI spec Some checks failed CI Pipeline / code-scanning (push) Has been cancelled Details CI Pipeline / variables (push) Has been cancelled Details CI Pipeline / golang (push) Has been cancelled Details CI Pipeline / image (push) Has been cancelled Details CI Pipeline / e2e-test (push) Has been cancelled Details This hook is not added to management specs. Signed-off-by: Evan Lezar <elezar@nvidia.com>	2025-06-04 19:08:48 +02:00
Carlos Eduardo Arango Gutierrez	6cf0248321	Added ability to disable specific (or all) CDI hooks This change adds the ability to disabled specific (or all) CDI hooks to both the nvidia-ctk cdi generate command and the nvcdi API. Signed-off-by: Carlos Eduardo Arango Gutierrez <eduardoa@nvidia.com> Signed-off-by: Evan Lezar <elezar@nvidia.com>	2025-06-03 16:01:20 +02:00
Carlos Eduardo Arango Gutierrez	b4787511d2	Consolidate HookName functionality on internal/discover pkg Signed-off-by: Carlos Eduardo Arango Gutierrez <eduardoa@nvidia.com>	2025-06-03 15:24:43 +02:00
Evan Lezar	479df7134a	Add envvar to control debug logging in CDI hooks This change allows hooks to be configured with debug logging. This is currently only enabled for the hooks generated from the runtime. Signed-off-by: Evan Lezar <elezar@nvidia.com>	2025-05-30 15:27:52 +02:00
Carlos Eduardo Arango Gutierrez	aaaa3c6275	Edit discover.mounts to have a deterministic output Signed-off-by: Carlos Eduardo Arango Gutierrez <eduardoa@nvidia.com> Signed-off-by: Evan Lezar <elezar@nvidia.com>	2025-05-22 15:25:03 +02:00
Carlos Eduardo Arango Gutierrez	cf3b9317ef	Refactor the way we create CDI Hooks Signed-off-by: Carlos Eduardo Arango Gutierrez <eduardoa@nvidia.com>	2025-05-21 10:19:47 +02:00
Evan Lezar	6dfd63f4a8	Merge pull request #980 from elezar/add-rprivate-to-mount-options Some checks are pending CI Pipeline / code-scanning (push) Waiting to run Details CI Pipeline / variables (push) Waiting to run Details CI Pipeline / golang (push) Waiting to run Details CI Pipeline / image (push) Blocked by required conditions Details CI Pipeline / e2e-test (push) Blocked by required conditions Details Add rprivate to CDI mount options	2025-05-16 07:53:39 +02:00
Evan Lezar	f4981f0876	Add cuda-compat-mode config option This change adds an nvidia-container-runtime.modes.legacy.cuda-compat-mode config option. This can be set to one of four values: * ldconfig (default): the --cuda-compat-mode=ldconfig flag is passed to the nvidia-container-cli * mount: the --cuda-compat-mode=mount flag is passed to the nvidia-conainer-cli * disabled: the --cuda-compat-mode=disabled flag is passed to the nvidia-container-cli * hook: the --cuda-compat-mode=disabled flag is passed to the nvidia-container-cli AND the enable-cuda-compat hook is used to provide forward compatibility. Note that the disable-cuda-compat-lib-hook feature flag will prevent the enable-cuda-compat hook from being used. This change also means that the allow-cuda-compat-libs-from-container feature flag no longer has any effect. Signed-off-by: Evan Lezar <elezar@nvidia.com>	2025-05-13 21:49:53 +02:00
Evan Lezar	a4dc28bb3f	Fix mode detection on Thor-based systems This change updates github.com/NVIDIA/go-nvlib from v0.7.1 to v0.7.2 to allow Thor systems to be detected as Tegra-based. This allows fixes automatic mode detection to work on these systems. Signed-off-by: Evan Lezar <elezar@nvidia.com>	2025-05-13 21:25:11 +02:00
Evan Lezar	d0103aa6a3	Add rprivate to CDI mount options Some checks failed CI Pipeline / code-scanning (push) Has been cancelled Details CI Pipeline / variables (push) Has been cancelled Details CI Pipeline / golang (push) Has been cancelled Details CI Pipeline / image (push) Has been cancelled Details CI Pipeline / e2e-test (push) Has been cancelled Details This ensures that mount propagation is set to rprivate for mounts from the host into the container. This aligns with the default in docker. Signed-off-by: Evan Lezar <elezar@nvidia.com>	2025-05-09 15:16:13 +02:00
Evan Lezar	adb5e6719d	Merge pull request #1046 from elezar/resolve-ldcache-libs-on-arm64 Some checks failed CI Pipeline / code-scanning (push) Has been cancelled Details CI Pipeline / variables (push) Has been cancelled Details CI Pipeline / golang (push) Has been cancelled Details CI Pipeline / image (push) Has been cancelled Details CI Pipeline / e2e-test (push) Has been cancelled Details Fix resolution of libs in LDCache on ARM	2025-05-09 15:04:00 +02:00
Evan Lezar	0c765c6536	Skip nil discoverers in merge When constructing a list of discoverers using discover.Merge we explicitly skip `nil` discoverers to simplify usage as we don't have to explicitly check validity when processing the discoverers in the list. Signed-off-by: Evan Lezar <elezar@nvidia.com>	2025-05-07 12:51:38 +02:00
Evan Lezar	e6cd7a3b53	Fix resolution of libs in LDCache on ARM Since we explicitly check for the architecture of the libraries in the ldcache, we need to also check the architecture flag against the ARM constants. Signed-off-by: Evan Lezar <elezar@nvidia.com>	2025-04-23 14:28:28 +02:00
Evan Lezar	986f3db971	Fix race condition in mounts cache This change switches to using the WithCache decorator for mounts instead of keeping track of a cache locally. This addresses a race condition when using the mounts structure. Signed-off-by: Evan Lezar <elezar@nvidia.com>	2025-04-02 16:29:49 +02:00
Evan Lezar	be9d7b6db1	[no-relnote] Fix QF1002: could use tagged switch on info.SubType lint errors Signed-off-by: Evan Lezar <elezar@nvidia.com>	2025-04-02 14:18:32 +02:00
Evan Lezar	c1c6534b1f	[no-relnote] Fix QF1008: could remove embedded field lint errors Signed-off-by: Evan Lezar <elezar@nvidia.com>	2025-04-02 14:18:32 +02:00
Evan Lezar	2a9bae8e80	[no-relnotes] Update moq to use rm and goimports Signed-off-by: Evan Lezar <elezar@nvidia.com>	2025-03-12 13:12:13 +02:00
Evan Lezar	bc9ec77fdd	Merge pull request #943 from elezar/add-disable-imex-channels-feature Some checks failed CI Pipeline / code-scanning (push) Has been cancelled Details CI Pipeline / variables (push) Has been cancelled Details CI Pipeline / golang (push) Has been cancelled Details CI Pipeline / image (push) Has been cancelled Details CI Pipeline / e2e-test (push) Has been cancelled Details Add ignore-imex-channel-requests feature flag	2025-02-28 17:53:28 +02:00
Evan Lezar	aff9301f2e	Add disable-cuda-compat-lib-hook feature flag Signed-off-by: Evan Lezar <elezar@nvidia.com>	2025-02-27 15:58:15 +02:00
Evan Lezar	2adef9903e	Ensure that mode hook is executed last Signed-off-by: Evan Lezar <elezar@nvidia.com>	2025-02-27 15:58:15 +02:00
Evan Lezar	b7fbd56f7e	Add ldconfig hook in legacy mode Signed-off-by: Evan Lezar <elezar@nvidia.com>	2025-02-27 15:58:15 +02:00
Evan Lezar	bd87c009ba	Add enable-cuda-compat hook if required This change adds the enable-cuda-compat hook to the incomming OCI runtime spec if the allow-cuda-compat-libs-from-container feature flag is not enabled. An update-ldcache hook is also injected to ensure that the required folders are processed. Signed-off-by: Evan Lezar <elezar@nvidia.com>	2025-02-27 15:58:15 +02:00
Evan Lezar	352b55c8ce	Add ignore-imex-channel-requests feature flag This allows the NVIDIA Container Toolkit to ignore IMEX channel requests through the NVIDIA_IMEX_CHANNELS envvar or volume mounts and ensures that the NVIDIA Container Toolkit cannot be used to provide out-of-band access to an IMEX channel by simply specifying an environment variable, possibly bypassing other checks by an orchestration system such as kubernetes. Signed-off-by: Evan Lezar <elezar@nvidia.com>	2025-02-26 17:46:36 +02:00
Evan Lezar	03152dba8d	Allow cdi mode to work with --gpus flag This changes ensures that the cdi modifier also removes the NVIDIA Container Runtime Hook from the incoming spec. This aligns with what is done for CSV modifications and prevents an error when starting the container. Signed-off-by: Evan Lezar <elezar@nvidia.com>	2025-02-05 19:01:43 +01:00
Evan Lezar	cf026dce9a	[no-relnote] Remove duplicate test case Signed-off-by: Evan Lezar <elezar@nvidia.com>	2025-02-05 19:01:43 +01:00
Carlos Eduardo Arango Gutierrez	bf9d618ff2	Rename test folder to tests Signed-off-by: Carlos Eduardo Arango Gutierrez <eduardoa@nvidia.com> Signed-off-by: Evan Lezar <elezar@nvidia.com>	2025-01-23 11:46:14 +01:00
Evan Lezar	ed3b52eb8d	Add allow-cuda-compat-libs-from-container feature flag This change adds an allow-cuda-compat-libs-from-container feature flag to the NVIDIA Container Toolkit config. This allows a user to opt-in to the previous default behaviour of overriding certain driver libraries with CUDA compat libraries from the container. Signed-off-by: Evan Lezar <elezar@nvidia.com>	2025-01-22 17:34:20 +01:00
Evan Lezar	991b9c222f	Skip graphics modifier in CSV mode In CSV mode the CSV files at /etc/nvidia-container-runtime/host-files-for-container.d/ should be the source of truth for container modifications. This change skips graphics modifications to a container. This prevents conflicts when handling files such as vulkan icd files which are already defined in the CSV file. Signed-off-by: Evan Lezar <elezar@nvidia.com>	2025-01-22 13:58:31 +01:00
Evan Lezar	fdad3927b4	[no-relnote] Refactor oci spec modifier list Signed-off-by: Evan Lezar <elezar@nvidia.com>	2025-01-22 13:58:31 +01:00
Evan Lezar	d584f9a40b	[no-relnote] Sort feature flags Signed-off-by: Evan Lezar <elezar@nvidia.com>	2025-01-15 13:27:14 +01:00
Evan Lezar	2529aebd6c	Fix create-device-node test when devices exist This changes fixes the TestCreateControlDevices test on systems where device nodes exist. Signed-off-by: Evan Lezar <elezar@nvidia.com>	2024-12-06 14:05:51 +01:00

1 2 3 4 5 ...

449 Commits