Commit Graph

14 Commits

Author SHA1 Message Date
Carlos Eduardo Arango Gutierrez
d03a06029a
BUGFIX: modifier: respect GPU volume-mount device requests
The gated modifiers used to add support for GDS, Mofed, and CUDA Forward Comatibility
only check the NVIDIA_VISIBLE_DEVICES envvar to determine whether GPUs are requested
and modifications should be made. This means that use cases where volume mounts are
used to request devices are not supported.

This change ensures that device extraction is consistent for all use cases.

Signed-off-by: Carlos Eduardo Arango Gutierrez <eduardoa@nvidia.com>
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2025-06-17 11:55:00 +02:00
Carlos Eduardo Arango Gutierrez
cf3b9317ef
Refactor the way we create CDI Hooks
Signed-off-by: Carlos Eduardo Arango Gutierrez <eduardoa@nvidia.com>
2025-05-21 10:19:47 +02:00
Evan Lezar
f4981f0876
Add cuda-compat-mode config option
This change adds an nvidia-container-runtime.modes.legacy.cuda-compat-mode
config option. This can be set to one of four values:

* ldconfig (default): the --cuda-compat-mode=ldconfig flag is passed to the nvidia-container-cli
* mount: the --cuda-compat-mode=mount flag is passed to the nvidia-conainer-cli
* disabled: the --cuda-compat-mode=disabled flag is passed to the nvidia-container-cli
* hook: the --cuda-compat-mode=disabled flag is passed to the nvidia-container-cli AND the
  enable-cuda-compat hook is used to provide forward compatibility.

Note that the disable-cuda-compat-lib-hook feature flag will prevent the enable-cuda-compat
hook from being used. This change also means that the allow-cuda-compat-libs-from-container
feature flag no longer has any effect.

Signed-off-by: Evan Lezar <elezar@nvidia.com>
2025-05-13 21:49:53 +02:00
Evan Lezar
aff9301f2e
Add disable-cuda-compat-lib-hook feature flag
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2025-02-27 15:58:15 +02:00
Evan Lezar
b7fbd56f7e
Add ldconfig hook in legacy mode
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2025-02-27 15:58:15 +02:00
Evan Lezar
bd87c009ba
Add enable-cuda-compat hook if required
This change adds the enable-cuda-compat hook to the incomming OCI runtime spec
if the allow-cuda-compat-libs-from-container feature flag is not enabled.

An update-ldcache hook is also injected to ensure that the required folders
are processed.

Signed-off-by: Evan Lezar <elezar@nvidia.com>
2025-02-27 15:58:15 +02:00
Evan Lezar
92df542f2f [no-relnote] Use image.CUDA to extract visible devices
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2024-10-17 16:53:17 +02:00
Evan Lezar
1d9d0acf7d [no-relnote] Remove feature flag for per-container features
This change REMOVES the ability to set opt-in features
(e.g. GDS, MOFED, GDRCOPY) in the config file. The existing
per-container envvars are unaffected.

Signed-off-by: Evan Lezar <elezar@nvidia.com>
2024-10-16 15:30:31 +02:00
Evan Lezar
5145b0a4b6 Revert "Merge pull request #694 from elezar/add-opt-in-to-sockets"
This reverts commit b061446694, reversing
changes made to c490baab63.

Signed-off-by: Evan Lezar <elezar@nvidia.com>
2024-09-20 20:26:45 +02:00
Evan Lezar
ba1ed3232f Skip injection of nvidia-persistenced socket by default
This changes skips the injection of the nvidia-persistenced socket by
default.

An include-persistenced-socket feature flag is added to allow the
injection of this socket to be explicitly requested.

Signed-off-by: Evan Lezar <elezar@nvidia.com>
2024-09-18 22:10:09 +02:00
Evan Lezar
09341a0934 Add support for feature flags
This change adds a features config that allows
individual features to be toggled at a global level. Each feature can (by default)
be controlled by an environment variable.

The GDS, MOFED, NVSWITCH, and GDRCOPY features are examples of such features.

Signed-off-by: Evan Lezar <elezar@nvidia.com>
2024-04-03 11:58:37 +02:00
Christopher Desiniotis
55097b3d7d Add a new gated modifier for GDRCopy which injects the gdrdrv device node
Signed-off-by: Christopher Desiniotis <cdesiniotis@nvidia.com>
2024-01-24 14:25:58 -08:00
Evan Lezar
efae501834 Add support for injecting NVSWITCH devices
This change adds support for an NVIDIA_NVSWITCH environment variable.
When set to `enabled` this striggers the injection of all available
/dev/nvidia-nvswitch* device nodes.

Signed-off-by: Evan Lezar <elezar@nvidia.com>
2023-11-22 21:59:39 +01:00
Evan Lezar
3045954cd9 Consolidate GDS and MOFED modifiers
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2023-11-22 21:59:17 +01:00