Commit Graph

343 Commits

Author SHA1 Message Date
Evan Lezar
997f23cf11
Merge pull request #953 from elezar/backport-enable-cdi-container
Add option in toolkit container to enable CDI in runtime
2025-03-06 10:50:43 +02:00
Evan Lezar
e4f8406139
Merge pull request #952 from elezar/add-disable-imex-channels-feature
Add ignore-imex-channel-requests feature flag.
2025-03-06 10:49:47 +02:00
Evan Lezar
aa0d4af51a
Merge pull request #951 from elezar/add-e2e-tests
Some checks are pending
CI Pipeline / code-scanning (push) Waiting to run
CI Pipeline / variables (push) Waiting to run
CI Pipeline / golang (push) Waiting to run
CI Pipeline / image (push) Blocked by required conditions
CI Pipeline / e2e-test (push) Blocked by required conditions
Add e2e tests to release-1.17 branch
2025-03-05 19:04:50 +02:00
Evan Lezar
267fb5987f
Merge pull request #950 from elezar/seal-ldconfig
Some checks failed
CodeQL / Analyze Go code with CodeQL (push) Has been cancelled
Golang / check (push) Has been cancelled
Golang / Unit test (push) Has been cancelled
Golang / Build (push) Has been cancelled
image / packages (${{github.event_name == 'pull_request'}}, centos7-aarch64) (push) Has been cancelled
image / packages (${{github.event_name == 'pull_request'}}, centos7-x86_64) (push) Has been cancelled
image / packages (${{github.event_name == 'pull_request'}}, centos8-ppc64le) (push) Has been cancelled
image / packages (${{github.event_name == 'pull_request'}}, ubuntu18.04-amd64) (push) Has been cancelled
image / packages (${{github.event_name == 'pull_request'}}, ubuntu18.04-arm64) (push) Has been cancelled
image / packages (${{github.event_name == 'pull_request'}}, ubuntu18.04-ppc64le) (push) Has been cancelled
image / image (packaging, ${{github.event_name == 'pull_request'}}) (push) Has been cancelled
image / image (ubi8, ${{github.event_name == 'pull_request'}}) (push) Has been cancelled
image / image (ubuntu20.04, ${{github.event_name == 'pull_request'}}) (push) Has been cancelled
Use memfd when running ldconfig
2025-02-28 22:12:57 +02:00
Evan Lezar
eb48d2d5fd
Enable CDI in runtime if CDI_ENABLED is set
This change also enables CDI in the configured runtime when the toolkit
is installed with CDI enabled.

Signed-off-by: Evan Lezar <elezar@nvidia.com>
2025-02-28 22:10:38 +02:00
Christopher Desiniotis
1bc9548a2f
Add EnableCDI() method to engine.Interface
This change adds an EnableCDI method to the container engine config files and
Updates the 'nvidia-ctk runtime configure' command to use this new method.

Signed-off-by: Christopher Desiniotis <cdesiniotis@nvidia.com>
2025-02-28 18:33:21 +02:00
Evan Lezar
7c758c97b8
Add ignore-imex-channel-requests feature flag
This allows the NVIDIA Container Toolkit to ignore IMEX channel requests
through the NVIDIA_IMEX_CHANNELS envvar or volume mounts and ensures that
the NVIDIA Container Toolkit cannot be used to provide out-of-band access
to an IMEX channel by simply specifying an environment variable, possibly
bypassing other checks by an orchestration system such as kubernetes.

Signed-off-by: Evan Lezar <elezar@nvidia.com>
2025-02-28 17:54:11 +02:00
Carlos Eduardo Arango Gutierrez
868f385a01
Rename test folder to tests
Signed-off-by: Carlos Eduardo Arango Gutierrez <eduardoa@nvidia.com>
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2025-02-28 17:24:22 +02:00
Evan Lezar
5bdf14b1e7
Use libcontainer execseal to run ldconfig
This change copies ldconfig into a memfd before executing it from
the createContainer hook.

Signed-off-by: Evan Lezar <elezar@nvidia.com>
2025-02-28 14:47:31 +02:00
Evan Lezar
b598826ff2
[no-relnote] Move root to separate file
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2025-02-28 14:01:50 +02:00
Evan Lezar
e1ae57eef9
Add enable-cuda-compat hook if required
This change adds the enable-cuda-compat hook to the incomming OCI runtime spec
if the allow-cuda-compat-libs-from-container feature flag is not enabled.

An update-ldcache hook is also injected to ensure that the required folders
are processed.

Signed-off-by: Evan Lezar <elezar@nvidia.com>
2025-02-27 17:26:15 +02:00
Evan Lezar
76040ff2ad
Add enable-cuda-compat hook to allow compat libs to be discovered
This change adds an nvidia-cdi-hook enable-cuda-compat hook that checks the
container for cuda compat libs and updates /etc/ld.so.conf.d to include their
parent folder if their driver major version is sufficient.

This allows CUDA Forward Compatibility to be used when this is not available
through the libnvidia-container.

Signed-off-by: Evan Lezar <elezar@nvidia.com>
2025-02-27 17:26:15 +02:00
Evan Lezar
2310ed76d8
Add allow-cuda-compat-libs-from-container feature flag
This change adds an allow-cuda-compat-libs-from-container feature flag
to the NVIDIA Container Toolkit config. This allows a user to opt-in
to the previous default behaviour of overriding certain driver
libraries with CUDA compat libraries from the container.

Signed-off-by: Evan Lezar <elezar@nvidia.com>
2025-01-23 10:59:52 +01:00
Evan Lezar
f2b3e8d381
Disable mounting of compat libs from container
This change passes the --no-cntlibs argument to the nvidia-container-cli
from the nvidia-container-runtime-hook to disable overwriting host
drivers with the compat libs from a container being started.

Note that this may be a breaking change for some applications.

Signed-off-by: Evan Lezar <elezar@nvidia.com>
2025-01-23 10:59:52 +01:00
Evan Lezar
c90338dd86
[no-relnote] Refactor config handling for hook
This change removes indirect calls to get the default config
from the nvidia-container-runtime-hook.

Signed-off-by: Evan Lezar <elezar@nvidia.com>
2024-11-26 13:37:00 +01:00
Evan Lezar
0322f85690
[no-relnote] Remove unused code
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2024-11-26 13:37:00 +01:00
Evan Lezar
edf5d970f4
Fix bug in default config file path
This fix ensures that the default config file path for the nvidia-ctk runtime configure
command is set consistently.

Signed-off-by: Evan Lezar <elezar@nvidia.com>
2024-11-08 15:58:12 -08:00
Evan Lezar
324096c979
Force symlink creation in create-symlink hook
This change updates the create-symlink hook to be equivalent to
ln -f -s target link

This ensures that links are updated even if they exist in the container
being run.

Signed-off-by: Evan Lezar <elezar@nvidia.com>
2024-11-05 09:39:11 -08:00
Evan Lezar
3fb1615d26
[no-relnote] Address lint errors in test
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2024-10-31 10:11:58 +01:00
Christopher Desiniotis
7e0cd45b1c
Check for valid paths in create-symlinks hook
This change updates the create-symlinks hook to always evaluate
link paths in the container's root filesystem. In addition the
executable is updated to return an error if a link could not
be created.

Signed-off-by: Christopher Desiniotis <cdesiniotis@nvidia.com>
2024-10-29 12:16:51 -07:00
Christopher Desiniotis
a04e3ac4f7
Write failing test case for create-symlinks hook
Signed-off-by: Christopher Desiniotis <cdesiniotis@nvidia.com>
2024-10-29 12:16:51 -07:00
Christopher Desiniotis
92779e71b3
Handle case where symlink already exists in create-symlinks hook
Signed-off-by: Christopher Desiniotis <cdesiniotis@nvidia.com>
2024-10-29 12:16:51 -07:00
Christopher Desiniotis
23f1ba3e93
Add unit tests for create-symlinks hook
Signed-off-by: Christopher Desiniotis <cdesiniotis@nvidia.com>
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2024-10-29 12:16:51 -07:00
Evan Lezar
d0d85a8c5c
Always use paths relative to the container root for links
This chagne ensures that we always treat the link path as a path
relative to the container root. Without this change, relative paths
in link paths would result links being created relative to the
current working directory where the hook is executed.

Signed-off-by: Evan Lezar <elezar@nvidia.com>
2024-10-29 12:16:51 -07:00
Evan Lezar
bfea673d6a
[no-relnote] Remove unused hostRoot argument
The hostRoot argument is always empty and not applicable to
how links are specified.

Links are specified by the paths in the container filesystem and as such
the only transform required to change the root is a join of the filepath.

Signed-off-by: Evan Lezar <elezar@nvidia.com>
2024-10-29 12:16:50 -07:00
Evan Lezar
6a6a3e6055
[no-relnote] Remove redundant changeRoot for link target
Since hostRoot is always the empty string and we are changing the root in the
target path to /, the call to changeRoot is redundant.

Signed-off-by: Evan Lezar <elezar@nvidia.com>
2024-10-29 12:16:50 -07:00
Evan Lezar
fa59d12973
[no-relnote] Check created outside of create loop
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2024-10-29 12:16:49 -07:00
Evan Lezar
c802c3089c Remove unsupported print-ldcache command
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2024-10-24 15:56:25 +02:00
Evan Lezar
5e3e91a010 [no-relnote] Minor cleanup in create-symlinks
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2024-10-18 16:27:38 +02:00
Evan Lezar
dc0e191093 Remove csv-filename support from create-symlinks
This change removes support for specifying csv-filenames when
calling the create-symlinks hook. This is no longer required
as tegra-based systems generate hooks with `--link` arguments.

This also allows the hook to better serve as a reference implementation
for upstream projects wanting to implement a set of standard CDI hooks.

Signed-off-by: Evan Lezar <elezar@nvidia.com>
2024-10-18 16:27:27 +02:00
Evan Lezar
e5175c270e
Merge pull request #745 from elezar/fix-symlink-logging
Fix symlink resolution error message
2024-10-17 18:04:54 +02:00
Evan Lezar
2e6712d2bc Allow IMEX channels to be requested as volume mounts
This change allows IMEX channels to be requested using the
volume mount mechanism.

A mount from /dev/null to /var/run/nvidia-container-devices/imex/{{ .ChannelID }}
is equivalent to including {{ .ChannelID }} in the NVIDIA_IMEX_CHANNELS
envvironment variables.

Signed-off-by: Evan Lezar <elezar@nvidia.com>
2024-10-17 16:54:29 +02:00
Evan Lezar
92df542f2f [no-relnote] Use image.CUDA to extract visible devices
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2024-10-17 16:53:17 +02:00
Evan Lezar
1991b3ef2a [no-relnote] Use string slice for devices in hook
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2024-10-17 16:53:17 +02:00
Evan Lezar
cdf39fbad3 [no-relnote] Use symlinks.Resolve in hook
This change removes duplicate logic from the create-symlinks hook
and uses symlinks.Resolve instead.

Signed-off-by: Evan Lezar <elezar@nvidia.com>
2024-10-17 15:47:13 +02:00
Evan Lezar
457d71c170 Add disable-imex-channel-creation feature flag
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2024-10-17 14:26:24 +02:00
Tariq Ibrahim
b90ee5d100 [no-relnote] minor cleanup and improvements
Signed-off-by: Tariq Ibrahim <tibrahim@nvidia.com>
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2024-10-11 16:14:41 +02:00
Tariq Ibrahim
f477dc0df1
fetch current container runtime config through the command line
Signed-off-by: Tariq Ibrahim <tibrahim@nvidia.com>

add default runtime binary path to runtimes field of toolkit config toml

Signed-off-by: Tariq Ibrahim <tibrahim@nvidia.com>

[no-relnote] Get low-level runtimes consistently

We ensure that we use the same low-level runtimes regardless
of the runtime engine being configured. This ensures consistent
behaviour.

Signed-off-by: Evan Lezar <elezar@nvidia.com>

Co-authored-by: Evan Lezar <elezar@nvidia.com>

address review comment

Signed-off-by: Tariq Ibrahim <tibrahim@nvidia.com>
2024-10-10 01:13:20 -07:00
Evan Lezar
bf2bdfd35e Refactor Toml config handling
This change refactors the toml config file handlig for runtimes
such as containerd or crio. A toml.Loader is introduced that
encapsulates loading the required file.

This can be extended to allow other mechanisms for loading
loading the current config.

Signed-off-by: Evan Lezar <elezar@nvidia.com>
2024-09-30 14:24:18 +02:00
Evan Lezar
6c5f4eea63 Remove support for config overrides
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2024-09-27 13:23:35 +02:00
Evan Lezar
3121663537
Merge pull request #708 from elezar/revert-check-symlink-resolution
Revert "Merge pull request #696 from elezar/check-link-resolution"
2024-09-20 20:50:02 +02:00
Evan Lezar
dc2ccdd2fa Revert "Merge pull request #696 from elezar/check-link-resolution"
This reverts commit c490baab63, reversing
changes made to 32486cf1e9.

Signed-off-by: Evan Lezar <elezar@nvidia.com>
2024-09-20 20:29:42 +02:00
Evan Lezar
5145b0a4b6 Revert "Merge pull request #694 from elezar/add-opt-in-to-sockets"
This reverts commit b061446694, reversing
changes made to c490baab63.

Signed-off-by: Evan Lezar <elezar@nvidia.com>
2024-09-20 20:26:45 +02:00
Evan Lezar
a819cfdab4 Revert "Merge pull request #703 from elezar/fix-no-persistenced-flag"
This reverts commit c02b144ed4, reversing
changes made to b061446694.

Signed-off-by: Evan Lezar <elezar@nvidia.com>
2024-09-20 20:26:33 +02:00
Evan Lezar
16fdef50e6 [no-relnote] Move --no-persistenced flag to after configure
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2024-09-18 22:52:25 +02:00
Evan Lezar
70da6cfa50 Allow inclusion of persistenced socket in CDI specification
This change adds an include-persistenced-socket flag to the
nvidia-ctk cdi generate command that ensures that a generated
specification includes the nvidia-persistenced socket if present on
the host.

Note that for mangement mode, these sockets are always included
if detected.

Signed-off-by: Evan Lezar <elezar@nvidia.com>
2024-09-18 22:30:27 +02:00
Evan Lezar
ba1ed3232f Skip injection of nvidia-persistenced socket by default
This changes skips the injection of the nvidia-persistenced socket by
default.

An include-persistenced-socket feature flag is added to allow the
injection of this socket to be explicitly requested.

Signed-off-by: Evan Lezar <elezar@nvidia.com>
2024-09-18 22:10:09 +02:00
Evan Lezar
e383b75a4d Check for valid paths in create-symlinks hook
This change updates the create-symlinks hook to check whether
link paths resolve in the container's filesystem. In addition
the executable is updated to return an error if a link could
not be created.

Signed-off-by: Evan Lezar <elezar@nvidia.com>
2024-09-18 17:47:19 +02:00
Evan Lezar
89c12c1368 Use empty string for default runtime-config-override
This change ensures that we unnecessarily print warnings for
runtimes where these configs are not applicable.

This removes the following warnings:
WARN[0000] Ignoring runtime-config-override flag for docker

Signed-off-by: Evan Lezar <elezar@nvidia.com>
2024-08-19 11:33:26 +02:00
Evan Lezar
b5743da52f
Merge pull request #526 from elezar/add-dev-root-to-create-device-nodes
Add dev-root option to create-device-nodes
2024-06-17 14:57:26 +02:00