This change adds an EnableCDI method to the container engine config files and
Updates the 'nvidia-ctk runtime configure' command to use this new method.
Signed-off-by: Christopher Desiniotis <cdesiniotis@nvidia.com>
This change adds an allow-cuda-compat-libs-from-container feature flag
to the NVIDIA Container Toolkit config. This allows a user to opt-in
to the previous default behaviour of overriding certain driver
libraries with CUDA compat libraries from the container.
Signed-off-by: Evan Lezar <elezar@nvidia.com>
This change passes the --no-cntlibs argument to the nvidia-container-cli
from the nvidia-container-runtime-hook to disable overwriting host
drivers with the compat libs from a container being started.
Note that this may be a breaking change for some applications.
Signed-off-by: Evan Lezar <elezar@nvidia.com>
This change moves the containerized installer from nvidia-toolkit to
cmd/nvidia-ctk-installer to allow for its use in CI.
Signed-off-by: Evan Lezar <elezar@nvidia.com>
This removes the untested watch option from the
nvidia-ctk system create-dev-char-symlinks command.
This also removes the direct dependency on fsnotify.
Signed-off-by: Evan Lezar <elezar@nvidia.com>
This fix ensures that the default config file path for the nvidia-ctk runtime configure
command is set consistently.
Signed-off-by: Evan Lezar <elezar@nvidia.com>
This change updates the create-symlink hook to be equivalent to
ln -f -s target link
This ensures that links are updated even if they exist in the container
being run.
Signed-off-by: Evan Lezar <elezar@nvidia.com>
This change updates the create-symlinks hook to always evaluate
link paths in the container's root filesystem. In addition the
executable is updated to return an error if a link could not
be created.
Signed-off-by: Christopher Desiniotis <cdesiniotis@nvidia.com>
This chagne ensures that we always treat the link path as a path
relative to the container root. Without this change, relative paths
in link paths would result links being created relative to the
current working directory where the hook is executed.
Signed-off-by: Evan Lezar <elezar@nvidia.com>
The hostRoot argument is always empty and not applicable to
how links are specified.
Links are specified by the paths in the container filesystem and as such
the only transform required to change the root is a join of the filepath.
Signed-off-by: Evan Lezar <elezar@nvidia.com>
Since hostRoot is always the empty string and we are changing the root in the
target path to /, the call to changeRoot is redundant.
Signed-off-by: Evan Lezar <elezar@nvidia.com>
This change removes support for specifying csv-filenames when
calling the create-symlinks hook. This is no longer required
as tegra-based systems generate hooks with `--link` arguments.
This also allows the hook to better serve as a reference implementation
for upstream projects wanting to implement a set of standard CDI hooks.
Signed-off-by: Evan Lezar <elezar@nvidia.com>
This change allows IMEX channels to be requested using the
volume mount mechanism.
A mount from /dev/null to /var/run/nvidia-container-devices/imex/{{ .ChannelID }}
is equivalent to including {{ .ChannelID }} in the NVIDIA_IMEX_CHANNELS
envvironment variables.
Signed-off-by: Evan Lezar <elezar@nvidia.com>
Signed-off-by: Tariq Ibrahim <tibrahim@nvidia.com>
add default runtime binary path to runtimes field of toolkit config toml
Signed-off-by: Tariq Ibrahim <tibrahim@nvidia.com>
[no-relnote] Get low-level runtimes consistently
We ensure that we use the same low-level runtimes regardless
of the runtime engine being configured. This ensures consistent
behaviour.
Signed-off-by: Evan Lezar <elezar@nvidia.com>
Co-authored-by: Evan Lezar <elezar@nvidia.com>
address review comment
Signed-off-by: Tariq Ibrahim <tibrahim@nvidia.com>
This change refactors the toml config file handlig for runtimes
such as containerd or crio. A toml.Loader is introduced that
encapsulates loading the required file.
This can be extended to allow other mechanisms for loading
loading the current config.
Signed-off-by: Evan Lezar <elezar@nvidia.com>
This change adds an include-persistenced-socket flag to the
nvidia-ctk cdi generate command that ensures that a generated
specification includes the nvidia-persistenced socket if present on
the host.
Note that for mangement mode, these sockets are always included
if detected.
Signed-off-by: Evan Lezar <elezar@nvidia.com>
This changes skips the injection of the nvidia-persistenced socket by
default.
An include-persistenced-socket feature flag is added to allow the
injection of this socket to be explicitly requested.
Signed-off-by: Evan Lezar <elezar@nvidia.com>
This change updates the create-symlinks hook to check whether
link paths resolve in the container's filesystem. In addition
the executable is updated to return an error if a link could
not be created.
Signed-off-by: Evan Lezar <elezar@nvidia.com>
This change ensures that we unnecessarily print warnings for
runtimes where these configs are not applicable.
This removes the following warnings:
WARN[0000] Ignoring runtime-config-override flag for docker
Signed-off-by: Evan Lezar <elezar@nvidia.com>
This change ensures that the created /etc/ld.so.conf.d file
has a higher priority to ensure that the injected libraries
take precendence over non-compat libraries.
Signed-off-by: Evan Lezar <elezar@nvidia.com>
This change renames the nvidia-ctk system create-device-nodes
flag driver-root to root. This makes it clearer that this is
used to load the kernel modules and is not specific to the
user-mode driver installation.
Signed-off-by: Evan Lezar <elezar@nvidia.com>
This change creates an nvidia-cdi-hook binary for implementing
CDI hooks. This allows for these hooks to be separated from the
nvidia-ctk command which may, for example, require libnvidia-ml
to support other functionality.
The nvidia-ctk hook subcommand is maintained as an alias for the
time being to allow for existing CDI specifications referring to
this path to work as expected.
Signed-off-by: Avi Deitcher <avi@deitcher.net>
This allows settings such as:
nvidia-ctk config --set nvidia-container-runtime.runtimes=crun:runc
to be applied correctly.
Signed-off-by: Evan Lezar <elezar@nvidia.com>
This change adds a features config that allows
individual features to be toggled at a global level. Each feature can (by default)
be controlled by an environment variable.
The GDS, MOFED, NVSWITCH, and GDRCOPY features are examples of such features.
Signed-off-by: Evan Lezar <elezar@nvidia.com>
This change ensures that CLI tools that require the path to the
driver root accept both the NVIDIA_DRIVER_ROOT and DRIVER_ROOT
environment variables in addition to the --driver-root command
line argument.
Signed-off-by: Evan Lezar <elezar@nvidia.com>
Instead of relying only on Experimental mode, the docker daemon
config requires that CDI is an opt-in feature.
Signed-off-by: Evan Lezar <elezar@nvidia.com>
Since the `update-ldcache` hook uses the host's `ldconfig`, the default
cache and config files configured on the host will be used. If those
defaults differ from what nvidia-ctk expects it to be (/etc/ld.so.cache
and /etc/ld.so.conf, respectively), then the hook will fail. This change
makes the call to ldconfig explicit in which cache and config files are
being used.
Signed-off-by: Jared Baur <jaredbaur@fastmail.com>
Since the `createContainer` `runc` hook runs with the environment that
the container's config.json specifies, the path to `ldconfig` may not be
easily resolvable if the host environment differs enough from the
container (e.g. on a NixOS host where all binaries are under hashed
paths in /nix/store with an Ubuntu container whose PATH contains
FHS-style paths such as /bin and /usr/bin). This change allows for
specifying exactly where ldconfig comes from.
Signed-off-by: Jared Baur <jaredbaur@fastmail.com>
This change adds a --relative-to option to the nvidia-ctk transform root
command. This defaults to "host" maintaining the existing behaviour.
If --relative-to=container is specified, the root transform is applied to
container paths in the CDI specification instead of host paths.
Signed-off-by: Evan Lezar <elezar@nvidia.com>
This change renames the root transformer to indicate that it
operates on host paths and adds a container root transformer for
explicitly transforming container roots.
The transform.NewRootTransformer constructor still exists, but has
been marked as deprecated.
Signed-off-by: Evan Lezar <elezar@nvidia.com>
This change switches to using the reflect package to determine
the type of config options instead of inferring the type from the
Toml data structure.
Signed-off-by: Evan Lezar <elezar@nvidia.com>
Instead of relying solely on a static config, we resolve the path
to ldconfig. The path is checked for existence and a .real suffix is preferred.
Signed-off-by: Evan Lezar <elezar@nvidia.com>
In some cases we might get a permission error trying to chmod -
most likely this is due to something beyond our control
like whole `/dev` being mounted.
Do not fail container creation in this case.
Due to loosing control of the program after `exec()`-ing `chmod(1)` program
and therefore not being able to handle errors -
refactor to use `chmod(2)` syscall instead of `exec()` `chmod(1)` program.
Fixes: #143
Signed-off-by: Ievgen Popovych <jmennius@gmail.com>
This change skips the update of ld.cache in the container if it
doesn't exist. Instead, the -N flag is used to only create the
relevant symlinks.
Signed-off-by: Evan Lezar <elezar@nvidia.com>
This change allows CDI devices to be requested as mounts in the
container. This enables their use in environments such as kind
where environment variables or annotations cannot be used.
Signed-off-by: Evan Lezar <elezar@nvidia.com>