Signed-off-by: Tariq Ibrahim <tibrahim@nvidia.com>
add default runtime binary path to runtimes field of toolkit config toml
Signed-off-by: Tariq Ibrahim <tibrahim@nvidia.com>
[no-relnote] Get low-level runtimes consistently
We ensure that we use the same low-level runtimes regardless
of the runtime engine being configured. This ensures consistent
behaviour.
Signed-off-by: Evan Lezar <elezar@nvidia.com>
Co-authored-by: Evan Lezar <elezar@nvidia.com>
address review comment
Signed-off-by: Tariq Ibrahim <tibrahim@nvidia.com>
This change refactors the toml config file handlig for runtimes
such as containerd or crio. A toml.Loader is introduced that
encapsulates loading the required file.
This can be extended to allow other mechanisms for loading
loading the current config.
Signed-off-by: Evan Lezar <elezar@nvidia.com>
This change adds an include-persistenced-socket flag to the
nvidia-ctk cdi generate command that ensures that a generated
specification includes the nvidia-persistenced socket if present on
the host.
Note that for mangement mode, these sockets are always included
if detected.
Signed-off-by: Evan Lezar <elezar@nvidia.com>
This changes skips the injection of the nvidia-persistenced socket by
default.
An include-persistenced-socket feature flag is added to allow the
injection of this socket to be explicitly requested.
Signed-off-by: Evan Lezar <elezar@nvidia.com>
This change updates the create-symlinks hook to check whether
link paths resolve in the container's filesystem. In addition
the executable is updated to return an error if a link could
not be created.
Signed-off-by: Evan Lezar <elezar@nvidia.com>
This change ensures that we unnecessarily print warnings for
runtimes where these configs are not applicable.
This removes the following warnings:
WARN[0000] Ignoring runtime-config-override flag for docker
Signed-off-by: Evan Lezar <elezar@nvidia.com>
This change ensures that the created /etc/ld.so.conf.d file
has a higher priority to ensure that the injected libraries
take precendence over non-compat libraries.
Signed-off-by: Evan Lezar <elezar@nvidia.com>
This change renames the nvidia-ctk system create-device-nodes
flag driver-root to root. This makes it clearer that this is
used to load the kernel modules and is not specific to the
user-mode driver installation.
Signed-off-by: Evan Lezar <elezar@nvidia.com>
This change creates an nvidia-cdi-hook binary for implementing
CDI hooks. This allows for these hooks to be separated from the
nvidia-ctk command which may, for example, require libnvidia-ml
to support other functionality.
The nvidia-ctk hook subcommand is maintained as an alias for the
time being to allow for existing CDI specifications referring to
this path to work as expected.
Signed-off-by: Avi Deitcher <avi@deitcher.net>
This allows settings such as:
nvidia-ctk config --set nvidia-container-runtime.runtimes=crun:runc
to be applied correctly.
Signed-off-by: Evan Lezar <elezar@nvidia.com>
This change adds a features config that allows
individual features to be toggled at a global level. Each feature can (by default)
be controlled by an environment variable.
The GDS, MOFED, NVSWITCH, and GDRCOPY features are examples of such features.
Signed-off-by: Evan Lezar <elezar@nvidia.com>
This change ensures that CLI tools that require the path to the
driver root accept both the NVIDIA_DRIVER_ROOT and DRIVER_ROOT
environment variables in addition to the --driver-root command
line argument.
Signed-off-by: Evan Lezar <elezar@nvidia.com>
Instead of relying only on Experimental mode, the docker daemon
config requires that CDI is an opt-in feature.
Signed-off-by: Evan Lezar <elezar@nvidia.com>
Since the `update-ldcache` hook uses the host's `ldconfig`, the default
cache and config files configured on the host will be used. If those
defaults differ from what nvidia-ctk expects it to be (/etc/ld.so.cache
and /etc/ld.so.conf, respectively), then the hook will fail. This change
makes the call to ldconfig explicit in which cache and config files are
being used.
Signed-off-by: Jared Baur <jaredbaur@fastmail.com>
Since the `createContainer` `runc` hook runs with the environment that
the container's config.json specifies, the path to `ldconfig` may not be
easily resolvable if the host environment differs enough from the
container (e.g. on a NixOS host where all binaries are under hashed
paths in /nix/store with an Ubuntu container whose PATH contains
FHS-style paths such as /bin and /usr/bin). This change allows for
specifying exactly where ldconfig comes from.
Signed-off-by: Jared Baur <jaredbaur@fastmail.com>
This change adds a --relative-to option to the nvidia-ctk transform root
command. This defaults to "host" maintaining the existing behaviour.
If --relative-to=container is specified, the root transform is applied to
container paths in the CDI specification instead of host paths.
Signed-off-by: Evan Lezar <elezar@nvidia.com>
This change renames the root transformer to indicate that it
operates on host paths and adds a container root transformer for
explicitly transforming container roots.
The transform.NewRootTransformer constructor still exists, but has
been marked as deprecated.
Signed-off-by: Evan Lezar <elezar@nvidia.com>
This change switches to using the reflect package to determine
the type of config options instead of inferring the type from the
Toml data structure.
Signed-off-by: Evan Lezar <elezar@nvidia.com>
Instead of relying solely on a static config, we resolve the path
to ldconfig. The path is checked for existence and a .real suffix is preferred.
Signed-off-by: Evan Lezar <elezar@nvidia.com>
In some cases we might get a permission error trying to chmod -
most likely this is due to something beyond our control
like whole `/dev` being mounted.
Do not fail container creation in this case.
Due to loosing control of the program after `exec()`-ing `chmod(1)` program
and therefore not being able to handle errors -
refactor to use `chmod(2)` syscall instead of `exec()` `chmod(1)` program.
Fixes: #143
Signed-off-by: Ievgen Popovych <jmennius@gmail.com>
This change skips the update of ld.cache in the container if it
doesn't exist. Instead, the -N flag is used to only create the
relevant symlinks.
Signed-off-by: Evan Lezar <elezar@nvidia.com>
This change allows CDI devices to be requested as mounts in the
container. This enables their use in environments such as kind
where environment variables or annotations cannot be used.
Signed-off-by: Evan Lezar <elezar@nvidia.com>
This change refactors the use of the symlink filter to make it extendible.
A blocked filter can be set on the Tegra CSV discoverer to ensure that the correct
symlink libraries are filtered out. Here, globs can be used to select mulitple libraries,
and a **/ prefix on the globs indicates that the pattern that follows is only applied to
the filename of the symlink entry in the CSV file.
A --csv.ignore-pattern command line argument is added to the nvidia-ctk cdi generate
command that allows this to be set.
Signed-off-by: Evan Lezar <elezar@nvidia.com>
This change adds a "required" option to the new toml config
that controls whether a default config is returned or not.
This is useful from the NVIDIA Container Runtime Hook, where
/run/driver/nvidia/etc/nvidia-container-runtime/config.toml
is checked before the standard path.
This fixes a bug where the default config was always applied
when this config was not used.
See https://github.com/NVIDIA/nvidia-container-toolkit/issues/106
Signed-off-by: Evan Lezar <elezar@nvidia.com>
This change renames the csv.library-search-path option to
library-search-path so as to be more generally applicable in
future. Note that the option is still only applied in csv mode.
Signed-off-by: Evan Lezar <elezar@nvidia.com>
This chagne simplifies the nvidia-ctk config default command.
By default it now outputs the default config to STDOUT, and can
optionally output this to file.
Signed-off-by: Evan Lezar <elezar@nvidia.com>
This change introduced a config.Toml type that is used as the base for
config file processing and manipulation. This ensures that configs --
including commented values -- can be handled consistently.
Signed-off-by: Evan Lezar <elezar@nvidia.com>
This change ensures that the Config structs from internal.Config
are used for the NVIDIA Container Runtime Hook config too.
Signed-off-by: Evan Lezar <elezar@nvidia.com>
This change extends the nvidia-ctk runtime configure command
with a --config-mode=oci-hook that creates an OCI hook json file.
Signed-off-by: Evan Lezar <elezar@nvidia.com>
If the config.toml has an empty root specified, this could be
passed to the NVIDIA Container CLI through the --root flag
which causes argument parsing to fail. This change only
adds the --root flag if the config option is specified
and is non-empty.
Signed-off-by: Evan Lezar <elezar@nvidia.com>
This change ensures that the nvidia-ctk config default command
generates a config file that is compatible with the official documentation
to, for example, disable cgroups in the NVIDIA Container CLI.
This requires that whitespace around comments is stripped before outputing the
contets.
This also adds an option to load a config and modify it in-place instead. This can
be triggered as a post-install step, for example.
Signed-off-by: Evan Lezar <elezar@nvidia.com>
This changes splits the functionality in the internal system package
into two packages: one for dealing with devices and one for dealing
with kernel modules. This removes ambiguity around the meaning of
driver / device roots in each case.
In each case, a root can be specified where device nodes are created
or kernel modules loaded.
Signed-off-by: Evan Lezar <elezar@nvidia.com>
This change adds a --create-device-nodes option to the
nvidia-ctk system create-dev-char-symlinks command to create
device nodes. The currently only creates control device nodes.
Signed-off-by: Evan Lezar <elezar@nvidia.com>
These changes add a --load-kernel-modules option to the
nvidia-ctk system commands. If specified the NVIDIA kernel modules
(nvidia, nvidia-uvm, and nvidia-modeset) are loaded before any
operations on device nodes are performed.
Signed-off-by: Evan Lezar <elezar@nvidia.com>