This change refactors the use of the symlink filter to make it extendible.
A blocked filter can be set on the Tegra CSV discoverer to ensure that the correct
symlink libraries are filtered out. Here, globs can be used to select mulitple libraries,
and a **/ prefix on the globs indicates that the pattern that follows is only applied to
the filename of the symlink entry in the CSV file.
A --csv.ignore-pattern command line argument is added to the nvidia-ctk cdi generate
command that allows this to be set.
Signed-off-by: Evan Lezar <elezar@nvidia.com>
This change adds a "required" option to the new toml config
that controls whether a default config is returned or not.
This is useful from the NVIDIA Container Runtime Hook, where
/run/driver/nvidia/etc/nvidia-container-runtime/config.toml
is checked before the standard path.
This fixes a bug where the default config was always applied
when this config was not used.
See https://github.com/NVIDIA/nvidia-container-toolkit/issues/106
Signed-off-by: Evan Lezar <elezar@nvidia.com>
This change adds a UsesNVGPUModule function that checks whether the nvgpu
kernel module is used by NVML. This allows for more robust detection of
Tegra-based platforms where libnvidia-ml.so is supported to enumerate the
iGPU.
Signed-off-by: Evan Lezar <elezar@nvidia.com>
This change updates go-nvlib to include logic to skip NVIDIA PCI-E
devices where the name or class id is not known.
Signed-off-by: Evan Lezar <elezar@nvidia.com>
This change renames the csv.library-search-path option to
library-search-path so as to be more generally applicable in
future. Note that the option is still only applied in csv mode.
Signed-off-by: Evan Lezar <elezar@nvidia.com>
This change ensures that the Config structs from internal.Config
are used for the NVIDIA Container Runtime Hook config too.
Signed-off-by: Evan Lezar <elezar@nvidia.com>
This change removes installation of the oci-nvidia-hook files.
These files conflict with CDI use in runtimes that support it.
The use of the hook should be considered deprecated on these platforms.
If a hook is required, the
nvidia-ctk runtime configure --config-mode=oci-hook
command should be used to create the hook file(s).
Signed-off-by: Evan Lezar <elezar@nvidia.com>
This change extends the nvidia-ctk runtime configure command
with a --config-mode=oci-hook that creates an OCI hook json file.
Signed-off-by: Evan Lezar <elezar@nvidia.com>
The debian and rpm packages are updated to trigger the generation of
of a default config if no config exists at the expected location.
Signed-off-by: Evan Lezar <elezar@nvidia.com>
This change adds an nvidia-container-runtime-hook.path config option
to allow the path used for the prestart hook to be overridden. This
is useful in cases where multiple NVIDIA Container Toolkit installations
are present.
Signed-off-by: Evan Lezar <elezar@nvidia.com>
This change updates go-nvlib to ensure that non-migcapable GPUs
are skipped when generating CDI specifications for MIG devices.
Signed-off-by: Evan Lezar <elezar@nvidia.com>
The nvcid api is extended to allow for merged device options to
be specified. If any options are specified, then a merged device
is generated.
Signed-off-by: Evan Lezar <elezar@nvidia.com>
By default, temporary files are created with permissions 600 and
this means that the files created when updating the ldcache are
not readable in non-root containers.
Signed-off-by: Evan Lezar <elezar@nvidia.com>
This change adds an nvidia-container-runtime.modes.cdi.annotation-prefixes config
option that defaults to cdi.k8s.io/. This allows the annotation prefixes parsed
for CDI devices to be overridden in cases where CDI support in container engines such
as containerd or crio need to be overridden.
Signed-off-by: Evan Lezar <elezar@nvidia.com>
This change allows nvcdi.New to return an error in addition to the
constructed library instead of panicing.
Signed-off-by: Evan Lezar <elezar@nvidia.com>
As simplified CDI spec has no duplicate entities in any single set of container edits.
Furthermore, contianer edits defined at a spec-level are not included in the container
edits for a device.
Signed-off-by: Evan Lezar <elezar@nvidia.com>
Since we relied on finding libcuda.so in the LDCache to determine both the CUDA
version and the expected directory for the driver libraries, the generation of the
management CDI specifications fails in containers where the LDCache has not been updated.
This change falls back to searching a set of predefined paths instead when the lookup of
libcuda.so in the cache fails.
Signed-off-by: Evan Lezar <elezar@nvidia.com>