Commit Graph

302 Commits

Author SHA1 Message Date
Evan Lezar
d4e21fdd10 Add devRoot option to CDI api
A driverRoot defines both the driver library root and the
root for device nodes. In the case of preinstalled drivers or
the driver container, these are equal, but in cases such as GKE
they do not match. In this case, drivers are extracted to a folder
and devices exist at the root /.

The changes here add a devRoot option to the nvcdi API that allows the
parent of /dev to be specified explicitly.

Signed-off-by: Evan Lezar <elezar@nvidia.com>
2023-11-20 21:29:35 +01:00
Evan Lezar
e609e41a64 Allow multiple pattern matches for symlinks
Since we allow pattern inputs for locating symlinks we could have
multiples. The error being checked is resolved by the deduplication.

Signed-off-by: Evan Lezar <elezar@nvidia.com>
2023-11-17 10:43:52 +01:00
Evan Lezar
80ecd024ee Add tests for library locator
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2023-11-17 10:43:52 +01:00
Evan Lezar
e8dbb216a5 Return empty ldcache if cache does not exist
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2023-11-17 10:14:03 +01:00
Christopher Desiniotis
f5d8d248b7 Deduplicate symlinks
Signed-off-by: Christopher Desiniotis <cdesiniotis@nvidia.com>
2023-11-16 17:57:31 -08:00
Evan Lezar
c63fb35ba8 Use github.com/NVIDIA/go-nvlib imports
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2023-11-15 21:38:26 +01:00
Evan Lezar
04b28d116c Make library lookups more robust
These changes make library lookups more robust. The core change is that
library lookups now first look a set of predefined locations before checking
the ldcache. This also handles cases where an ldcache is not available more
gracefully.

Signed-off-by: Evan Lezar <elezar@nvidia.com>
2023-11-06 12:15:28 -06:00
Evan Lezar
e56bb09889 Use tags.cncf.io for CDI imports
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2023-11-01 12:40:51 +01:00
Evan Lezar
833254fa59 Support CDI devices as mounts
This change allows CDI devices to be requested as mounts in the
container. This enables their use in environments such as kind
where environment variables or annotations cannot be used.

Signed-off-by: Evan Lezar <elezar@nvidia.com>
2023-10-27 21:24:53 +02:00
Evan Lezar
acc50969dc Fix ifElseChain lint errors
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2023-10-24 20:11:34 +02:00
Evan Lezar
48d68e4eff Add nolint for exec calls
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2023-10-24 20:11:34 +02:00
Evan Lezar
709e27bf4b Fix implicit memory aliasing in for loop
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2023-10-24 20:11:34 +02:00
Evan Lezar
1b16b341dd Fix default permissions
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2023-10-24 20:11:34 +02:00
Evan Lezar
2e1f94aedf Fix append assignments
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2023-10-24 20:11:34 +02:00
Evan Lezar
f8870b31be Fix filepath.Join with single arg
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2023-10-24 20:11:34 +02:00
Evan Lezar
73857eb8e3 Fix unnecessary conversion
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2023-10-24 20:11:34 +02:00
Evan Lezar
dd2f218226 Use MustCompile for static regexp
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2023-10-24 20:00:24 +02:00
Evan Lezar
8a9f367067 Check returned error values
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2023-10-24 20:00:24 +02:00
Evan Lezar
e0df157f70 Remove unnecessary assignment to the blank identifier
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2023-10-24 20:00:24 +02:00
Evan Lezar
12dc12ce09 Fix misspellings
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2023-10-24 20:00:24 +02:00
Evan Lezar
73749285d5 Remove unused loadSaver interface
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2023-10-24 20:00:24 +02:00
Evan Lezar
f63ad3d9e7 Refactor symlink filter
This change refactors the use of the symlink filter to make it extendible.
A blocked filter can be set on the Tegra CSV discoverer to ensure that the correct
symlink libraries are filtered out. Here, globs can be used to select mulitple libraries,
and a **/ prefix on the globs indicates that the pattern that follows is only applied to
the filename of the symlink entry in the CSV file.

A --csv.ignore-pattern command line argument is added to the nvidia-ctk cdi generate
command that allows this to be set.

Signed-off-by: Evan Lezar <elezar@nvidia.com>
2023-09-22 22:04:06 +02:00
Evan Lezar
c4b4478d1a Remove default symlink filter
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2023-09-22 22:02:51 +02:00
Evan Lezar
963250a58f Refactor CSV discovery for testability
This change improves the testibility of the CSV discoverer.
This is done by adding injection points for mocks for library discovery and
symlink resolution.

Note that this highlights a bug in the current implementation where the
library filter causes valid symlinks to be skipped.

Signed-off-by: Evan Lezar <elezar@nvidia.com>
2023-09-22 22:02:30 +02:00
Evan Lezar
4ec9bd751e Add required option to new toml config
This change adds a "required" option to the new toml config
that controls whether a default config is returned or not.
This is useful from the NVIDIA Container Runtime Hook, where
/run/driver/nvidia/etc/nvidia-container-runtime/config.toml
is checked before the standard path.

This fixes a bug where the default config was always applied
when this config was not used.

See https://github.com/NVIDIA/nvidia-container-toolkit/issues/106

Signed-off-by: Evan Lezar <elezar@nvidia.com>
2023-09-07 11:56:01 +02:00
Evan Lezar
2a3afdd5d9 Merge branch 'fix-platform-detection' into 'main'
Add UsesNVGPUModule info function

See merge request nvidia/container-toolkit/container-toolkit!473
2023-08-28 15:58:41 +00:00
Evan Lezar
1dc028cdf2 Add UsesNVGPUModule info function
This change adds a UsesNVGPUModule function that checks whether the nvgpu
kernel module is used by NVML. This allows for more robust detection of
Tegra-based platforms where libnvidia-ml.so is supported to enumerate the
iGPU.

Signed-off-by: Evan Lezar <elezar@nvidia.com>
2023-08-25 11:24:34 +02:00
Evan Lezar
ca1055588d Remove /sys/devices/soc0/family from CDI spec
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2023-08-25 10:25:33 +02:00
Tariq Ibrahim
f904ec41eb Merge branch 'log-unresolved-devices' into 'main'
add a warning statement listing unresolved CDI devices

See merge request nvidia/container-toolkit/container-toolkit!461
2023-08-14 17:22:36 +00:00
Evan Lezar
4addb292b1 Extend nvidia-ctk config command to allow options to be set
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2023-08-14 11:33:26 +02:00
Evan Lezar
a69657dde7 Add config.Toml type to handle config files
This change introduced a config.Toml type that is used as the base for
config file processing and manipulation. This ensures that configs --
including commented values -- can be handled consistently.

Signed-off-by: Evan Lezar <elezar@nvidia.com>
2023-08-14 11:32:54 +02:00
Evan Lezar
c2d4de54b0 Add function to get config file path. 2023-08-14 11:32:54 +02:00
Evan Lezar
b18ac09f77 Refactor handling of DriverCapabilities
This change consolidates the handling of NVIDIA_DRIVER_CAPABILITIES in the
interal/image package.

Signed-off-by: Evan Lezar <elezar@nvidia.com>
2023-08-14 10:40:42 +02:00
Evan Lezar
4dcaa61167 Use internal/config structs in hook
This change ensures that the Config structs from internal.Config
are used for the NVIDIA Container Runtime Hook config too.

Signed-off-by: Evan Lezar <elezar@nvidia.com>
2023-08-14 10:40:41 +02:00
Evan Lezar
8bf52e1dec Export config.GetDefault function
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2023-08-14 10:35:33 +02:00
Tariq Ibrahim
6d3b29f3ca add a warning statement listing unresolved CDI devices 2023-08-10 08:38:33 -07:00
Evan Lezar
8553fce68a Specify library search paths for CSV CDI spec generation
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2023-08-04 16:49:30 +02:00
Evan Lezar
03a4e2f8a9 Skip symlinks to libraries
In order to properly handle systems with both iGPU and dGPU
drivers included, we skip "sym" mount specifications which
refer to .so or .so.[1-9] files.

Signed-off-by: Evan Lezar <elezar@nvidia.com>
2023-08-04 16:49:30 +02:00
Evan Lezar
918bd03488 Move tegra-specifics to new package
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2023-08-04 16:49:30 +02:00
Evan Lezar
01a7f7bb8e Explicitly generate CDI spec for CSV mode
This change explicitly generates a CDI specification from
the supplied CSV files when cdi mode is detected. This
ensures consistency between the behaviour on Tegra-based
systems.

Signed-off-by: Evan Lezar <elezar@nvidia.com>
2023-08-04 16:49:30 +02:00
Evan Lezar
6b48cbd1dc Move CDI modifier to separate package
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2023-08-04 16:49:30 +02:00
Evan Lezar
ec63533eb1 Ensure default config comments are consistent
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2023-07-19 14:37:49 +02:00
Evan Lezar
ce7d5f7a51 Use functional options when constructing direcory locator
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2023-07-18 15:36:03 +02:00
Evan Lezar
9b64d74f6a Use functional options when constructing Symlink locator
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2023-07-18 15:31:15 +02:00
Evan Lezar
cca343abb0 Pass image when constructing CSV modifier
Since the incoming OCI spec has already been parsed and used to
construct a CUDA image representation, pass this to the CSV
modifier constructor instead of re-creating an image representation.

Signed-off-by: Evan Lezar <elezar@nvidia.com>
2023-07-18 15:27:16 +02:00
Evan Lezar
e2f8d2a15f Set default spec dirs at config level
This change sets the default CDI spec dirs at a config level instead
of when a CDI runtime modifier is constructed. This makes this setting
consistent with other options such as the nvidia-ctk path.

Signed-off-by: Evan Lezar <elezar@nvidia.com>
2023-07-18 15:23:09 +02:00
Evan Lezar
481000b4ce Remove unused argument
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2023-07-18 15:20:24 +02:00
Evan Lezar
083b789102 Use cdi parser package for IsQualiedName
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2023-07-18 15:16:25 +02:00
Evan Lezar
6750ce1667 Print invalid version on parse error
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2023-07-11 13:47:39 +02:00
Evan Lezar
1081cecea9 Return empty requirements if NVIDIA_DISABLE_REQUIRE is true
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2023-07-11 13:47:37 +02:00