Commit Graph

87 Commits

Author SHA1 Message Date
Evan Lezar
b6be911eaa Replace go-nvlib/pkg/nvml with go-nvml/pkg/nvml
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2024-04-18 15:01:20 +02:00
Evan Lezar
29c0f82ed2
Merge pull request #327 from elezar/add-driver-config
Add config search path option to driver root
2024-04-11 16:58:33 +02:00
Evan Lezar
4cd86caf67 Use NewCache instead of GetRegistry
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2024-04-05 17:09:17 +02:00
Evan Lezar
2a9e3537ec Add config search paths option to driver root.
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2024-04-02 23:03:05 +02:00
Evan Lezar
93763d25f0 Use functional options to construct driver root
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2024-03-25 20:13:25 +02:00
Evan Lezar
88ad42ccd1 Add NVIDIA_VISIBLE_DEVICES=void to CDI specs
This change ensures taht NVIDIA_VISIBLE_DEVICES=void is included in
generated CDI specs. This prevents the NVIDIA Container Runtime Hook
from injecting devices if NVIDIA_VISIBLE_DEVICES=all is set.

Signed-off-by: Evan Lezar <elezar@nvidia.com>
2024-03-04 16:10:06 +02:00
Evan Lezar
52da12cf9a Allow multiple device name strategies to be specified
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2024-02-13 16:38:05 +01:00
Jared Baur
838493b8b9
Allow for customizing the path to ldconfig
Since the `createContainer` `runc` hook runs with the environment that
the container's config.json specifies, the path to `ldconfig` may not be
easily resolvable if the host environment differs enough from the
container (e.g. on a NixOS host where all binaries are under hashed
paths in /nix/store with an Ubuntu container whose PATH contains
FHS-style paths such as /bin and /usr/bin). This change allows for
specifying exactly where ldconfig comes from.

Signed-off-by: Jared Baur <jaredbaur@fastmail.com>
2024-01-17 21:07:00 -08:00
Evan Lezar
21fc1f24e4 Use devRoot to resolve MIG device nodes
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2024-01-09 15:40:17 +01:00
Jakub Bujak
79acd7acff Add libnvdxgdmal library
This change adds the new libnvdxgdmal.so.1 library to the list of files copied from the DriverStore.

Signed-off-by: Jakub Bujak <jbujak@nvidia.com>
2024-01-09 15:29:55 +01:00
Christopher Desiniotis
895a5ed73a Update to github.com/NVIDIA/go-nvlib@f3264c8a6a7a
Signed-off-by: Christopher Desiniotis <cdesiniotis@nvidia.com>
2023-12-13 10:08:14 -08:00
Christopher Desiniotis
3158146946 Extend the 'runtime.nvidia.com/gpu' CDI device kind to support MIG devices specified by index or UUID
Signed-off-by: Christopher Desiniotis <cdesiniotis@nvidia.com>
2023-12-06 09:02:19 -08:00
Christopher Desiniotis
def7d09f85 Refactor how device identifiers are parsed before performing automatic CDI spec generation
Signed-off-by: Christopher Desiniotis <cdesiniotis@nvidia.com>
2023-12-06 09:02:19 -08:00
Christopher Desiniotis
b9ac54b922 Add GetDeviceSpecsByID() API to the nvcdi Interface
Signed-off-by: Christopher Desiniotis <cdesiniotis@nvidia.com>
2023-12-06 09:02:19 -08:00
Tariq Ibrahim
7627d48a5c run goimports -local against the entire codebase
Signed-off-by: Tariq Ibrahim <tibrahim@nvidia.com>
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2023-12-01 11:13:17 +01:00
Evan Lezar
879cc99aac Add transformer for container roots
This change renames the root transformer to indicate that it
operates on host paths and adds a container root transformer for
explicitly transforming container roots.

The transform.NewRootTransformer constructor still exists, but has
been marked as deprecated.

Signed-off-by: Evan Lezar <elezar@nvidia.com>
2023-11-30 20:26:42 +01:00
Evan Lezar
bbd9222206 Add driver root abstraction
This change adds a driver root abstraction that defines how
libraries are located relative to the root. This allows for
this driver root to be constructed once and passed to discovery
code.

Signed-off-by: Evan Lezar <elezar@nvidia.com>
2023-11-22 13:27:48 +01:00
Evan Lezar
3a96a00362 Simplify meta device discovery
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2023-11-20 21:29:35 +01:00
Evan Lezar
d4e21fdd10 Add devRoot option to CDI api
A driverRoot defines both the driver library root and the
root for device nodes. In the case of preinstalled drivers or
the driver container, these are equal, but in cases such as GKE
they do not match. In this case, drivers are extracted to a folder
and devices exist at the root /.

The changes here add a devRoot option to the nvcdi API that allows the
parent of /dev to be specified explicitly.

Signed-off-by: Evan Lezar <elezar@nvidia.com>
2023-11-20 21:29:35 +01:00
Evan Lezar
c63fb35ba8 Use github.com/NVIDIA/go-nvlib imports
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2023-11-15 21:38:26 +01:00
Evan Lezar
8d52cc18ce Make discovery of graphics libraries optional
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2023-11-03 22:15:41 +01:00
Evan Lezar
e56bb09889 Use tags.cncf.io for CDI imports
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2023-11-01 12:40:51 +01:00
Evan Lezar
709e27bf4b Fix implicit memory aliasing in for loop
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2023-10-24 20:11:34 +02:00
Evan Lezar
f8870b31be Fix filepath.Join with single arg
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2023-10-24 20:11:34 +02:00
Evan Lezar
73857eb8e3 Fix unnecessary conversion
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2023-10-24 20:11:34 +02:00
Evan Lezar
8a9f367067 Check returned error values
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2023-10-24 20:00:24 +02:00
Evan Lezar
f2c9937ca8 Use cdi parser package
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2023-10-24 20:00:24 +02:00
Evan Lezar
12dc12ce09 Fix misspellings
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2023-10-24 20:00:24 +02:00
Evan Lezar
f63ad3d9e7 Refactor symlink filter
This change refactors the use of the symlink filter to make it extendible.
A blocked filter can be set on the Tegra CSV discoverer to ensure that the correct
symlink libraries are filtered out. Here, globs can be used to select mulitple libraries,
and a **/ prefix on the globs indicates that the pattern that follows is only applied to
the filename of the symlink entry in the CSV file.

A --csv.ignore-pattern command line argument is added to the nvidia-ctk cdi generate
command that allows this to be set.

Signed-off-by: Evan Lezar <elezar@nvidia.com>
2023-09-22 22:04:06 +02:00
Evan Lezar
cbdbcd87ff Add sorter to simplifying transformer
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2023-08-08 15:27:04 +02:00
Evan Lezar
7a4d2cff67 Add merged CDI spec transformer
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2023-08-08 14:45:31 +02:00
Evan Lezar
5638f47cb0 Add sort CDI spec transoformer
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2023-08-08 14:45:31 +02:00
Evan Lezar
8553fce68a Specify library search paths for CSV CDI spec generation
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2023-08-04 16:49:30 +02:00
Evan Lezar
918bd03488 Move tegra-specifics to new package
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2023-08-04 16:49:30 +02:00
Evan Lezar
9f46c34587 Support device name strategies for Tegra devices
This change generates CDI specifications for Tegra devices
with the nvidia.com/gpu=0 name by default. The type-index
nameing strategy is also supported and will generate a device
with the name nvidia.com/gpu=gpu0.

The uuid naming strategy will raise an error if selected.

Signed-off-by: Evan Lezar <elezar@nvidia.com>
2023-07-18 16:13:38 +02:00
Evan Lezar
f07a0585fc Refactor device namer
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2023-07-18 16:13:37 +02:00
Evan Lezar
81908c8cc9 Search custom firmware paths first
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2023-07-11 10:34:14 +02:00
Evan Lezar
d3d41a3e1d Simplify handling of custom firmware path
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2023-07-11 10:31:50 +02:00
Evan Lezar
0a37f8798a Add firmware search paths when generating CDI specifications
Path to locate the GSP firmware is explicitly set to /lib/firmware/nvidia.
Users may chose to install the GSP firmware in alternate locations where
the kernel would look for firmware on the root filesystem.

Add locate functionality which looks for the GSP firmware files in the
same location as the kernel would
(https://docs.kernel.org/driver-api/firmware/fw_search_path.html).

The paths searched in order are:
- path described in /sys/module/firmware_class/parameters/path
- /lib/firmware/updates/UTS_RELEASE/
- /lib/firmware/updates/
- /lib/firmware/UTS_RELEASE/
- /lib/firmware/

Signed-off-by: Evan Lezar <elezar@nvidia.com>
2023-07-11 10:31:50 +02:00
Evan Lezar
6265e34afb Fix bug with multiple driver store paths
This change uses the actual discovered path of nvidia-smi when
creating a symlink to the binary on WSL2 platforms.

This ensures that cases where multiple driver store paths are
detected are supported.

Signed-off-by: Evan Lezar <elezar@nvidia.com>
2023-06-26 21:37:14 +02:00
Evan Lezar
1d0a733487 Replace logger.Warn(f) with logger.Warning(f)
This aligns better with klog used in other projects.

Signed-off-by: Evan Lezar <elezar@nvidia.com>
2023-06-12 10:48:04 +02:00
Evan Lezar
a02bc27c3e Define a basic logger interface
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2023-06-12 10:46:10 +02:00
Evan Lezar
e7d2a9c212 Merge branch 'CNT-1876/cdi-specs-from-csv' into 'main'
Add csv mode to CDI spec generation

See merge request nvidia/container-toolkit/container-toolkit!393
2023-05-23 14:47:19 +00:00
Evan Lezar
fcb4e379e3 Fix mode resolution tests
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2023-05-23 16:02:07 +02:00
Evan Lezar
3ea02d13fc Merge branch 'use-major-minor-for-cuda-version' into 'main'
Use *.* pattern when locating libcuda.so

See merge request nvidia/container-toolkit/container-toolkit!397
2023-05-22 13:02:33 +00:00
Evan Lezar
e30fd0f4ad Add csv mode to nvidia-ctk cdi generate command
This chagne allows the csv mode option to specified in the
nvidia-ctk cdi generate command and adds a --csv.file option
that can be repeated to specify the CSV files to be processed.

Signed-off-by: Evan Lezar <elezar@nvidia.com>
2023-05-22 13:56:45 +02:00
Evan Lezar
418e4014e6 Resolve to csv for CDI generation on tegra systems
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2023-05-22 13:56:00 +02:00
Evan Lezar
e78a4f5eac Add csv mode to nvcdi api
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2023-05-22 13:55:58 +02:00
Evan Lezar
c120c511d5 Merge branch 'CNT-3939/generate-all-device' into 'main'
Add options to generate all device to nvcdi API

See merge request nvidia/container-toolkit/container-toolkit!348
2023-05-22 11:53:39 +00:00
Evan Lezar
424b8c9d46 Use *.* pattern when locating libcuda.so
This change ensures that libcuda.so can be located on systems
where no patch version is specified in the driver version.

Signed-off-by: Evan Lezar <elezar@nvidia.com>
2023-05-22 13:53:19 +02:00