Commit Graph

101 Commits

Author SHA1 Message Date
Evan Lezar
a4bfccc3fe Use include-persistenced-socket feature for CDI mode
This change ensures that the internal CDI representation includes
the persistenced socket if the include-persistenced-socket feature
flag is enabled.

Signed-off-by: Evan Lezar <elezar@nvidia.com>
2024-09-18 22:30:27 +02:00
Evan Lezar
be11cf428b [no-relnote] Add MIG discoverer to dgpu package
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2024-07-09 13:34:04 +02:00
Evan Lezar
b42a5d3e3a [no-relnote] Refactor dGPU device discovery
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2024-07-09 13:34:04 +02:00
Evan Lezar
71e0b8590f Set default CDI spec permissions to 644
Although the nvidia-ctk cdi generate command generates
specs with 644 permissions, the nvidia-ctk cdi transform
commands do not. This change sets the default permissions
to 600 instead of 644.

Signed-off-by: Evan Lezar <elezar@nvidia.com>
2024-06-05 11:27:03 +02:00
Evan Lezar
9208159263 Add dev-root option to toolkit container
This changes adds an option to the toolkit container to allow
the dev root to be specified. This adds support for driver installations
where the driver files are at one root and the dev nodes are created
elsewhere -- most typically at /. This is the case, for example, for
GKE driver installations.

Signed-off-by: Evan Lezar <elezar@nvidia.com>
2024-06-03 20:40:30 +02:00
Evan Lezar
c5eda7af8e Ensure that libnvidia-ml.so.1 is found in driver root
This change ensures that the driver root is used to locate libnvidia-ml.so.1
if required.

Signed-off-by: Evan Lezar <elezar@nvidia.com>
2024-06-03 12:01:10 +02:00
Evan Lezar
8fc4b9c742 Add WithInfoLib option to CDI package
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2024-05-28 13:30:31 +02:00
Evan Lezar
ef57c07199 Bump github.com/NVIDIA/go-nvlib to v0.5.0
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2024-05-28 13:28:28 +02:00
Evan Lezar
abb5abaea4 Ensure consistent construction order for libs
This change ensures that nvnllib and devicelib are constructed
before these are used to construct infolib.

Signed-off-by: Evan Lezar <elezar@nvidia.com>
2024-05-28 12:05:44 +02:00
Evan Lezar
17c044eef8 Set minimum version on save
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2024-05-24 15:36:18 +02:00
Evan Lezar
edda11d647
Merge pull request #428 from elezar/fix-cdi-mode-resolution
Fix cdi mode resolution
2024-05-21 13:22:10 +02:00
Evan Lezar
3defc6babb Use go-nvlib mode resolution
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2024-05-21 12:25:54 +02:00
Avi Deitcher
179d8655f9 Move nvidia-ctk hook command into own binary
This change creates an nvidia-cdi-hook binary for implementing
CDI hooks. This allows for these hooks to be separated from the
nvidia-ctk command which may, for example, require libnvidia-ml
to support other functionality.

The nvidia-ctk hook subcommand is maintained as an alias for the
time being to allow for existing CDI specifications referring to
this path to work as expected.

Signed-off-by: Avi Deitcher <avi@deitcher.net>
2024-05-21 12:19:44 +02:00
Christopher Desiniotis
35b23c5a2c Accept device.Identifiers for requesting CDI specs
This change moves from using strings to useing device.Identifiers
as input for requesting CDI specifications for specific
devices.

Signed-off-by: Christopher Desiniotis <cdesiniotis@nvidia.com>
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2024-05-07 21:50:28 +02:00
Evan Lezar
082ce066ed Replace go-nvlib/pkg/nvml with go-nvml/pkg/nvml
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2024-04-18 14:53:37 +02:00
Evan Lezar
29c0f82ed2
Merge pull request #327 from elezar/add-driver-config
Add config search path option to driver root
2024-04-11 16:58:33 +02:00
Evan Lezar
4cd86caf67 Use NewCache instead of GetRegistry
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2024-04-05 17:09:17 +02:00
Evan Lezar
2a9e3537ec Add config search paths option to driver root.
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2024-04-02 23:03:05 +02:00
Evan Lezar
93763d25f0 Use functional options to construct driver root
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2024-03-25 20:13:25 +02:00
Evan Lezar
88ad42ccd1 Add NVIDIA_VISIBLE_DEVICES=void to CDI specs
This change ensures taht NVIDIA_VISIBLE_DEVICES=void is included in
generated CDI specs. This prevents the NVIDIA Container Runtime Hook
from injecting devices if NVIDIA_VISIBLE_DEVICES=all is set.

Signed-off-by: Evan Lezar <elezar@nvidia.com>
2024-03-04 16:10:06 +02:00
Evan Lezar
52da12cf9a Allow multiple device name strategies to be specified
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2024-02-13 16:38:05 +01:00
Jared Baur
838493b8b9
Allow for customizing the path to ldconfig
Since the `createContainer` `runc` hook runs with the environment that
the container's config.json specifies, the path to `ldconfig` may not be
easily resolvable if the host environment differs enough from the
container (e.g. on a NixOS host where all binaries are under hashed
paths in /nix/store with an Ubuntu container whose PATH contains
FHS-style paths such as /bin and /usr/bin). This change allows for
specifying exactly where ldconfig comes from.

Signed-off-by: Jared Baur <jaredbaur@fastmail.com>
2024-01-17 21:07:00 -08:00
Evan Lezar
21fc1f24e4 Use devRoot to resolve MIG device nodes
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2024-01-09 15:40:17 +01:00
Jakub Bujak
79acd7acff Add libnvdxgdmal library
This change adds the new libnvdxgdmal.so.1 library to the list of files copied from the DriverStore.

Signed-off-by: Jakub Bujak <jbujak@nvidia.com>
2024-01-09 15:29:55 +01:00
Christopher Desiniotis
895a5ed73a Update to github.com/NVIDIA/go-nvlib@f3264c8a6a7a
Signed-off-by: Christopher Desiniotis <cdesiniotis@nvidia.com>
2023-12-13 10:08:14 -08:00
Christopher Desiniotis
3158146946 Extend the 'runtime.nvidia.com/gpu' CDI device kind to support MIG devices specified by index or UUID
Signed-off-by: Christopher Desiniotis <cdesiniotis@nvidia.com>
2023-12-06 09:02:19 -08:00
Christopher Desiniotis
def7d09f85 Refactor how device identifiers are parsed before performing automatic CDI spec generation
Signed-off-by: Christopher Desiniotis <cdesiniotis@nvidia.com>
2023-12-06 09:02:19 -08:00
Christopher Desiniotis
b9ac54b922 Add GetDeviceSpecsByID() API to the nvcdi Interface
Signed-off-by: Christopher Desiniotis <cdesiniotis@nvidia.com>
2023-12-06 09:02:19 -08:00
Tariq Ibrahim
7627d48a5c run goimports -local against the entire codebase
Signed-off-by: Tariq Ibrahim <tibrahim@nvidia.com>
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2023-12-01 11:13:17 +01:00
Evan Lezar
879cc99aac Add transformer for container roots
This change renames the root transformer to indicate that it
operates on host paths and adds a container root transformer for
explicitly transforming container roots.

The transform.NewRootTransformer constructor still exists, but has
been marked as deprecated.

Signed-off-by: Evan Lezar <elezar@nvidia.com>
2023-11-30 20:26:42 +01:00
Evan Lezar
bbd9222206 Add driver root abstraction
This change adds a driver root abstraction that defines how
libraries are located relative to the root. This allows for
this driver root to be constructed once and passed to discovery
code.

Signed-off-by: Evan Lezar <elezar@nvidia.com>
2023-11-22 13:27:48 +01:00
Evan Lezar
3a96a00362 Simplify meta device discovery
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2023-11-20 21:29:35 +01:00
Evan Lezar
d4e21fdd10 Add devRoot option to CDI api
A driverRoot defines both the driver library root and the
root for device nodes. In the case of preinstalled drivers or
the driver container, these are equal, but in cases such as GKE
they do not match. In this case, drivers are extracted to a folder
and devices exist at the root /.

The changes here add a devRoot option to the nvcdi API that allows the
parent of /dev to be specified explicitly.

Signed-off-by: Evan Lezar <elezar@nvidia.com>
2023-11-20 21:29:35 +01:00
Evan Lezar
c63fb35ba8 Use github.com/NVIDIA/go-nvlib imports
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2023-11-15 21:38:26 +01:00
Evan Lezar
8d52cc18ce Make discovery of graphics libraries optional
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2023-11-03 22:15:41 +01:00
Evan Lezar
e56bb09889 Use tags.cncf.io for CDI imports
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2023-11-01 12:40:51 +01:00
Evan Lezar
709e27bf4b Fix implicit memory aliasing in for loop
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2023-10-24 20:11:34 +02:00
Evan Lezar
f8870b31be Fix filepath.Join with single arg
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2023-10-24 20:11:34 +02:00
Evan Lezar
73857eb8e3 Fix unnecessary conversion
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2023-10-24 20:11:34 +02:00
Evan Lezar
8a9f367067 Check returned error values
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2023-10-24 20:00:24 +02:00
Evan Lezar
f2c9937ca8 Use cdi parser package
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2023-10-24 20:00:24 +02:00
Evan Lezar
12dc12ce09 Fix misspellings
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2023-10-24 20:00:24 +02:00
Evan Lezar
f63ad3d9e7 Refactor symlink filter
This change refactors the use of the symlink filter to make it extendible.
A blocked filter can be set on the Tegra CSV discoverer to ensure that the correct
symlink libraries are filtered out. Here, globs can be used to select mulitple libraries,
and a **/ prefix on the globs indicates that the pattern that follows is only applied to
the filename of the symlink entry in the CSV file.

A --csv.ignore-pattern command line argument is added to the nvidia-ctk cdi generate
command that allows this to be set.

Signed-off-by: Evan Lezar <elezar@nvidia.com>
2023-09-22 22:04:06 +02:00
Evan Lezar
cbdbcd87ff Add sorter to simplifying transformer
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2023-08-08 15:27:04 +02:00
Evan Lezar
7a4d2cff67 Add merged CDI spec transformer
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2023-08-08 14:45:31 +02:00
Evan Lezar
5638f47cb0 Add sort CDI spec transoformer
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2023-08-08 14:45:31 +02:00
Evan Lezar
8553fce68a Specify library search paths for CSV CDI spec generation
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2023-08-04 16:49:30 +02:00
Evan Lezar
918bd03488 Move tegra-specifics to new package
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2023-08-04 16:49:30 +02:00
Evan Lezar
9f46c34587 Support device name strategies for Tegra devices
This change generates CDI specifications for Tegra devices
with the nvidia.com/gpu=0 name by default. The type-index
nameing strategy is also supported and will generate a device
with the name nvidia.com/gpu=gpu0.

The uuid naming strategy will raise an error if selected.

Signed-off-by: Evan Lezar <elezar@nvidia.com>
2023-07-18 16:13:38 +02:00
Evan Lezar
f07a0585fc Refactor device namer
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2023-07-18 16:13:37 +02:00