Commit Graph

20 Commits

Author SHA1 Message Date
Evan Lezar
a4bfccc3fe Use include-persistenced-socket feature for CDI mode
This change ensures that the internal CDI representation includes
the persistenced socket if the include-persistenced-socket feature
flag is enabled.

Signed-off-by: Evan Lezar <elezar@nvidia.com>
2024-09-18 22:30:27 +02:00
Avi Deitcher
179d8655f9 Move nvidia-ctk hook command into own binary
This change creates an nvidia-cdi-hook binary for implementing
CDI hooks. This allows for these hooks to be separated from the
nvidia-ctk command which may, for example, require libnvidia-ml
to support other functionality.

The nvidia-ctk hook subcommand is maintained as an alias for the
time being to allow for existing CDI specifications referring to
this path to work as expected.

Signed-off-by: Avi Deitcher <avi@deitcher.net>
2024-05-21 12:19:44 +02:00
Evan Lezar
082ce066ed Replace go-nvlib/pkg/nvml with go-nvml/pkg/nvml
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2024-04-18 14:53:37 +02:00
Jared Baur
838493b8b9
Allow for customizing the path to ldconfig
Since the `createContainer` `runc` hook runs with the environment that
the container's config.json specifies, the path to `ldconfig` may not be
easily resolvable if the host environment differs enough from the
container (e.g. on a NixOS host where all binaries are under hashed
paths in /nix/store with an Ubuntu container whose PATH contains
FHS-style paths such as /bin and /usr/bin). This change allows for
specifying exactly where ldconfig comes from.

Signed-off-by: Jared Baur <jaredbaur@fastmail.com>
2024-01-17 21:07:00 -08:00
Tariq Ibrahim
7627d48a5c run goimports -local against the entire codebase
Signed-off-by: Tariq Ibrahim <tibrahim@nvidia.com>
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2023-12-01 11:13:17 +01:00
Evan Lezar
bbd9222206 Add driver root abstraction
This change adds a driver root abstraction that defines how
libraries are located relative to the root. This allows for
this driver root to be constructed once and passed to discovery
code.

Signed-off-by: Evan Lezar <elezar@nvidia.com>
2023-11-22 13:27:48 +01:00
Evan Lezar
c63fb35ba8 Use github.com/NVIDIA/go-nvlib imports
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2023-11-15 21:38:26 +01:00
Evan Lezar
f8870b31be Fix filepath.Join with single arg
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2023-10-24 20:11:34 +02:00
Evan Lezar
8a9f367067 Check returned error values
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2023-10-24 20:00:24 +02:00
Evan Lezar
12dc12ce09 Fix misspellings
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2023-10-24 20:00:24 +02:00
Evan Lezar
81908c8cc9 Search custom firmware paths first
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2023-07-11 10:34:14 +02:00
Evan Lezar
d3d41a3e1d Simplify handling of custom firmware path
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2023-07-11 10:31:50 +02:00
Evan Lezar
0a37f8798a Add firmware search paths when generating CDI specifications
Path to locate the GSP firmware is explicitly set to /lib/firmware/nvidia.
Users may chose to install the GSP firmware in alternate locations where
the kernel would look for firmware on the root filesystem.

Add locate functionality which looks for the GSP firmware files in the
same location as the kernel would
(https://docs.kernel.org/driver-api/firmware/fw_search_path.html).

The paths searched in order are:
- path described in /sys/module/firmware_class/parameters/path
- /lib/firmware/updates/UTS_RELEASE/
- /lib/firmware/updates/
- /lib/firmware/UTS_RELEASE/
- /lib/firmware/

Signed-off-by: Evan Lezar <elezar@nvidia.com>
2023-07-11 10:31:50 +02:00
Evan Lezar
a02bc27c3e Define a basic logger interface
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2023-06-12 10:46:10 +02:00
Evan Lezar
8bb0235c92 Remove discover.Config
These changes remove the use of discover.Config which was used
to pass the driver root and the nvidiaCTK path in some cases.

Instead, the nvidiaCTKPath is resolved at the begining of runtime
invocation to ensure that this is valid at all points where it is
used.

Signed-off-by: Evan Lezar <elezar@nvidia.com>
2023-05-10 15:03:37 +02:00
Evan Lezar
2abe679dd1 Move libcuda locator to internal/lookup package
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2023-03-26 17:04:06 +02:00
Evan Lezar
9506bd9da0 Fix generation of management CDI spec in containers
Since we relied on finding libcuda.so in the LDCache to determine both the CUDA
version and the expected directory for the driver libraries, the generation of the
management CDI specifications fails in containers where the LDCache has not been updated.

This change falls back to searching a set of predefined paths instead when the lookup of
libcuda.so in the cache fails.

Signed-off-by: Evan Lezar <elezar@nvidia.com>
2023-03-23 15:59:01 +02:00
Evan Lezar
685802b1ce Only init nvml as required when generating CDI specs
CDI generation modes such as management and wsl don't require
NVML. This change removes the top-level instantiation of nvmllib
and replaces it with an instanitation in the nvml CDI spec generation
code.

Signed-off-by: Evan Lezar <elezar@nvidia.com>
2023-03-20 14:24:08 +02:00
Evan Lezar
29cbbe83f9 Add management mode to CDI spec generation API
These changes add support for generating a management spec to the nvcdi API.
A management spec consists of a single CDI device (`all`) which includes all expected
NVIDIA device nodes, driver libraries, binaries, and IPC sockets.

Signed-off-by: Evan Lezar <elezar@nvidia.com>
2023-03-06 10:53:43 +02:00
Evan Lezar
d226925fe7 Construct nvml-based CDI lib based on mode
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2023-02-20 10:30:13 +02:00