Since the `createContainer` `runc` hook runs with the environment that
the container's config.json specifies, the path to `ldconfig` may not be
easily resolvable if the host environment differs enough from the
container (e.g. on a NixOS host where all binaries are under hashed
paths in /nix/store with an Ubuntu container whose PATH contains
FHS-style paths such as /bin and /usr/bin). This change allows for
specifying exactly where ldconfig comes from.
Signed-off-by: Jared Baur <jaredbaur@fastmail.com>
This change adds a driver root abstraction that defines how
libraries are located relative to the root. This allows for
this driver root to be constructed once and passed to discovery
code.
Signed-off-by: Evan Lezar <elezar@nvidia.com>
A driverRoot defines both the driver library root and the
root for device nodes. In the case of preinstalled drivers or
the driver container, these are equal, but in cases such as GKE
they do not match. In this case, drivers are extracted to a folder
and devices exist at the root /.
The changes here add a devRoot option to the nvcdi API that allows the
parent of /dev to be specified explicitly.
Signed-off-by: Evan Lezar <elezar@nvidia.com>
This change ensures that libcuda.so can be located on systems
where no patch version is specified in the driver version.
Signed-off-by: Evan Lezar <elezar@nvidia.com>
Since we relied on finding libcuda.so in the LDCache to determine both the CUDA
version and the expected directory for the driver libraries, the generation of the
management CDI specifications fails in containers where the LDCache has not been updated.
This change falls back to searching a set of predefined paths instead when the lookup of
libcuda.so in the cache fails.
Signed-off-by: Evan Lezar <elezar@nvidia.com>
These changes add support for generating a management spec to the nvcdi API.
A management spec consists of a single CDI device (`all`) which includes all expected
NVIDIA device nodes, driver libraries, binaries, and IPC sockets.
Signed-off-by: Evan Lezar <elezar@nvidia.com>