Commit Graph

91 Commits

Author SHA1 Message Date
Evan Lezar
e774c51c97 Add nvidia-ctk system create-device-nodes command
This change adds an nvidia-ctk system create-device-nodes command for
creating NVIDIA device nodes. Currently this is limited to control devices
(nvidia-uvm, nvidia-uvm-tools, nvidia-modeset, nvidiactl).

A --dry-run mode is included for outputing commands that would be executed and
the driver root can be specified.

Signed-off-by: Evan Lezar <elezar@nvidia.com>
2023-03-28 11:29:45 +02:00
Evan Lezar
226c54613e Also return an error from nvcdi.New
This change allows nvcdi.New to return an error in addition to the
constructed library instead of panicing.

Signed-off-by: Evan Lezar <elezar@nvidia.com>
2023-03-26 16:13:12 +02:00
Evan Lezar
685802b1ce Only init nvml as required when generating CDI specs
CDI generation modes such as management and wsl don't require
NVML. This change removes the top-level instantiation of nvmllib
and replaces it with an instanitation in the nvml CDI spec generation
code.

Signed-off-by: Evan Lezar <elezar@nvidia.com>
2023-03-20 14:24:08 +02:00
Evan Lezar
3bac4fad09 Migrate cri-o config update to engine.Interface
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2023-03-07 20:59:54 +02:00
Evan Lezar
9fff19da23 Migrate docker config to engine.Interface
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2023-03-07 20:59:54 +02:00
Evan Lezar
e5bb4d2718 Move runtime config code from config to config/engine
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2023-03-07 20:59:54 +02:00
Evan Lezar
cb5006c73f Merge branch 'CNT-3897/generate-management-container-spec' into 'main'
Generate CDI specs for management containers

See merge request nvidia/container-toolkit/container-toolkit!314
2023-03-06 16:23:13 +00:00
Evan Lezar
20d3bb189b Rename --discovery-mode to --mode
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2023-03-06 11:00:22 +02:00
Evan Lezar
f7e817cff6 Support management mode in nvidia-ctk cdi generate
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2023-03-06 10:53:43 +02:00
Evan Lezar
314059fcf0 Move path manipulation to spec.Save
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2023-03-01 13:49:04 +02:00
Evan Lezar
221781bd0b Use full path for output spec
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2023-03-01 13:48:28 +02:00
Evan Lezar
8be6de177f Move formatJSON and formatYAML to nvcdi/spec package
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2023-03-01 13:48:28 +02:00
Evan Lezar
890a519121 Use nvcdi.spec package to write and validate spec
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2023-03-01 13:48:28 +02:00
Evan Lezar
89321edae6 Add top-level GetSpec function to nvcdi API
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2023-03-01 13:48:28 +02:00
Evan Lezar
accba4ead5 Merge branch 'CNT-3965/clean-up-by-path-symlinks' into 'main'
Improve handling of /dev/dri devices and nested device paths

See merge request nvidia/container-toolkit/container-toolkit!307
2023-03-01 10:25:48 +00:00
Evan Lezar
b4dc1f338d Generate nested device folder permission hooks per device
This change generates device folder permission hooks per device instead of
at a spec level. This ensures that the hook is not injected for a device that
does not have any nested device nodes.

Signed-off-by: Evan Lezar <elezar@nvidia.com>
2023-02-22 17:16:23 +02:00
Evan Lezar
2542224d7b Skip paths with errors in chmod hook
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2023-02-21 11:47:11 +02:00
Evan Lezar
2680c45811 Add mode constants to nvcdi
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2023-02-20 16:33:51 +02:00
Evan Lezar
4ccb0b9a53 Add and resolve auto discovery mode for cdi generation
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2023-02-20 14:49:58 +02:00
Evan Lezar
b21dc929ef Add WSL2 discovery and spec generation
These changes add a wsl discovery mode to the nvidia-ctk cdi generate command.

If wsl mode is enabled, the driver store for the available devices is used as
the source for discovered entities.

Signed-off-by: Evan Lezar <elezar@nvidia.com>
2023-02-20 10:30:13 +02:00
Evan Lezar
20d6e9af04 Add --discovery-mode to nvidia-ctk cdi generate command
This change adds --discovery-mode flag to the nvidia-ctk cdi generate
command and plumbs this through to the CDI API.

Signed-off-by: Evan Lezar <elezar@nvidia.com>
2023-02-20 10:30:13 +02:00
Evan Lezar
a844749791 Ensure that generate uses a consistent nvidia-ctk path
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2023-02-20 10:28:45 +02:00
Evan Lezar
5b110fba2d Add nvcdi package with basic CDI generation API
This change adds an nvcdi package that exposes a basic API for
CDI spec generation. This is used from the nvidia-ctk cdi generate
command and can be consumed by DRA implementations and the device plugin.

Signed-off-by: Evan Lezar <elezar@nvidia.com>
2023-02-14 19:52:31 +01:00
Evan Lezar
97008f2db6 Move IPC discoverer into DriverDiscoverer
This simplifies the construction of the required common edits
when constructing a CDI specification.

Signed-off-by: Evan Lezar <elezar@nvidia.com>
2023-02-08 09:06:07 +01:00
Evan Lezar
076eed7eb4 Update ipcMount to add noexec option
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2023-02-08 09:06:07 +01:00
Evan Lezar
3b8c40c3e6 Move IPC discoverer to internal/discover package
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2023-02-08 09:06:07 +01:00
Evan Lezar
daceac9117 Rename discover.Config.Root to discover.Config.DriverRoot
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2023-02-02 15:57:15 +01:00
Evan Lezar
cfa2647260 Rename root to driverRoot for CDI generation
This makes the intent of the command line argument clearer since this
relates specifically to the root where the NVIDIA driver is installed.

Signed-off-by: Evan Lezar <elezar@nvidia.com>
2023-02-02 15:42:04 +01:00
Evan Lezar
707e3479f8 Fix lint errors
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2023-01-30 13:39:57 +01:00
Evan Lezar
201232dae3 Add logging of minimum CDI version
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2023-01-30 13:39:08 +01:00
Evan Lezar
f768bb5783 Use device index as CDI device names by default
This change uses the `index` mode for the --device-name-strategy when
generating CDI specifications by default. This generates device names
such as nvidia.com/gpu=0 or nvidia.com/gpu=1:0 by default.

Note that this requires a CDI spec version of 0.5.0 and for consumers
(e.g. podman) that are only compatible with older versions one of the
other stragegies (`type-index` or `uuid`) should be used instead to
generate a v0.3.0 or v0.4.0 specification.

Signed-off-by: Evan Lezar <elezar@nvidia.com>
2023-01-30 13:36:17 +01:00
Evan Lezar
f0de3ccd9c Merge branch 'CNT-3718/allow-device-name-to-be-controlled' into 'main'
Add --device-name-strategy flag for CDI spec generation

See merge request nvidia/container-toolkit/container-toolkit!269
2023-01-30 12:28:38 +00:00
Evan Lezar
8188400c97 Move create-dev-char-symlinks subcommand from hook to system
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2023-01-27 12:12:54 +01:00
Evan Lezar
962d38e9dd Add nvidia-ctk system subcommand
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2023-01-27 12:12:54 +01:00
Evan Lezar
1d7e419008 Add --create-all mode to creation of dev/char symlinks
This change adds a --create-all mode to the create-dev-char-symlinks hook.
This mode creates all POSSIBLE symlinks to device nodes for regular and cap
devices. With the number of GPUs inferred from the PCI device information.

Signed-off-by: Evan Lezar <elezar@nvidia.com>
2023-01-25 13:43:43 +01:00
Evan Lezar
f9330a4c2c Add --watch option to create-dev-char-symlinks
This change adds a --watch option to the create-dev-char-symlinks hook. This
installs an fsnotify watcher that creates symlinks for ADDED device nodes under
/dev/char.

Signed-off-by: Evan Lezar <elezar@nvidia.com>
2023-01-25 13:43:43 +01:00
Evan Lezar
be0e4667a5 Add create-dev-char-symlinks hook
This change adds an nvidia-ctk hook create-dev-char-symlinks
subcommand that creates symlinks to device nodes (as required by
systemd) under /dev/char.

Signed-off-by: Evan Lezar <elezar@nvidia.com>
2023-01-25 13:43:43 +01:00
Evan Lezar
408eeae70f Allow locator to be marked as optional
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2023-01-25 10:38:11 +01:00
Evan Lezar
89bf81a9db Add --device-name-strategy flag for CDI spec generation
This change adds a --device-name-strategy flag for generating a CDI
specificaion. This allows a CDI spec to be generated with the following
names used for device:

* type-index: gpu0 and mig0:1
* index: 0 and 0:1
* uuid: GPU and MIG UUIDs

Note that the use of 'index' generates a v0.5.0 CDI specification since
this relaxes the restriction on the device names.

Signed-off-by: Evan Lezar <elezar@nvidia.com>
2023-01-20 16:17:32 +01:00
Evan Lezar
7a1cfb48b9 Merge branch 'update-cdi' into 'main'
Determine the minumum required spec version

See merge request nvidia/container-toolkit/container-toolkit!265
2023-01-19 11:57:19 +00:00
Evan Lezar
eaf9bdaeb4 Determine the minumum required spec version
This change uses functionality from the CDI package to determine
the minimum required CDI spec version. This allows for a spec with
the widest compatibility to be specified.

Signed-off-by: Evan Lezar <elezar@nvidia.com>
2023-01-19 12:14:00 +01:00
Evan Lezar
a77331f8f0 Reuse mount discovery for driver libraries
This change implements the discovery of versioned driver libaries
by reusing the mounts and update ldcache discoverers use for, for example,
CVS file discovery. This allows the container paths to be correctly generated
without requiring specific manipulation.

Signed-off-by: Evan Lezar <elezar@nvidia.com>
2023-01-19 12:11:13 +01:00
Evan Lezar
9c9e6cd324 Fix nvidia-ctk path in spec generation
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2023-01-19 12:03:07 +01:00
Evan Lezar
27347c98d9 Consolidate code to find nvidia-ctk
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2023-01-19 10:31:42 +01:00
Evan Lezar
35df24d63a Make handling of nvidia-ctk path consistent
This change adds an --nvidia-ctk-path to the nvidia-ctk cdi generate
command. This ensures that the executable path for the generated
hooks can be specified consistently.

Since the NVIDIA Container Runtime already allows for the executable
path to be specified in the config the utility code to update the
LDCache and create other nvidia-ctk hooks are also updated.

Signed-off-by: Evan Lezar <elezar@nvidia.com>
2023-01-18 17:02:42 +01:00
Evan Lezar
311e7a1feb Ensure existence of DRM devices nodes is checked
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2022-12-12 14:48:54 +01:00
Evan Lezar
41f1b93422 Use NewContainerEdits utility function for CDI generation
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2022-12-07 11:09:19 +01:00
Evan Lezar
0a2083df72 Merge branch 'CNT-3707/add-root-flag' into 'main'
Add --root flag to nvidia-ctk cdi generate command

See merge request nvidia/container-toolkit/container-toolkit!256
2022-12-02 15:54:27 +00:00
Evan Lezar
80c810bf9e Add --root flag to CDI generate command
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2022-12-02 16:13:53 +01:00
Evan Lezar
82ba424212 Simplify device folder permission hook
This simplifies the device folder permission hook to only handle
/dev/dri and /dev/nvidia-caps folders.

Signed-off-by: Evan Lezar <elezar@nvidia.com>
2022-12-02 16:13:53 +01:00