Commit Graph

394 Commits

Author SHA1 Message Date
Evan Lezar
9464953924 Use logger.Interface when resolving auto mode
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2023-06-12 10:46:11 +02:00
Evan Lezar
a02bc27c3e Define a basic logger interface
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2023-06-12 10:46:10 +02:00
Evan Lezar
3f03a71afd Skip additional modifications in CDI mode
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2023-06-05 15:01:58 +02:00
Evan Lezar
43c44a0f48 Merge branch 'treat-log-errors-as-non-fatal' into 'main'
Ignore errors when creating debug log file

See merge request nvidia/container-toolkit/container-toolkit!404
2023-06-01 07:44:56 +00:00
Evan Lezar
6b1e8171c8 Merge branch 'add-mod-probe' into 'main'
Add option to load NVIDIA kernel modules

See merge request nvidia/container-toolkit/container-toolkit!409
2023-05-31 18:14:45 +00:00
Evan Lezar
2e50b3da7c Merge branch 'ldcache-resolve-circular' into 'main'
Fix infinite recursion when resolving libraries in LDCache

Closes #13

See merge request nvidia/container-toolkit/container-toolkit!406
2023-05-31 17:35:27 +00:00
Evan Lezar
7b801a0ce0 Add option to load NVIDIA kernel modules
These changes add a --load-kernel-modules option to the
nvidia-ctk system commands. If specified the NVIDIA kernel modules
(nvidia, nvidia-uvm, and nvidia-modeset) are loaded before any
operations on device nodes are performed.

Signed-off-by: Evan Lezar <elezar@nvidia.com>
2023-05-31 19:31:38 +02:00
Evan Lezar
528cbbb636 Merge branch 'fix-device-symlinks' into 'main'
Fix creation of device symlinks in /dev/char

See merge request nvidia/container-toolkit/container-toolkit!399
2023-05-31 17:31:04 +00:00
Evan Lezar
a6b0f45d2c Fix infinite recursion when resolving ldcache
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2023-05-30 11:03:36 +02:00
Evan Lezar
05632c0a40 Treat missing nvidia device majors as an error
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2023-05-26 10:24:36 +02:00
Evan Lezar
02656b624d Create log directory if required
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2023-05-25 15:17:00 +02:00
Evan Lezar
61af2aee8e Ignore errors when creating debug log file
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2023-05-25 14:44:00 +02:00
Evan Lezar
ac11727ec5 Add nvidia-contianer-runtime-hook.path config option
This change adds an nvidia-container-runtime-hook.path config option
to allow the path used for the prestart hook to be overridden. This
is useful in cases where multiple NVIDIA Container Toolkit installations
are present.

Signed-off-by: Evan Lezar <elezar@nvidia.com>
2023-05-25 12:05:33 +02:00
Evan Lezar
013a1b413b Fix ineffectual assignment
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2023-05-23 21:14:02 +02:00
Evan Lezar
3be16d8077 Create individual links instead of processing CSV
This change switches to generating a OCI runtime hook to create
individual symlinks instead of processing a CSV file in the hook.
This allows for better reuse of the logic generating CDI
specifications, for example.

Signed-off-by: Evan Lezar <elezar@nvidia.com>
2023-05-23 20:43:36 +02:00
Evan Lezar
927ec78b6e Add symlinks package with Resolve function
This change adds a symlinks.Resolve function for resolving symlinks and
updates usages across the code to make use of it.

Signed-off-by: Evan Lezar <elezar@nvidia.com>
2023-05-23 20:42:17 +02:00
Evan Lezar
e7d2a9c212 Merge branch 'CNT-1876/cdi-specs-from-csv' into 'main'
Add csv mode to CDI spec generation

See merge request nvidia/container-toolkit/container-toolkit!393
2023-05-23 14:47:19 +00:00
Evan Lezar
e30fd0f4ad Add csv mode to nvidia-ctk cdi generate command
This chagne allows the csv mode option to specified in the
nvidia-ctk cdi generate command and adds a --csv.file option
that can be repeated to specify the CSV files to be processed.

Signed-off-by: Evan Lezar <elezar@nvidia.com>
2023-05-22 13:56:45 +02:00
Evan Lezar
540dbcbc03 Move tegra system mounts to tegra-specific discoverer
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2023-05-22 13:55:22 +02:00
Evan Lezar
a8265f8846 Add tegra discoverer
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2023-05-22 13:55:22 +02:00
Evan Lezar
424b8c9d46 Use *.* pattern when locating libcuda.so
This change ensures that libcuda.so can be located on systems
where no patch version is specified in the driver version.

Signed-off-by: Evan Lezar <elezar@nvidia.com>
2023-05-22 13:53:19 +02:00
Evan Lezar
b7e5cef934 Include xorg discoverer with graphics mounts
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2023-05-10 17:07:55 +02:00
Evan Lezar
9378d0cd0f Move discover.FindNvidiaCTK to config.ResolveNVIDIACTKPath
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2023-05-10 15:12:44 +02:00
Evan Lezar
8bb0235c92 Remove discover.Config
These changes remove the use of discover.Config which was used
to pass the driver root and the nvidiaCTK path in some cases.

Instead, the nvidiaCTKPath is resolved at the begining of runtime
invocation to ensure that this is valid at all points where it is
used.

Signed-off-by: Evan Lezar <elezar@nvidia.com>
2023-05-10 15:03:37 +02:00
Evan Lezar
37c66fc33c Ensure that the nvidia-container-cli.user option is uncommented on suse
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2023-05-08 11:26:54 +02:00
Evan Lezar
1bd5798a99 Use toml representation to get defaults
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2023-05-08 11:26:53 +02:00
Evan Lezar
90c4c4811a Fallback to ldconfig if ldconfig.real does not exist
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2023-05-08 11:26:24 +02:00
Evan Lezar
49de170652 Generate default config.toml contents
This change adds a GetDefaultConfigToml function to the config package.

This function returns the default config in the form of raw TOML
including comments. This is useful for generating a default config at
installation time, with platform-specific differences codified.

Signed-off-by: Evan Lezar <elezar@nvidia.com>
2023-05-08 11:26:22 +02:00
Evan Lezar
2e3a12438a Fix toml definition in cli config struct
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2023-05-03 15:59:02 +02:00
Evan Lezar
ae2a683929 Run go fmt
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2023-04-25 11:27:58 +02:00
Evan Lezar
2b5eeb8d24 Regenerate mocks for formatting
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2023-04-25 11:26:55 +02:00
Carlos Eduardo Arango Gutierrez
81d8b94cdc
Export pkg config/engine
Signed-off-by: Carlos Eduardo Arango Gutierrez <eduardoa@nvidia.com>
2023-04-25 07:16:59 +02:00
Evan Lezar
f1e201d368 Refactor runtime configure cli
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2023-04-24 18:32:04 +02:00
Evan Lezar
fc7c8f7520 Resolve all symlinks in ldcache
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2023-04-21 17:28:49 +02:00
Evan Lezar
46c1c45d85 Add /usr/lib/current to search path
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2023-04-21 11:47:42 +02:00
Evan Lezar
2136266d1d Make discovery of Xorg libraries optional
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2023-04-13 18:41:38 +02:00
Christopher Desiniotis
ee5be5e3f2 Merge branch 'CNT-4056/add-cdi-annotations' into 'main'
Add nvidia-container-runtime.modes.cdi.annotation-prefixes config option.

See merge request nvidia/container-toolkit/container-toolkit!356
2023-03-28 16:47:51 +00:00
Evan Lezar
149236b002 Configure containerd config based on specified annotation prefixes
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2023-03-28 16:22:48 +02:00
Evan Lezar
e774c51c97 Add nvidia-ctk system create-device-nodes command
This change adds an nvidia-ctk system create-device-nodes command for
creating NVIDIA device nodes. Currently this is limited to control devices
(nvidia-uvm, nvidia-uvm-tools, nvidia-modeset, nvidiactl).

A --dry-run mode is included for outputing commands that would be executed and
the driver root can be specified.

Signed-off-by: Evan Lezar <elezar@nvidia.com>
2023-03-28 11:29:45 +02:00
Evan Lezar
c46b118f37 Add nvidia-container-runtime.modes.cdi.annotation-prefixes config option.
This change adds an nvidia-container-runtime.modes.cdi.annotation-prefixes config
option that defaults to cdi.k8s.io/. This allows the annotation prefixes parsed
for CDI devices to be overridden in cases where CDI support in container engines such
as containerd or crio need to be overridden.

Signed-off-by: Evan Lezar <elezar@nvidia.com>
2023-03-27 16:36:54 +02:00
Evan Lezar
c13c6ebadb Inject xorg libs and config in container
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2023-03-26 17:04:06 +02:00
Evan Lezar
2abe679dd1 Move libcuda locator to internal/lookup package
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2023-03-26 17:04:06 +02:00
Christopher Desiniotis
b2aaa21b0a Instantiate a logger when constructing a library Locator
Signed-off-by: Christopher Desiniotis <cdesiniotis@nvidia.com>
2023-03-21 13:38:36 -07:00
Evan Lezar
df40fbe03e Locate persistenced and fabricmanager sockets at /run instead of /var/run
This chagne prefers (non-symlink) sockets at /run over /var/run for
nvidia-persistenced and nvidia-fabricmanager sockets.

Signed-off-by: Evan Lezar <elezar@nvidia.com>
2023-03-17 09:23:48 +02:00
Christopher Desiniotis
48414e97bb Return empty list of devices for unprivileged containers when 'accept-nvidia-visible-devices-envvar-unprivileged=false'
Signed-off-by: Christopher Desiniotis <cdesiniotis@nvidia.com>
2023-03-10 13:11:29 -08:00
Evan Lezar
3a11f6ee0a Add nvidia-container-runtime-hook.skip-mode-detection option to config
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2023-03-09 20:15:40 +02:00
Evan Lezar
973e7bda5e Check accept-nvidia-visible-devices-envvar-when-unprivileged option for CDI
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2023-03-09 11:15:53 +02:00
Evan Lezar
154cd4ecf3 Add to config struct
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2023-03-09 11:15:53 +02:00
Evan Lezar
936fad1d04 Move check for privileged images to config/image/ package
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2023-03-09 11:15:53 +02:00
Evan Lezar
510fb248fe Add cdi.k8s.io annotations to containerd config
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2023-03-08 07:23:27 +02:00
Evan Lezar
1c696b1e39 Merge branch 'CNT-3894/configure-mode-specific-runtimes' into 'main'
Configure .cdi and .legacy executables in Toolkit Container

See merge request nvidia/container-toolkit/container-toolkit!308
2023-03-08 05:12:50 +00:00
Evan Lezar
dca8e3123f Migrate containerd config to engine.Interface
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2023-03-07 20:59:55 +02:00
Evan Lezar
3bac4fad09 Migrate cri-o config update to engine.Interface
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2023-03-07 20:59:54 +02:00
Evan Lezar
9fff19da23 Migrate docker config to engine.Interface
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2023-03-07 20:59:54 +02:00
Evan Lezar
e5bb4d2718 Move runtime config code from config to config/engine
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2023-03-07 20:59:54 +02:00
Evan Lezar
5bfb51f801 Add API for interacting with runtime engine configs
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2023-03-07 20:59:53 +02:00
Evan Lezar
6d220ed9a2 Rework selection of devices in CDI mode
The following changes are made:
* The default-cdi-kind config option is used to convert an envvar entry to a fully-qualified device name
* If annotation devices exist, these are used instead of the envvar devices.
* The `all` device is no longer treated as a special case and MUST exist in the CDI spec.

Signed-off-by: Evan Lezar <elezar@nvidia.com>
2023-03-07 16:18:53 +02:00
Evan Lezar
f00439c93e Add nvidia-container-runtime.modes.csv.default-kind config option
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2023-03-07 16:18:53 +02:00
Evan Lezar
35fc57291f Deduplicate WSL driverstore paths
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2023-02-21 11:48:56 +02:00
Evan Lezar
7eb435eb73 Add basic dxcore bindings
This change copies dxcore.h and dxcore.c from libnvidia-container to
allow for the driver store path to be queried. Modifications are made
to dxcore to remove the code associated with checking the components
in the driver store path.

Signed-off-by: Evan Lezar <elezar@nvidia.com>
2023-02-20 10:30:13 +02:00
Evan Lezar
5d011c1333 Add Discoverer to create a single symlink
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2023-02-20 10:30:13 +02:00
Evan Lezar
7789ac6331 Fix logger.Update and Reset
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2023-02-16 15:22:56 +01:00
Evan Lezar
7a3aabbbda Add logger test
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2023-02-16 15:22:56 +01:00
Evan Lezar
bf6babe07e Fix issue with blank nvidia-ctk path
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2023-02-16 14:18:07 +01:00
Evan Lezar
456d2864a6 Log config in JSON if possible
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2023-02-13 16:09:46 +01:00
Evan Lezar
406a5ec76f Implement runtime package for creating runtime CLI
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2023-02-13 16:09:46 +01:00
Evan Lezar
f71c419cfb Move modifying OCI runtime wrapper to oci package
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2023-02-13 16:09:46 +01:00
Evan Lezar
076eed7eb4 Update ipcMount to add noexec option
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2023-02-08 09:06:07 +01:00
Evan Lezar
33c7b056ea Add ipcMounts type
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2023-02-08 09:06:07 +01:00
Evan Lezar
3b8c40c3e6 Move IPC discoverer to internal/discover package
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2023-02-08 09:06:07 +01:00
Evan Lezar
3f70521a63 Add Options to discover.Mount
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2023-02-08 09:06:07 +01:00
Evan Lezar
daceac9117 Rename discover.Config.Root to discover.Config.DriverRoot
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2023-02-02 15:57:15 +01:00
Evan Lezar
0c8379f681 Fix nvidia-ctk path for update ldcache hook
This change ensures that the update-ldcache hook is created in a manner
consistent with other nvidia-ctk hooks ensuring that a full path is
used.

Without this change the update-ldcache hook on Tegra-based sytems had an
invalid path.

Signed-off-by: Evan Lezar <elezar@nvidia.com>
2023-02-01 12:00:23 +01:00
Evan Lezar
92dc0506fe Add hook path to logger output
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2023-02-01 12:00:23 +01:00
Evan Lezar
7045a223d2 Only use configured nvidia-ctk path if it is a full path
If this is not done, the default config which sets the nvidia-ctk.path
option as "nvidia-ctk" will result in an invalid OCI spec if a hook is
injected. This change ensures that the path used is always an absolute
path as required by the hook spec.

Signed-off-by: Evan Lezar <elezar@nvidia.com>
2023-02-01 12:00:23 +01:00
Evan Lezar
14bcebd8b7 Fix relative link resolution for ldcache
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2023-01-31 13:51:48 +01:00
Evan Lezar
95394e0fc8 Add internal/info/proc/devices package to read device majors
This change adds basic functionality to process the /proc/devices
file to extract device majors.

Signed-off-by: Evan Lezar <elezar@nvidia.com>
2023-01-25 13:43:43 +01:00
Evan Lezar
408eeae70f Allow locator to be marked as optional
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2023-01-25 10:38:11 +01:00
Evan Lezar
6237477ba3 Limit number of candidates for executables
This change ensures that the first match of an executable in the path
is retured instead of a list of candidates. This prevents a CDI spec,
for example, from containing multiple entries for a single executable
(e.g. nvidia-smi).

Signed-off-by: Evan Lezar <elezar@nvidia.com>
2023-01-20 15:10:24 +01:00
Evan Lezar
881b1c0e08 introduce resolveSelected helper
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2023-01-19 14:10:55 +01:00
Evan Lezar
3537d76726 Further refactoring of ldcache code
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2023-01-19 14:10:36 +01:00
Evan Lezar
ccd1961c60 Ensure root is included in absolute ldcache paths
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2023-01-19 14:09:43 +01:00
Evan Lezar
f350f0c0bb Refactor resolving of links in ldcache
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2023-01-19 14:09:41 +01:00
Evan Lezar
80672d33af Continue instead of break on error when listing libraries
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2023-01-19 13:54:24 +01:00
Evan Lezar
19cfb2774d Use common code to construct nvidia-ctk hooks
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2023-01-19 10:37:10 +01:00
Evan Lezar
27347c98d9 Consolidate code to find nvidia-ctk
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2023-01-19 10:31:42 +01:00
Evan Lezar
ebbc47702d Remove 'Executable' from private struct member names
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2023-01-18 17:02:42 +01:00
Evan Lezar
09d42f0ad9 Remove 'Executable' from config struct member
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2023-01-18 17:02:42 +01:00
Evan Lezar
35df24d63a Make handling of nvidia-ctk path consistent
This change adds an --nvidia-ctk-path to the nvidia-ctk cdi generate
command. This ensures that the executable path for the generated
hooks can be specified consistently.

Since the NVIDIA Container Runtime already allows for the executable
path to be specified in the config the utility code to update the
LDCache and create other nvidia-ctk hooks are also updated.

Signed-off-by: Evan Lezar <elezar@nvidia.com>
2023-01-18 17:02:42 +01:00
Evan Lezar
3140810c95 Add NewContainerEdits utility function
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2022-12-07 11:03:45 +01:00
Evan Lezar
046d761f4c Ensure that an empty discoverer returns valid edits
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2022-12-06 14:01:35 +01:00
Evan Lezar
8604c255c4 Use Options to set FileLocator options
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2022-12-02 13:57:33 +01:00
Evan Lezar
bea8321205 Use prefix search for locating graphics files
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2022-12-02 13:55:13 +01:00
Evan Lezar
db962c4bf2 Use getSearchPrefixes for all locators
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2022-12-02 13:55:13 +01:00
Evan Lezar
d1a3de7671 Add test for device locator
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2022-12-02 13:55:13 +01:00
Evan Lezar
8da7e74408 Add tests for executable locator
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2022-12-02 13:55:13 +01:00
Evan Lezar
55eb898186 Add support for specifying multiple prefixes
This change allows the file Locator to be instantiated with multiple
search prefixes.

Signed-off-by: Evan Lezar <elezar@nvidia.com>
2022-12-02 13:55:13 +01:00
Evan Lezar
a7fc29d4bd Add tests for file locator
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2022-12-02 13:55:13 +01:00
Evan Lezar
fdb3e51294 Add egl_external_platform.d/10_nvidia_wayland.json to graphics mounts
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2022-12-02 13:55:13 +01:00
Evan Lezar
d51c8fcfa7 Add utility function to generatee nvidia-ctk OCI hook
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2022-12-02 10:01:22 +01:00
Evan Lezar
9b33c34a57 Allow graphics mount discoverer to be instantiated independently
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2022-12-02 10:01:22 +01:00
Evan Lezar
0b6cd7e90e Add FromDiscoverer function to generate container edits
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2022-12-02 10:01:22 +01:00
Evan Lezar
029a04c37d Use blank device hostPath if same as Path
The HostPath field was added in the v0.5.0 CDI specification.
The cdi package uses strict unmarshalling when loading specs
from file causing failures for unexpected fields.

Since the behaviour for HostPath == "" and HostPath == Path are
equivalent, we clear HostPath if it is equal to Path to ensure
compatibility with the widest range of specs.

This allows, for example, a v0.4.0 spec to be generated as required.

Signed-off-by: Evan Lezar <elezar@nvidia.com>
2022-12-02 10:01:22 +01:00
Evan Lezar
60c1df4e9c Remove unneeded workaround for CDI edit generation
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2022-12-02 10:01:22 +01:00
Evan Lezar
5575b391ff Skip missing by-path symlinks instead of failing
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2022-11-23 22:21:58 +01:00
Evan Lezar
429ef4d4e9 Make NewVisibleDevices public
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2022-11-14 12:19:59 +01:00
Evan Lezar
0bc09665a8 Merge branch 'CNT-1380/add-crio-config' into 'main'
Add support for updating crio config

See merge request nvidia/container-toolkit/container-toolkit!176
2022-11-07 10:54:34 +00:00
Evan Lezar
877832da69 Consider all Swarm resource envvars
This change extends the support for multiple envvars when
specifying swarm resources to consider ALL of the specified
environment variables instead of the first match.

Signed-off-by: Evan Lezar <elezar@nvidia.com>
2022-11-04 10:01:28 +01:00
Evan Lezar
76b69f45de Add discovery of DRM devices
This change adds the discovery of DRM devices associated with requested
devices. This means that the /dev/dri/card* and /dev/dri/renderD*
devices associated with each requested NVIDIA GPU are injected into
the container and that the /dev/dri/by-path symlinks associated with
these devices are created in the container.

Signed-off-by: Evan Lezar <elezar@nvidia.com>
2022-11-02 14:49:08 +01:00
Evan Lezar
73e65edaa9 Also trigger graphics modifier for display capability
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2022-11-02 14:42:51 +01:00
Evan Lezar
cd7ee5a435 Add test for graphics modifier
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2022-11-02 14:42:51 +01:00
Evan Lezar
eac4faddc6 Use :: as link separator
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2022-11-02 14:42:51 +01:00
Evan Lezar
bc8a73dde4 Add a Filter interface to the discover package
This change adds support for filtering entities by specifying a filter.
This can be used, for example, to check whether a mount or device
has a particular property and removing it from the set of discovered
entities if it does not.

Signed-off-by: Evan Lezar <elezar@nvidia.com>
2022-11-02 14:42:48 +01:00
Evan Lezar
624b9d8ee6 Add internal drm package for determining DRM devices
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2022-11-02 14:39:53 +01:00
Evan Lezar
9d6e2ff1b0 Add internal proc package for processing GPU information files
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2022-11-02 14:39:53 +01:00
Evan Lezar
aca0c7bc5a Add Devices abstraction to CUDA image
This change adds a Devices abstraction to the CUDA image utilities. This
allows for checking whether a devices is selected, for example.

Signed-off-by: Evan Lezar <elezar@nvidia.com>
2022-11-02 14:39:53 +01:00
Evan Lezar
db47b58275 Add utilities for driver capabilities to image packages
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2022-11-02 14:35:42 +01:00
Evan Lezar
899fc72014 Correct constructin of MIG Caps
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2022-10-13 14:06:30 +02:00
Evan Lezar
1267c1d9a2 Refactor docker config update
This change updates the docker config update for simplicitly.
This also allows for the API to match the crio update code.

Signed-off-by: Evan Lezar <elezar@nvidia.com>
2022-10-11 11:42:38 +02:00
Evan Lezar
9a697e340b Add support for updating crio configs
This adds support for updating crio configs (instead of installing hooks)
and adds crio support to the nvidia-ctk runtime configure command.

Signed-off-by: Evan Lezar <elezar@nvidia.com>
2022-10-11 11:42:38 +02:00
Evan Lezar
9bbf7dcf96 Merge branch 'fix-hook-removal' into 'main'
Improve locating NVIDIA Container Runtime Hook

See merge request nvidia/container-toolkit/container-toolkit!215
2022-10-11 09:32:08 +00:00
Evan Lezar
4dedac6a24 Use base filename as first hook argument
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2022-09-29 12:14:12 +02:00
Evan Lezar
8c1b9b33c1 Use common code to construct ldconfig hook
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2022-09-29 12:12:42 +02:00
Evan Lezar
a0065456d0 Add internal/nvcaps package
This change adds an internal nvcaps pacakge.

This package will be migrated to go-nvlib.

Signed-off-by: Evan Lezar <elezar@nvidia.com>
2022-09-29 12:11:42 +02:00
Evan Lezar
b16d263ee7 Add tests for ldcache hook
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2022-09-29 12:11:40 +02:00
Evan Lezar
3ecd790206 Merge branch 'opengl-poc' into 'main'
Add support for injecting vulkan configs and libraries

See merge request nvidia/container-toolkit/container-toolkit!196
2022-09-29 09:23:54 +00:00
Evan Lezar
52bb9e186b Add vulkan support through OCI spec modification
This change allows the NVIDIA Container Runtime to inject vulkan
loaders and libraries by modifying the OCI runtime specification.

This allows vulkan applications to run in containers without
additional modifications.

Signed-off-by: Evan Lezar <elezar@nvidia.com>
2022-09-28 16:51:52 +02:00
Evan Lezar
68b6d1cab1 Add a locator for libraries
This change adds a Locator that can be used to locate libraries.
If library names are specified, the ldcache is searched otherwise
symlinks are resolved.

Signed-off-by: Evan Lezar <elezar@nvidia.com>
2022-09-28 16:43:21 +02:00
Evan Lezar
bdb67b4fba Add package for locating libraries in LDCache
This change adds a package that reads an ldcache and allows for libraries
to be searched by prefix.

Signed-off-by: Evan Lezar <elezar@nvidia.com>
2022-09-28 16:43:21 +02:00
Evan Lezar
fb016dca86 Use go-nvlib nvlib/info package
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2022-09-28 13:40:18 +02:00
Evan Lezar
5885fead8f Improve locating NVIDIA Container Runtime Hook
This change ensures that a more concrete error is provided by the NVIDIA
Container Runtime if the NVIDIA Container Runtime hook cannot be
located.

Signed-off-by: Evan Lezar <elezar@nvidia.com>
2022-09-19 15:29:29 +02:00
Evan Lezar
a9dc6550d5 Use nvinfo package from go-nvlib
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2022-08-08 17:11:42 +02:00
Evan Lezar
ffd6ec3c54 Add modifier to inject Tegra platform files
This change adds a modifier to that injects the tegra platform files
* /etc/nv_tegra_release
* /sys/devices/soc0/family

allowing these files to be used for platform detection in a containerized
context such as the GPU device plugin.

Signed-off-by: Evan Lezar <elezar@nvidia.com>
2022-08-08 16:04:20 +02:00
Evan Lezar
629a68937e Merge branch 'fix-relative-files' into 'main'
Fix adjusting relative paths for containerised devices and mounts.

See merge request nvidia/container-toolkit/container-toolkit!193
2022-07-20 11:40:28 +00:00
Evan Lezar
34e80abdea Add root to mounts type
This change adds a root member to the mounts type that is used to
perform most of the lookups for files and devices. This allows
for consistent handling of relative paths.

Signed-off-by: Evan Lezar <elezar@nvidia.com>
2022-07-18 14:37:02 +02:00
Evan Lezar
acc0afbb7a Remove Relative method from Locator
The Relative method added to the Locator interface was
not correctly implemented in the file type. The root was
never set when instantiating the object.

This change removes this method from the interface and the file
type, switching to a local implementation in the mounts type
instead.

Signed-off-by: Evan Lezar <elezar@nvidia.com>
2022-07-15 16:40:27 +02:00
Evan Lezar
7584044b3c Fix bug where ldcache may not contain symlinks
Since the creation of symlinks may include other libraries / folders
the ldcache should be updated AFTER the symlinks are created.

Signed-off-by: Evan Lezar <elezar@nvidia.com>
2022-07-15 12:18:40 +02:00
Evan Lezar
02c14e981c Add tests for identifying libraries
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2022-07-15 12:17:15 +02:00
Evan Lezar
37ee972f74 Merge branch 'CNT-2349/configure-docker' into 'main'
Add nvidia-ctk runtime configure command to update docker config

See merge request nvidia/container-toolkit/container-toolkit!166
2022-07-14 08:06:27 +00:00
Evan Lezar
3809407b6a Merge branch 'rename-to-nvidia-container-hook' into 'main'
Rename -toolkit executable to -runtime-hook

See merge request nvidia/container-toolkit/container-toolkit!189
2022-07-13 11:08:53 +00:00
Evan Lezar
f9547c447a Merge branch 'fix-cdi-refresh' into 'main'
Ensure that CDI registry is refreshed

See merge request nvidia/container-toolkit/container-toolkit!191
2022-07-13 09:38:45 +00:00
Evan Lezar
0e6dc3f7ea Move docker config handling to internal package
In preparation for adding a command to the nvidia-ctk CLI to modify
the docker config, this change refactors load, update, and flush logic
from the toolkit container docker CLI to an internal package.

Signed-off-by: Evan Lezar <elezar@nvidia.com>
2022-07-13 10:30:01 +02:00
Evan Lezar
1b4944e1de Ensure that CDI registry is refreshed
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2022-07-12 14:07:21 +02:00
Evan Lezar
83743e3613 Add runtime config option for CDI spec dirs
This change adds an nvidia-container-runtime.modes.cdi.spec-dirs
config option that allows the default spec dirs to be overridden.

Signed-off-by: Evan Lezar <elezar@nvidia.com>
2022-07-11 15:39:48 +02:00
Evan Lezar
87afcc3ef4 Reuse check for existing hook
This change reuse the code that checks for the existing NVIDIA
Container Runtime hook to ensure that both nvidia-container-toolkit
and nvidia-container-runtime-hook are detected.

Signed-off-by: Evan Lezar <elezar@nvidia.com>
2022-07-08 12:20:19 +02:00
Evan Lezar
b68b3c543b Use device host path to determine properties
This mirrors what is done in cri-o and allows for devices nodes
from, for example, the driver container to be injected into a
container at /dev instead of <ROOT>/dev

Signed-off-by: Evan Lezar <elezar@nvidia.com>
2022-07-07 12:03:23 +02:00
Evan Lezar
8817dee66c Add support for specifying devices in annotations
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2022-07-06 16:53:36 +02:00
Evan Lezar
404e266222 Add cdi mode to NVIDIA Container Runtime
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2022-07-06 16:53:05 +02:00
Evan Lezar
beff276a52 Add charDevices discoverer for devices
This change adds a charDevices discoverer and using this
for CSV, GDS, and MOFED discovery. Internally the discoverer
is a "mounts" discoverer with a charDevice locator.

Signed-off-by: Evan Lezar <elezar@nvidia.com>
2022-07-06 13:43:23 +02:00
Evan Lezar
55cb82c6c8 Create single discoverer per mount type for CSV
Instead of creating a set of discoverers per file, this change creates
a discoverer per type by first concatenating the mount specifications
from all files. This will allow all device nodes, for example, to
be treated as a single device.

Signed-off-by: Evan Lezar <elezar@nvidia.com>
2022-07-06 10:57:35 +02:00
Evan Lezar
9191074666 Rename discover.NewList to discover.Merge
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2022-07-05 10:28:40 +02:00
Evan Lezar
89824849d3 Merge branch 'refactor-envvar-devices' into 'main'
Add DevicesFromEnvvars function to CUDA image abstraction

See merge request nvidia/container-toolkit/container-toolkit!178
2022-07-04 08:47:28 +00:00
Evan Lezar
fd135f1a8b Add Relative function to Locator interface
This adds a Relative function to the Locator interface and uses
this to determine the host and container paths for located files
(and devices). This ensures that the root (e.g. the nvidia driver
root) is stripped from the container path.

Signed-off-by: Evan Lezar <elezar@nvidia.com>
2022-07-01 16:23:50 +02:00
Evan Lezar
4e08ec2405 Use CUDA.DevicesFromEnvvar to check if modifications are required
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2022-07-01 16:14:36 +02:00
Evan Lezar
925c348565 Add DevicesFromEnvvars function to CUDA image
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2022-07-01 16:12:13 +02:00
Evan Lezar
a1c2f07b6e Add /etc/cufile.json to list of required mounts
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2022-07-01 14:54:58 +02:00
Evan Lezar
7f7bec0668 Create GDS and MOFED modifiers
This change creates GDS and MOFED modifiers and adds them to the
modifer created for the selected runtime mode if the NVIDIA_GDS
and NVIDIA_MOFED envvars are set to "enabled", respectively.

Signed-off-by: Evan Lezar <elezar@nvidia.com>
2022-07-01 14:54:05 +02:00
Evan Lezar
cb34f7c6d1 Add discovery of GDS and MOFED devices
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2022-07-01 14:40:55 +02:00
Evan Lezar
7f47a61986 Allow globs in filenames for locators
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2022-07-01 14:30:33 +02:00
Evan Lezar
e8843c38f2 Move cmd/nvidia-container-runtime/modifier package to internal/modifier
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2022-07-01 14:28:40 +02:00
Evan Lezar
55ac8628c8 Add lists of modifiers to allow for modifier compositioning
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2022-07-01 14:25:18 +02:00
Evan Lezar
73a5b70a02 Return default config if config path is not found
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2022-05-25 13:22:45 +02:00
Evan Lezar
e07c7f0fa2 Ignore NVIDIA_REQUIRE_JETPACK* for image requirements
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2022-05-24 09:53:37 +02:00
Evan Lezar
084eae6e0d Fix bug in tegra detection
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2022-05-20 14:39:36 +02:00
Evan Lezar
55c1d7c256 Fix assertCharDevice matching on all files
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2022-05-20 10:08:00 +02:00
Evan Lezar
c77e86137e Add version output to CLIs
This change adds version output to the nvidia-continer-runtime,
nvidia-container-toolkit, and nvidia-ctk CLIs. The same version
is used in all cases and includes a version string and a git
revision if set.

The construction of the version string mirrors what is done in runc.

Signed-off-by: Evan Lezar <elezar@nvidia.com>
2022-05-13 07:31:11 +02:00
Evan Lezar
ff86ecb2a5 Include HasNVML check in ResolveAutoMode
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2022-05-12 10:55:58 +02:00
Evan Lezar
ad9ec1efae Add HasNVML function to check if NVML is supported
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2022-05-12 10:55:13 +02:00
Evan Lezar
9db5f9c9e8 Remove unneeded legacy discovery
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2022-05-12 10:53:52 +02:00
Evan Lezar
e591f3f26b Replace experimental and discover-mode
These changes replace the nvidia-container-runtime config options
experimental and discover-mode with a single mode config option.

Note that mode is now a string with a default value of "auto"
and a mode value of "legacy" is equivalent to experimental == false.

Signed-off-by: Evan Lezar <elezar@nvidia.com>
2022-05-12 10:53:50 +02:00
Evan Lezar
e0ad82e467 Move ResolveAutoMode to info package
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2022-05-12 10:28:56 +02:00
Evan Lezar
3a1404f2f4 Move isTegraSystem to internal info package
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2022-05-12 10:28:56 +02:00
Evan Lezar
cf7bb91481 Update nvidia-container-runtime config options
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2022-05-12 10:28:56 +02:00
Evan Lezar
ba0e606df2 Use toml unmarshal to read runtime config
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2022-05-12 10:28:56 +02:00
Evan Lezar
ae57a2fc93 Merge branch 'CNT-2875/create-specific-symlinks' into 'main'
Create specific symlinks for CSV mode

See merge request nvidia/container-toolkit/container-toolkit!150
2022-05-12 05:27:43 +00:00
Evan Lezar
1eb0e3c8b3 Merge branch 'fix-executable-locator' into 'main'
Fix location of executables in PATH

See merge request nvidia/container-toolkit/container-toolkit!148
2022-05-12 05:26:22 +00:00
Evan Lezar
675fbace01 Add hook to create specific links
This change updates the create-symlinks hook to also create symlinks for
libcuda.so, libGLX_indirect.so.0, and libnvidia-opticalflow.so

Signed-off-by: Evan Lezar <elezar@nvidia.com>
2022-05-11 16:36:49 +02:00
Evan Lezar
d2516cb5d5 Merge branch 'fix-container-root' into 'main'
Fix bug in update-ldcache hook when OCI spec contains a relative root

See merge request nvidia/container-toolkit/container-toolkit!147
2022-05-10 22:01:14 +00:00
Evan Lezar
4696d7ee69 Merge branch 'fix-hook-flags' into 'main'
Use singular instead of plural for hook arguments

See merge request nvidia/container-toolkit/container-toolkit!146
2022-05-10 22:00:51 +00:00
Evan Lezar
ef6f48e9f7 Use singular instead of plural for hook arguments
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2022-05-10 19:55:31 +02:00
Evan Lezar
088db09180 Use executable locator to find low-level runtime
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2022-05-10 15:21:48 +02:00
Evan Lezar
1d2e1bd403 Add lookup.GetPath and lookup.GetPaths functions
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2022-05-10 14:52:47 +02:00
Evan Lezar
395f6cecb2 Add GetContainerRoot to oci.State type
This change adds a GetContainerRoot to the oci.State type to
encapsulate the logic around determining the container root.
This Fixes a bug where relative roots (e.g. as generated by contianerd)
are not supported.

Signed-off-by: Evan Lezar <elezar@nvidia.com>
2022-05-10 11:48:43 +02:00
Evan Lezar
7574a0d7de Make output of bundle directory a debug message
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2022-05-09 09:38:16 +02:00
Evan Lezar
335de5a352 Switch to debug logging when locating runtimes
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2022-05-09 09:38:16 +02:00
Evan Lezar
c76946cbcc Add nvidia-container-runtime.runtimes config option
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2022-05-09 09:38:12 +02:00
Evan Lezar
785f120c31 Fix form -> from in comment
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2022-05-06 13:22:34 +02:00
Evan Lezar
9e46d41dbe Add debug logging when checking requirements
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2022-05-05 14:14:01 +02:00
Evan Lezar
9f50ac95c4 Add CUDA ComputeCapability function
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2022-05-05 14:09:28 +02:00
Evan Lezar
583793b7ae Add processing for requirements and constraints
This change adds a Requirements abstraction that can be used to check
an images' NVIDIA_REQUIRE_* envvars against the host properties such
as CUDA version or architecture.

Signed-off-by: Evan Lezar <elezar@nvidia.com>
2022-05-05 13:43:13 +02:00
Evan Lezar
5d7b3a4a96 Return raw spec from Spec.Load
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2022-05-05 13:43:13 +02:00
Evan Lezar
a672713dba Add basic CUDA wrapper
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2022-05-05 13:43:13 +02:00
Evan Lezar
8f0e1906c2 Add CUDA image abstraction
This change adds a CUDA image abstraction that encapsulates
the queries performed on a container image (e.g. envvars) to
check certain CUDA properties.

Signed-off-by: Evan Lezar <elezar@nvidia.com>
2022-05-05 13:43:13 +02:00
Evan Lezar
c224832a6d Add log-level config option for nvidia-container-runtime
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2022-04-08 13:56:17 +02:00
Evan Lezar
5211960fc3 Merge branch 'detect-gpus-flag' into 'master'
Detect use of --gpus flag in experimental mode

See merge request nvidia/container-toolkit/container-toolkit!125
2022-04-08 11:18:11 +00:00
Evan Lezar
cfca18a5f8 Merge branch 'refactor-csv-mount-spec-discovery' into 'master'
Refactor CSV discovery to make char device discovery clearer

See merge request nvidia/container-toolkit/container-toolkit!129
2022-04-08 10:54:06 +00:00
Evan Lezar
dab6f4b768 Specify --force flag when invoking nvidia-container-runtime-hook
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2022-04-08 12:03:22 +02:00
Evan Lezar
2563c1b87c Export GetDefaultRuntimeConfig
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2022-04-08 12:03:22 +02:00
Evan Lezar
62f608a3fe Make order of discoverers deterministic
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2022-04-08 11:59:26 +02:00
Evan Lezar
2c1e356370 Refactor CSV discovery to make char device discovery clearer
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2022-04-08 11:47:47 +02:00