Commit Graph

34 Commits

Author SHA1 Message Date
Evan Lezar
2e6712d2bc Allow IMEX channels to be requested as volume mounts
This change allows IMEX channels to be requested using the
volume mount mechanism.

A mount from /dev/null to /var/run/nvidia-container-devices/imex/{{ .ChannelID }}
is equivalent to including {{ .ChannelID }} in the NVIDIA_IMEX_CHANNELS
envvironment variables.

Signed-off-by: Evan Lezar <elezar@nvidia.com>
2024-10-17 16:54:29 +02:00
Evan Lezar
92df542f2f [no-relnote] Use image.CUDA to extract visible devices
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2024-10-17 16:53:17 +02:00
Evan Lezar
1991b3ef2a [no-relnote] Use string slice for devices in hook
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2024-10-17 16:53:17 +02:00
Evan Lezar
457d71c170 Add disable-imex-channel-creation feature flag
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2024-10-17 14:26:24 +02:00
Evan Lezar
5145b0a4b6 Revert "Merge pull request #694 from elezar/add-opt-in-to-sockets"
This reverts commit b061446694, reversing
changes made to c490baab63.

Signed-off-by: Evan Lezar <elezar@nvidia.com>
2024-09-20 20:26:45 +02:00
Evan Lezar
a819cfdab4 Revert "Merge pull request #703 from elezar/fix-no-persistenced-flag"
This reverts commit c02b144ed4, reversing
changes made to b061446694.

Signed-off-by: Evan Lezar <elezar@nvidia.com>
2024-09-20 20:26:33 +02:00
Evan Lezar
16fdef50e6 [no-relnote] Move --no-persistenced flag to after configure
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2024-09-18 22:52:25 +02:00
Evan Lezar
ba1ed3232f Skip injection of nvidia-persistenced socket by default
This changes skips the injection of the nvidia-persistenced socket by
default.

An include-persistenced-socket feature flag is added to allow the
injection of this socket to be explicitly requested.

Signed-off-by: Evan Lezar <elezar@nvidia.com>
2024-09-18 22:10:09 +02:00
Kevin Klues
296d4560b0 Add support for an NVIDIA_IMEX_CHANNELS envvar
Signed-off-by: Kevin Klues <kklues@nvidia.com>
2024-02-26 20:09:43 +01:00
Tariq Ibrahim
7627d48a5c run goimports -local against the entire codebase
Signed-off-by: Tariq Ibrahim <tibrahim@nvidia.com>
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2023-12-01 11:13:17 +01:00
Evan Lezar
232df647c1 Resolve LDConfig path passed to nvidia-container-cli
Instead of relying solely on a static config, we resolve the path
to ldconfig. The path is checked for existence and a .real suffix is preferred.

Signed-off-by: Evan Lezar <elezar@nvidia.com>
2023-11-21 15:31:12 +01:00
Evan Lezar
833254fa59 Support CDI devices as mounts
This change allows CDI devices to be requested as mounts in the
container. This enables their use in environments such as kind
where environment variables or annotations cannot be used.

Signed-off-by: Evan Lezar <elezar@nvidia.com>
2023-10-27 21:24:53 +02:00
Evan Lezar
48d68e4eff Add nolint for exec calls
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2023-10-24 20:11:34 +02:00
Evan Lezar
12dc12ce09 Fix misspellings
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2023-10-24 20:00:24 +02:00
Evan Lezar
73749285d5 Remove unused loadSaver interface
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2023-10-24 20:00:24 +02:00
Evan Lezar
4ec9bd751e Add required option to new toml config
This change adds a "required" option to the new toml config
that controls whether a default config is returned or not.
This is useful from the NVIDIA Container Runtime Hook, where
/run/driver/nvidia/etc/nvidia-container-runtime/config.toml
is checked before the standard path.

This fixes a bug where the default config was always applied
when this config was not used.

See https://github.com/NVIDIA/nvidia-container-toolkit/issues/106

Signed-off-by: Evan Lezar <elezar@nvidia.com>
2023-09-07 11:56:01 +02:00
Evan Lezar
a69657dde7 Add config.Toml type to handle config files
This change introduced a config.Toml type that is used as the base for
config file processing and manipulation. This ensures that configs --
including commented values -- can be handled consistently.

Signed-off-by: Evan Lezar <elezar@nvidia.com>
2023-08-14 11:32:54 +02:00
Evan Lezar
3670e7b89e Refactor loading of hook configs
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2023-08-14 10:40:42 +02:00
Evan Lezar
b18ac09f77 Refactor handling of DriverCapabilities
This change consolidates the handling of NVIDIA_DRIVER_CAPABILITIES in the
interal/image package.

Signed-off-by: Evan Lezar <elezar@nvidia.com>
2023-08-14 10:40:42 +02:00
Evan Lezar
4dcaa61167 Use internal/config structs in hook
This change ensures that the Config structs from internal.Config
are used for the NVIDIA Container Runtime Hook config too.

Signed-off-by: Evan Lezar <elezar@nvidia.com>
2023-08-14 10:40:41 +02:00
Evan Lezar
e51621aa7f Handle empty root in config
If the config.toml has an empty root specified, this could be
passed to the NVIDIA Container CLI through the --root flag
which causes argument parsing to fail. This change only
adds the --root flag if the config option is specified
and is non-empty.

Signed-off-by: Evan Lezar <elezar@nvidia.com>
2023-07-19 14:02:23 +02:00
Evan Lezar
1081cecea9 Return empty requirements if NVIDIA_DISABLE_REQUIRE is true
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2023-07-11 13:47:37 +02:00
Evan Lezar
82347eb9bc Resolve auto mode as cdi for fully-qualified names
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2023-06-13 16:05:37 +02:00
Evan Lezar
9464953924 Use logger.Interface when resolving auto mode
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2023-06-12 10:46:11 +02:00
Evan Lezar
1bd5798a99 Use toml representation to get defaults
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2023-05-08 11:26:53 +02:00
Evan Lezar
3a11f6ee0a Add nvidia-container-runtime-hook.skip-mode-detection option to config
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2023-03-09 20:15:40 +02:00
Evan Lezar
936fad1d04 Move check for privileged images to config/image/ package
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2023-03-09 11:15:53 +02:00
Evan Lezar
877832da69 Consider all Swarm resource envvars
This change extends the support for multiple envvars when
specifying swarm resources to consider ALL of the specified
environment variables instead of the first match.

Signed-off-by: Evan Lezar <elezar@nvidia.com>
2022-11-04 10:01:28 +01:00
Evan Lezar
aca0c7bc5a Add Devices abstraction to CUDA image
This change adds a Devices abstraction to the CUDA image utilities. This
allows for checking whether a devices is selected, for example.

Signed-off-by: Evan Lezar <elezar@nvidia.com>
2022-11-02 14:39:53 +01:00
Evan Lezar
a35236a8f6 Correct test cases for NVIDIA_VISIBLE_DEVICES=void
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2022-10-04 14:14:44 +02:00
Evan Lezar
f0bdfbebe4 Add support for multiple swarm resource envvars
This change allows the swarm-resource config option to specify a
comma-separated list of environment variables instead of a single
environment variable.

The first environment variable matched is considered and other
environment variables are ignored.

Signed-off-by: Evan Lezar <elezar@nvidia.com>
2022-10-04 14:11:10 +02:00
Christopher Desiniotis
3a9de13f4e Apply 1 suggestion(s) to 1 file(s) 2022-07-21 08:03:39 +00:00
Evan Lezar
1161b21166 Make error message clearer
This change improves the error message when invoking the NVIDIA
Runtime Hook in non-legacy mode. This should guide users to specifying
the --runtime=nvidia flag when using docker.

Signed-off-by: Evan Lezar <elezar@nvidia.com>
2022-07-18 13:09:59 +02:00
Evan Lezar
f50aecb84e Rename -toolkit executable to -runtime-hook
This change renames the nvidia-container-toolkit executable
to nvidia-container-runtime-hook. Here nvidia-container-toolkit
is created as a symlink to nvidia-container-runtime-hook.

Signed-off-by: Evan Lezar <elezar@nvidia.com>
2022-07-08 12:09:11 +02:00