Commit Graph

119 Commits

Author SHA1 Message Date
Evan Lezar
e4e1de82ec Refactor nvidia-ctk cdi generate command
This change refactors the generation of CDI specifications
to use discoverers and generate the CDI specifications from these
discoverers. This allows for better reuse.

Signed-off-by: Evan Lezar <elezar@nvidia.com>
2022-12-02 11:49:37 +01:00
Evan Lezar
932b39fd08 Remove unused jsonMode and fix output
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2022-12-01 16:21:50 +01:00
Evan Lezar
584e792a5a Ensure output folder exists for CDI spec
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2022-11-30 19:40:58 +01:00
Evan Lezar
d45ec7bd28 Switch to string-based flag for CDI output format
This change replaces the `--json` flag of the nvidia-ctk cdi generate
command with a --format flag that accepts a string format of either
json or yaml.

Signed-off-by: Evan Lezar <elezar@nvidia.com>
2022-11-29 16:56:26 +01:00
Evan Lezar
9faf11ddf3 Fix error message
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2022-11-23 22:21:58 +01:00
Evan Lezar
b6d9c2c1ad Add graphics devices and libraries to CDI specification
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2022-11-14 13:55:56 +01:00
Evan Lezar
ce3d94af1a Add README for generating CDI specifications
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2022-11-08 15:15:27 +01:00
Evan Lezar
0bc09665a8 Merge branch 'CNT-1380/add-crio-config' into 'main'
Add support for updating crio config

See merge request nvidia/container-toolkit/container-toolkit!176
2022-11-07 10:54:34 +00:00
Evan Lezar
877832da69 Consider all Swarm resource envvars
This change extends the support for multiple envvars when
specifying swarm resources to consider ALL of the specified
environment variables instead of the first match.

Signed-off-by: Evan Lezar <elezar@nvidia.com>
2022-11-04 10:01:28 +01:00
Evan Lezar
a2fb017208 Merge branch 'rework-cdi-cli' into 'main'
Rename nvidia-ctk info generate-cdi command

See merge request nvidia/container-toolkit/container-toolkit!236
2022-11-03 09:31:26 +00:00
Evan Lezar
c793fc27d8 Output YAML separator
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2022-11-02 15:03:18 +01:00
Evan Lezar
3d2328bdfd Rename nvidia-ctk info generate-cdi command
This change renames the nvidia-ctk info generate-cdi command as

nvidia-ctk cdi generate

Signed-off-by: Evan Lezar <elezar@nvidia.com>
2022-11-02 14:56:26 +01:00
Evan Lezar
eac4faddc6 Use :: as link separator
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2022-11-02 14:42:51 +01:00
Evan Lezar
aca0c7bc5a Add Devices abstraction to CUDA image
This change adds a Devices abstraction to the CUDA image utilities. This
allows for checking whether a devices is selected, for example.

Signed-off-by: Evan Lezar <elezar@nvidia.com>
2022-11-02 14:39:53 +01:00
Evan Lezar
59bf7607ce Merge branch 'ipc-rw' into 'main'
Mount IPC sockets with noexec flag

See merge request nvidia/container-toolkit/container-toolkit!234
2022-11-02 12:15:47 +00:00
Evan Lezar
523fc57ab4 Use an Executable Locator to lookup chmod
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2022-10-26 16:24:11 +02:00
Evan Lezar
ae18c5d847 Include chmod hook for device subfolders in CDI spec generation
This change generates one or more createContainer hooks for ensuring
that subfolders in /dev have the required permissions in the container.
As an example, a user requires read permissions to the /dev/nvidia-caps
in addition to including the specific caps devices under this folder.

Signed-off-by: Evan Lezar <elezar@nvidia.com>
2022-10-26 16:08:13 +02:00
Evan Lezar
4abdc2f35d Add nvidia-ctk hook chmod command to set permissions
This change adds an nvidia-ctk hook chmod command that can be used
to update the permissions for paths in the container.

This prepends the container root to the paths to allow these to be
updated by runtime executables.

Signed-off-by: Evan Lezar <elezar@nvidia.com>
2022-10-26 16:01:52 +02:00
Evan Lezar
f8748bfa9a Mount IPC sockets with noexec flag
This change ensures that the CDI spec mounts the ipc sockets with the
noexec flag to allow these to function in rootless mode with podman.

Signed-off-by: Evan Lezar <elezar@nvidia.com>
2022-10-21 16:44:02 +02:00
Evan Lezar
1267c1d9a2 Refactor docker config update
This change updates the docker config update for simplicitly.
This also allows for the API to match the crio update code.

Signed-off-by: Evan Lezar <elezar@nvidia.com>
2022-10-11 11:42:38 +02:00
Evan Lezar
9a697e340b Add support for updating crio configs
This adds support for updating crio configs (instead of installing hooks)
and adds crio support to the nvidia-ctk runtime configure command.

Signed-off-by: Evan Lezar <elezar@nvidia.com>
2022-10-11 11:42:38 +02:00
Evan Lezar
9bbf7dcf96 Merge branch 'fix-hook-removal' into 'main'
Improve locating NVIDIA Container Runtime Hook

See merge request nvidia/container-toolkit/container-toolkit!215
2022-10-11 09:32:08 +00:00
Evan Lezar
1597ede2af Add all device
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2022-10-10 10:19:08 +02:00
Evan Lezar
3dd8020695 Include meta devices in generated CDI spec
This change includes meta devices (e.g. /dev/nvidiactl) in the
generated CDI spec. Missing device nodes are ignored.

Signed-off-by: Evan Lezar <elezar@nvidia.com>
2022-10-07 16:23:37 +02:00
Evan Lezar
dfa041991f Generate v0.4.0 CDI spec
This change generates a v0.4.0 CDI spec instead of a v0.5.0 spec.
This allows older versions of podman, for example, to be used.

This requires that the device names do not start on a numeric character
and that the HostPath for a device is unspecified.

Signed-off-by: Evan Lezar <elezar@nvidia.com>
2022-10-07 16:10:47 +02:00
Evan Lezar
a35236a8f6 Correct test cases for NVIDIA_VISIBLE_DEVICES=void
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2022-10-04 14:14:44 +02:00
Evan Lezar
f0bdfbebe4 Add support for multiple swarm resource envvars
This change allows the swarm-resource config option to specify a
comma-separated list of environment variables instead of a single
environment variable.

The first environment variable matched is considered and other
environment variables are ignored.

Signed-off-by: Evan Lezar <elezar@nvidia.com>
2022-10-04 14:11:10 +02:00
Evan Lezar
8c1b9b33c1 Use common code to construct ldconfig hook
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2022-09-29 12:12:42 +02:00
Evan Lezar
d37c17857e Add nvidia-ctk info generate-cdi command
This change adds functionality to generate CDI specifications
for all devices detected on the system. A specification containing
all GPUs and MIG devices is generated. All libraries on the host
ldcache that have an NVIDIA Driver Version suffix are included as
are the required binaries and IPC sockets.

A hook (based on the nvidia-ctk hook subcommand) to update the ldcache
in the container for the libraries being injected is also added to the
CDI specificiation.

Signed-off-by: Evan Lezar <elezar@nvidia.com>
2022-09-29 12:11:42 +02:00
Evan Lezar
52bb9e186b Add vulkan support through OCI spec modification
This change allows the NVIDIA Container Runtime to inject vulkan
loaders and libraries by modifying the OCI runtime specification.

This allows vulkan applications to run in containers without
additional modifications.

Signed-off-by: Evan Lezar <elezar@nvidia.com>
2022-09-28 16:51:52 +02:00
Evan Lezar
2b08a79206 Ensure that errors are logged
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2022-09-19 15:29:29 +02:00
Evan Lezar
ffd6ec3c54 Add modifier to inject Tegra platform files
This change adds a modifier to that injects the tegra platform files
* /etc/nv_tegra_release
* /sys/devices/soc0/family

allowing these files to be used for platform detection in a containerized
context such as the GPU device plugin.

Signed-off-by: Evan Lezar <elezar@nvidia.com>
2022-08-08 16:04:20 +02:00
Christopher Desiniotis
3a9de13f4e Apply 1 suggestion(s) to 1 file(s) 2022-07-21 08:03:39 +00:00
Evan Lezar
1161b21166 Make error message clearer
This change improves the error message when invoking the NVIDIA
Runtime Hook in non-legacy mode. This should guide users to specifying
the --runtime=nvidia flag when using docker.

Signed-off-by: Evan Lezar <elezar@nvidia.com>
2022-07-18 13:09:59 +02:00
Evan Lezar
37ee972f74 Merge branch 'CNT-2349/configure-docker' into 'main'
Add nvidia-ctk runtime configure command to update docker config

See merge request nvidia/container-toolkit/container-toolkit!166
2022-07-14 08:06:27 +00:00
Evan Lezar
9f0060f651 Add nvidia-ctk runtime configure command
This change adds a `runtime configure` command to the nvidia-ctk CLI. This
command is currently limited to configuring the docker config on the
system by modifying the daemon.json config file associated with docker.

Signed-off-by: Evan Lezar <elezar@nvidia.com>
2022-07-13 10:33:46 +02:00
Evan Lezar
f50aecb84e Rename -toolkit executable to -runtime-hook
This change renames the nvidia-container-toolkit executable
to nvidia-container-runtime-hook. Here nvidia-container-toolkit
is created as a symlink to nvidia-container-runtime-hook.

Signed-off-by: Evan Lezar <elezar@nvidia.com>
2022-07-08 12:09:11 +02:00
Evan Lezar
404e266222 Add cdi mode to NVIDIA Container Runtime
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2022-07-06 16:53:05 +02:00
Evan Lezar
925c348565 Add DevicesFromEnvvars function to CUDA image
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2022-07-01 16:12:13 +02:00
Evan Lezar
7f7bec0668 Create GDS and MOFED modifiers
This change creates GDS and MOFED modifiers and adds them to the
modifer created for the selected runtime mode if the NVIDIA_GDS
and NVIDIA_MOFED envvars are set to "enabled", respectively.

Signed-off-by: Evan Lezar <elezar@nvidia.com>
2022-07-01 14:54:05 +02:00
Evan Lezar
e8843c38f2 Move cmd/nvidia-container-runtime/modifier package to internal/modifier
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2022-07-01 14:28:40 +02:00
Evan Lezar
d66c00dd1d Use modifier list and discoverModifer
This change uses modifier compositioning and the discoverModifier to
refactor the existing CSV modifier.

This change adds a discoverModifier to the internal/modifier package and
refactors the CSV modifier to use this abstraction.

Signed-off-by: Evan Lezar <elezar@nvidia.com>
2022-07-01 14:25:19 +02:00
Evan Lezar
7ca0e5db60 Update NVIDIA Container Runtime readme
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2022-06-03 13:10:21 +02:00
Evan Lezar
c77e86137e Add version output to CLIs
This change adds version output to the nvidia-continer-runtime,
nvidia-container-toolkit, and nvidia-ctk CLIs. The same version
is used in all cases and includes a version string and a git
revision if set.

The construction of the version string mirrors what is done in runc.

Signed-off-by: Evan Lezar <elezar@nvidia.com>
2022-05-13 07:31:11 +02:00
Evan Lezar
60dacb76b6 Call logger.Reset() to ensure errors are captured
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2022-05-12 15:42:42 +02:00
Evan Lezar
19138a2110 Skip setting of log file for --version flag
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2022-05-12 14:34:37 +02:00
Evan Lezar
4c49f75365 Remove --force flag from nvidia-container-runtime-hook
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2022-05-12 10:53:52 +02:00
Evan Lezar
e591f3f26b Replace experimental and discover-mode
These changes replace the nvidia-container-runtime config options
experimental and discover-mode with a single mode config option.

Note that mode is now a string with a default value of "auto"
and a mode value of "legacy" is equivalent to experimental == false.

Signed-off-by: Evan Lezar <elezar@nvidia.com>
2022-05-12 10:53:50 +02:00
Evan Lezar
e0ad82e467 Move ResolveAutoMode to info package
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2022-05-12 10:28:56 +02:00
Evan Lezar
3a1404f2f4 Move isTegraSystem to internal info package
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2022-05-12 10:28:56 +02:00