nvidia-container-toolkit

mirror of https://github.com/NVIDIA/nvidia-container-toolkit synced 2025-06-26 18:18:24 +00:00

Author	SHA1	Message	Date
Jean-Francois Roy	d9c52ecd4e	remove runtime wrapper driver detection This will be handled by the runtime itself. Signed-off-by: Jean-Francois Roy <jeroy@nvidia.com>	2024-11-06 14:02:26 -08:00
Jean-Francois Roy	0c1e76a221	inline execve Signed-off-by: Jean-Francois Roy <jeroy@nvidia.com>	2024-11-06 14:02:26 -08:00
Jean-Francois Roy	8d8dbd38c3	use the new Go wrapper program This patch modifies the the container toolkit installer, used by the GPU operator, to use the new Go wrapper program. Signed-off-by: Jean-Francois Roy <jeroy@nvidia.com>	2024-11-06 14:02:26 -08:00
Jean-Francois Roy	dc79fa8513	implement golang wrapper to replace shell scripts Some platforms and Kubernetes distributions do not include a shell. This patch replaces the shell wrapper scripts with a small Go program. Signed-off-by: Jean-Francois Roy <jeroy@nvidia.com>	2024-11-06 14:02:26 -08:00
Evan Lezar	0c687be794	[no-relnote] Also validate CDI management spec Signed-off-by: Evan Lezar <elezar@nvidia.com>	2024-11-05 14:23:36 -08:00
Evan Lezar	8d869acce5	[no-relnote] Add toolkit install unit test This change adds basic toolkit installation unit tests. This required that the source for files be specified when installing to allow for a testdata folder to be used. This replaces the currently unused shell-based tests in /test/container. Signed-off-by: Evan Lezar <elezar@nvidia.com>	2024-11-05 14:23:35 -08:00
Evan Lezar	1145ce2283	Add aliases for runtime-specific envvars This change ensures that the toolkit works with older versions of the GPU Operator where runtime-specific envvars are used to set options such as the config file location. Signed-off-by: Evan Lezar <elezar@nvidia.com>	2024-10-18 12:16:50 +02:00
Evan Lezar	bc9180b59d	Expose opt-in features in toolkit-container This change enables opt-in (off-by-default) features to be opted into. These features can be toggled by name by specifying the (repeated) --opt-in-features command line argument or as a comma-separated list in the NVIDIA_CONTAINER_TOOLKIT_OPT_IN_FEATURES environment variable. Signed-off-by: Evan Lezar <elezar@nvidia.com>	2024-10-17 14:26:24 +02:00
Tariq Ibrahim	b90ee5d100	[no-relnote] minor cleanup and improvements Signed-off-by: Tariq Ibrahim <tibrahim@nvidia.com> Signed-off-by: Evan Lezar <elezar@nvidia.com>	2024-10-11 16:14:41 +02:00
Tariq Ibrahim	f477dc0df1	fetch current container runtime config through the command line Signed-off-by: Tariq Ibrahim <tibrahim@nvidia.com> add default runtime binary path to runtimes field of toolkit config toml Signed-off-by: Tariq Ibrahim <tibrahim@nvidia.com> [no-relnote] Get low-level runtimes consistently We ensure that we use the same low-level runtimes regardless of the runtime engine being configured. This ensures consistent behaviour. Signed-off-by: Evan Lezar <elezar@nvidia.com> Co-authored-by: Evan Lezar <elezar@nvidia.com> address review comment Signed-off-by: Tariq Ibrahim <tibrahim@nvidia.com>	2024-10-10 01:13:20 -07:00
Evan Lezar	3ee678f4f6	Convert crio to runtime package Signed-off-by: Evan Lezar <elezar@nvidia.com>	2024-09-30 14:40:30 -07:00
Evan Lezar	103375e504	Convert containerd to runtime package Signed-off-by: Evan Lezar <elezar@nvidia.com>	2024-09-30 14:39:52 -07:00
Evan Lezar	5bedbc2b50	Convert docker to runtime package Signed-off-by: Evan Lezar <elezar@nvidia.com>	2024-09-30 14:36:35 -07:00
Evan Lezar	94337b7427	Add runtime package for runtime setup Signed-off-by: Evan Lezar <elezar@nvidia.com>	2024-09-30 14:36:35 -07:00
Evan Lezar	046a05921f	Convert toolkit to go package This change converts the toolkit installation logic to a go package and invokes this installation over the go API instead of starting this executable. Signed-off-by: Evan Lezar <elezar@nvidia.com>	2024-09-30 14:36:35 -07:00
Evan Lezar	bf2bdfd35e	Refactor Toml config handling This change refactors the toml config file handlig for runtimes such as containerd or crio. A toml.Loader is introduced that encapsulates loading the required file. This can be extended to allow other mechanisms for loading loading the current config. Signed-off-by: Evan Lezar <elezar@nvidia.com>	2024-09-30 14:24:18 +02:00
Evan Lezar	6c5f4eea63	Remove support for config overrides Signed-off-by: Evan Lezar <elezar@nvidia.com>	2024-09-27 13:23:35 +02:00
Evan Lezar	5145b0a4b6	Revert "Merge pull request #694 from elezar/add-opt-in-to-sockets" This reverts commit `b061446694`, reversing changes made to `c490baab63`. Signed-off-by: Evan Lezar <elezar@nvidia.com>	2024-09-20 20:26:45 +02:00
Evan Lezar	9c2476c98d	Expose opt-in features in toolkit-container This change enables opt-in (off-by-default) features to be opted into. These features can be toggled by name by specifying the (repeated) --opt-in-feature command line argument or as a comma-separated list in the NVIDIA_CONTAINER_TOOLKIT_OPT_IN_FEATURES environment variable. Signed-off-by: Evan Lezar <elezar@nvidia.com>	2024-09-18 22:30:27 +02:00
Evan Lezar	46838b1a44	Merge pull request #576 from elezar/get-options-from-default Extract options from default runtime if runc does not exist	2024-07-09 13:34:57 +02:00
Evan Lezar	b8389283d5	Extract options from default runtime if runc does not exist This change updates the logic to populate the options for the nvidia runtime configs added to containerd or crio from a default runtime if this is specified and a runc entry is not found. This allows the default runtime values for settings such as SystemdCgroup to be applied correctly. Signed-off-by: Evan Lezar <elezar@nvidia.com>	2024-07-04 12:44:08 +02:00
Tariq Ibrahim	70ef0fb973	avoid using map pointers as maps are always passed by reference Signed-off-by: Tariq Ibrahim <tibrahim@nvidia.com>	2024-07-02 17:35:44 -07:00
Evan Lezar	876d479308	Allow toolkit.pid path to be specified This change makes the following changes: * Allows the toolkit.pid path to be specified * Creates the toolkit.pid file at /run/nvidia/toolkit/toolkit.pid by default * Handles failures to remove the /run/nvidia/toolkit folder Signed-off-by: Evan Lezar <elezar@nvidia.com>	2024-06-17 11:26:23 +02:00
Evan Lezar	9208159263	Add dev-root option to toolkit container This changes adds an option to the toolkit container to allow the dev root to be specified. This adds support for driver installations where the driver files are at one root and the dev nodes are created elsewhere -- most typically at /. This is the case, for example, for GKE driver installations. Signed-off-by: Evan Lezar <elezar@nvidia.com>	2024-06-03 20:40:30 +02:00
Avi Deitcher	179d8655f9	Move nvidia-ctk hook command into own binary This change creates an nvidia-cdi-hook binary for implementing CDI hooks. This allows for these hooks to be separated from the nvidia-ctk command which may, for example, require libnvidia-ml to support other functionality. The nvidia-ctk hook subcommand is maintained as an alias for the time being to allow for existing CDI specifications referring to this path to work as expected. Signed-off-by: Avi Deitcher <avi@deitcher.net>	2024-05-21 12:19:44 +02:00
Evan Lezar	b435b797af	Add support for adding additional containerd configs This allow for options such as SystemdCgroup to be optionally set. Signed-off-by: Evan Lezar <elezar@nvidia.com>	2024-05-17 12:58:08 +02:00
Evan Lezar	cd7d586afa	Also ignore CDI errors if required Signed-off-by: Evan Lezar <elezar@nvidia.com>	2024-02-13 12:37:41 +01:00
Evan Lezar	cc4c2783a3	Add --create-device-nodes option to toolkit config This change adds a --create-device-nodes option to the toolkit config CLI. Most noteably, this allows the creation of control devices to be skipped when CDI spec generation is enabled. Currently values of "", "node", and "control" are supported and can be set via the command line flag or the CREATE_DEVICE_NODES environment variable. The default value of CREATE_DEVICE_NODES=control will trigger the creation of control device nodes. Setting this envvar to include the (comma-separated) strings of "" or "none" will disable device node creation regardless of whether other supported strings are included. Signed-off-by: Evan Lezar <elezar@nvidia.com>	2024-02-13 12:37:41 +01:00
Evan Lezar	f89cef307d	Specify DRIVER_ROOT consistently This change ensures that CLI tools that require the path to the driver root accept both the NVIDIA_DRIVER_ROOT and DRIVER_ROOT environment variables in addition to the --driver-root command line argument. Signed-off-by: Evan Lezar <elezar@nvidia.com>	2024-02-09 14:28:56 +01:00
Tariq Ibrahim	7627d48a5c	run goimports -local against the entire codebase Signed-off-by: Tariq Ibrahim <tibrahim@nvidia.com> Signed-off-by: Evan Lezar <elezar@nvidia.com>	2023-12-01 11:13:17 +01:00
Evan Lezar	879cc99aac	Add transformer for container roots This change renames the root transformer to indicate that it operates on host paths and adds a container root transformer for explicitly transforming container roots. The transform.NewRootTransformer constructor still exists, but has been marked as deprecated. Signed-off-by: Evan Lezar <elezar@nvidia.com>	2023-11-30 20:26:42 +01:00
Evan Lezar	a545810981	Allow make check to run on non-linux platforms Signed-off-by: Evan Lezar <elezar@nvidia.com>	2023-11-27 14:10:34 +01:00
Evan Lezar	232df647c1	Resolve LDConfig path passed to nvidia-container-cli Instead of relying solely on a static config, we resolve the path to ldconfig. The path is checked for existence and a .real suffix is preferred. Signed-off-by: Evan Lezar <elezar@nvidia.com>	2023-11-21 15:31:12 +01:00
Evan Lezar	e56bb09889	Use tags.cncf.io for CDI imports Signed-off-by: Evan Lezar <elezar@nvidia.com>	2023-11-01 12:40:51 +01:00
Evan Lezar	48d68e4eff	Add nolint for exec calls Signed-off-by: Evan Lezar <elezar@nvidia.com>	2023-10-24 20:11:34 +02:00
Evan Lezar	709e27bf4b	Fix implicit memory aliasing in for loop Signed-off-by: Evan Lezar <elezar@nvidia.com>	2023-10-24 20:11:34 +02:00
Evan Lezar	8a9f367067	Check returned error values Signed-off-by: Evan Lezar <elezar@nvidia.com>	2023-10-24 20:00:24 +02:00
Evan Lezar	f2c9937ca8	Use cdi parser package Signed-off-by: Evan Lezar <elezar@nvidia.com>	2023-10-24 20:00:24 +02:00
Evan Lezar	12dc12ce09	Fix misspellings Signed-off-by: Evan Lezar <elezar@nvidia.com>	2023-10-24 20:00:24 +02:00
Evan Lezar	73749285d5	Remove unused loadSaver interface Signed-off-by: Evan Lezar <elezar@nvidia.com>	2023-10-24 20:00:24 +02:00
Evan Lezar	f6a4986c15	Add support for creating oci hook to nvidia-ctk This change extends the nvidia-ctk runtime configure command with a --config-mode=oci-hook that creates an OCI hook json file. Signed-off-by: Evan Lezar <elezar@nvidia.com>	2023-08-11 16:34:58 +02:00
Evan Lezar	0938576618	Remove NVIDIA experimental runtime from toolkit container Signed-off-by: Evan Lezar <elezar@nvidia.com>	2023-07-10 11:44:55 +02:00
Evan Lezar	4d2e8d1913	Ensure common envvars have higher precedence Signed-off-by: Evan Lezar <elezar@nvidia.com>	2023-06-27 14:45:15 +02:00
Evan Lezar	d52dbeaa7a	Split internal system package This changes splits the functionality in the internal system package into two packages: one for dealing with devices and one for dealing with kernel modules. This removes ambiguity around the meaning of driver / device roots in each case. In each case, a root can be specified where device nodes are created or kernel modules loaded. Signed-off-by: Evan Lezar <elezar@nvidia.com>	2023-06-15 09:01:13 +02:00
Evan Lezar	1d0a733487	Replace logger.Warn(f) with logger.Warning(f) This aligns better with klog used in other projects. Signed-off-by: Evan Lezar <elezar@nvidia.com>	2023-06-12 10:48:04 +02:00
Evan Lezar	2bc0f45a52	Remove unused constants and variables Signed-off-by: Evan Lezar <elezar@nvidia.com>	2023-06-11 11:38:22 +02:00
Evan Lezar	178eb5c5a8	Rework restart logic Signed-off-by: Evan Lezar <elezar@nvidia.com>	2023-06-10 12:41:53 +02:00
Evan Lezar	761fc29567	Add version info to config CLIs Signed-off-by: Evan Lezar <elezar@nvidia.com>	2023-06-09 18:49:17 +02:00
Evan Lezar	9f5c82420a	Refactor toolking to setup and cleanup configs Signed-off-by: Evan Lezar <elezar@nvidia.com>	2023-06-09 18:49:15 +02:00
Evan Lezar	23041be511	Add runtimeDir as argument Thsi change adds the --nvidia-runtime-dir as a command line argument when configuring container runtimes in the toolkit container. This removes the need to set it via the command line. Signed-off-by: Evan Lezar <elezar@nvidia.com>	2023-06-09 18:48:34 +02:00

1 2 3

115 Commits