Commit Graph

290 Commits

Author SHA1 Message Date
Avi Deitcher
179d8655f9 Move nvidia-ctk hook command into own binary
This change creates an nvidia-cdi-hook binary for implementing
CDI hooks. This allows for these hooks to be separated from the
nvidia-ctk command which may, for example, require libnvidia-ml
to support other functionality.

The nvidia-ctk hook subcommand is maintained as an alias for the
time being to allow for existing CDI specifications referring to
this path to work as expected.

Signed-off-by: Avi Deitcher <avi@deitcher.net>
2024-05-21 12:19:44 +02:00
Evan Lezar
b435b797af Add support for adding additional containerd configs
This allow for options such as SystemdCgroup to be optionally set.

Signed-off-by: Evan Lezar <elezar@nvidia.com>
2024-05-17 12:58:08 +02:00
Evan Lezar
5a3eda4cba Use : as a config --set list separator
This allows settings such as:

nvidia-ctk config --set nvidia-container-runtime.runtimes=crun:runc

to be applied correctly.

Signed-off-by: Evan Lezar <elezar@nvidia.com>
2024-05-08 17:26:59 +02:00
Evan Lezar
29c0f82ed2
Merge pull request #327 from elezar/add-driver-config
Add config search path option to driver root
2024-04-11 16:58:33 +02:00
Evan Lezar
09341a0934 Add support for feature flags
This change adds a features config that allows
individual features to be toggled at a global level. Each feature can (by default)
be controlled by an environment variable.

The GDS, MOFED, NVSWITCH, and GDRCOPY features are examples of such features.

Signed-off-by: Evan Lezar <elezar@nvidia.com>
2024-04-03 11:58:37 +02:00
Evan Lezar
2a9e3537ec Add config search paths option to driver root.
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2024-04-02 23:03:05 +02:00
rongfu.leng
a78a7f866f fix doc
Signed-off-by: rongfu.leng <lenronfu@gmail.com>
2024-03-28 13:55:25 +08:00
Kevin Klues
296d4560b0 Add support for an NVIDIA_IMEX_CHANNELS envvar
Signed-off-by: Kevin Klues <kklues@nvidia.com>
2024-02-26 20:09:43 +01:00
Evan Lezar
b6efd3091d Use index and uuid as default device-name-strategies
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2024-02-13 16:38:18 +01:00
Evan Lezar
52da12cf9a Allow multiple device name strategies to be specified
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2024-02-13 16:38:05 +01:00
Evan Lezar
f89cef307d Specify DRIVER_ROOT consistently
This change ensures that CLI tools that require the path to the
driver root accept both the NVIDIA_DRIVER_ROOT and DRIVER_ROOT
environment variables in addition to the --driver-root command
line argument.

Signed-off-by: Evan Lezar <elezar@nvidia.com>
2024-02-09 14:28:56 +01:00
Evan Lezar
bab4ec30af Improve error reporting for cdi list
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2024-02-08 14:58:48 +01:00
Evan Lezar
b6ab444529 Add spec-dirs argument to cdi list
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2024-02-08 14:50:14 +01:00
Evan Lezar
862f071557 Fix bug in update-ldcache hook
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2024-01-29 15:20:21 +01:00
Evan Lezar
f936f4c0bc Add --cdi.enable as alias for --cdi.enabled
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2024-01-23 14:57:15 +01:00
Evan Lezar
ab598f004d Fix --cdi.enabled for Docker
Instead of relying only on Experimental mode, the docker daemon
config requires that CDI is an opt-in feature.

Signed-off-by: Evan Lezar <elezar@nvidia.com>
2024-01-23 14:56:08 +01:00
Jared Baur
d80657dd0a
Explicitly set ldconfig cache and config file
Since the `update-ldcache` hook uses the host's `ldconfig`, the default
cache and config files configured on the host will be used. If those
defaults differ from what nvidia-ctk expects it to be (/etc/ld.so.cache
and /etc/ld.so.conf, respectively), then the hook will fail. This change
makes the call to ldconfig explicit in which cache and config files are
being used.

Signed-off-by: Jared Baur <jaredbaur@fastmail.com>
2024-01-18 02:23:27 -08:00
Jared Baur
838493b8b9
Allow for customizing the path to ldconfig
Since the `createContainer` `runc` hook runs with the environment that
the container's config.json specifies, the path to `ldconfig` may not be
easily resolvable if the host environment differs enough from the
container (e.g. on a NixOS host where all binaries are under hashed
paths in /nix/store with an Ubuntu container whose PATH contains
FHS-style paths such as /bin and /usr/bin). This change allows for
specifying exactly where ldconfig comes from.

Signed-off-by: Jared Baur <jaredbaur@fastmail.com>
2024-01-17 21:07:00 -08:00
Christopher Desiniotis
83ad09b179 Refactor the engine.Interface such that the Set() API does not return an extraneous error
Signed-off-by: Christopher Desiniotis <cdesiniotis@nvidia.com>
2023-12-01 15:59:34 -08:00
Tariq Ibrahim
7627d48a5c run goimports -local against the entire codebase
Signed-off-by: Tariq Ibrahim <tibrahim@nvidia.com>
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2023-12-01 11:13:17 +01:00
Evan Lezar
bc4e19aa48 Add --relative-to option to nvidia-ctk transform root
This change adds a --relative-to option to the nvidia-ctk transform root
command. This defaults to "host" maintaining the existing behaviour.

If --relative-to=container is specified, the root transform is applied to
container paths in the CDI specification instead of host paths.

Signed-off-by: Evan Lezar <elezar@nvidia.com>
2023-11-30 20:26:42 +01:00
Evan Lezar
879cc99aac Add transformer for container roots
This change renames the root transformer to indicate that it
operates on host paths and adds a container root transformer for
explicitly transforming container roots.

The transform.NewRootTransformer constructor still exists, but has
been marked as deprecated.

Signed-off-by: Evan Lezar <elezar@nvidia.com>
2023-11-30 20:26:42 +01:00
Evan Lezar
893b3c1824 Fix incorrect ldconfig path
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2023-11-24 11:03:51 +01:00
Evan Lezar
671d787a42 Switch to reflect package for config updates
This change switches to using the reflect package to determine
the type of config options instead of inferring the type from the
Toml data structure.

Signed-off-by: Evan Lezar <elezar@nvidia.com>
2023-11-23 10:29:38 +01:00
Christopher Desiniotis
64fb26b086 Add option to nvidia-ctk to enable CDI in docker
Signed-off-by: Christopher Desiniotis <cdesiniotis@nvidia.com>
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2023-11-23 10:15:58 +01:00
Evan Lezar
fc8c5f82dc Merge branch 'fix-ldconfig-resolution' into 'main'
Resolve LDConfig path

See merge request nvidia/container-toolkit/container-toolkit!490
2023-11-21 16:45:21 +00:00
Evan Lezar
d792e64f38 Resolve ldconfig path in update-ldcache hook
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2023-11-21 15:31:12 +01:00
Evan Lezar
232df647c1 Resolve LDConfig path passed to nvidia-container-cli
Instead of relying solely on a static config, we resolve the path
to ldconfig. The path is checked for existence and a .real suffix is preferred.

Signed-off-by: Evan Lezar <elezar@nvidia.com>
2023-11-21 15:31:12 +01:00
Evan Lezar
adc516fd59 Merge branch 'ctk-hook-chmod-improve-eperm-handling' into 'main'
nvidia-ctk hook chmod: Improve permission error handling

See merge request nvidia/container-toolkit/container-toolkit!496
2023-11-21 11:05:03 +00:00
Evan Lezar
00a712d018 Add --dev-root option to CDI spec generation
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2023-11-20 21:29:35 +01:00
Ievgen Popovych
9085cb7dd5 nvidia-ctk hook chmod: Move file mode parsing into flag validation function
Signed-off-by: Ievgen Popovych <jmennius@gmail.com>
2023-11-20 14:49:29 +02:00
Ievgen Popovych
eb35d9b30a nvidia-ctk hook chmod: Ignore permission errors
In some cases we might get a permission error trying to chmod -
most likely this is due to something beyond our control
like whole `/dev` being mounted.
Do not fail container creation in this case.

Due to loosing control of the program after `exec()`-ing `chmod(1)` program
and therefore not being able to handle errors -
refactor to use `chmod(2)` syscall instead of `exec()` `chmod(1)` program.

Fixes: #143
Signed-off-by: Ievgen Popovych <jmennius@gmail.com>
2023-11-20 01:29:51 +02:00
Ievgen Popovych
f1d32f2cd3 nvidia-ctk hook chmod: Only chmod if desired permissions are different
This is to avoid any unnecessary potential errors (e.g. due to permissions).

Fixes: #143
Signed-off-by: Ievgen Popovych <jmennius@gmail.com>
2023-11-20 01:18:36 +02:00
Evan Lezar
6dc9ee3f33 Allow ldcache update in container to be skipped
This change skips the update of ld.cache in the container if it
doesn't exist. Instead, the -N flag is used to only create the
relevant symlinks.

Signed-off-by: Evan Lezar <elezar@nvidia.com>
2023-11-17 11:56:19 +01:00
Evan Lezar
c63fb35ba8 Use github.com/NVIDIA/go-nvlib imports
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2023-11-15 21:38:26 +01:00
Evan Lezar
c25376afa0 Merge branch 'update-cdi' into 'main'
Use tags.cncf.io for CDI imports

See merge request nvidia/container-toolkit/container-toolkit!487
2023-11-02 09:14:30 +00:00
Evan Lezar
e56bb09889 Use tags.cncf.io for CDI imports
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2023-11-01 12:40:51 +01:00
Evan Lezar
61595aa0fa Add cdi.enabled option to runtime configure
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2023-10-31 17:23:55 +01:00
Evan Lezar
833254fa59 Support CDI devices as mounts
This change allows CDI devices to be requested as mounts in the
container. This enables their use in environments such as kind
where environment variables or annotations cannot be used.

Signed-off-by: Evan Lezar <elezar@nvidia.com>
2023-10-27 21:24:53 +02:00
Evan Lezar
acc50969dc Fix ifElseChain lint errors
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2023-10-24 20:11:34 +02:00
Evan Lezar
48d68e4eff Add nolint for exec calls
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2023-10-24 20:11:34 +02:00
Evan Lezar
2f48ab99c3 Address singleCaseSwitch errors
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2023-10-24 20:11:34 +02:00
Evan Lezar
73857eb8e3 Fix unnecessary conversion
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2023-10-24 20:11:34 +02:00
Evan Lezar
8a9f367067 Check returned error values
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2023-10-24 20:00:24 +02:00
Evan Lezar
f2c9937ca8 Use cdi parser package
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2023-10-24 20:00:24 +02:00
Evan Lezar
12dc12ce09 Fix misspellings
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2023-10-24 20:00:24 +02:00
Evan Lezar
2fad708556 Address ioutil deprecation
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2023-10-24 20:00:24 +02:00
Evan Lezar
73749285d5 Remove unused loadSaver interface
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2023-10-24 20:00:24 +02:00
Evan Lezar
ebff62f56b Update nvidia-container-runtime README
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2023-10-23 13:35:19 +02:00
Evan Lezar
f63ad3d9e7 Refactor symlink filter
This change refactors the use of the symlink filter to make it extendible.
A blocked filter can be set on the Tegra CSV discoverer to ensure that the correct
symlink libraries are filtered out. Here, globs can be used to select mulitple libraries,
and a **/ prefix on the globs indicates that the pattern that follows is only applied to
the filename of the symlink entry in the CSV file.

A --csv.ignore-pattern command line argument is added to the nvidia-ctk cdi generate
command that allows this to be set.

Signed-off-by: Evan Lezar <elezar@nvidia.com>
2023-09-22 22:04:06 +02:00