Compare commits

..

32 Commits

Author SHA1 Message Date
Carlos Eduardo Arango Gutierrez
55a8d1e1e5 Load settings from config.toml file during CDI generation
Some checks failed
CI Pipeline / code-scanning (push) Has been cancelled
CI Pipeline / variables (push) Has been cancelled
CI Pipeline / golang (push) Has been cancelled
CI Pipeline / image (push) Has been cancelled
CI Pipeline / e2e-test (push) Has been cancelled
Signed-off-by: Carlos Eduardo Arango Gutierrez <eduardoa@nvidia.com>
2025-06-26 13:14:27 +02:00
Carlos Eduardo Arango Gutierrez
1676931fe0 [no-relnote] Migrate to urfave v3
Signed-off-by: Carlos Eduardo Arango Gutierrez <eduardoa@nvidia.com>
2025-06-25 15:46:27 +02:00
Evan Lezar
ced79e51ed Merge pull request #1131 from elezar/bump-release-v1.18.0-rc.1
Bump release v1.18.0 rc.1
2025-06-25 12:04:28 +02:00
Evan Lezar
d1e25abd6c [no-relnote] Add v1.18.0-rc.1 changelog
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2025-06-25 12:02:19 +02:00
Evan Lezar
7d3defccd2 Bump version for v1.18.0-rc.1 release
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2025-06-24 19:59:15 +02:00
Evan Lezar
c95d36db52 Merge pull request #947 from elezar/ensure-libcuda.so-in-ldcache
Some checks are pending
CI Pipeline / code-scanning (push) Waiting to run
CI Pipeline / variables (push) Waiting to run
CI Pipeline / golang (push) Waiting to run
CI Pipeline / image (push) Blocked by required conditions
CI Pipeline / e2e-test (push) Blocked by required conditions
Ensure that libcuda.so is in the ldcache
2025-06-24 19:55:51 +02:00
Evan Lezar
36950ba03f Merge pull request #1100 from elezar/pin-libnvidia-container-tools-version
Require matching version of libnvidia-container-tools
2025-06-24 19:55:10 +02:00
Evan Lezar
d1286bceed Merge pull request #763 from elezar/allow-config-override-by-envvar
Add support for specifying the config file path in an environment variable
2025-06-24 19:54:36 +02:00
Evan Lezar
2f204147f9 Merge pull request #1030 from elezar/add-driver-lib-root-envvar
Add envvar for libcuda.so parent dir to CDI spec
2025-06-24 14:00:46 +02:00
Evan Lezar
39975fc77b [no-relnote] Refactor ldconfig hooks
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2025-06-24 13:53:20 +02:00
Evan Lezar
4bf7421a80 Add create-soname-symlinks hook
This change adds a create-soname-symlinks hook that can be used to ensure
that the soname symlinks for injected libraries exist in a container.

This is done by calling ldconfig -n -N for the directories containing the injected
libraries.

This also ensures that libcuda.so is present in the ldcache when the update-ldcache
hook is run.

Signed-off-by: Evan Lezar <elezar@nvidia.com>
2025-06-24 13:49:24 +02:00
Evan Lezar
f0ea60a28f Require matching version of libnvidia-container-tools
Since the nvidia-container-tools and libnvidia-container* packages
are now released at the same time with the same version, a more
restrictive version makes sense. Here we specifically require an
equal version for the nvidia-container-toolkit* packages and the
libnvidia-container* packages.

Signed-off-by: Evan Lezar <elezar@nvidia.com>
2025-06-24 13:26:31 +02:00
Evan Lezar
4bab94baa6 Add envvar for libcuda.so parent dir to CDI spec
This change adds an NVIDIA_CTK_LIBCUDA_DIR envvar to
a generated CDI specification. This reports where the `libcuda.so.*`
libraries will be injected into the container.

Signed-off-by: Evan Lezar <elezar@nvidia.com>
2025-06-24 13:24:50 +02:00
Evan Lezar
f642825ad4 Add EnvVar to Discover interface
This change adds environment variables to the Discover interface.

Signed-off-by: Evan Lezar <elezar@nvidia.com>
2025-06-24 13:24:50 +02:00
Evan Lezar
5bc2f50299 Merge pull request #1154 from elezar/switch-to-distroless
Switch to distroless Base image
2025-06-24 11:05:35 +02:00
Evan Lezar
60706815a5 Create /work/nvidia-toolkit symlink
This change ensures that a symlink from /work/nvidia-toolkit to
/work/nvidia-ctk-installer exists to allow GPU Operator versions
that override the entrypoint and assume nvidia-toolkit as the
original entrypoint.

Signed-off-by: Evan Lezar <elezar@nvidia.com>
2025-06-19 10:23:09 +02:00
Evan Lezar
69b0f0ba61 [no-relnote] Update release scripts for distroless
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2025-06-19 10:20:59 +02:00
Evan Lezar
7abf5fa6a4 Use Apache license for images
This change removes the NGC-DL-CONTAINER-LICENSE (since this
is not available in the distroless images) and includes the
repo's Apache LICENSE file in the image.

Signed-off-by: Evan Lezar <elezar@nvidia.com>
2025-06-19 10:20:59 +02:00
Evan Lezar
0dddd5cfd8 Merge pull request #910 from elezar/default-to-cdi
Some checks failed
CI Pipeline / code-scanning (push) Has been cancelled
CI Pipeline / variables (push) Has been cancelled
CI Pipeline / golang (push) Has been cancelled
CI Pipeline / image (push) Has been cancelled
CI Pipeline / e2e-test (push) Has been cancelled
Use just-in-time CDI spec generation by default in the NVIDIA Container Runtime
2025-06-18 23:40:44 +02:00
Evan Lezar
28ddc1454c Switch to golang distroless image
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2025-06-18 23:39:26 +02:00
Evan Lezar
17c5d1dc87 Resolve to legacy by default in nvidia-container-runtime-hook
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2025-06-18 23:36:12 +02:00
Evan Lezar
6149592bf6 Default to jit-cdi mode in the nvidia runtime
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2025-06-18 23:36:12 +02:00
Evan Lezar
d3ece78bc9 [no-relnote] Add RuntimeMode type
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2025-06-18 23:36:12 +02:00
Evan Lezar
980ca5d1bc Use functional options to construct runtime mode resolver
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2025-06-18 23:36:12 +02:00
Evan Lezar
76b71a5498 Merge pull request #1083 from elezar/bump-image-in-tests
[no-relnote] Use cuda 12.9.0 image in tests
2025-06-18 23:33:54 +02:00
Evan Lezar
5a1b4e7c1e [no-relnote] Fix tests for compat mode
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2025-06-18 15:22:08 +02:00
Evan Lezar
39fd15d273 [no-relnote] Use 550 driver in tests
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2025-06-18 15:18:21 +02:00
Evan Lezar
da6b849cf6 [no-relnote] Use cuda 12.9.0 image in tests
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2025-06-18 15:18:21 +02:00
Evan Lezar
849691d290 [no-relnote] Use NVIDIA_CTK_CONFIG_FILE_PATH in toolkit install
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2025-06-18 15:06:00 +02:00
Evan Lezar
f672d38aa5 [no-relnote] Rename config constants
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2025-06-18 15:02:35 +02:00
Evan Lezar
eb39b972a5 Add NVIDIA_CTK_CONFIG_FILE_PATH envvar
This change adds support for explicitly specifying the path
to the config file through an environment variable.
The NVIDIA_CTK_CONFIG_FILE_PATH variable is used for both the nvidia-container-runtime
and nvidia-container-runtime-hook.

Signed-off-by: Evan Lezar <elezar@nvidia.com>
2025-06-18 15:01:46 +02:00
Evan Lezar
d9c7ec9714 [no-relnote] Don't refer to target image distribution
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2025-06-18 14:32:54 +02:00
219 changed files with 8508 additions and 20322 deletions

View File

@@ -79,8 +79,8 @@ jobs:
runs-on: ubuntu-latest
strategy:
matrix:
dist:
- ubi9
target:
- application
- packaging
needs: packages
steps:
@@ -117,4 +117,4 @@ jobs:
BUILD_MULTI_ARCH_IMAGES: ${{ inputs.build_multi_arch_images }}
run: |
echo "${VERSION}"
make -f deployments/container/Makefile build-${{ matrix.dist }}
make -f deployments/container/Makefile build-${{ matrix.target }}

View File

@@ -1,34 +1,139 @@
# NVIDIA Container Toolkit Changelog
## v1.17.4
- Disable mounting of compat libs from container by default
## v1.18.0-rc.1
- Add create-soname-symlinks hook
- Require matching version of libnvidia-container-tools
- Add envvar for libcuda.so parent dir to CDI spec
- Add EnvVar to Discover interface
- Resolve to legacy by default in nvidia-container-runtime-hook
- Default to jit-cdi mode in the nvidia runtime
- Use functional options to construct runtime mode resolver
- Add NVIDIA_CTK_CONFIG_FILE_PATH envvar
- Switch to cuda ubi9 base image
- Use single version tag for image
- BUGFIX: modifier: respect GPU volume-mount device requests
- Ensure consistent sorting of annotation devices
- Extract deb and rpm packages to single image
- Remove docker-run as default runtime candidate
- Return annotation devices from VisibleDevices
- Make CDI device requests consistent with other methods
- Construct container info once
- Add logic to extract annotation device requests to image type
- Add IsPrivileged function to CUDA container type
- Add device IDs to nvcdi.GetSpec API
- Refactor extracting requested devices from the container image
- Add EnvVars option for all nvidia-ctk cdi commands
- Add nvidia-cdi-refresh service
- Add discovery of arch-specific vulkan ICD
- Add disabled-device-node-modification hook to CDI spec
- Add a hook to disable device node creation in a container
- Remove redundant deduplication of search paths for WSL
- Added ability to disable specific (or all) CDI hooks
- Consolidate HookName functionality on internal/discover pkg
- Add envvar to control debug logging in CDI hooks
- Add FeatureFlags to the nvcdi API
- Reenable nvsandboxutils for driver discovery
- Edit discover.mounts to have a deterministic output
- Refactor the way we create CDI Hooks
- Issue warning on unsupported CDI hook
- Run update-ldcache in isolated namespaces
- Add cuda-compat-mode config option
- Fix mode detection on Thor-based systems
- Add rprivate to CDI mount options
- Skip nil discoverers in merge
- bump runc go dep to v1.3.0
- Fix resolution of libs in LDCache on ARM
- Updated .release:staging to stage images in nvstaging
- Refactor toolkit installer
- Allow container runtime executable path to be specified
- Add support for building ubuntu22.04 on arm64
- Fix race condition in mounts cache
- Add support for building ubuntu22.04 on amd64
- Fix update-ldcache arguments
- Remove positional arguments from nvidia-ctk-installer
- Remove deprecated --runtime-args from nvidia-ctk-installer
- Add version info to nvidia-ctk-installer
- Update nvidia-ctk-installer app name to match binary name
- Allow nvidia-ctk config --set to accept comma-separated lists
- Disable enable-cuda-compat hook for management containers
- Allow enable-cuda-compat hook to be disabled in CDI spec generation
- Add disable-cuda-compat-lib-hook feature flag
- Add basic integration tests for forward compat
- Ensure that mode hook is executed last
- Add enable-cuda-compat hook to CDI spec generation
- Add ldconfig hook in legacy mode
- Add enable-cuda-compat hook if required
- Add enable-cuda-compat hook to allow compat libs to be discovered
- Use libcontainer execseal to run ldconfig
- Add ignore-imex-channel-requests feature flag
- Disable nvsandboxutils in nvcdi API
- Allow cdi mode to work with --gpus flag
- Add E2E GitHub Action for Container Toolkit
- Add remote-test option for E2E
- Enable CDI in runtime if CDI_ENABLED is set
- Fix overwriting docker feature flags
- Add option in toolkit container to enable CDI in runtime
- Remove Set from engine config API
- Add EnableCDI() method to engine.Interface
- Add IMEX binaries to CDI discovery
- Rename test folder to tests
- Add allow-cuda-compat-libs-from-container feature flag
- Disable mounting of compat libs from container
- Skip graphics modifier in CSV mode
- Properly pass configSearchPaths to a Driver constructor
- Move nvidia-toolkit to nvidia-ctk-installer
- Automated regression testing for the NVIDIA Container Toolkit
- Add support for containerd version 3 config
- Remove watch option from create-dev-char-symlinks
- Add string TOML source
- Improve the implementation for UseLegacyConfig
- Properly pass configSearchPaths to a Driver constructor
- Fix create-device-node test when devices exist
- Add imex mode to CDI spec generation
- Only allow host-relative LDConfig paths
- Fix NVIDIA_IMEX_CHANNELS handling on legacy images
- Fix bug in default config file path
- Fix fsnotify.Remove logic function.
- Force symlink creation in create-symlink hook
### Changes in the Toolkit Container
- Create /work/nvidia-toolkit symlink
- Use Apache license for images
- Switch to golang distroless image
- Switch to cuda ubi9 base image
- Use single version tag for image
- Extract deb and rpm packages to single image
- Bump nvidia/cuda in /deployments/container
- Bump nvidia/cuda in /deployments/container
- Add E2E GitHub Action for Container Toolkit
- Bump nvidia/cuda in /deployments/container
- Move nvidia-toolkit to nvidia-ctk-installer
- Add support for containerd version 3 config
- Improve the implementation for UseLegacyConfig
- Bump nvidia/cuda in /deployments/container
- Add imex mode to CDI spec generation
- Only allow host-relative LDConfig paths
- Fallback to file for runtime config
### Changes in libnvidia-container
- Fix pointer accessing local variable out of scope
- Require version match between libnvidia-container-tools and libnvidia-container1
- Add libnvidia-gpucomp.so to the list of compute libs
- Use VERSION_ prefix for version parts in makefiles
- Add additional logging
- Do not discard container flags when --cuda-compat-mode is not specified
- Remove unneeded --no-cntlibs argument from list command
- Add cuda-compat-mode flag to configure command
- Skip files when user has insufficient permissions
- Fix building with Go 1.24
- Add no-cntlibs CLI option to nvidia-container-cli
### Changes in the Toolkit Container
- Bump CUDA base image version to 12.6.3
## v1.17.3
- Only allow host-relative LDConfig paths by default.
### Changes in libnvidia-container
- Fix always using fallback
- Add fallback for systems without memfd_create()
- Create virtual copy of host ldconfig binary before calling fexecve()
- Fix some typos in text.
## v1.17.2
- Fixed a bug where legacy images would set imex channels as `all`.
## v1.17.1
- Fixed a bug where specific symlinks existing in a container image could cause a container to fail to start.
- Fixed a bug on Tegra-based systems where a container would fail to start.
- Fixed a bug where the default container runtime config path was not properly set.
### Changes in the Toolkit Container
- Fallback to using a config file if the current runtime config can not be determined from the command line.
## v1.17.0
- Promote v1.17.0-rc.2 to v1.17.0

View File

@@ -17,6 +17,7 @@
package chmod
import (
"context"
"errors"
"fmt"
"io/fs"
@@ -25,7 +26,7 @@ import (
"strconv"
"strings"
"github.com/urfave/cli/v2"
"github.com/urfave/cli/v3"
"github.com/NVIDIA/nvidia-container-toolkit/internal/logger"
"github.com/NVIDIA/nvidia-container-toolkit/internal/oci"
@@ -36,7 +37,7 @@ type command struct {
}
type config struct {
paths cli.StringSlice
paths []string
modeStr string
mode fs.FileMode
containerSpec string
@@ -58,11 +59,11 @@ func (m command) build() *cli.Command {
c := cli.Command{
Name: "chmod",
Usage: "Set the permissions of folders in the container by running chmod. The container root is prefixed to the specified paths.",
Before: func(c *cli.Context) error {
return validateFlags(c, &cfg)
Before: func(ctx context.Context, cmd *cli.Command) (context.Context, error) {
return ctx, validateFlags(cmd, &cfg)
},
Action: func(c *cli.Context) error {
return m.run(c, &cfg)
Action: func(ctx context.Context, cmd *cli.Command) error {
return m.run(cmd, &cfg)
},
}
@@ -87,7 +88,7 @@ func (m command) build() *cli.Command {
return &c
}
func validateFlags(c *cli.Context, cfg *config) error {
func validateFlags(c *cli.Command, cfg *config) error {
if strings.TrimSpace(cfg.modeStr) == "" {
return fmt.Errorf("a non-empty mode must be specified")
}
@@ -98,7 +99,7 @@ func validateFlags(c *cli.Context, cfg *config) error {
}
cfg.mode = fs.FileMode(modeInt)
for _, p := range cfg.paths.Value() {
for _, p := range cfg.paths {
if strings.TrimSpace(p) == "" {
return fmt.Errorf("paths must not be empty")
}
@@ -107,7 +108,7 @@ func validateFlags(c *cli.Context, cfg *config) error {
return nil
}
func (m command) run(c *cli.Context, cfg *config) error {
func (m command) run(c *cli.Command, cfg *config) error {
s, err := oci.LoadContainerState(cfg.containerSpec)
if err != nil {
return fmt.Errorf("failed to load container state: %v", err)
@@ -121,7 +122,7 @@ func (m command) run(c *cli.Context, cfg *config) error {
return fmt.Errorf("empty container root detected")
}
paths := m.getPaths(containerRoot, cfg.paths.Value(), cfg.mode)
paths := m.getPaths(containerRoot, cfg.paths, cfg.mode)
if len(paths) == 0 {
m.logger.Debugf("No paths specified; exiting")
return nil

View File

@@ -17,9 +17,10 @@
package commands
import (
"github.com/urfave/cli/v2"
"github.com/urfave/cli/v3"
"github.com/NVIDIA/nvidia-container-toolkit/cmd/nvidia-cdi-hook/chmod"
createsonamesymlinks "github.com/NVIDIA/nvidia-container-toolkit/cmd/nvidia-cdi-hook/create-soname-symlinks"
symlinks "github.com/NVIDIA/nvidia-container-toolkit/cmd/nvidia-cdi-hook/create-symlinks"
"github.com/NVIDIA/nvidia-container-toolkit/cmd/nvidia-cdi-hook/cudacompat"
disabledevicenodemodification "github.com/NVIDIA/nvidia-container-toolkit/cmd/nvidia-cdi-hook/disable-device-node-modification"
@@ -35,6 +36,7 @@ func New(logger logger.Interface) []*cli.Command {
symlinks.NewCommand(logger),
chmod.NewCommand(logger),
cudacompat.NewCommand(logger),
createsonamesymlinks.NewCommand(logger),
disabledevicenodemodification.NewCommand(logger),
}
}
@@ -43,7 +45,7 @@ func New(logger logger.Interface) []*cli.Command {
// hook has been specified.
// This happens if a subcommand is provided that does not match one of the
// subcommands that has been explicitly specified.
func IssueUnsupportedHookWarning(logger logger.Interface, c *cli.Context) {
func IssueUnsupportedHookWarning(logger logger.Interface, c *cli.Command) {
args := c.Args().Slice()
if len(args) == 0 {
logger.Warningf("No CDI hook specified")

View File

@@ -0,0 +1,167 @@
/**
# SPDX-FileCopyrightText: Copyright (c) 2025 NVIDIA CORPORATION & AFFILIATES. All rights reserved.
# SPDX-License-Identifier: Apache-2.0
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
**/
package create_soname_symlinks
import (
"context"
"errors"
"fmt"
"log"
"os"
"github.com/moby/sys/reexec"
"github.com/urfave/cli/v3"
"github.com/NVIDIA/nvidia-container-toolkit/internal/ldconfig"
"github.com/NVIDIA/nvidia-container-toolkit/internal/logger"
"github.com/NVIDIA/nvidia-container-toolkit/internal/oci"
)
const (
reexecUpdateLdCacheCommandName = "reexec-create-soname-symlinks"
)
type command struct {
logger logger.Interface
}
type options struct {
folders []string
ldconfigPath string
containerSpec string
}
func init() {
reexec.Register(reexecUpdateLdCacheCommandName, createSonameSymlinksHandler)
if reexec.Init() {
os.Exit(0)
}
}
// NewCommand constructs an create-soname-symlinks command with the specified logger
func NewCommand(logger logger.Interface) *cli.Command {
c := command{
logger: logger,
}
return c.build()
}
// build the create-soname-symlinks command
func (m command) build() *cli.Command {
cfg := options{}
// Create the 'create-soname-symlinks' command
c := cli.Command{
Name: "create-soname-symlinks",
Usage: "Create soname symlinks libraries in specified directories",
Before: func(ctx context.Context, cmd *cli.Command) (context.Context, error) {
return ctx, m.validateFlags(cmd, &cfg)
},
Action: func(ctx context.Context, cmd *cli.Command) error {
return m.run(cmd, &cfg)
},
}
c.Flags = []cli.Flag{
&cli.StringSliceFlag{
Name: "folder",
Usage: "Specify a directory to generate soname symlinks in. Can be specified multiple times",
Destination: &cfg.folders,
},
&cli.StringFlag{
Name: "ldconfig-path",
Usage: "Specify the path to ldconfig on the host",
Destination: &cfg.ldconfigPath,
Value: "/sbin/ldconfig",
},
&cli.StringFlag{
Name: "container-spec",
Usage: "Specify the path to the OCI container spec. If empty or '-' the spec will be read from STDIN",
Destination: &cfg.containerSpec,
},
}
return &c
}
func (m command) validateFlags(c *cli.Command, cfg *options) error {
if cfg.ldconfigPath == "" {
return errors.New("ldconfig-path must be specified")
}
return nil
}
func (m command) run(c *cli.Command, cfg *options) error {
s, err := oci.LoadContainerState(cfg.containerSpec)
if err != nil {
return fmt.Errorf("failed to load container state: %v", err)
}
containerRootDir, err := s.GetContainerRoot()
if err != nil || containerRootDir == "" || containerRootDir == "/" {
return fmt.Errorf("failed to determined container root: %v", err)
}
cmd, err := ldconfig.NewRunner(
reexecUpdateLdCacheCommandName,
cfg.ldconfigPath,
containerRootDir,
cfg.folders...,
)
if err != nil {
return err
}
return cmd.Run()
}
// createSonameSymlinksHandler wraps createSonameSymlinks with error handling.
func createSonameSymlinksHandler() {
if err := createSonameSymlinks(os.Args); err != nil {
log.Printf("Error updating ldcache: %v", err)
os.Exit(1)
}
}
// createSonameSymlinks ensures that soname symlinks are created in the
// specified directories.
// It is invoked from a reexec'd handler and provides namespace isolation for
// the operations performed by this hook. At the point where this is invoked,
// we are in a new mount namespace that is cloned from the parent.
//
// args[0] is the reexec initializer function name
// args[1] is the path of the ldconfig binary on the host
// args[2] is the container root directory
// The remaining args are directories where soname symlinks need to be created.
func createSonameSymlinks(args []string) error {
if len(args) < 3 {
return fmt.Errorf("incorrect arguments: %v", args)
}
hostLdconfigPath := args[1]
containerRootDirPath := args[2]
ldconfig, err := ldconfig.New(
hostLdconfigPath,
containerRootDirPath,
)
if err != nil {
return fmt.Errorf("failed to construct ldconfig runner: %w", err)
}
return ldconfig.CreateSonameSymlinks(args[3:]...)
}

View File

@@ -17,6 +17,7 @@
package symlinks
import (
"context"
"errors"
"fmt"
"os"
@@ -24,7 +25,7 @@ import (
"strings"
"github.com/moby/sys/symlink"
"github.com/urfave/cli/v2"
"github.com/urfave/cli/v3"
"github.com/NVIDIA/nvidia-container-toolkit/internal/logger"
"github.com/NVIDIA/nvidia-container-toolkit/internal/lookup/symlinks"
@@ -36,7 +37,7 @@ type command struct {
}
type config struct {
links cli.StringSlice
links []string
containerSpec string
}
@@ -55,8 +56,8 @@ func (m command) build() *cli.Command {
c := cli.Command{
Name: "create-symlinks",
Usage: "A hook to create symlinks in the container.",
Action: func(c *cli.Context) error {
return m.run(c, &cfg)
Action: func(ctx context.Context, cmd *cli.Command) error {
return m.run(ctx, cmd, &cfg)
},
}
@@ -78,7 +79,7 @@ func (m command) build() *cli.Command {
return &c
}
func (m command) run(c *cli.Context, cfg *config) error {
func (m command) run(ctx context.Context, cmd *cli.Command, cfg *config) error {
s, err := oci.LoadContainerState(cfg.containerSpec)
if err != nil {
return fmt.Errorf("failed to load container state: %v", err)
@@ -90,7 +91,7 @@ func (m command) run(c *cli.Context, cfg *config) error {
}
created := make(map[string]bool)
for _, l := range cfg.links.Value() {
for _, l := range cfg.links {
if created[l] {
m.logger.Debugf("Link %v already processed", l)
continue

View File

@@ -17,13 +17,14 @@
package cudacompat
import (
"context"
"fmt"
"os"
"path/filepath"
"strconv"
"strings"
"github.com/urfave/cli/v2"
"github.com/urfave/cli/v3"
"github.com/NVIDIA/nvidia-container-toolkit/internal/logger"
"github.com/NVIDIA/nvidia-container-toolkit/internal/oci"
@@ -63,11 +64,11 @@ func (m command) build() *cli.Command {
c := cli.Command{
Name: "enable-cuda-compat",
Usage: "This hook ensures that the folder containing the CUDA compat libraries is added to the ldconfig search path if required.",
Before: func(c *cli.Context) error {
return m.validateFlags(c, &cfg)
Before: func(ctx context.Context, cmd *cli.Command) (context.Context, error) {
return ctx, m.validateFlags(cmd, &cfg)
},
Action: func(c *cli.Context) error {
return m.run(c, &cfg)
Action: func(ctx context.Context, cmd *cli.Command) error {
return m.run(ctx, cmd, &cfg)
},
}
@@ -89,11 +90,11 @@ func (m command) build() *cli.Command {
return &c
}
func (m command) validateFlags(_ *cli.Context, cfg *options) error {
func (m command) validateFlags(cmd *cli.Command, cfg *options) error {
return nil
}
func (m command) run(_ *cli.Context, cfg *options) error {
func (m command) run(ctx context.Context, cmd *cli.Command, cfg *options) error {
if cfg.hostDriverVersion == "" {
return nil
}

View File

@@ -20,13 +20,14 @@ package disabledevicenodemodification
import (
"bufio"
"bytes"
"context"
"errors"
"fmt"
"io"
"os"
"strings"
"github.com/urfave/cli/v2"
"github.com/urfave/cli/v3"
"github.com/NVIDIA/nvidia-container-toolkit/internal/logger"
"github.com/NVIDIA/nvidia-container-toolkit/internal/oci"
@@ -47,11 +48,11 @@ func NewCommand(logger logger.Interface) *cli.Command {
c := cli.Command{
Name: "disable-device-node-modification",
Usage: "Ensure that the /proc/driver/nvidia/params file present in the container does not allow device node modifications.",
Before: func(c *cli.Context) error {
return validateFlags(c, &cfg)
Before: func(ctx context.Context, cmd *cli.Command) (context.Context, error) {
return ctx, validateFlags(cmd, &cfg)
},
Action: func(c *cli.Context) error {
return run(c, &cfg)
Action: func(ctx context.Context, cmd *cli.Command) error {
return run(ctx, cmd, &cfg)
},
}
@@ -67,11 +68,11 @@ func NewCommand(logger logger.Interface) *cli.Command {
return &c
}
func validateFlags(c *cli.Context, cfg *options) error {
func validateFlags(c *cli.Command, cfg *options) error {
return nil
}
func run(_ *cli.Context, cfg *options) error {
func run(ctx context.Context, cmd *cli.Command, cfg *options) error {
modifiedParamsFileContents, err := getModifiedNVIDIAParamsContents()
if err != nil {
return fmt.Errorf("failed to get modified params file contents: %w", err)

View File

@@ -17,13 +17,14 @@
package main
import (
"context"
"os"
"github.com/sirupsen/logrus"
"github.com/NVIDIA/nvidia-container-toolkit/internal/info"
cli "github.com/urfave/cli/v2"
cli "github.com/urfave/cli/v3"
"github.com/NVIDIA/nvidia-container-toolkit/cmd/nvidia-cdi-hook/commands"
)
@@ -44,10 +45,10 @@ func main() {
opts := options{}
// Create the top-level CLI
c := cli.NewApp()
c := cli.Command{}
c.Name = "NVIDIA CDI Hook"
c.UseShortOptionHandling = true
c.EnableBashCompletion = true
c.EnableShellCompletion = true
c.Usage = "Command to structure files for usage inside a container, called as hooks from a container runtime, defined in a CDI yaml file"
c.Version = info.GetVersionString()
@@ -58,8 +59,8 @@ func main() {
// referring to a new hook that is not yet supported by an older NVIDIA
// Container Toolkit version or a hook that has been removed in newer
// version.
c.Action = func(ctx *cli.Context) error {
commands.IssueUnsupportedHookWarning(logger, ctx)
c.Action = func(ctx context.Context, cmd *cli.Command) error {
commands.IssueUnsupportedHookWarning(logger, cmd)
return nil
}
@@ -71,19 +72,19 @@ func main() {
Usage: "Enable debug-level logging",
Destination: &opts.Debug,
// TODO: Support for NVIDIA_CDI_DEBUG is deprecated and NVIDIA_CTK_DEBUG should be used instead.
EnvVars: []string{"NVIDIA_CTK_DEBUG", "NVIDIA_CDI_DEBUG"},
Sources: cli.EnvVars("NVIDIA_CTK_DEBUG", "NVIDIA_CDI_DEBUG"),
},
&cli.BoolFlag{
Name: "quiet",
Usage: "Suppress all output except for errors; overrides --debug",
Destination: &opts.Quiet,
// TODO: Support for NVIDIA_CDI_QUIET is deprecated and NVIDIA_CTK_QUIET should be used instead.
EnvVars: []string{"NVDIA_CTK_QUIET", "NVIDIA_CDI_QUIET"},
Sources: cli.EnvVars("NVIDIA_CTK_QUIET", "NVIDIA_CDI_QUIET"),
},
}
// Set log-level for all subcommands
c.Before = func(c *cli.Context) error {
c.Before = func(ctx context.Context, cmd *cli.Command) (context.Context, error) {
logLevel := logrus.InfoLevel
if opts.Debug {
logLevel = logrus.DebugLevel
@@ -92,14 +93,14 @@ func main() {
logLevel = logrus.ErrorLevel
}
logger.SetLevel(logLevel)
return nil
return ctx, nil
}
// Define the subcommands
c.Commands = commands.New(logger)
// Run the CLI
err := c.Run(os.Args)
err := c.Run(context.Background(), os.Args)
if err != nil {
logger.Errorf("%v", err)
os.Exit(1)

View File

@@ -1,46 +0,0 @@
/**
# Copyright (c) 2025, NVIDIA CORPORATION. All rights reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
**/
package ldcache
import (
"os"
"path/filepath"
"github.com/moby/sys/symlink"
)
// A containerRoot represents the root filesystem of a container.
type containerRoot string
// hasPath checks whether the specified path exists in the root.
func (r containerRoot) hasPath(path string) bool {
resolved, err := r.resolve(path)
if err != nil {
return false
}
if _, err := os.Stat(resolved); err != nil && os.IsNotExist(err) {
return false
}
return true
}
// resolve returns the absolute path including root path.
// Symlinks are resolved, but are guaranteed to resolve in the root.
func (r containerRoot) resolve(path string) (string, error) {
absolute := filepath.Clean(filepath.Join(string(r), path))
return symlink.FollowSymlinkInScope(absolute, string(r))
}

View File

@@ -17,28 +17,21 @@
package ldcache
import (
"context"
"errors"
"fmt"
"log"
"os"
"strings"
"github.com/moby/sys/reexec"
"github.com/urfave/cli/v2"
"github.com/urfave/cli/v3"
"github.com/NVIDIA/nvidia-container-toolkit/internal/config"
"github.com/NVIDIA/nvidia-container-toolkit/internal/ldconfig"
"github.com/NVIDIA/nvidia-container-toolkit/internal/logger"
"github.com/NVIDIA/nvidia-container-toolkit/internal/oci"
)
const (
// ldsoconfdFilenamePattern specifies the pattern for the filename
// in ld.so.conf.d that includes references to the specified directories.
// The 00-nvcr prefix is chosen to ensure that these libraries have a
// higher precedence than other libraries on the system, but lower than
// the 00-cuda-compat that is included in some containers.
ldsoconfdFilenamePattern = "00-nvcr-*.conf"
reexecUpdateLdCacheCommandName = "reexec-update-ldcache"
)
@@ -47,7 +40,7 @@ type command struct {
}
type options struct {
folders cli.StringSlice
folders []string
ldconfigPath string
containerSpec string
}
@@ -75,11 +68,11 @@ func (m command) build() *cli.Command {
c := cli.Command{
Name: "update-ldcache",
Usage: "Update ldcache in a container by running ldconfig",
Before: func(c *cli.Context) error {
return m.validateFlags(c, &cfg)
Before: func(ctx context.Context, cmd *cli.Command) (context.Context, error) {
return ctx, m.validateFlags(cmd, &cfg)
},
Action: func(c *cli.Context) error {
return m.run(c, &cfg)
Action: func(ctx context.Context, cmd *cli.Command) error {
return m.run(ctx, cmd, &cfg)
},
}
@@ -105,14 +98,14 @@ func (m command) build() *cli.Command {
return &c
}
func (m command) validateFlags(c *cli.Context, cfg *options) error {
func (m command) validateFlags(cmd *cli.Command, cfg *options) error {
if cfg.ldconfigPath == "" {
return errors.New("ldconfig-path must be specified")
}
return nil
}
func (m command) run(c *cli.Context, cfg *options) error {
func (m command) run(ctx context.Context, cmd *cli.Command, cfg *options) error {
s, err := oci.LoadContainerState(cfg.containerSpec)
if err != nil {
return fmt.Errorf("failed to load container state: %v", err)
@@ -123,16 +116,16 @@ func (m command) run(c *cli.Context, cfg *options) error {
return fmt.Errorf("failed to determined container root: %v", err)
}
args := []string{
runner, err := ldconfig.NewRunner(
reexecUpdateLdCacheCommandName,
strings.TrimPrefix(config.NormalizeLDConfigPath("@"+cfg.ldconfigPath), "@"),
cfg.ldconfigPath,
containerRootDir,
cfg.folders...,
)
if err != nil {
return err
}
args = append(args, cfg.folders.Value()...)
cmd := createReexecCommand(args)
return cmd.Run()
return runner.Run()
}
// updateLdCacheHandler wraps updateLdCache with error handling.
@@ -143,15 +136,16 @@ func updateLdCacheHandler() {
}
}
// updateLdCache is invoked from a reexec'd handler and provides namespace
// isolation for the operations performed by this hook.
// At the point where this is invoked, we are in a new mount namespace that is
// cloned from the parent.
// updateLdCache ensures that the ldcache in the container is updated to include
// libraries that are mounted from the host.
// It is invoked from a reexec'd handler and provides namespace isolation for
// the operations performed by this hook. At the point where this is invoked,
// we are in a new mount namespace that is cloned from the parent.
//
// args[0] is the reexec initializer function name
// args[1] is the path of the ldconfig binary on the host
// args[2] is the container root directory
// The remaining args are folders that need to be added to the ldcache.
// The remaining args are folders where soname symlinks need to be created.
func updateLdCache(args []string) error {
if len(args) < 3 {
return fmt.Errorf("incorrect arguments: %v", args)
@@ -159,97 +153,13 @@ func updateLdCache(args []string) error {
hostLdconfigPath := args[1]
containerRootDirPath := args[2]
// To prevent leaking the parent proc filesystem, we create a new proc mount
// in the container root.
if err := mountProc(containerRootDirPath); err != nil {
return fmt.Errorf("error mounting /proc: %w", err)
}
// We mount the host ldconfig before we pivot root since host paths are not
// visible after the pivot root operation.
ldconfigPath, err := mountLdConfig(hostLdconfigPath, containerRootDirPath)
ldconfig, err := ldconfig.New(
hostLdconfigPath,
containerRootDirPath,
)
if err != nil {
return fmt.Errorf("error mounting host ldconfig: %w", err)
return fmt.Errorf("failed to construct ldconfig runner: %w", err)
}
// We pivot to the container root for the new process, this further limits
// access to the host.
if err := pivotRoot(containerRootDirPath); err != nil {
return fmt.Errorf("error running pivot_root: %w", err)
}
return runLdconfig(ldconfigPath, args[3:]...)
}
// runLdconfig runs the ldconfig binary and ensures that the specified directories
// are processed for the ldcache.
func runLdconfig(ldconfigPath string, directories ...string) error {
args := []string{
"ldconfig",
// Explicitly specify using /etc/ld.so.conf since the host's ldconfig may
// be configured to use a different config file by default.
// Note that since we apply the `-r {{ .containerRootDir }}` argument, /etc/ld.so.conf is
// in the container.
"-f", "/etc/ld.so.conf",
}
containerRoot := containerRoot("/")
if containerRoot.hasPath("/etc/ld.so.cache") {
args = append(args, "-C", "/etc/ld.so.cache")
} else {
args = append(args, "-N")
}
if containerRoot.hasPath("/etc/ld.so.conf.d") {
err := createLdsoconfdFile(ldsoconfdFilenamePattern, directories...)
if err != nil {
return fmt.Errorf("failed to update ld.so.conf.d: %w", err)
}
} else {
args = append(args, directories...)
}
return SafeExec(ldconfigPath, args, nil)
}
// createLdsoconfdFile creates a file at /etc/ld.so.conf.d/.
// The file is created at /etc/ld.so.conf.d/{{ .pattern }} using `CreateTemp` and
// contains the specified directories on each line.
func createLdsoconfdFile(pattern string, dirs ...string) error {
if len(dirs) == 0 {
return nil
}
ldsoconfdDir := "/etc/ld.so.conf.d"
if err := os.MkdirAll(ldsoconfdDir, 0755); err != nil {
return fmt.Errorf("failed to create ld.so.conf.d: %w", err)
}
configFile, err := os.CreateTemp(ldsoconfdDir, pattern)
if err != nil {
return fmt.Errorf("failed to create config file: %w", err)
}
defer func() {
_ = configFile.Close()
}()
added := make(map[string]bool)
for _, dir := range dirs {
if added[dir] {
continue
}
_, err = fmt.Fprintf(configFile, "%s\n", dir)
if err != nil {
return fmt.Errorf("failed to update config file: %w", err)
}
added[dir] = true
}
// The created file needs to be world readable for the cases where the container is run as a non-root user.
if err := configFile.Chmod(0644); err != nil {
return fmt.Errorf("failed to chmod config file: %w", err)
}
return nil
return ldconfig.UpdateLDCache(args[3:]...)
}

View File

@@ -242,7 +242,14 @@ func (hookConfig *hookConfig) getNvidiaConfig(image image.CUDA, privileged bool)
}
}
func (hookConfig *hookConfig) getContainerConfig() (config containerConfig) {
func (hookConfig *hookConfig) getContainerConfig() (config *containerConfig) {
hookConfig.Lock()
defer hookConfig.Unlock()
if hookConfig.containerConfig != nil {
return hookConfig.containerConfig
}
var h HookState
d := json.NewDecoder(os.Stdin)
if err := d.Decode(&h); err != nil {
@@ -271,10 +278,13 @@ func (hookConfig *hookConfig) getContainerConfig() (config containerConfig) {
log.Panicln(err)
}
return containerConfig{
cc := containerConfig{
Pid: h.Pid,
Rootfs: s.Root.Path,
Image: i,
Nvidia: hookConfig.getNvidiaConfig(i, privileged),
}
hookConfig.containerConfig = &cc
return hookConfig.containerConfig
}

View File

@@ -487,7 +487,7 @@ func TestGetNvidiaConfig(t *testing.T) {
hookCfg := tc.hookConfig
if hookCfg == nil {
defaultConfig, _ := config.GetDefault()
hookCfg = &hookConfig{defaultConfig}
hookCfg = &hookConfig{Config: defaultConfig}
}
cfg = hookCfg.getNvidiaConfig(image, tc.privileged)
}

View File

@@ -4,50 +4,46 @@ import (
"fmt"
"log"
"os"
"path"
"reflect"
"strings"
"sync"
"github.com/NVIDIA/nvidia-container-toolkit/internal/config"
"github.com/NVIDIA/nvidia-container-toolkit/internal/config/image"
)
const (
configPath = "/etc/nvidia-container-runtime/config.toml"
driverPath = "/run/nvidia/driver"
"github.com/NVIDIA/nvidia-container-toolkit/internal/info"
)
// hookConfig wraps the toolkit config.
// This allows for functions to be defined on the local type.
type hookConfig struct {
sync.Mutex
*config.Config
containerConfig *containerConfig
}
// loadConfig loads the required paths for the hook config.
func loadConfig() (*config.Config, error) {
var configPaths []string
var required bool
if len(*configflag) != 0 {
configPaths = append(configPaths, *configflag)
required = true
} else {
configPaths = append(configPaths, path.Join(driverPath, configPath), configPath)
configFilePath, required := getConfigFilePath()
cfg, err := config.New(
config.WithConfigFile(configFilePath),
config.WithRequired(true),
)
if err == nil {
return cfg.Config()
} else if os.IsNotExist(err) && !required {
return config.GetDefault()
}
return nil, fmt.Errorf("couldn't open required configuration file: %v", err)
}
for _, p := range configPaths {
cfg, err := config.New(
config.WithConfigFile(p),
config.WithRequired(true),
)
if err == nil {
return cfg.Config()
} else if os.IsNotExist(err) && !required {
continue
}
return nil, fmt.Errorf("couldn't open required configuration file: %v", err)
func getConfigFilePath() (string, bool) {
if configFromFlag := *configflag; configFromFlag != "" {
return configFromFlag, true
}
return config.GetDefault()
if configFromEnvvar := os.Getenv(config.FilePathOverrideEnvVar); configFromEnvvar != "" {
return configFromEnvvar, true
}
return config.GetConfigFilePath(), false
}
func getHookConfig() (*hookConfig, error) {
@@ -55,7 +51,7 @@ func getHookConfig() (*hookConfig, error) {
if err != nil {
return nil, fmt.Errorf("failed to load config: %v", err)
}
config := &hookConfig{cfg}
config := &hookConfig{Config: cfg}
allSupportedDriverCapabilities := image.SupportedDriverCapabilities
if config.SupportedDriverCapabilities == "all" {
@@ -73,8 +69,8 @@ func getHookConfig() (*hookConfig, error) {
// getConfigOption returns the toml config option associated with the
// specified struct field.
func (c hookConfig) getConfigOption(fieldName string) string {
t := reflect.TypeOf(c)
func (c *hookConfig) getConfigOption(fieldName string) string {
t := reflect.TypeOf(&c)
f, ok := t.FieldByName(fieldName)
if !ok {
return fieldName
@@ -127,3 +123,21 @@ func (c *hookConfig) nvidiaContainerCliCUDACompatModeFlags() []string {
}
return []string{flag}
}
func (c *hookConfig) assertModeIsLegacy() error {
if c.NVIDIAContainerRuntimeHookConfig.SkipModeDetection {
return nil
}
mr := info.NewRuntimeModeResolver(
info.WithLogger(&logInterceptor{}),
info.WithImage(&c.containerConfig.Image),
info.WithDefaultMode(info.LegacyRuntimeMode),
)
mode := mr.ResolveRuntimeMode(c.NVIDIAContainerRuntimeConfig.Mode)
if mode == "legacy" {
return nil
}
return fmt.Errorf("invoking the NVIDIA Container Runtime Hook directly (e.g. specifying the docker --gpus flag) is not supported. Please use the NVIDIA Container Runtime (e.g. specify the --runtime=nvidia flag) instead")
}

View File

@@ -90,10 +90,10 @@ func TestGetHookConfig(t *testing.T) {
}
}
var cfg hookConfig
var cfg *hookConfig
getHookConfig := func() {
c, _ := getHookConfig()
cfg = *c
cfg = c
}
if tc.expectedPanic {

View File

@@ -55,7 +55,7 @@ func getCLIPath(config config.ContainerCLIConfig) string {
}
// getRootfsPath returns an absolute path. We don't need to resolve symlinks for now.
func getRootfsPath(config containerConfig) string {
func getRootfsPath(config *containerConfig) string {
rootfs, err := filepath.Abs(config.Rootfs)
if err != nil {
log.Panicln(err)
@@ -82,8 +82,8 @@ func doPrestart() {
return
}
if !hook.NVIDIAContainerRuntimeHookConfig.SkipModeDetection && info.ResolveAutoMode(&logInterceptor{}, hook.NVIDIAContainerRuntimeConfig.Mode, container.Image) != "legacy" {
log.Panicln("invoking the NVIDIA Container Runtime Hook directly (e.g. specifying the docker --gpus flag) is not supported. Please use the NVIDIA Container Runtime (e.g. specify the --runtime=nvidia flag) instead.")
if err := hook.assertModeIsLegacy(); err != nil {
log.Panicf("%v", err)
}
rootfs := getRootfsPath(container)

View File

@@ -122,11 +122,10 @@ func TestGoodInput(t *testing.T) {
err = cmdCreate.Run()
require.NoError(t, err, "runtime should not return an error")
// Check config.json for NVIDIA prestart hook
// Check config.json to ensure that the NVIDIA prestart was not inserted.
spec, err = cfg.getRuntimeSpec()
require.NoError(t, err, "should be no errors when reading and parsing spec from config.json")
require.NotEmpty(t, spec.Hooks, "there should be hooks in config.json")
require.Equal(t, 1, nvidiaHookCount(spec.Hooks), "exactly one nvidia prestart hook should be inserted correctly into config.json")
require.Empty(t, spec.Hooks, "there should be no hooks in config.json")
}
// NVIDIA prestart hook already present in config file
@@ -168,11 +167,10 @@ func TestDuplicateHook(t *testing.T) {
output, err := cmdCreate.CombinedOutput()
require.NoErrorf(t, err, "runtime should not return an error", "output=%v", string(output))
// Check config.json for NVIDIA prestart hook
// Check config.json to ensure that the NVIDIA prestart hook was removed.
spec, err = cfg.getRuntimeSpec()
require.NoError(t, err, "should be no errors when reading and parsing spec from config.json")
require.NotEmpty(t, spec.Hooks, "there should be hooks in config.json")
require.Equal(t, 1, nvidiaHookCount(spec.Hooks), "exactly one nvidia prestart hook should be inserted correctly into config.json")
require.Empty(t, spec.Hooks, "there should be no hooks in config.json")
}
// addNVIDIAHook is a basic wrapper for an addHookModifier that is used for
@@ -240,18 +238,3 @@ func (c testConfig) generateNewRuntimeSpec() error {
}
return nil
}
// Return number of valid NVIDIA prestart hooks in runtime spec
func nvidiaHookCount(hooks *specs.Hooks) int {
if hooks == nil {
return 0
}
count := 0
for _, hook := range hooks.Prestart {
if strings.Contains(hook.Path, nvidiaHook) {
count++
}
}
return count
}

View File

@@ -22,7 +22,7 @@ import (
"os/exec"
"github.com/sirupsen/logrus"
"github.com/urfave/cli/v2"
"github.com/urfave/cli/v3"
"github.com/NVIDIA/nvidia-container-toolkit/cmd/nvidia-ctk-installer/container/operator"
"github.com/NVIDIA/nvidia-container-toolkit/pkg/config/engine"
@@ -53,7 +53,7 @@ type Options struct {
}
// ParseArgs parses the command line arguments to the CLI
func ParseArgs(c *cli.Context, o *Options) error {
func ParseArgs(c *cli.Command, o *Options) error {
if o.RuntimeDir != "" {
logrus.Debug("Runtime directory already set; ignoring arguments")
return nil
@@ -61,7 +61,7 @@ func ParseArgs(c *cli.Context, o *Options) error {
args := c.Args()
logrus.Infof("Parsing arguments: %v", args.Slice())
logrus.Infof("Parsing arguments: %v", args)
if c.NArg() != 1 {
return fmt.Errorf("incorrect number of arguments")
}

View File

@@ -21,7 +21,7 @@ import (
"fmt"
log "github.com/sirupsen/logrus"
cli "github.com/urfave/cli/v2"
cli "github.com/urfave/cli/v3"
"github.com/NVIDIA/nvidia-container-toolkit/cmd/nvidia-ctk-installer/container"
"github.com/NVIDIA/nvidia-container-toolkit/pkg/config/engine"
@@ -44,7 +44,7 @@ type Options struct {
useLegacyConfig bool
runtimeType string
ContainerRuntimeModesCDIAnnotationPrefixes cli.StringSlice
ContainerRuntimeModesCDIAnnotationPrefixes []string
runtimeConfigOverrideJSON string
}
@@ -58,26 +58,26 @@ func Flags(opts *Options) []cli.Flag {
"containerd.runtimes.default_runtime config section is used to define the default " +
"runtime instead of container.default_runtime_name.",
Destination: &opts.useLegacyConfig,
EnvVars: []string{"CONTAINERD_USE_LEGACY_CONFIG"},
Sources: cli.EnvVars("CONTAINERD_USE_LEGACY_CONFIG"),
},
&cli.StringFlag{
Name: "runtime-type",
Usage: "The runtime_type to use for the configured runtime classes",
Value: defaultRuntmeType,
Destination: &opts.runtimeType,
EnvVars: []string{"CONTAINERD_RUNTIME_TYPE"},
Sources: cli.EnvVars("CONTAINERD_RUNTIME_TYPE"),
},
&cli.StringSliceFlag{
Name: "nvidia-container-runtime-modes.cdi.annotation-prefixes",
Destination: &opts.ContainerRuntimeModesCDIAnnotationPrefixes,
EnvVars: []string{"NVIDIA_CONTAINER_RUNTIME_MODES_CDI_ANNOTATION_PREFIXES"},
Sources: cli.EnvVars("NVIDIA_CONTAINER_RUNTIME_MODES_CDI_ANNOTATION_PREFIXES"),
},
&cli.StringFlag{
Name: "runtime-config-override",
Destination: &opts.runtimeConfigOverrideJSON,
Usage: "specify additional runtime options as a JSON string. The paths are relative to the runtime config.",
Value: "{}",
EnvVars: []string{"RUNTIME_CONFIG_OVERRIDE", "CONTAINERD_RUNTIME_CONFIG_OVERRIDE"},
Sources: cli.EnvVars("RUNTIME_CONFIG_OVERRIDE", "CONTAINERD_RUNTIME_CONFIG_OVERRIDE"),
},
}
@@ -85,8 +85,8 @@ func Flags(opts *Options) []cli.Flag {
}
// Setup updates a containerd configuration to include the nvidia-containerd-runtime and reloads it
func Setup(c *cli.Context, o *container.Options, co *Options) error {
log.Infof("Starting 'setup' for %v", c.App.Name)
func Setup(c *cli.Command, o *container.Options, co *Options) error {
log.Infof("Starting 'setup' for %v", c.Name)
cfg, err := getRuntimeConfig(o, co)
if err != nil {
@@ -103,14 +103,14 @@ func Setup(c *cli.Context, o *container.Options, co *Options) error {
return fmt.Errorf("unable to restart containerd: %v", err)
}
log.Infof("Completed 'setup' for %v", c.App.Name)
log.Infof("Completed 'setup' for %v", c.Name)
return nil
}
// Cleanup reverts a containerd configuration to remove the nvidia-containerd-runtime and reloads it
func Cleanup(c *cli.Context, o *container.Options, co *Options) error {
log.Infof("Starting 'cleanup' for %v", c.App.Name)
func Cleanup(c *cli.Command, o *container.Options, co *Options) error {
log.Infof("Starting 'cleanup' for %v", c.Name)
cfg, err := getRuntimeConfig(o, co)
if err != nil {
@@ -127,7 +127,7 @@ func Cleanup(c *cli.Context, o *container.Options, co *Options) error {
return fmt.Errorf("unable to restart containerd: %v", err)
}
log.Infof("Completed 'cleanup' for %v", c.App.Name)
log.Infof("Completed 'cleanup' for %v", c.Name)
return nil
}
@@ -140,7 +140,7 @@ func RestartContainerd(o *container.Options) error {
// containerAnnotationsFromCDIPrefixes returns the container annotations to set for the given CDI prefixes.
func (o *Options) containerAnnotationsFromCDIPrefixes() []string {
var annotations []string
for _, prefix := range o.ContainerRuntimeModesCDIAnnotationPrefixes.Value() {
for _, prefix := range o.ContainerRuntimeModesCDIAnnotationPrefixes {
annotations = append(annotations, prefix+"*")
}

View File

@@ -22,7 +22,7 @@ import (
"path/filepath"
log "github.com/sirupsen/logrus"
cli "github.com/urfave/cli/v2"
cli "github.com/urfave/cli/v3"
"github.com/NVIDIA/nvidia-container-toolkit/cmd/nvidia-ctk-installer/container"
"github.com/NVIDIA/nvidia-container-toolkit/internal/config"
@@ -63,7 +63,7 @@ func Flags(opts *Options) []cli.Flag {
Usage: "path to the cri-o hooks directory",
Value: defaultHooksDir,
Destination: &opts.hooksDir,
EnvVars: []string{"CRIO_HOOKS_DIR"},
Sources: cli.EnvVars("CRIO_HOOKS_DIR"),
DefaultText: defaultHooksDir,
},
&cli.StringFlag{
@@ -71,7 +71,7 @@ func Flags(opts *Options) []cli.Flag {
Usage: "filename of the cri-o hook that will be created / removed in the hooks directory",
Value: defaultHookFilename,
Destination: &opts.hookFilename,
EnvVars: []string{"CRIO_HOOK_FILENAME"},
Sources: cli.EnvVars("CRIO_HOOK_FILENAME"),
DefaultText: defaultHookFilename,
},
&cli.StringFlag{
@@ -79,7 +79,7 @@ func Flags(opts *Options) []cli.Flag {
Usage: "the configuration mode to use. One of [hook | config]",
Value: defaultConfigMode,
Destination: &opts.configMode,
EnvVars: []string{"CRIO_CONFIG_MODE"},
Sources: cli.EnvVars("CRIO_CONFIG_MODE"),
},
}
@@ -87,8 +87,8 @@ func Flags(opts *Options) []cli.Flag {
}
// Setup installs the prestart hook required to launch GPU-enabled containers
func Setup(c *cli.Context, o *container.Options, co *Options) error {
log.Infof("Starting 'setup' for %v", c.App.Name)
func Setup(c *cli.Command, o *container.Options, co *Options) error {
log.Infof("Starting 'setup' for %v", c.Name)
switch co.configMode {
case "hook":
@@ -136,8 +136,8 @@ func setupConfig(o *container.Options) error {
}
// Cleanup removes the specified prestart hook
func Cleanup(c *cli.Context, o *container.Options, co *Options) error {
log.Infof("Starting 'cleanup' for %v", c.App.Name)
func Cleanup(c *cli.Command, o *container.Options, co *Options) error {
log.Infof("Starting 'cleanup' for %v", c.Name)
switch co.configMode {
case "hook":

View File

@@ -20,7 +20,7 @@ import (
"fmt"
log "github.com/sirupsen/logrus"
cli "github.com/urfave/cli/v2"
cli "github.com/urfave/cli/v3"
"github.com/NVIDIA/nvidia-container-toolkit/cmd/nvidia-ctk-installer/container"
"github.com/NVIDIA/nvidia-container-toolkit/pkg/config/engine"
@@ -42,8 +42,8 @@ func Flags(opts *Options) []cli.Flag {
}
// Setup updates docker configuration to include the nvidia runtime and reloads it
func Setup(c *cli.Context, o *container.Options) error {
log.Infof("Starting 'setup' for %v", c.App.Name)
func Setup(c *cli.Command, o *container.Options) error {
log.Infof("Starting 'setup' for %v", c.Name)
cfg, err := getRuntimeConfig(o)
if err != nil {
@@ -60,14 +60,14 @@ func Setup(c *cli.Context, o *container.Options) error {
return fmt.Errorf("unable to restart docker: %v", err)
}
log.Infof("Completed 'setup' for %v", c.App.Name)
log.Infof("Completed 'setup' for %v", c.Name)
return nil
}
// Cleanup reverts docker configuration to remove the nvidia runtime and reloads it
func Cleanup(c *cli.Context, o *container.Options) error {
log.Infof("Starting 'cleanup' for %v", c.App.Name)
func Cleanup(c *cli.Command, o *container.Options) error {
log.Infof("Starting 'cleanup' for %v", c.Name)
cfg, err := getRuntimeConfig(o)
if err != nil {
@@ -84,7 +84,7 @@ func Cleanup(c *cli.Context, o *container.Options) error {
return fmt.Errorf("unable to signal docker: %v", err)
}
log.Infof("Completed 'cleanup' for %v", c.App.Name)
log.Infof("Completed 'cleanup' for %v", c.Name)
return nil
}

View File

@@ -19,7 +19,7 @@ package runtime
import (
"fmt"
"github.com/urfave/cli/v2"
"github.com/urfave/cli/v3"
"github.com/NVIDIA/nvidia-container-toolkit/cmd/nvidia-ctk-installer/container"
"github.com/NVIDIA/nvidia-container-toolkit/cmd/nvidia-ctk-installer/container/runtime/containerd"
@@ -52,40 +52,40 @@ func Flags(opts *Options) []cli.Flag {
Usage: "Path to the runtime config file",
Value: runtimeSpecificDefault,
Destination: &opts.Config,
EnvVars: []string{"RUNTIME_CONFIG", "CONTAINERD_CONFIG", "DOCKER_CONFIG"},
Sources: cli.EnvVars("RUNTIME_CONFIG", "CONTAINERD_CONFIG", "DOCKER_CONFIG"),
},
&cli.StringFlag{
Name: "executable-path",
Usage: "The path to the runtime executable. This is used to extract the current config",
Destination: &opts.ExecutablePath,
EnvVars: []string{"RUNTIME_EXECUTABLE_PATH"},
Sources: cli.EnvVars("RUNTIME_EXECUTABLE_PATH"),
},
&cli.StringFlag{
Name: "socket",
Usage: "Path to the runtime socket file",
Value: runtimeSpecificDefault,
Destination: &opts.Socket,
EnvVars: []string{"RUNTIME_SOCKET", "CONTAINERD_SOCKET", "DOCKER_SOCKET"},
Sources: cli.EnvVars("RUNTIME_SOCKET", "CONTAINERD_SOCKET", "DOCKER_SOCKET"),
},
&cli.StringFlag{
Name: "restart-mode",
Usage: "Specify how the runtime should be restarted; If 'none' is selected it will not be restarted [signal | systemd | none ]",
Value: runtimeSpecificDefault,
Destination: &opts.RestartMode,
EnvVars: []string{"RUNTIME_RESTART_MODE"},
Sources: cli.EnvVars("RUNTIME_RESTART_MODE"),
},
&cli.BoolFlag{
Name: "enable-cdi-in-runtime",
Usage: "Enable CDI in the configured runt ime",
Destination: &opts.EnableCDI,
EnvVars: []string{"RUNTIME_ENABLE_CDI"},
Sources: cli.EnvVars("RUNTIME_ENABLE_CDI"),
},
&cli.StringFlag{
Name: "host-root",
Usage: "Specify the path to the host root to be used when restarting the runtime using systemd",
Value: defaultHostRootMount,
Destination: &opts.HostRootMount,
EnvVars: []string{"HOST_ROOT_MOUNT"},
Sources: cli.EnvVars("HOST_ROOT_MOUNT"),
},
&cli.StringFlag{
Name: "runtime-name",
@@ -93,14 +93,14 @@ func Flags(opts *Options) []cli.Flag {
Usage: "Specify the name of the `nvidia` runtime. If set-as-default is selected, the runtime is used as the default runtime.",
Value: defaultRuntimeName,
Destination: &opts.RuntimeName,
EnvVars: []string{"NVIDIA_RUNTIME_NAME", "CONTAINERD_RUNTIME_CLASS", "DOCKER_RUNTIME_NAME"},
Sources: cli.EnvVars("NVIDIA_RUNTIME_NAME", "CONTAINERD_RUNTIME_CLASS", "DOCKER_RUNTIME_NAME"),
},
&cli.BoolFlag{
Name: "set-as-default",
Usage: "Set the `nvidia` runtime as the default runtime.",
Value: defaultSetAsDefault,
Destination: &opts.SetAsDefault,
EnvVars: []string{"NVIDIA_RUNTIME_SET_AS_DEFAULT", "CONTAINERD_SET_AS_DEFAULT", "DOCKER_SET_AS_DEFAULT"},
Sources: cli.EnvVars("NVIDIA_RUNTIME_SET_AS_DEFAULT", "CONTAINERD_SET_AS_DEFAULT", "DOCKER_SET_AS_DEFAULT"),
Hidden: true,
},
}
@@ -112,7 +112,7 @@ func Flags(opts *Options) []cli.Flag {
}
// Validate checks whether the specified options are valid
func (opts *Options) Validate(logger logger.Interface, c *cli.Context, runtime string, toolkitRoot string, to *toolkit.Options) error {
func (opts *Options) Validate(logger logger.Interface, c *cli.Command, runtime string, toolkitRoot string, to *toolkit.Options) error {
// We set this option here to ensure that it is available in future calls.
opts.RuntimeDir = toolkitRoot
@@ -164,7 +164,7 @@ func (opts *Options) Validate(logger logger.Interface, c *cli.Context, runtime s
return nil
}
func Setup(c *cli.Context, opts *Options, runtime string) error {
func Setup(c *cli.Command, opts *Options, runtime string) error {
switch runtime {
case containerd.Name:
return containerd.Setup(c, &opts.Options, &opts.containerdOptions)
@@ -177,7 +177,7 @@ func Setup(c *cli.Context, opts *Options, runtime string) error {
}
}
func Cleanup(c *cli.Context, opts *Options, runtime string) error {
func Cleanup(c *cli.Command, opts *Options, runtime string) error {
switch runtime {
case containerd.Name:
return containerd.Cleanup(c, &opts.Options, &opts.containerdOptions)

View File

@@ -1,13 +1,14 @@
package main
import (
"context"
"fmt"
"os"
"os/signal"
"path/filepath"
"syscall"
"github.com/urfave/cli/v2"
"github.com/urfave/cli/v3"
"golang.org/x/sys/unix"
"github.com/NVIDIA/nvidia-container-toolkit/cmd/nvidia-ctk-installer/container/runtime"
@@ -57,7 +58,7 @@ func main() {
// Run the CLI
logger.Infof("Starting %v", c.Name)
if err := c.Run(os.Args); err != nil {
if err := c.Run(context.Background(), os.Args); err != nil {
logger.Errorf("error running %v: %v", c.Name, err)
os.Exit(1)
}
@@ -73,27 +74,27 @@ type app struct {
}
// NewApp creates the CLI app fro the specified options.
func NewApp(logger logger.Interface) *cli.App {
func NewApp(logger logger.Interface) *cli.Command {
a := app{
logger: logger,
}
return a.build()
}
func (a app) build() *cli.App {
func (a app) build() *cli.Command {
options := options{
toolkitOptions: toolkit.Options{},
}
// Create the top-level CLI
c := cli.NewApp()
c := cli.Command{}
c.Name = "nvidia-ctk-installer"
c.Usage = "Install the NVIDIA Container Toolkit and configure the specified runtime to use the `nvidia` runtime."
c.Version = info.GetVersionString()
c.Before = func(ctx *cli.Context) error {
return a.Before(ctx, &options)
c.Before = func(ctx context.Context, cmd *cli.Command) (context.Context, error) {
return ctx, a.Before(cmd, &options)
}
c.Action = func(ctx *cli.Context) error {
return a.Run(ctx, &options)
c.Action = func(ctx context.Context, cmd *cli.Command) error {
return a.Run(cmd, &options)
}
// Setup flags for the CLI
@@ -103,7 +104,7 @@ func (a app) build() *cli.App {
Aliases: []string{"n"},
Usage: "terminate immediately after setting up the runtime. Note that no cleanup will be performed",
Destination: &options.noDaemon,
EnvVars: []string{"NO_DAEMON"},
Sources: cli.EnvVars("NO_DAEMON"),
},
&cli.StringFlag{
Name: "runtime",
@@ -111,7 +112,7 @@ func (a app) build() *cli.App {
Usage: "the runtime to setup on this node. One of {'docker', 'crio', 'containerd'}",
Value: defaultRuntime,
Destination: &options.runtime,
EnvVars: []string{"RUNTIME"},
Sources: cli.EnvVars("RUNTIME"),
},
&cli.StringFlag{
Name: "toolkit-install-dir",
@@ -122,37 +123,37 @@ func (a app) build() *cli.App {
"recommended that this match the path on the host.",
Value: defaultToolkitInstallDir,
Destination: &options.toolkitInstallDir,
EnvVars: []string{"TOOLKIT_INSTALL_DIR", "ROOT"},
Sources: cli.EnvVars("TOOLKIT_INSTALL_DIR", "ROOT"),
},
&cli.StringFlag{
Name: "toolkit-source-root",
Usage: "The folder where the required toolkit artifacts can be found. If this is not specified, the path /artifacts/{{ .ToolkitPackageType }} is used where ToolkitPackageType is the resolved package type",
Destination: &options.sourceRoot,
EnvVars: []string{"TOOLKIT_SOURCE_ROOT"},
Sources: cli.EnvVars("TOOLKIT_SOURCE_ROOT"),
},
&cli.StringFlag{
Name: "toolkit-package-type",
Usage: "specify the package type to use for the toolkit. One of ['deb', 'rpm', 'auto', '']. If 'auto' or '' are used, the type is inferred automatically.",
Value: "auto",
Destination: &options.packageType,
EnvVars: []string{"TOOLKIT_PACKAGE_TYPE"},
Sources: cli.EnvVars("TOOLKIT_PACKAGE_TYPE"),
},
&cli.StringFlag{
Name: "pid-file",
Value: defaultPidFile,
Usage: "the path to a toolkit.pid file to ensure that only a single configuration instance is running",
Destination: &options.pidFile,
EnvVars: []string{"TOOLKIT_PID_FILE", "PID_FILE"},
Sources: cli.EnvVars("TOOLKIT_PID_FILE", "PID_FILE"),
},
}
c.Flags = append(c.Flags, toolkit.Flags(&options.toolkitOptions)...)
c.Flags = append(c.Flags, runtime.Flags(&options.runtimeOptions)...)
return c
return &c
}
func (a *app) Before(c *cli.Context, o *options) error {
func (a *app) Before(c *cli.Command, o *options) error {
if o.sourceRoot == "" {
sourceRoot, err := a.resolveSourceRoot(o.runtimeOptions.HostRootMount, o.packageType)
if err != nil {
@@ -170,7 +171,7 @@ func (a *app) Before(c *cli.Context, o *options) error {
return a.validateFlags(c, o)
}
func (a *app) validateFlags(c *cli.Context, o *options) error {
func (a *app) validateFlags(c *cli.Command, o *options) error {
if o.toolkitInstallDir == "" {
return fmt.Errorf("the install root must be specified")
}
@@ -193,21 +194,21 @@ func (a *app) validateFlags(c *cli.Context, o *options) error {
// Run installs the NVIDIA Container Toolkit and updates the requested runtime.
// If the application is run as a daemon, the application waits and unconfigures
// the runtime on termination.
func (a *app) Run(c *cli.Context, o *options) error {
func (a *app) Run(c *cli.Command, o *options) error {
err := a.initialize(o.pidFile)
if err != nil {
return fmt.Errorf("unable to initialize: %v", err)
}
defer a.shutdown(o.pidFile)
if len(o.toolkitOptions.ContainerRuntimeRuntimes.Value()) == 0 {
if len(o.toolkitOptions.ContainerRuntimeRuntimes) == 0 {
lowlevelRuntimePaths, err := runtime.GetLowlevelRuntimePaths(&o.runtimeOptions, o.runtime)
if err != nil {
return fmt.Errorf("unable to determine runtime options: %w", err)
}
lowlevelRuntimePaths = append(lowlevelRuntimePaths, defaultLowLevelRuntimes...)
o.toolkitOptions.ContainerRuntimeRuntimes = *cli.NewStringSlice(lowlevelRuntimePaths...)
o.toolkitOptions.ContainerRuntimeRuntimes = lowlevelRuntimePaths
}
err = a.toolkit.Install(c, &o.toolkitOptions)

View File

@@ -17,6 +17,7 @@
package main
import (
"context"
"os"
"path/filepath"
"strings"
@@ -436,7 +437,7 @@ swarm-resource = ""
"--toolkit-source-root=" + filepath.Join(artifactRoot, "deb"),
}
err := app.Run(append(testArgs, tc.args...))
err := app.Run(context.Background(), append(testArgs, tc.args...))
require.NoError(t, err)

View File

@@ -28,7 +28,7 @@ type createDirectory struct {
logger logger.Interface
}
func (t *toolkitInstaller) createDirectory() Installer {
func (t *ToolkitInstaller) createDirectory() Installer {
return &createDirectory{
logger: t.logger,
}

View File

@@ -28,20 +28,18 @@ import (
log "github.com/sirupsen/logrus"
"github.com/NVIDIA/nvidia-container-toolkit/cmd/nvidia-ctk-installer/container/operator"
"github.com/NVIDIA/nvidia-container-toolkit/internal/config"
)
type executable struct {
requiresKernelModule bool
path string
symlink string
args []string
env map[string]string
}
func (t *toolkitInstaller) collectExecutables(destDir string) ([]Installer, error) {
configHome := filepath.Join(destDir, ".config")
configDir := filepath.Join(configHome, "nvidia-container-runtime")
configPath := filepath.Join(configDir, "config.toml")
func (t *ToolkitInstaller) collectExecutables(destDir string) ([]Installer, error) {
configFilePath := t.ConfigFilePath(destDir)
executables := []executable{
{
@@ -56,7 +54,7 @@ func (t *toolkitInstaller) collectExecutables(destDir string) ([]Installer, erro
path: runtime.Path,
requiresKernelModule: true,
env: map[string]string{
"XDG_CONFIG_HOME": configHome,
config.FilePathOverrideEnvVar: configFilePath,
},
}
executables = append(executables, e)
@@ -72,7 +70,9 @@ func (t *toolkitInstaller) collectExecutables(destDir string) ([]Installer, erro
executable{
path: "nvidia-container-runtime-hook",
symlink: "nvidia-container-toolkit",
args: []string{fmt.Sprintf("-config %s", configPath)},
env: map[string]string{
config.FilePathOverrideEnvVar: configFilePath,
},
},
)
@@ -94,7 +94,6 @@ func (t *toolkitInstaller) collectExecutables(destDir string) ([]Installer, erro
Source: executablePath,
WrappedExecutable: dotRealFilename,
CheckModules: executable.requiresKernelModule,
Args: executable.args,
Envvars: map[string]string{
"PATH": strings.Join([]string{destDir, "$PATH"}, ":"),
},
@@ -124,7 +123,6 @@ type wrapper struct {
Envvars map[string]string
WrappedExecutable string
CheckModules bool
Args []string
}
type render struct {
@@ -165,9 +163,6 @@ fi
{{$key}}={{$value}} \
{{- end }}
{{ .DestDir }}/{{ .WrappedExecutable }} \
{{- range $arg := .Args }}
{{$arg}} \
{{- end }}
"$@"
`

View File

@@ -68,19 +68,6 @@ fi
PATH=/foo/bar/baz \
/dest-dir/some-runtime \
"$@"
`,
},
{
description: "args are added",
w: &wrapper{
WrappedExecutable: "some-runtime",
Args: []string{"--config foo", "bar"},
},
expected: `#! /bin/sh
/dest-dir/some-runtime \
--config foo \
bar \
"$@"
`,
},
}

View File

@@ -33,7 +33,7 @@ type Installer interface {
Install(string) error
}
type toolkitInstaller struct {
type ToolkitInstaller struct {
logger logger.Interface
ignoreErrors bool
sourceRoot string
@@ -43,11 +43,11 @@ type toolkitInstaller struct {
ensureTargetDirectory Installer
}
var _ Installer = (*toolkitInstaller)(nil)
var _ Installer = (*ToolkitInstaller)(nil)
// New creates a toolkit installer with the specified options.
func New(opts ...Option) (Installer, error) {
t := &toolkitInstaller{
func New(opts ...Option) (*ToolkitInstaller, error) {
t := &ToolkitInstaller{
sourceRoot: "/",
}
for _, opt := range opts {
@@ -73,7 +73,7 @@ func New(opts ...Option) (Installer, error) {
}
// Install ensures that the required toolkit files are installed in the specified directory.
func (t *toolkitInstaller) Install(destDir string) error {
func (t *ToolkitInstaller) Install(destDir string) error {
var installers []Installer
installers = append(installers, t.ensureTargetDirectory)
@@ -98,6 +98,11 @@ func (t *toolkitInstaller) Install(destDir string) error {
return errs
}
func (t *ToolkitInstaller) ConfigFilePath(destDir string) string {
toolkitConfigDir := filepath.Join(destDir, ".config", "nvidia-container-runtime")
return filepath.Join(toolkitConfigDir, "config.toml")
}
type symlink struct {
linkname string
target string

View File

@@ -112,7 +112,7 @@ func TestToolkitInstaller(t *testing.T) {
return nil
},
}
i := toolkitInstaller{
i := ToolkitInstaller{
logger: logger,
artifactRoot: r,
ensureTargetDirectory: createDirectory,
@@ -172,8 +172,8 @@ if [ "${?}" != "0" ]; then
echo "nvidia driver modules are not yet loaded, invoking runc directly"
exec runc "$@"
fi
NVIDIA_CTK_CONFIG_FILE_PATH=/foo/bar/baz/.config/nvidia-container-runtime/config.toml \
PATH=/foo/bar/baz:$PATH \
XDG_CONFIG_HOME=/foo/bar/baz/.config \
/foo/bar/baz/nvidia-container-runtime.real \
"$@"
`,
@@ -187,8 +187,8 @@ if [ "${?}" != "0" ]; then
echo "nvidia driver modules are not yet loaded, invoking runc directly"
exec runc "$@"
fi
NVIDIA_CTK_CONFIG_FILE_PATH=/foo/bar/baz/.config/nvidia-container-runtime/config.toml \
PATH=/foo/bar/baz:$PATH \
XDG_CONFIG_HOME=/foo/bar/baz/.config \
/foo/bar/baz/nvidia-container-runtime.cdi.real \
"$@"
`,
@@ -202,8 +202,8 @@ if [ "${?}" != "0" ]; then
echo "nvidia driver modules are not yet loaded, invoking runc directly"
exec runc "$@"
fi
NVIDIA_CTK_CONFIG_FILE_PATH=/foo/bar/baz/.config/nvidia-container-runtime/config.toml \
PATH=/foo/bar/baz:$PATH \
XDG_CONFIG_HOME=/foo/bar/baz/.config \
/foo/bar/baz/nvidia-container-runtime.legacy.real \
"$@"
`,
@@ -240,9 +240,9 @@ PATH=/foo/bar/baz:$PATH \
path: "/foo/bar/baz/nvidia-container-runtime-hook",
mode: 0777,
wrapper: `#! /bin/sh
NVIDIA_CTK_CONFIG_FILE_PATH=/foo/bar/baz/.config/nvidia-container-runtime/config.toml \
PATH=/foo/bar/baz:$PATH \
/foo/bar/baz/nvidia-container-runtime-hook.real \
-config /foo/bar/baz/.config/nvidia-container-runtime/config.toml \
"$@"
`,
},

View File

@@ -28,7 +28,7 @@ import (
// A predefined set of library candidates are considered, with the first one
// resulting in success being installed to the toolkit folder. The install process
// resolves the symlink for the library and copies the versioned library itself.
func (t *toolkitInstaller) collectLibraries() ([]Installer, error) {
func (t *ToolkitInstaller) collectLibraries() ([]Installer, error) {
requiredLibraries := []string{
"libnvidia-container.so.1",
"libnvidia-container-go.so.1",

View File

@@ -19,29 +19,29 @@ package installer
import "github.com/NVIDIA/nvidia-container-toolkit/internal/logger"
type Option func(*toolkitInstaller)
type Option func(*ToolkitInstaller)
func WithLogger(logger logger.Interface) Option {
return func(ti *toolkitInstaller) {
return func(ti *ToolkitInstaller) {
ti.logger = logger
}
}
func WithArtifactRoot(artifactRoot *artifactRoot) Option {
return func(ti *toolkitInstaller) {
return func(ti *ToolkitInstaller) {
ti.artifactRoot = artifactRoot
}
}
func WithIgnoreErrors(ignoreErrors bool) Option {
return func(ti *toolkitInstaller) {
return func(ti *ToolkitInstaller) {
ti.ignoreErrors = ignoreErrors
}
}
// WithSourceRoot sets the root directory for locating artifacts to be installed.
func WithSourceRoot(sourceRoot string) Option {
return func(ti *toolkitInstaller) {
return func(ti *ToolkitInstaller) {
ti.sourceRoot = sourceRoot
}
}

View File

@@ -22,7 +22,7 @@ import (
"path/filepath"
"strings"
"github.com/urfave/cli/v2"
"github.com/urfave/cli/v3"
"tags.cncf.io/container-device-interface/pkg/cdi"
"tags.cncf.io/container-device-interface/pkg/parser"
@@ -37,8 +37,6 @@ import (
const (
// DefaultNvidiaDriverRoot specifies the default NVIDIA driver run directory
DefaultNvidiaDriverRoot = "/run/nvidia/driver"
configFilename = "config.toml"
)
type cdiOptions struct {
@@ -60,9 +58,9 @@ type Options struct {
ContainerRuntimeLogLevel string
ContainerRuntimeModesCdiDefaultKind string
ContainerRuntimeModesCDIAnnotationPrefixes cli.StringSlice
ContainerRuntimeModesCDIAnnotationPrefixes []string
ContainerRuntimeRuntimes cli.StringSlice
ContainerRuntimeRuntimes []string
ContainerRuntimeHookSkipModeDetection bool
@@ -71,14 +69,14 @@ type Options struct {
// CDI stores the CDI options for the toolkit.
CDI cdiOptions
createDeviceNodes cli.StringSlice
createDeviceNodes []string
acceptNVIDIAVisibleDevicesWhenUnprivileged bool
acceptNVIDIAVisibleDevicesAsVolumeMounts bool
ignoreErrors bool
optInFeatures cli.StringSlice
optInFeatures []string
}
func Flags(opts *Options) []cli.Flag {
@@ -88,106 +86,106 @@ func Flags(opts *Options) []cli.Flag {
Aliases: []string{"nvidia-driver-root"},
Value: DefaultNvidiaDriverRoot,
Destination: &opts.DriverRoot,
EnvVars: []string{"NVIDIA_DRIVER_ROOT", "DRIVER_ROOT"},
Sources: cli.EnvVars("NVIDIA_DRIVER_ROOT", "DRIVER_ROOT"),
},
&cli.StringFlag{
Name: "driver-root-ctr-path",
Value: DefaultNvidiaDriverRoot,
Destination: &opts.DriverRootCtrPath,
EnvVars: []string{"DRIVER_ROOT_CTR_PATH"},
Sources: cli.EnvVars("DRIVER_ROOT_CTR_PATH"),
},
&cli.StringFlag{
Name: "dev-root",
Usage: "Specify the root where `/dev` is located. If this is not specified, the driver-root is assumed.",
Destination: &opts.DevRoot,
EnvVars: []string{"NVIDIA_DEV_ROOT", "DEV_ROOT"},
Sources: cli.EnvVars("NVIDIA_DEV_ROOT", "DEV_ROOT"),
},
&cli.StringFlag{
Name: "dev-root-ctr-path",
Usage: "Specify the root where `/dev` is located in the container. If this is not specified, the driver-root-ctr-path is assumed.",
Destination: &opts.DevRootCtrPath,
EnvVars: []string{"DEV_ROOT_CTR_PATH"},
Sources: cli.EnvVars("DEV_ROOT_CTR_PATH"),
},
&cli.StringFlag{
Name: "nvidia-container-runtime.debug",
Aliases: []string{"nvidia-container-runtime-debug"},
Usage: "Specify the location of the debug log file for the NVIDIA Container Runtime",
Destination: &opts.ContainerRuntimeDebug,
EnvVars: []string{"NVIDIA_CONTAINER_RUNTIME_DEBUG"},
Sources: cli.EnvVars("NVIDIA_CONTAINER_RUNTIME_DEBUG"),
},
&cli.StringFlag{
Name: "nvidia-container-runtime.log-level",
Aliases: []string{"nvidia-container-runtime-debug-log-level"},
Destination: &opts.ContainerRuntimeLogLevel,
EnvVars: []string{"NVIDIA_CONTAINER_RUNTIME_LOG_LEVEL"},
Sources: cli.EnvVars("NVIDIA_CONTAINER_RUNTIME_LOG_LEVEL"),
},
&cli.StringFlag{
Name: "nvidia-container-runtime.mode",
Aliases: []string{"nvidia-container-runtime-mode"},
Destination: &opts.ContainerRuntimeMode,
EnvVars: []string{"NVIDIA_CONTAINER_RUNTIME_MODE"},
Sources: cli.EnvVars("NVIDIA_CONTAINER_RUNTIME_MODE"),
},
&cli.StringFlag{
Name: "nvidia-container-runtime.modes.cdi.default-kind",
Destination: &opts.ContainerRuntimeModesCdiDefaultKind,
EnvVars: []string{"NVIDIA_CONTAINER_RUNTIME_MODES_CDI_DEFAULT_KIND"},
Sources: cli.EnvVars("NVIDIA_CONTAINER_RUNTIME_MODES_CDI_DEFAULT_KIND"),
},
&cli.StringSliceFlag{
Name: "nvidia-container-runtime.modes.cdi.annotation-prefixes",
Destination: &opts.ContainerRuntimeModesCDIAnnotationPrefixes,
EnvVars: []string{"NVIDIA_CONTAINER_RUNTIME_MODES_CDI_ANNOTATION_PREFIXES"},
Sources: cli.EnvVars("NVIDIA_CONTAINER_RUNTIME_MODES_CDI_ANNOTATION_PREFIXES"),
},
&cli.StringSliceFlag{
Name: "nvidia-container-runtime.runtimes",
Destination: &opts.ContainerRuntimeRuntimes,
EnvVars: []string{"NVIDIA_CONTAINER_RUNTIME_RUNTIMES"},
Sources: cli.EnvVars("NVIDIA_CONTAINER_RUNTIME_RUNTIMES"),
},
&cli.BoolFlag{
Name: "nvidia-container-runtime-hook.skip-mode-detection",
Value: true,
Destination: &opts.ContainerRuntimeHookSkipModeDetection,
EnvVars: []string{"NVIDIA_CONTAINER_RUNTIME_HOOK_SKIP_MODE_DETECTION"},
Sources: cli.EnvVars("NVIDIA_CONTAINER_RUNTIME_HOOK_SKIP_MODE_DETECTION"),
},
&cli.StringFlag{
Name: "nvidia-container-cli.debug",
Aliases: []string{"nvidia-container-cli-debug"},
Usage: "Specify the location of the debug log file for the NVIDIA Container CLI",
Destination: &opts.ContainerCLIDebug,
EnvVars: []string{"NVIDIA_CONTAINER_CLI_DEBUG"},
Sources: cli.EnvVars("NVIDIA_CONTAINER_CLI_DEBUG"),
},
&cli.BoolFlag{
Name: "accept-nvidia-visible-devices-envvar-when-unprivileged",
Usage: "Set the accept-nvidia-visible-devices-envvar-when-unprivileged config option",
Value: true,
Destination: &opts.acceptNVIDIAVisibleDevicesWhenUnprivileged,
EnvVars: []string{"ACCEPT_NVIDIA_VISIBLE_DEVICES_ENVVAR_WHEN_UNPRIVILEGED"},
Sources: cli.EnvVars("ACCEPT_NVIDIA_VISIBLE_DEVICES_ENVVAR_WHEN_UNPRIVILEGED"),
},
&cli.BoolFlag{
Name: "accept-nvidia-visible-devices-as-volume-mounts",
Usage: "Set the accept-nvidia-visible-devices-as-volume-mounts config option",
Destination: &opts.acceptNVIDIAVisibleDevicesAsVolumeMounts,
EnvVars: []string{"ACCEPT_NVIDIA_VISIBLE_DEVICES_AS_VOLUME_MOUNTS"},
Sources: cli.EnvVars("ACCEPT_NVIDIA_VISIBLE_DEVICES_AS_VOLUME_MOUNTS"),
},
&cli.BoolFlag{
Name: "cdi-enabled",
Aliases: []string{"enable-cdi"},
Usage: "enable the generation of a CDI specification",
Destination: &opts.CDI.Enabled,
EnvVars: []string{"CDI_ENABLED", "ENABLE_CDI"},
Sources: cli.EnvVars("CDI_ENABLED", "ENABLE_CDI"),
},
&cli.StringFlag{
Name: "cdi-output-dir",
Usage: "the directory where the CDI output files are to be written. If this is set to '', no CDI specification is generated.",
Value: "/var/run/cdi",
Destination: &opts.CDI.outputDir,
EnvVars: []string{"CDI_OUTPUT_DIR"},
Sources: cli.EnvVars("CDI_OUTPUT_DIR"),
},
&cli.StringFlag{
Name: "cdi-kind",
Usage: "the vendor string to use for the generated CDI specification",
Value: "management.nvidia.com/gpu",
Destination: &opts.CDI.kind,
EnvVars: []string{"CDI_KIND"},
Sources: cli.EnvVars("CDI_KIND"),
},
&cli.BoolFlag{
Name: "ignore-errors",
@@ -198,15 +196,15 @@ func Flags(opts *Options) []cli.Flag {
&cli.StringSliceFlag{
Name: "create-device-nodes",
Usage: "(Only applicable with --cdi-enabled) specifies which device nodes should be created. If any one of the options is set to '' or 'none', no device nodes will be created.",
Value: cli.NewStringSlice("control"),
Value: []string{"control"},
Destination: &opts.createDeviceNodes,
EnvVars: []string{"CREATE_DEVICE_NODES"},
Sources: cli.EnvVars("CREATE_DEVICE_NODES"),
},
&cli.StringSliceFlag{
Name: "opt-in-features",
Hidden: true,
Destination: &opts.optInFeatures,
EnvVars: []string{"NVIDIA_CONTAINER_TOOLKIT_OPT_IN_FEATURES"},
Sources: cli.EnvVars("NVIDIA_CONTAINER_TOOLKIT_OPT_IN_FEATURES"),
},
}
@@ -261,7 +259,7 @@ func (t *Installer) ValidateOptions(opts *Options) error {
}
isDisabled := false
for _, mode := range opts.createDeviceNodes.Value() {
for _, mode := range opts.createDeviceNodes {
if mode != "" && mode != "none" && mode != "control" {
return fmt.Errorf("invalid --create-device-nodes value: %v", mode)
}
@@ -275,7 +273,7 @@ func (t *Installer) ValidateOptions(opts *Options) error {
isDisabled = true
}
if isDisabled {
opts.createDeviceNodes = *cli.NewStringSlice()
opts.createDeviceNodes = []string{}
}
return nil
@@ -283,7 +281,7 @@ func (t *Installer) ValidateOptions(opts *Options) error {
// Install installs the components of the NVIDIA container toolkit.
// Any existing installation is removed.
func (t *Installer) Install(cli *cli.Context, opts *Options) error {
func (t *Installer) Install(cli *cli.Command, opts *Options) error {
if t == nil {
return fmt.Errorf("toolkit installer is not initilized")
}
@@ -316,7 +314,7 @@ func (t *Installer) Install(cli *cli.Context, opts *Options) error {
t.logger.Errorf("Ignoring error: %v", fmt.Errorf("could not install toolkit components: %w", err))
}
err = t.installToolkitConfig(cli, opts)
err = t.installToolkitConfig(cli, opts, toolkit.ConfigFilePath(t.toolkitRoot))
if err != nil && !opts.ignoreErrors {
return fmt.Errorf("error installing NVIDIA container toolkit config: %v", err)
} else if err != nil {
@@ -343,13 +341,11 @@ func (t *Installer) Install(cli *cli.Context, opts *Options) error {
// installToolkitConfig installs the config file for the NVIDIA container toolkit ensuring
// that the settings are updated to match the desired install and nvidia driver directories.
func (t *Installer) installToolkitConfig(c *cli.Context, opts *Options) error {
toolkitConfigDir := filepath.Join(t.toolkitRoot, ".config", "nvidia-container-runtime")
toolkitConfigPath := filepath.Join(toolkitConfigDir, configFilename)
func (t *Installer) installToolkitConfig(c *cli.Command, opts *Options, toolkitConfigPath string) error {
t.logger.Infof("Installing NVIDIA container toolkit config '%v'", toolkitConfigPath)
err := t.createDirectories(toolkitConfigDir)
err := t.createDirectories(filepath.Dir(toolkitConfigPath))
if err != nil && !opts.ignoreErrors {
return fmt.Errorf("could not create required directories: %v", err)
} else if err != nil {
@@ -391,12 +387,12 @@ func (t *Installer) installToolkitConfig(c *cli.Context, opts *Options) error {
"nvidia-container-runtime-hook.skip-mode-detection": opts.ContainerRuntimeHookSkipModeDetection,
}
toolkitRuntimeList := opts.ContainerRuntimeRuntimes.Value()
toolkitRuntimeList := opts.ContainerRuntimeRuntimes
if len(toolkitRuntimeList) > 0 {
configValues["nvidia-container-runtime.runtimes"] = toolkitRuntimeList
}
for _, optInFeature := range opts.optInFeatures.Value() {
for _, optInFeature := range opts.optInFeatures {
configValues["features."+optInFeature] = true
}
@@ -466,7 +462,7 @@ func (t *Installer) createDirectories(dir ...string) error {
}
func (t *Installer) createDeviceNodes(opts *Options) error {
modes := opts.createDeviceNodes.Value()
modes := opts.createDeviceNodes
if len(modes) == 0 {
return nil
}

View File

@@ -25,7 +25,7 @@ import (
testlog "github.com/sirupsen/logrus/hooks/test"
"github.com/stretchr/testify/require"
"github.com/urfave/cli/v2"
"github.com/urfave/cli/v3"
"github.com/NVIDIA/nvidia-container-toolkit/internal/config"
"github.com/NVIDIA/nvidia-container-toolkit/internal/lookup/symlinks"
@@ -86,6 +86,7 @@ devices:
hostPath: /host/driver/root/dev/nvidia-caps-imex-channels/channel2047
containerEdits:
env:
- NVIDIA_CTK_LIBCUDA_DIR=/lib/x86_64-linux-gnu
- NVIDIA_VISIBLE_DEVICES=void
hooks:
- hookName: createContainer
@@ -97,6 +98,15 @@ containerEdits:
- libcuda.so.1::/lib/x86_64-linux-gnu/libcuda.so
env:
- NVIDIA_CTK_DEBUG=false
- hookName: createContainer
path: {{ .toolkitRoot }}/nvidia-cdi-hook
args:
- nvidia-cdi-hook
- create-soname-symlinks
- --folder
- /lib/x86_64-linux-gnu
env:
- NVIDIA_CTK_DEBUG=false
- hookName: createContainer
path: {{ .toolkitRoot }}/nvidia-cdi-hook
args:
@@ -143,7 +153,7 @@ containerEdits:
)
require.NoError(t, ti.ValidateOptions(&options))
err := ti.Install(&cli.Context{}, &options)
err := ti.Install(&cli.Command{}, &options)
if tc.expectedError == nil {
require.NoError(t, err)
} else {

View File

@@ -17,7 +17,7 @@
package cdi
import (
"github.com/urfave/cli/v2"
"github.com/urfave/cli/v3"
"github.com/NVIDIA/nvidia-container-toolkit/cmd/nvidia-ctk/cdi/generate"
"github.com/NVIDIA/nvidia-container-toolkit/cmd/nvidia-ctk/cdi/list"
@@ -45,7 +45,7 @@ func (m command) build() *cli.Command {
Usage: "Provide tools for interacting with Container Device Interface specifications",
}
hook.Subcommands = []*cli.Command{
hook.Commands = []*cli.Command{
generate.NewCommand(m.logger),
transform.NewCommand(m.logger),
list.NewCommand(m.logger),

View File

@@ -17,12 +17,13 @@
package generate
import (
"context"
"fmt"
"os"
"path/filepath"
"strings"
"github.com/urfave/cli/v2"
"github.com/urfave/cli/v3"
cdi "tags.cncf.io/container-device-interface/pkg/parser"
"github.com/NVIDIA/go-nvml/pkg/nvml"
@@ -46,7 +47,7 @@ type command struct {
type options struct {
output string
format string
deviceNameStrategies cli.StringSlice
deviceNameStrategies []string
driverRoot string
devRoot string
nvidiaCDIHookPath string
@@ -55,13 +56,15 @@ type options struct {
vendor string
class string
configSearchPaths cli.StringSlice
librarySearchPaths cli.StringSlice
disabledHooks cli.StringSlice
configSearchPaths []string
librarySearchPaths []string
disabledHooks []string
config string
csv struct {
files cli.StringSlice
ignorePatterns cli.StringSlice
files []string
ignorePatterns []string
}
// the following are used for dependency injection during spec generation.
@@ -80,37 +83,73 @@ func NewCommand(logger logger.Interface) *cli.Command {
func (m command) build() *cli.Command {
opts := options{}
var flags []cli.Flag
// Create the 'generate-cdi' command
c := cli.Command{
Name: "generate",
Usage: "Generate CDI specifications for use with CDI-enabled runtimes",
Before: func(c *cli.Context) error {
return m.validateFlags(c, &opts)
Before: func(ctx context.Context, cmd *cli.Command) (context.Context, error) {
// First, determine the config file path
var configFilePath string
if cmd.IsSet("config") {
configFilePath = cmd.String("config")
} else {
configFilePath = config.GetConfigFilePath()
}
// Load the config file
configToml, err := config.New(
config.WithConfigFile(configFilePath),
)
if err != nil {
return nil, err
}
// Get the typed config
cfg, err := configToml.Config()
if err != nil {
return nil, err
}
// Apply config defaults to flags that haven't been set via env vars or CLI
// we can't use urfave/cli-altsrc here because we need to check the config file
// for defaults, and we can't use the config file as an input source for the
// flags because the config file is not yet loaded.
m.applyConfigDefaults(cmd, &opts, cfg)
return ctx, m.validateFlags(cmd, &opts)
},
Action: func(c *cli.Context) error {
return m.run(c, &opts)
Action: func(ctx context.Context, cmd *cli.Command) error {
return m.run(cmd, &opts)
},
}
c.Flags = []cli.Flag{
flags = []cli.Flag{
&cli.StringFlag{
Name: "config",
Usage: "Specify the path to the config file to use",
Destination: &opts.config,
Sources: cli.EnvVars("NVIDIA_CTK_CONFIG_FILE"),
},
&cli.StringSliceFlag{
Name: "config-search-path",
Usage: "Specify the path to search for config files when discovering the entities that should be included in the CDI specification.",
Destination: &opts.configSearchPaths,
EnvVars: []string{"NVIDIA_CTK_CDI_GENERATE_CONFIG_SEARCH_PATHS"},
Sources: cli.EnvVars("NVIDIA_CTK_CDI_GENERATE_CONFIG_SEARCH_PATHS"),
},
&cli.StringFlag{
Name: "output",
Usage: "Specify the file to output the generated CDI specification to. If this is '' the specification is output to STDOUT",
Destination: &opts.output,
EnvVars: []string{"NVIDIA_CTK_CDI_OUTPUT_FILE_PATH"},
Sources: cli.EnvVars("NVIDIA_CTK_CDI_OUTPUT_FILE_PATH"),
},
&cli.StringFlag{
Name: "format",
Usage: "The output format for the generated spec [json | yaml]. This overrides the format defined by the output file extension (if specified).",
Value: spec.FormatYAML,
Destination: &opts.format,
EnvVars: []string{"NVIDIA_CTK_CDI_GENERATE_OUTPUT_FORMAT"},
Sources: cli.EnvVars("NVIDIA_CTK_CDI_GENERATE_OUTPUT_FORMAT"),
},
&cli.StringFlag{
Name: "mode",
@@ -120,32 +159,32 @@ func (m command) build() *cli.Command {
"If mode is set to 'auto' the mode will be determined based on the system configuration.",
Value: string(nvcdi.ModeAuto),
Destination: &opts.mode,
EnvVars: []string{"NVIDIA_CTK_CDI_GENERATE_MODE"},
Sources: cli.EnvVars("NVIDIA_CTK_CDI_GENERATE_MODE"),
},
&cli.StringFlag{
Name: "dev-root",
Usage: "Specify the root where `/dev` is located. If this is not specified, the driver-root is assumed.",
Destination: &opts.devRoot,
EnvVars: []string{"NVIDIA_CTK_DEV_ROOT"},
Sources: cli.EnvVars("NVIDIA_CTK_DEV_ROOT"),
},
&cli.StringSliceFlag{
Name: "device-name-strategy",
Usage: "Specify the strategy for generating device names. If this is specified multiple times, the devices will be duplicated for each strategy. One of [index | uuid | type-index]",
Value: cli.NewStringSlice(nvcdi.DeviceNameStrategyIndex, nvcdi.DeviceNameStrategyUUID),
Value: []string{nvcdi.DeviceNameStrategyIndex, nvcdi.DeviceNameStrategyUUID},
Destination: &opts.deviceNameStrategies,
EnvVars: []string{"NVIDIA_CTK_CDI_GENERATE_DEVICE_NAME_STRATEGIES"},
Sources: cli.EnvVars("NVIDIA_CTK_CDI_GENERATE_DEVICE_NAME_STRATEGIES"),
},
&cli.StringFlag{
Name: "driver-root",
Usage: "Specify the NVIDIA GPU driver root to use when discovering the entities that should be included in the CDI specification.",
Destination: &opts.driverRoot,
EnvVars: []string{"NVIDIA_CTK_DRIVER_ROOT"},
Sources: cli.EnvVars("NVIDIA_CTK_DRIVER_ROOT"),
},
&cli.StringSliceFlag{
Name: "library-search-path",
Usage: "Specify the path to search for libraries when discovering the entities that should be included in the CDI specification.\n\tNote: This option only applies to CSV mode.",
Destination: &opts.librarySearchPaths,
EnvVars: []string{"NVIDIA_CTK_CDI_GENERATE_LIBRARY_SEARCH_PATHS"},
Sources: cli.EnvVars("NVIDIA_CTK_CDI_GENERATE_LIBRARY_SEARCH_PATHS"),
},
&cli.StringFlag{
Name: "nvidia-cdi-hook-path",
@@ -154,13 +193,13 @@ func (m command) build() *cli.Command {
"If not specified, the PATH will be searched for `nvidia-cdi-hook`. " +
"NOTE: That if this is specified as `nvidia-ctk`, the PATH will be searched for `nvidia-ctk` instead.",
Destination: &opts.nvidiaCDIHookPath,
EnvVars: []string{"NVIDIA_CTK_CDI_HOOK_PATH"},
Sources: cli.EnvVars("NVIDIA_CTK_CDI_HOOK_PATH"),
},
&cli.StringFlag{
Name: "ldconfig-path",
Usage: "Specify the path to use for ldconfig in the generated CDI specification",
Destination: &opts.ldconfigPath,
EnvVars: []string{"NVIDIA_CTK_CDI_GENERATE_LDCONFIG_PATH"},
Sources: cli.EnvVars("NVIDIA_CTK_CDI_GENERATE_LDCONFIG_PATH"),
},
&cli.StringFlag{
Name: "vendor",
@@ -168,7 +207,7 @@ func (m command) build() *cli.Command {
Usage: "the vendor string to use for the generated CDI specification.",
Value: "nvidia.com",
Destination: &opts.vendor,
EnvVars: []string{"NVIDIA_CTK_CDI_GENERATE_VENDOR"},
Sources: cli.EnvVars("NVIDIA_CTK_CDI_GENERATE_VENDOR"),
},
&cli.StringFlag{
Name: "class",
@@ -176,20 +215,20 @@ func (m command) build() *cli.Command {
Usage: "the class string to use for the generated CDI specification.",
Value: "gpu",
Destination: &opts.class,
EnvVars: []string{"NVIDIA_CTK_CDI_GENERATE_CLASS"},
Sources: cli.EnvVars("NVIDIA_CTK_CDI_GENERATE_CLASS"),
},
&cli.StringSliceFlag{
Name: "csv.file",
Usage: "The path to the list of CSV files to use when generating the CDI specification in CSV mode.",
Value: cli.NewStringSlice(csv.DefaultFileList()...),
Value: csv.DefaultFileList(),
Destination: &opts.csv.files,
EnvVars: []string{"NVIDIA_CTK_CDI_GENERATE_CSV_FILES"},
Sources: cli.EnvVars("NVIDIA_CTK_CDI_GENERATE_CSV_FILES"),
},
&cli.StringSliceFlag{
Name: "csv.ignore-pattern",
Usage: "specify a pattern the CSV mount specifications.",
Destination: &opts.csv.ignorePatterns,
EnvVars: []string{"NVIDIA_CTK_CDI_GENERATE_CSV_IGNORE_PATTERNS"},
Sources: cli.EnvVars("NVIDIA_CTK_CDI_GENERATE_CSV_IGNORE_PATTERNS"),
},
&cli.StringSliceFlag{
Name: "disable-hook",
@@ -199,14 +238,38 @@ func (m command) build() *cli.Command {
"special hook name 'all' can be used ensure that the generated " +
"CDI specification does not include any hooks.",
Destination: &opts.disabledHooks,
EnvVars: []string{"NVIDIA_CTK_CDI_GENERATE_DISABLED_HOOKS"},
Sources: cli.EnvVars("NVIDIA_CTK_CDI_GENERATE_DISABLED_HOOKS"),
},
}
c.Flags = flags
return &c
}
func (m command) validateFlags(c *cli.Context, opts *options) error {
// applyConfigDefaults applies default values from the config file to flags that haven't been set
// via environment variables or CLI flags
func (m command) applyConfigDefaults(cmd *cli.Command, opts *options, cfg *config.Config) {
// Apply mode from config if not set via env var or CLI
if !cmd.IsSet("mode") && cfg.NVIDIAContainerRuntimeConfig.Mode != "" {
opts.mode = cfg.NVIDIAContainerRuntimeConfig.Mode
m.logger.Debugf("Setting mode from config: %s", opts.mode)
}
// Apply ldconfig path from config if not set via env var or CLI
if !cmd.IsSet("ldconfig-path") && cfg.NVIDIAContainerCLIConfig.Ldconfig != "" {
opts.ldconfigPath = string(cfg.NVIDIAContainerCLIConfig.Ldconfig)
m.logger.Debugf("Setting ldconfig-path from config: %s", opts.ldconfigPath)
}
// Apply driver-root
if !cmd.IsSet("driver-root") && cfg.NVIDIAContainerCLIConfig.Root != "" {
opts.driverRoot = cfg.NVIDIAContainerCLIConfig.Root
m.logger.Debugf("Setting driver-root from config: %s", opts.driverRoot)
}
}
func (m command) validateFlags(c *cli.Command, opts *options) error {
opts.format = strings.ToLower(opts.format)
switch opts.format {
case spec.FormatJSON:
@@ -220,7 +283,7 @@ func (m command) validateFlags(c *cli.Context, opts *options) error {
return fmt.Errorf("invalid discovery mode: %v", opts.mode)
}
for _, strategy := range opts.deviceNameStrategies.Value() {
for _, strategy := range opts.deviceNameStrategies {
_, err := nvcdi.NewDeviceNamer(strategy)
if err != nil {
return err
@@ -247,7 +310,7 @@ func (m command) validateFlags(c *cli.Context, opts *options) error {
return nil
}
func (m command) run(c *cli.Context, opts *options) error {
func (m command) run(c *cli.Command, opts *options) error {
spec, err := m.generateSpec(opts)
if err != nil {
return fmt.Errorf("failed to generate CDI spec: %v", err)
@@ -279,7 +342,7 @@ func formatFromFilename(filename string) string {
func (m command) generateSpec(opts *options) (spec.Interface, error) {
var deviceNamers []nvcdi.DeviceNamer
for _, strategy := range opts.deviceNameStrategies.Value() {
for _, strategy := range opts.deviceNameStrategies {
deviceNamer, err := nvcdi.NewDeviceNamer(strategy)
if err != nil {
return nil, fmt.Errorf("failed to create device namer: %v", err)
@@ -295,15 +358,15 @@ func (m command) generateSpec(opts *options) (spec.Interface, error) {
nvcdi.WithLdconfigPath(opts.ldconfigPath),
nvcdi.WithDeviceNamers(deviceNamers...),
nvcdi.WithMode(opts.mode),
nvcdi.WithConfigSearchPaths(opts.configSearchPaths.Value()),
nvcdi.WithLibrarySearchPaths(opts.librarySearchPaths.Value()),
nvcdi.WithCSVFiles(opts.csv.files.Value()),
nvcdi.WithCSVIgnorePatterns(opts.csv.ignorePatterns.Value()),
nvcdi.WithConfigSearchPaths(opts.configSearchPaths),
nvcdi.WithLibrarySearchPaths(opts.librarySearchPaths),
nvcdi.WithCSVFiles(opts.csv.files),
nvcdi.WithCSVIgnorePatterns(opts.csv.ignorePatterns),
// We set the following to allow for dependency injection:
nvcdi.WithNvmlLib(opts.nvmllib),
}
for _, hook := range opts.disabledHooks.Value() {
for _, hook := range opts.disabledHooks {
cdiOptions = append(cdiOptions, nvcdi.WithDisabledHook(hook))
}

View File

@@ -26,7 +26,6 @@ import (
"github.com/NVIDIA/go-nvml/pkg/nvml/mock/dgxa100"
testlog "github.com/sirupsen/logrus/hooks/test"
"github.com/stretchr/testify/require"
"github.com/urfave/cli/v2"
"github.com/NVIDIA/nvidia-container-toolkit/internal/test"
)
@@ -80,6 +79,7 @@ devices:
hostPath: {{ .driverRoot }}/dev/nvidia0
containerEdits:
env:
- NVIDIA_CTK_LIBCUDA_DIR=/lib/x86_64-linux-gnu
- NVIDIA_VISIBLE_DEVICES=void
deviceNodes:
- path: /dev/nvidiactl
@@ -102,6 +102,15 @@ containerEdits:
- --host-driver-version=999.88.77
env:
- NVIDIA_CTK_DEBUG=false
- hookName: createContainer
path: /usr/bin/nvidia-cdi-hook
args:
- nvidia-cdi-hook
- create-soname-symlinks
- --folder
- /lib/x86_64-linux-gnu
env:
- NVIDIA_CTK_DEBUG=false
- hookName: createContainer
path: /usr/bin/nvidia-cdi-hook
args:
@@ -137,7 +146,7 @@ containerEdits:
vendor: "example.com",
class: "device",
driverRoot: driverRoot,
disabledHooks: valueOf(cli.NewStringSlice("enable-cuda-compat")),
disabledHooks: []string{"enable-cuda-compat"},
},
expectedOptions: options{
format: "yaml",
@@ -146,7 +155,7 @@ containerEdits:
class: "device",
nvidiaCDIHookPath: "/usr/bin/nvidia-cdi-hook",
driverRoot: driverRoot,
disabledHooks: valueOf(cli.NewStringSlice("enable-cuda-compat")),
disabledHooks: []string{"enable-cuda-compat"},
},
expectedSpec: `---
cdiVersion: 0.5.0
@@ -164,6 +173,7 @@ devices:
hostPath: {{ .driverRoot }}/dev/nvidia0
containerEdits:
env:
- NVIDIA_CTK_LIBCUDA_DIR=/lib/x86_64-linux-gnu
- NVIDIA_VISIBLE_DEVICES=void
deviceNodes:
- path: /dev/nvidiactl
@@ -178,6 +188,15 @@ containerEdits:
- libcuda.so.1::/lib/x86_64-linux-gnu/libcuda.so
env:
- NVIDIA_CTK_DEBUG=false
- hookName: createContainer
path: /usr/bin/nvidia-cdi-hook
args:
- nvidia-cdi-hook
- create-soname-symlinks
- --folder
- /lib/x86_64-linux-gnu
env:
- NVIDIA_CTK_DEBUG=false
- hookName: createContainer
path: /usr/bin/nvidia-cdi-hook
args:
@@ -213,7 +232,7 @@ containerEdits:
vendor: "example.com",
class: "device",
driverRoot: driverRoot,
disabledHooks: valueOf(cli.NewStringSlice("enable-cuda-compat", "update-ldcache")),
disabledHooks: []string{"enable-cuda-compat", "update-ldcache"},
},
expectedOptions: options{
format: "yaml",
@@ -222,7 +241,7 @@ containerEdits:
class: "device",
nvidiaCDIHookPath: "/usr/bin/nvidia-cdi-hook",
driverRoot: driverRoot,
disabledHooks: valueOf(cli.NewStringSlice("enable-cuda-compat", "update-ldcache")),
disabledHooks: []string{"enable-cuda-compat", "update-ldcache"},
},
expectedSpec: `---
cdiVersion: 0.5.0
@@ -240,6 +259,7 @@ devices:
hostPath: {{ .driverRoot }}/dev/nvidia0
containerEdits:
env:
- NVIDIA_CTK_LIBCUDA_DIR=/lib/x86_64-linux-gnu
- NVIDIA_VISIBLE_DEVICES=void
deviceNodes:
- path: /dev/nvidiactl
@@ -254,6 +274,15 @@ containerEdits:
- libcuda.so.1::/lib/x86_64-linux-gnu/libcuda.so
env:
- NVIDIA_CTK_DEBUG=false
- hookName: createContainer
path: /usr/bin/nvidia-cdi-hook
args:
- nvidia-cdi-hook
- create-soname-symlinks
- --folder
- /lib/x86_64-linux-gnu
env:
- NVIDIA_CTK_DEBUG=false
- hookName: createContainer
path: /usr/bin/nvidia-cdi-hook
args:
@@ -280,7 +309,7 @@ containerEdits:
vendor: "example.com",
class: "device",
driverRoot: driverRoot,
disabledHooks: valueOf(cli.NewStringSlice("all")),
disabledHooks: []string{"all"},
},
expectedOptions: options{
format: "yaml",
@@ -289,7 +318,7 @@ containerEdits:
class: "device",
nvidiaCDIHookPath: "/usr/bin/nvidia-cdi-hook",
driverRoot: driverRoot,
disabledHooks: valueOf(cli.NewStringSlice("all")),
disabledHooks: []string{"all"},
},
expectedSpec: `---
cdiVersion: 0.5.0
@@ -307,6 +336,7 @@ devices:
hostPath: {{ .driverRoot }}/dev/nvidia0
containerEdits:
env:
- NVIDIA_CTK_LIBCUDA_DIR=/lib/x86_64-linux-gnu
- NVIDIA_VISIBLE_DEVICES=void
deviceNodes:
- path: /dev/nvidiactl
@@ -363,9 +393,3 @@ containerEdits:
})
}
}
// valueOf returns the value of a pointer.
// Note that this does not check for a nil pointer and is only used for testing.
func valueOf[T any](v *T) T {
return *v
}

View File

@@ -17,10 +17,11 @@
package list
import (
"context"
"errors"
"fmt"
"github.com/urfave/cli/v2"
"github.com/urfave/cli/v3"
"tags.cncf.io/container-device-interface/pkg/cdi"
"github.com/NVIDIA/nvidia-container-toolkit/internal/logger"
@@ -31,7 +32,7 @@ type command struct {
}
type config struct {
cdiSpecDirs cli.StringSlice
cdiSpecDirs []string
}
// NewCommand constructs a cdi list command with the specified logger
@@ -50,11 +51,11 @@ func (m command) build() *cli.Command {
c := cli.Command{
Name: "list",
Usage: "List the available CDI devices",
Before: func(c *cli.Context) error {
return m.validateFlags(c, &cfg)
Before: func(ctx context.Context, cmd *cli.Command) (context.Context, error) {
return ctx, m.validateFlags(cmd, &cfg)
},
Action: func(c *cli.Context) error {
return m.run(c, &cfg)
Action: func(ctx context.Context, cmd *cli.Command) error {
return m.run(cmd, &cfg)
},
}
@@ -62,26 +63,26 @@ func (m command) build() *cli.Command {
&cli.StringSliceFlag{
Name: "spec-dir",
Usage: "specify the directories to scan for CDI specifications",
Value: cli.NewStringSlice(cdi.DefaultSpecDirs...),
Value: cdi.DefaultSpecDirs,
Destination: &cfg.cdiSpecDirs,
EnvVars: []string{"NVIDIA_CTK_CDI_SPEC_DIRS"},
Sources: cli.EnvVars("NVIDIA_CTK_CDI_SPEC_DIRS"),
},
}
return &c
}
func (m command) validateFlags(c *cli.Context, cfg *config) error {
if len(cfg.cdiSpecDirs.Value()) == 0 {
func (m command) validateFlags(c *cli.Command, cfg *config) error {
if len(cfg.cdiSpecDirs) == 0 {
return errors.New("at least one CDI specification directory must be specified")
}
return nil
}
func (m command) run(c *cli.Context, cfg *config) error {
func (m command) run(c *cli.Command, cfg *config) error {
registry, err := cdi.NewCache(
cdi.WithAutoRefresh(false),
cdi.WithSpecDirs(cfg.cdiSpecDirs.Value()...),
cdi.WithSpecDirs(cfg.cdiSpecDirs...),
)
if err != nil {
return fmt.Errorf("failed to create CDI cache: %v", err)

View File

@@ -17,11 +17,12 @@
package root
import (
"context"
"fmt"
"io"
"os"
"github.com/urfave/cli/v2"
"github.com/urfave/cli/v3"
"tags.cncf.io/container-device-interface/pkg/cdi"
"github.com/NVIDIA/nvidia-container-toolkit/internal/logger"
@@ -60,11 +61,11 @@ func (m command) build() *cli.Command {
c := cli.Command{
Name: "root",
Usage: "Apply a root transform to a CDI specification",
Before: func(c *cli.Context) error {
return m.validateFlags(c, &opts)
Before: func(ctx context.Context, cmd *cli.Command) (context.Context, error) {
return ctx, m.validateFlags(cmd, &opts)
},
Action: func(c *cli.Context) error {
return m.run(c, &opts)
Action: func(ctx context.Context, cmd *cli.Command) error {
return m.run(cmd, &opts)
},
}
@@ -102,7 +103,7 @@ func (m command) build() *cli.Command {
return &c
}
func (m command) validateFlags(c *cli.Context, opts *options) error {
func (m command) validateFlags(c *cli.Command, opts *options) error {
switch opts.relativeTo {
case "host":
case "container":
@@ -112,7 +113,7 @@ func (m command) validateFlags(c *cli.Context, opts *options) error {
return nil
}
func (m command) run(c *cli.Context, opts *options) error {
func (m command) run(c *cli.Command, opts *options) error {
spec, err := opts.Load()
if err != nil {
return fmt.Errorf("failed to load CDI specification: %w", err)

View File

@@ -17,7 +17,7 @@
package transform
import (
"github.com/urfave/cli/v2"
"github.com/urfave/cli/v3"
"github.com/NVIDIA/nvidia-container-toolkit/cmd/nvidia-ctk/cdi/transform/root"
"github.com/NVIDIA/nvidia-container-toolkit/internal/logger"
@@ -44,7 +44,7 @@ func (m command) build() *cli.Command {
c.Flags = []cli.Flag{}
c.Subcommands = []*cli.Command{
c.Commands = []*cli.Command{
root.NewCommand(m.logger),
}

View File

@@ -17,13 +17,14 @@
package config
import (
"context"
"errors"
"fmt"
"reflect"
"strconv"
"strings"
"github.com/urfave/cli/v2"
"github.com/urfave/cli/v3"
createdefault "github.com/NVIDIA/nvidia-container-toolkit/cmd/nvidia-ctk/config/create-default"
"github.com/NVIDIA/nvidia-container-toolkit/cmd/nvidia-ctk/config/flags"
@@ -39,7 +40,7 @@ type command struct {
type options struct {
flags.Options
setListSeparator string
sets cli.StringSlice
sets []string
}
// NewCommand constructs an config command with the specified logger
@@ -58,11 +59,11 @@ func (m command) build() *cli.Command {
c := cli.Command{
Name: "config",
Usage: "Interact with the NVIDIA Container Toolkit configuration",
Before: func(ctx *cli.Context) error {
return validateFlags(ctx, &opts)
Before: func(ctx context.Context, cmd *cli.Command) (context.Context, error) {
return ctx, validateFlags(cmd, &opts)
},
Action: func(ctx *cli.Context) error {
return run(ctx, &opts)
Action: func(ctx context.Context, cmd *cli.Command) error {
return run(cmd, &opts)
},
}
@@ -104,21 +105,21 @@ func (m command) build() *cli.Command {
},
}
c.Subcommands = []*cli.Command{
c.Commands = []*cli.Command{
createdefault.NewCommand(m.logger),
}
return &c
}
func validateFlags(c *cli.Context, opts *options) error {
func validateFlags(c *cli.Command, opts *options) error {
if opts.setListSeparator == "" {
return fmt.Errorf("set-list-separator must be set")
}
return nil
}
func run(c *cli.Context, opts *options) error {
func run(c *cli.Command, opts *options) error {
cfgToml, err := config.New(
config.WithConfigFile(opts.Config),
)
@@ -126,7 +127,7 @@ func run(c *cli.Context, opts *options) error {
return fmt.Errorf("unable to create config: %v", err)
}
for _, set := range opts.sets.Value() {
for _, set := range opts.sets {
key, value, err := setFlagToKeyValue(set, opts.setListSeparator)
if err != nil {
return fmt.Errorf("invalid --set option %v: %w", set, err)

View File

@@ -17,9 +17,10 @@
package defaultsubcommand
import (
"context"
"fmt"
"github.com/urfave/cli/v2"
"github.com/urfave/cli/v3"
"github.com/NVIDIA/nvidia-container-toolkit/cmd/nvidia-ctk/config/flags"
"github.com/NVIDIA/nvidia-container-toolkit/internal/config"
@@ -47,11 +48,11 @@ func (m command) build() *cli.Command {
Name: "default",
Aliases: []string{"create-default", "generate-default"},
Usage: "Generate the default NVIDIA Container Toolkit configuration file",
Before: func(c *cli.Context) error {
return m.validateFlags(c, &opts)
Before: func(ctx context.Context, cmd *cli.Command) (context.Context, error) {
return ctx, m.validateFlags(cmd, &opts)
},
Action: func(c *cli.Context) error {
return m.run(c, &opts)
Action: func(ctx context.Context, cmd *cli.Command) error {
return m.run(cmd, &opts)
},
}
@@ -67,11 +68,11 @@ func (m command) build() *cli.Command {
return &c
}
func (m command) validateFlags(c *cli.Context, opts *flags.Options) error {
func (m command) validateFlags(c *cli.Command, opts *flags.Options) error {
return opts.Validate()
}
func (m command) run(c *cli.Context, opts *flags.Options) error {
func (m command) run(c *cli.Command, opts *flags.Options) error {
cfgToml, err := config.New()
if err != nil {
return fmt.Errorf("unable to load or create config: %v", err)

View File

@@ -17,10 +17,12 @@
package hook
import (
"context"
"github.com/NVIDIA/nvidia-container-toolkit/cmd/nvidia-cdi-hook/commands"
"github.com/NVIDIA/nvidia-container-toolkit/internal/logger"
"github.com/urfave/cli/v2"
"github.com/urfave/cli/v3"
)
type hookCommand struct {
@@ -48,13 +50,13 @@ func (m hookCommand) build() *cli.Command {
// referring to a new hook that is not yet supported by an older NVIDIA
// Container Toolkit version or a hook that has been removed in newer
// version.
Action: func(ctx *cli.Context) error {
commands.IssueUnsupportedHookWarning(m.logger, ctx)
Action: func(ctx context.Context, cmd *cli.Command) error {
commands.IssueUnsupportedHookWarning(m.logger, cmd)
return nil
},
}
hook.Subcommands = commands.New(m.logger)
hook.Commands = commands.New(m.logger)
return &hook
}

View File

@@ -17,7 +17,7 @@
package info
import (
"github.com/urfave/cli/v2"
"github.com/urfave/cli/v3"
"github.com/NVIDIA/nvidia-container-toolkit/internal/logger"
)
@@ -42,7 +42,7 @@ func (m command) build() *cli.Command {
Usage: "Provide information about the system",
}
info.Subcommands = []*cli.Command{}
info.Commands = []*cli.Command{}
return &info
}

View File

@@ -17,6 +17,7 @@
package main
import (
"context"
"os"
"github.com/sirupsen/logrus"
@@ -29,7 +30,7 @@ import (
"github.com/NVIDIA/nvidia-container-toolkit/cmd/nvidia-ctk/system"
"github.com/NVIDIA/nvidia-container-toolkit/internal/info"
cli "github.com/urfave/cli/v2"
cli "github.com/urfave/cli/v3"
)
// options defines the options that can be set for the CLI through config files,
@@ -48,11 +49,11 @@ func main() {
opts := options{}
// Create the top-level CLI
c := cli.NewApp()
c := cli.Command{}
c.DisableSliceFlagSeparator = true
c.Name = "NVIDIA Container Toolkit CLI"
c.UseShortOptionHandling = true
c.EnableBashCompletion = true
c.EnableShellCompletion = true
c.Usage = "Tools to configure the NVIDIA Container Toolkit"
c.Version = info.GetVersionString()
@@ -63,18 +64,18 @@ func main() {
Aliases: []string{"d"},
Usage: "Enable debug-level logging",
Destination: &opts.Debug,
EnvVars: []string{"NVIDIA_CTK_DEBUG"},
Sources: cli.EnvVars("NVIDIA_CTK_DEBUG"),
},
&cli.BoolFlag{
Name: "quiet",
Usage: "Suppress all output except for errors; overrides --debug",
Destination: &opts.Quiet,
EnvVars: []string{"NVIDIA_CTK_QUIET"},
Sources: cli.EnvVars("NVIDIA_CTK_QUIET"),
},
}
// Set log-level for all subcommands
c.Before = func(c *cli.Context) error {
c.Before = func(ctx context.Context, cmd *cli.Command) (context.Context, error) {
logLevel := logrus.InfoLevel
if opts.Debug {
logLevel = logrus.DebugLevel
@@ -83,7 +84,7 @@ func main() {
logLevel = logrus.ErrorLevel
}
logger.SetLevel(logLevel)
return nil
return ctx, nil
}
// Define the subcommands
@@ -97,7 +98,7 @@ func main() {
}
// Run the CLI
err := c.Run(os.Args)
err := c.Run(context.Background(), os.Args)
if err != nil {
logger.Errorf("%v", err)
os.Exit(1)

View File

@@ -17,10 +17,11 @@
package configure
import (
"context"
"fmt"
"path/filepath"
"github.com/urfave/cli/v2"
"github.com/urfave/cli/v3"
"github.com/NVIDIA/nvidia-container-toolkit/internal/logger"
"github.com/NVIDIA/nvidia-container-toolkit/pkg/config/engine"
@@ -94,11 +95,11 @@ func (m command) build() *cli.Command {
configure := cli.Command{
Name: "configure",
Usage: "Add a runtime to the specified container engine",
Before: func(c *cli.Context) error {
return m.validateFlags(c, &config)
Before: func(ctx context.Context, cmd *cli.Command) (context.Context, error) {
return ctx, m.validateFlags(cmd, &config)
},
Action: func(c *cli.Context) error {
return m.configureWrapper(c, &config)
Action: func(ctx context.Context, cmd *cli.Command) error {
return m.configureWrapper(cmd, &config)
},
}
@@ -176,7 +177,7 @@ func (m command) build() *cli.Command {
return &configure
}
func (m command) validateFlags(c *cli.Context, config *config) error {
func (m command) validateFlags(c *cli.Command, config *config) error {
if config.mode == "oci-hook" {
if !filepath.IsAbs(config.nvidiaRuntime.hookPath) {
return fmt.Errorf("the NVIDIA runtime hook path %q is not an absolute path", config.nvidiaRuntime.hookPath)
@@ -244,7 +245,7 @@ func (m command) validateFlags(c *cli.Context, config *config) error {
}
// configureWrapper updates the specified container engine config to enable the NVIDIA runtime
func (m command) configureWrapper(c *cli.Context, config *config) error {
func (m command) configureWrapper(c *cli.Command, config *config) error {
switch config.mode {
case "oci-hook":
return m.configureOCIHook(c, config)
@@ -255,7 +256,7 @@ func (m command) configureWrapper(c *cli.Context, config *config) error {
}
// configureConfigFile updates the specified container engine config file to enable the NVIDIA runtime.
func (m command) configureConfigFile(c *cli.Context, config *config) error {
func (m command) configureConfigFile(c *cli.Command, config *config) error {
configSource, err := config.resolveConfigSource()
if err != nil {
return err
@@ -350,7 +351,7 @@ func (c *config) getOutputConfigPath() string {
}
// configureOCIHook creates and configures the OCI hook for the NVIDIA runtime
func (m *command) configureOCIHook(c *cli.Context, config *config) error {
func (m *command) configureOCIHook(c *cli.Command, config *config) error {
err := ocihook.CreateHook(config.hookFilePath, config.nvidiaRuntime.hookPath)
if err != nil {
return fmt.Errorf("error creating OCI hook: %v", err)

View File

@@ -17,7 +17,7 @@
package runtime
import (
"github.com/urfave/cli/v2"
"github.com/urfave/cli/v3"
"github.com/NVIDIA/nvidia-container-toolkit/cmd/nvidia-ctk/runtime/configure"
"github.com/NVIDIA/nvidia-container-toolkit/internal/logger"
@@ -42,7 +42,7 @@ func (m runtimeCommand) build() *cli.Command {
Usage: "A collection of runtime-related utilities for the NVIDIA Container Toolkit",
}
runtime.Subcommands = []*cli.Command{
runtime.Commands = []*cli.Command{
configure.NewCommand(m.logger),
}

View File

@@ -17,11 +17,12 @@
package devchar
import (
"context"
"fmt"
"os"
"path/filepath"
"github.com/urfave/cli/v2"
"github.com/urfave/cli/v3"
"github.com/NVIDIA/nvidia-container-toolkit/internal/logger"
"github.com/NVIDIA/nvidia-container-toolkit/internal/system/nvdevices"
@@ -61,11 +62,11 @@ func (m command) build() *cli.Command {
c := cli.Command{
Name: "create-dev-char-symlinks",
Usage: "A utility to create symlinks to possible /dev/nv* devices in /dev/char",
Before: func(c *cli.Context) error {
return m.validateFlags(c, &cfg)
Before: func(ctx context.Context, cmd *cli.Command) (context.Context, error) {
return ctx, m.validateFlags(cmd, &cfg)
},
Action: func(c *cli.Context) error {
return m.run(c, &cfg)
Action: func(ctx context.Context, cmd *cli.Command) error {
return m.run(cmd, &cfg)
},
}
@@ -75,46 +76,46 @@ func (m command) build() *cli.Command {
Usage: "The path at which the symlinks will be created. Symlinks will be created as `DEV_CHAR`/MAJOR:MINOR where MAJOR and MINOR are the major and minor numbers of a corresponding device node.",
Value: defaultDevCharPath,
Destination: &cfg.devCharPath,
EnvVars: []string{"DEV_CHAR_PATH"},
Sources: cli.EnvVars("DEV_CHAR_PATH"),
},
&cli.StringFlag{
Name: "driver-root",
Usage: "The path to the driver root. `DRIVER_ROOT`/dev is searched for NVIDIA device nodes.",
Value: "/",
Destination: &cfg.driverRoot,
EnvVars: []string{"NVIDIA_DRIVER_ROOT", "DRIVER_ROOT"},
Sources: cli.EnvVars("NVIDIA_DRIVER_ROOT", "DRIVER_ROOT"),
},
&cli.BoolFlag{
Name: "create-all",
Usage: "Create all possible /dev/char symlinks instead of limiting these to existing device nodes.",
Destination: &cfg.createAll,
EnvVars: []string{"CREATE_ALL"},
Sources: cli.EnvVars("CREATE_ALL"),
},
&cli.BoolFlag{
Name: "load-kernel-modules",
Usage: "Load the NVIDIA kernel modules before creating symlinks. This is only applicable when --create-all is set.",
Destination: &cfg.loadKernelModules,
EnvVars: []string{"LOAD_KERNEL_MODULES"},
Sources: cli.EnvVars("LOAD_KERNEL_MODULES"),
},
&cli.BoolFlag{
Name: "create-device-nodes",
Usage: "Create the NVIDIA control device nodes in the driver root if they do not exist. This is only applicable when --create-all is set",
Destination: &cfg.createDeviceNodes,
EnvVars: []string{"CREATE_DEVICE_NODES"},
Sources: cli.EnvVars("CREATE_DEVICE_NODES"),
},
&cli.BoolFlag{
Name: "dry-run",
Usage: "If set, the command will not create any symlinks.",
Value: false,
Destination: &cfg.dryRun,
EnvVars: []string{"DRY_RUN"},
Sources: cli.EnvVars("DRY_RUN"),
},
}
return &c
}
func (m command) validateFlags(r *cli.Context, cfg *config) error {
func (m command) validateFlags(r *cli.Command, cfg *config) error {
if cfg.createAll {
return fmt.Errorf("create-all and watch are mutually exclusive")
}
@@ -132,7 +133,7 @@ func (m command) validateFlags(r *cli.Context, cfg *config) error {
return nil
}
func (m command) run(c *cli.Context, cfg *config) error {
func (m command) run(c *cli.Command, cfg *config) error {
l, err := NewSymlinkCreator(
WithLogger(m.logger),
WithDevCharPath(cfg.devCharPath),

View File

@@ -17,9 +17,10 @@
package createdevicenodes
import (
"context"
"fmt"
"github.com/urfave/cli/v2"
"github.com/urfave/cli/v3"
"github.com/NVIDIA/nvidia-container-toolkit/internal/logger"
"github.com/NVIDIA/nvidia-container-toolkit/internal/system/nvdevices"
@@ -56,11 +57,11 @@ func (m command) build() *cli.Command {
c := cli.Command{
Name: "create-device-nodes",
Usage: "A utility to create NVIDIA device nodes",
Before: func(c *cli.Context) error {
return m.validateFlags(c, &opts)
Before: func(ctx context.Context, cmd *cli.Command) (context.Context, error) {
return ctx, m.validateFlags(cmd, &opts)
},
Action: func(c *cli.Context) error {
return m.run(c, &opts)
Action: func(ctx context.Context, cmd *cli.Command) error {
return m.run(cmd, &opts)
},
}
@@ -74,13 +75,13 @@ func (m command) build() *cli.Command {
Value: "/",
Destination: &opts.root,
// TODO: Remove the NVIDIA_DRIVER_ROOT and DRIVER_ROOT envvars.
EnvVars: []string{"ROOT", "NVIDIA_DRIVER_ROOT", "DRIVER_ROOT"},
Sources: cli.EnvVars("ROOT", "NVIDIA_DRIVER_ROOT", "DRIVER_ROOT"),
},
&cli.StringFlag{
Name: "dev-root",
Usage: "specify the root where `/dev` is located. If this is not specified, the root is assumed.",
Destination: &opts.devRoot,
EnvVars: []string{"NVIDIA_DEV_ROOT", "DEV_ROOT"},
Sources: cli.EnvVars("NVIDIA_DEV_ROOT", "DEV_ROOT"),
},
&cli.BoolFlag{
Name: "control-devices",
@@ -97,14 +98,14 @@ func (m command) build() *cli.Command {
Usage: "if set, the command will not perform any operations",
Value: false,
Destination: &opts.dryRun,
EnvVars: []string{"DRY_RUN"},
Sources: cli.EnvVars("DRY_RUN"),
},
}
return &c
}
func (m command) validateFlags(r *cli.Context, opts *options) error {
func (m command) validateFlags(r *cli.Command, opts *options) error {
if opts.devRoot == "" && opts.root != "" {
m.logger.Infof("Using dev-root %q", opts.root)
opts.devRoot = opts.root
@@ -112,7 +113,7 @@ func (m command) validateFlags(r *cli.Context, opts *options) error {
return nil
}
func (m command) run(c *cli.Context, opts *options) error {
func (m command) run(c *cli.Command, opts *options) error {
if opts.loadKernelModules {
modules := nvmodules.New(
nvmodules.WithLogger(m.logger),

View File

@@ -17,7 +17,7 @@
package system
import (
"github.com/urfave/cli/v2"
"github.com/urfave/cli/v3"
devchar "github.com/NVIDIA/nvidia-container-toolkit/cmd/nvidia-ctk/system/create-dev-char-symlinks"
devicenodes "github.com/NVIDIA/nvidia-container-toolkit/cmd/nvidia-ctk/system/create-device-nodes"
@@ -43,7 +43,7 @@ func (m command) build() *cli.Command {
Usage: "A collection of system-related utilities for the NVIDIA Container Toolkit",
}
system.Subcommands = []*cli.Command{
system.Commands = []*cli.Command{
devchar.NewCommand(m.logger),
devicenodes.NewCommand(m.logger),
}

View File

@@ -48,14 +48,18 @@ ARG VERSION="N/A"
ARG GIT_COMMIT="unknown"
RUN make PREFIX=/artifacts/bin cmd-nvidia-ctk-installer
# The packaging stage collects the deb and rpm packages built for supported
# architectures.
FROM nvcr.io/nvidia/cuda:12.9.0-base-ubi9 AS packaging
# The packaging stage collects the deb and rpm packages built for
# supported architectures.
FROM nvcr.io/nvidia/distroless/go:v3.1.9-dev AS packaging
USER 0:0
SHELL ["/busybox/sh", "-c"]
RUN ln -s /busybox/sh /bin/sh
ARG ARTIFACTS_ROOT
COPY ${ARTIFACTS_ROOT} /artifacts/packages/
WORKDIR /artifacts/packages
WORKDIR /artifacts
# build-args are added to the manifest.txt file below.
ARG PACKAGE_VERSION
@@ -70,7 +74,14 @@ RUN echo "#IMAGE_EPOCH=$(date '+%s')" > /artifacts/manifest.txt && \
env | sed 's/^/#/g' >> /artifacts/manifest.txt && \
find /artifacts/packages -iname '*.deb' -o -iname '*.rpm' >> /artifacts/manifest.txt
RUN mkdir /licenses && mv /NGC-DL-CONTAINER-LICENSE /licenses/NGC-DL-CONTAINER-LICENSE
LABEL name="NVIDIA Container Toolkit Packages"
LABEL vendor="NVIDIA"
LABEL version="${VERSION}"
LABEL release="N/A"
LABEL summary="deb and rpm packages for the NVIDIA Container Toolkit"
LABEL description="See summary"
COPY LICENSE /licenses/
# The debpackages stage is used to extract the contents of deb packages.
FROM nvcr.io/nvidia/cuda:12.9.0-base-ubuntu20.04 AS debpackages
@@ -116,13 +127,19 @@ RUN set -eux; \
# - The extracted deb packages
# - The extracted rpm packages
# - The nvidia-ctk-installer binary
FROM nvcr.io/nvidia/cuda:12.9.0-base-ubi9 AS artifacts
FROM scratch AS artifacts
COPY --from=rpmpackages /artifacts/rpm /artifacts/rpm
COPY --from=debpackages /artifacts/deb /artifacts/deb
COPY --from=build /artifacts/bin /artifacts/build
FROM nvcr.io/nvidia/cuda:12.9.0-base-ubi9
# The application stage contains the application used as a GPU Operator
# operand.
FROM nvcr.io/nvidia/distroless/go:v3.1.9-dev AS application
USER 0:0
SHELL ["/busybox/sh", "-c"]
RUN ln -s /busybox/sh /bin/sh
ENV NVIDIA_DISABLE_REQUIRE="true"
ENV NVIDIA_VISIBLE_DEVICES=void
@@ -144,6 +161,11 @@ LABEL release="N/A"
LABEL summary="Automatically Configure your Container Runtime for GPU support."
LABEL description="See summary"
RUN mkdir /licenses && mv /NGC-DL-CONTAINER-LICENSE /licenses/NGC-DL-CONTAINER-LICENSE
COPY LICENSE /licenses/
ENTRYPOINT ["/work/nvidia-ctk-installer"]
# The GPU Operator exec's nvidia-toolkit in its entrypoint.
# We create a symlink here to ensure compatibility with older
# GPU Operator versions.
RUN ln -s /work/nvidia-ctk-installer /work/nvidia-toolkit

View File

@@ -38,7 +38,7 @@ OUT_IMAGE_TAG = $(OUT_IMAGE_VERSION)
OUT_IMAGE = $(OUT_IMAGE_NAME):$(OUT_IMAGE_TAG)
##### Public rules #####
DEFAULT_PUSH_TARGET := ubi9
DEFAULT_PUSH_TARGET := application
DISTRIBUTIONS := $(DEFAULT_PUSH_TARGET)
META_TARGETS := packaging
@@ -102,8 +102,6 @@ build: build-$(DEFAULT_PUSH_TARGET)
push: push-$(DEFAULT_PUSH_TARGET)
# Test targets
test-%: DIST = $(*)
TEST_CASES ?= docker crio containerd
$(TEST_TARGETS): test-%:
TEST_CASES="$(TEST_CASES)" bash -x $(CURDIR)/test/container/main.sh run \

View File

@@ -57,9 +57,6 @@ WORKDIR $DIST_DIR
COPY packaging/debian ./debian
COPY deployments/systemd/ .
ARG LIBNVIDIA_CONTAINER_TOOLS_VERSION
ENV LIBNVIDIA_CONTAINER_TOOLS_VERSION ${LIBNVIDIA_CONTAINER_TOOLS_VERSION}
RUN dch --create --package="${PKG_NAME}" \
--newversion "${REVISION}" \
"See https://gitlab.com/nvidia/container-toolkit/container-toolkit/-/blob/${GIT_COMMIT}/CHANGELOG.md for the changelog" && \
@@ -68,6 +65,6 @@ RUN dch --create --package="${PKG_NAME}" \
if [ "$REVISION" != "$(dpkg-parsechangelog --show-field=Version)" ]; then exit 1; fi
CMD export DISTRIB="$(lsb_release -cs)" && \
debuild -eDISTRIB -eSECTION -eLIBNVIDIA_CONTAINER_TOOLS_VERSION -eVERSION="${REVISION}" \
debuild -eDISTRIB -eSECTION -eVERSION="${REVISION}" \
--dpkg-buildpackage-hook='sh debian/prepare' -i -us -uc -b && \
mv /tmp/*.deb /dist

View File

@@ -48,16 +48,12 @@ WORKDIR $DIST_DIR/..
COPY packaging/rpm .
COPY deployments/systemd/ .
ARG LIBNVIDIA_CONTAINER_TOOLS_VERSION
ENV LIBNVIDIA_CONTAINER_TOOLS_VERSION ${LIBNVIDIA_CONTAINER_TOOLS_VERSION}
CMD arch=$(uname -m) && \
rpmbuild --clean --target=$arch -bb \
-D "_topdir $PWD" \
-D "release_date $(date +'%a %b %d %Y')" \
-D "git_commit ${GIT_COMMIT}" \
-D "version ${PKG_VERS}" \
-D "libnvidia_container_tools_version ${LIBNVIDIA_CONTAINER_TOOLS_VERSION}" \
-D "release ${PKG_REV}" \
SPECS/nvidia-container-toolkit.spec && \
mv RPMS/$arch/*.rpm /dist

View File

@@ -73,16 +73,12 @@ WORKDIR $DIST_DIR/..
COPY packaging/rpm .
COPY deployments/systemd/ ${DIST_DIR}/
ARG LIBNVIDIA_CONTAINER_TOOLS_VERSION
ENV LIBNVIDIA_CONTAINER_TOOLS_VERSION ${LIBNVIDIA_CONTAINER_TOOLS_VERSION}
CMD arch=$(uname -m) && \
rpmbuild --clean --target=$arch -bb \
-D "_topdir $PWD" \
-D "release_date $(date +'%a %b %d %Y')" \
-D "git_commit ${GIT_COMMIT}" \
-D "version ${PKG_VERS}" \
-D "libnvidia_container_tools_version ${LIBNVIDIA_CONTAINER_TOOLS_VERSION}" \
-D "release ${PKG_REV}" \
SPECS/nvidia-container-toolkit.spec && \
mv RPMS/$arch/*.rpm /dist

View File

@@ -55,17 +55,14 @@ WORKDIR $DIST_DIR
COPY packaging/debian ./debian
COPY deployments/systemd/ .
ARG LIBNVIDIA_CONTAINER_TOOLS_VERSION
ENV LIBNVIDIA_CONTAINER_TOOLS_VERSION ${LIBNVIDIA_CONTAINER_TOOLS_VERSION}
RUN dch --create --package="${PKG_NAME}" \
--newversion "${REVISION}" \
"See https://gitlab.com/nvidia/container-toolkit/container-toolkit/-/blob/${GIT_COMMIT}/CHANGELOG.md for the changelog" && \
dch --append "Bump libnvidia-container dependency to ${LIBNVIDIA_CONTAINER_TOOLS_VERSION}" && \
dch --append "Bump libnvidia-container dependency to ${REVISION}" && \
dch -r "" && \
if [ "$REVISION" != "$(dpkg-parsechangelog --show-field=Version)" ]; then exit 1; fi
CMD export DISTRIB="$(lsb_release -cs)" && \
debuild -eDISTRIB -eSECTION -eLIBNVIDIA_CONTAINER_TOOLS_VERSION -eVERSION="${REVISION}" \
debuild -eDISTRIB -eSECTION -eVERSION="${REVISION}" \
--dpkg-buildpackage-hook='sh debian/prepare' -i -us -uc -b && \
mv /tmp/*.deb /dist

View File

@@ -85,11 +85,6 @@ docker-all: $(AMD64_TARGETS) $(X86_64_TARGETS) \
--%: docker-build-%
@
LIBNVIDIA_CONTAINER_VERSION ?= $(LIB_VERSION)
LIBNVIDIA_CONTAINER_TAG ?= $(LIB_TAG)
LIBNVIDIA_CONTAINER_TOOLS_VERSION := $(LIBNVIDIA_CONTAINER_VERSION)$(if $(LIBNVIDIA_CONTAINER_TAG),~$(LIBNVIDIA_CONTAINER_TAG))-1
# private ubuntu target
--ubuntu%: OS := ubuntu
@@ -129,7 +124,6 @@ docker-build-%:
--build-arg PKG_NAME="$(LIB_NAME)" \
--build-arg PKG_VERS="$(PACKAGE_VERSION)" \
--build-arg PKG_REV="$(PACKAGE_REVISION)" \
--build-arg LIBNVIDIA_CONTAINER_TOOLS_VERSION="$(LIBNVIDIA_CONTAINER_TOOLS_VERSION)" \
--build-arg GIT_COMMIT="$(GIT_COMMIT)" \
--tag $(BUILDIMAGE) \
--file $(DOCKERFILE) .

10
go.mod
View File

@@ -1,6 +1,8 @@
module github.com/NVIDIA/nvidia-container-toolkit
go 1.23.0
go 1.23.2
toolchain go1.24.4
require (
github.com/NVIDIA/go-nvlib v0.7.3
@@ -13,7 +15,7 @@ require (
github.com/pelletier/go-toml v1.9.5
github.com/sirupsen/logrus v1.9.3
github.com/stretchr/testify v1.10.0
github.com/urfave/cli/v2 v2.27.7
github.com/urfave/cli/v3 v3.3.8
golang.org/x/mod v0.25.0
golang.org/x/sys v0.33.0
tags.cncf.io/container-device-interface v1.0.1
@@ -21,20 +23,16 @@ require (
)
require (
github.com/cpuguy83/go-md2man/v2 v2.0.7 // indirect
github.com/davecgh/go-spew v1.1.1 // indirect
github.com/fsnotify/fsnotify v1.7.0 // indirect
github.com/google/uuid v1.6.0 // indirect
github.com/hashicorp/errwrap v1.1.0 // indirect
github.com/kr/pretty v0.3.1 // indirect
github.com/opencontainers/cgroups v0.0.1 // indirect
github.com/opencontainers/runtime-tools v0.9.1-0.20221107090550-2e043c6bd626 // indirect
github.com/pmezard/go-difflib v1.0.0 // indirect
github.com/rogpeppe/go-internal v1.11.0 // indirect
github.com/russross/blackfriday/v2 v2.1.0 // indirect
github.com/syndtr/gocapability v0.0.0-20200815063812-42c35b437635 // indirect
github.com/xeipuuv/gojsonpointer v0.0.0-20190905194746-02993c407bfb // indirect
github.com/xrash/smetrics v0.0.0-20240521201337-686a1a2994c1 // indirect
gopkg.in/check.v1 v1.0.0-20201130134442-10cb98267c6c // indirect
gopkg.in/yaml.v3 v3.0.1 // indirect
sigs.k8s.io/yaml v1.4.0 // indirect

12
go.sum
View File

@@ -4,8 +4,6 @@ github.com/NVIDIA/go-nvml v0.12.9-0 h1:e344UK8ZkeMeeLkdQtRhmXRxNf+u532LDZPGMtkdu
github.com/NVIDIA/go-nvml v0.12.9-0/go.mod h1:+KNA7c7gIBH7SKSJ1ntlwkfN80zdx8ovl4hrK3LmPt4=
github.com/blang/semver/v4 v4.0.0 h1:1PFHFE6yCCTv8C1TeyNNarDzntLi7wMI5i/pzqYIsAM=
github.com/blang/semver/v4 v4.0.0/go.mod h1:IbckMUScFkM3pff0VJDNKRiT6TG/YpiHIM2yvyW5YoQ=
github.com/cpuguy83/go-md2man/v2 v2.0.7 h1:zbFlGlXEAKlwXpmvle3d8Oe3YnkKIK4xSRTd3sHPnBo=
github.com/cpuguy83/go-md2man/v2 v2.0.7/go.mod h1:oOW0eioCTA6cOiMLiUPZOpcVxMig6NIQQ7OS05n1F4g=
github.com/creack/pty v1.1.9/go.mod h1:oKZEueFk5CKHvIhNR5MUki03XCEU+Q6VDXinZuGJ33E=
github.com/cyphar/filepath-securejoin v0.4.1 h1:JyxxyPEaktOD+GAnqIqTf9A8tHyAG22rowi7HkoSU1s=
github.com/cyphar/filepath-securejoin v0.4.1/go.mod h1:Sdj7gXlvMcPZsbhwhQ33GguGLDGQL7h7bg04C/+u9jI=
@@ -37,8 +35,6 @@ github.com/moby/sys/reexec v0.1.0/go.mod h1:EqjBg8F3X7iZe5pU6nRZnYCMUTXoxsjiIfHu
github.com/moby/sys/symlink v0.3.0 h1:GZX89mEZ9u53f97npBy4Rc3vJKj7JBDj/PN2I22GrNU=
github.com/moby/sys/symlink v0.3.0/go.mod h1:3eNdhduHmYPcgsJtZXW1W4XUJdZGBIkttZ8xKqPUJq0=
github.com/mrunalp/fileutils v0.5.0/go.mod h1:M1WthSahJixYnrXQl/DFQuteStB1weuxD2QJNHXfbSQ=
github.com/opencontainers/cgroups v0.0.1 h1:MXjMkkFpKv6kpuirUa4USFBas573sSAY082B4CiHEVA=
github.com/opencontainers/cgroups v0.0.1/go.mod h1:s8lktyhlGUqM7OSRL5P7eAW6Wb+kWPNvt4qvVfzA5vs=
github.com/opencontainers/runc v1.3.0 h1:cvP7xbEvD0QQAs0nZKLzkVog2OPZhI/V2w3WmTmUSXI=
github.com/opencontainers/runc v1.3.0/go.mod h1:9wbWt42gV+KRxKRVVugNP6D5+PQciRbenB4fLVsqGPs=
github.com/opencontainers/runtime-spec v1.0.3-0.20220825212826-86290f6a00fb/go.mod h1:jwyrGlmzljRJv/Fgzds9SsS/C5hL+LL3ko9hs6T5lQ0=
@@ -57,8 +53,6 @@ github.com/pmezard/go-difflib v1.0.0/go.mod h1:iKH77koFhYxTK1pcRnkKkqfTogsbg7gZN
github.com/rogpeppe/go-internal v1.9.0/go.mod h1:WtVeX8xhTBvf0smdhujwtBcq4Qrzq/fJaraNFVN+nFs=
github.com/rogpeppe/go-internal v1.11.0 h1:cWPaGQEPrBb5/AsnsZesgZZ9yb1OQ+GOISoDNXVBh4M=
github.com/rogpeppe/go-internal v1.11.0/go.mod h1:ddIwULY96R17DhadqLgMfk9H9tvdUzkipdSkR5nkCZA=
github.com/russross/blackfriday/v2 v2.1.0 h1:JIOH55/0cWyOuilr9/qlrm0BSXldqnqwMsf35Ld67mk=
github.com/russross/blackfriday/v2 v2.1.0/go.mod h1:+Rmxgy9KzJVeS9/2gXHxylqXiyQDYRxCVz55jmeOWTM=
github.com/sirupsen/logrus v1.8.1/go.mod h1:yWOB1SBYBC5VeMP7gHvWumXLIWorT60ONWic61uBYv0=
github.com/sirupsen/logrus v1.9.3 h1:dueUQJ1C2q9oE3F7wvmSGAaVtTmUizReu6fjN8uqzbQ=
github.com/sirupsen/logrus v1.9.3/go.mod h1:naHLuLoDiP4jHNo9R0sCBMtWGeIprob74mVsIT4qYEQ=
@@ -71,8 +65,8 @@ github.com/stretchr/testify v1.10.0/go.mod h1:r2ic/lqez/lEtzL7wO/rwa5dbSLXVDPFyf
github.com/syndtr/gocapability v0.0.0-20200815063812-42c35b437635 h1:kdXcSzyDtseVEc4yCz2qF8ZrQvIDBJLl4S1c3GCXmoI=
github.com/syndtr/gocapability v0.0.0-20200815063812-42c35b437635/go.mod h1:hkRG7XYTFWNJGYcbNJQlaLq0fg1yr4J4t/NcTQtrfww=
github.com/urfave/cli v1.19.1/go.mod h1:70zkFmudgCuE/ngEzBv17Jvp/497gISqfk5gWijbERA=
github.com/urfave/cli/v2 v2.27.7 h1:bH59vdhbjLv3LAvIu6gd0usJHgoTTPhCFib8qqOwXYU=
github.com/urfave/cli/v2 v2.27.7/go.mod h1:CyNAG/xg+iAOg0N4MPGZqVmv2rCoP267496AOXUZjA4=
github.com/urfave/cli/v3 v3.3.8 h1:BzolUExliMdet9NlJ/u4m5vHSotJ3PzEqSAZ1oPMa/E=
github.com/urfave/cli/v3 v3.3.8/go.mod h1:FJSKtM/9AiiTOJL4fJ6TbMUkxBXn7GO9guZqoZtpYpo=
github.com/xeipuuv/gojsonpointer v0.0.0-20180127040702-4e3ac2762d5f/go.mod h1:N2zxlSyiKSe5eX1tZViRH5QA0qijqEDrYZiPEAiq3wU=
github.com/xeipuuv/gojsonpointer v0.0.0-20190905194746-02993c407bfb h1:zGWFAtiMcyryUHoUjUJX0/lt1H2+i2Ka2n+D3DImSNo=
github.com/xeipuuv/gojsonpointer v0.0.0-20190905194746-02993c407bfb/go.mod h1:N2zxlSyiKSe5eX1tZViRH5QA0qijqEDrYZiPEAiq3wU=
@@ -80,8 +74,6 @@ github.com/xeipuuv/gojsonreference v0.0.0-20180127040603-bd5ef7bd5415 h1:EzJWgHo
github.com/xeipuuv/gojsonreference v0.0.0-20180127040603-bd5ef7bd5415/go.mod h1:GwrjFmJcFw6At/Gs6z4yjiIwzuJ1/+UwLxMQDVQXShQ=
github.com/xeipuuv/gojsonschema v1.2.0 h1:LhYJRs+L4fBtjZUfuSZIKGeVu0QRy8e5Xi7D17UxZ74=
github.com/xeipuuv/gojsonschema v1.2.0/go.mod h1:anYRn/JVcOK2ZgGU+IjEV4nwlhoK5sQluxsYJ78Id3Y=
github.com/xrash/smetrics v0.0.0-20240521201337-686a1a2994c1 h1:gEOO8jv9F4OT7lGCjxCBTO/36wtF6j2nSip77qHd4x4=
github.com/xrash/smetrics v0.0.0-20240521201337-686a1a2994c1/go.mod h1:Ohn+xnUBiLI6FVj/9LpzZWtj1/D6lUovWYBkxHVV3aM=
golang.org/x/mod v0.25.0 h1:n7a+ZbQKQA/Ysbyb0/6IbB1H/X41mKgbhfv7AfG/44w=
golang.org/x/mod v0.25.0/go.mod h1:IXM97Txy2VM4PJ3gI61r1YEk/gAj6zAHN3AdZt6S9Ww=
golang.org/x/sys v0.0.0-20191026070338-33540a1f6037/go.mod h1:h1NjWce9XRLGQEsW7wpKNCjG9DtNlClVuFLEZdDNbEs=

View File

@@ -53,6 +53,6 @@ docker run --rm \
-v $(pwd):$(pwd) \
-w $(pwd) \
-u $(id -u):$(id -g) \
--entrypoint="bash" \
--entrypoint="sh" \
${IMAGE} \
-c "cp --preserve=timestamps -R /artifacts/* ${DIST_DIR}"
-c "cp -p -R /artifacts/* ${DIST_DIR}"

View File

@@ -31,8 +31,10 @@ import (
)
const (
configOverride = "XDG_CONFIG_HOME"
configFilePath = "nvidia-container-runtime/config.toml"
FilePathOverrideEnvVar = "NVIDIA_CTK_CONFIG_FILE_PATH"
RelativeFilePath = "nvidia-container-runtime/config.toml"
configRootOverride = "XDG_CONFIG_HOME"
nvidiaCTKExecutable = "nvidia-ctk"
nvidiaCTKDefaultFilePath = "/usr/bin/nvidia-ctk"
@@ -74,11 +76,15 @@ type Config struct {
// GetConfigFilePath returns the path to the config file for the configured system
func GetConfigFilePath() string {
if XDGConfigDir := os.Getenv(configOverride); len(XDGConfigDir) != 0 {
return filepath.Join(XDGConfigDir, configFilePath)
if configFilePathOverride := os.Getenv(FilePathOverrideEnvVar); configFilePathOverride != "" {
return configFilePathOverride
}
configRoot := "/etc"
if XDGConfigDir := os.Getenv(configRootOverride); len(XDGConfigDir) != 0 {
configRoot = XDGConfigDir
}
return filepath.Join("/etc", configFilePath)
return filepath.Join(configRoot, RelativeFilePath)
}
// GetConfig sets up the config struct. Values are read from a toml file

View File

@@ -27,9 +27,26 @@ import (
func TestGetConfigWithCustomConfig(t *testing.T) {
testDir := t.TempDir()
t.Setenv(configOverride, testDir)
t.Setenv(configRootOverride, testDir)
filename := filepath.Join(testDir, configFilePath)
filename := filepath.Join(testDir, RelativeFilePath)
// By default debug is disabled
contents := []byte("[nvidia-container-runtime]\ndebug = \"/nvidia-container-toolkit.log\"")
require.NoError(t, os.MkdirAll(filepath.Dir(filename), 0766))
require.NoError(t, os.WriteFile(filename, contents, 0600))
cfg, err := GetConfig()
require.NoError(t, err)
require.Equal(t, "/nvidia-container-toolkit.log", cfg.NVIDIAContainerRuntimeConfig.DebugFilePath)
}
func TestGetConfigWithConfigFilePathOverride(t *testing.T) {
testDir := t.TempDir()
filename := filepath.Join(testDir, RelativeFilePath)
t.Setenv(FilePathOverrideEnvVar, filename)
// By default debug is disabled
contents := []byte("[nvidia-container-runtime]\ndebug = \"/nvidia-container-toolkit.log\"")

View File

@@ -23,6 +23,7 @@ type cache struct {
sync.Mutex
devices []Device
envVars []EnvVar
hooks []Hook
mounts []Mount
}
@@ -51,6 +52,20 @@ func (c *cache) Devices() ([]Device, error) {
return c.devices, nil
}
func (c *cache) EnvVars() ([]EnvVar, error) {
c.Lock()
defer c.Unlock()
if c.envVars == nil {
envVars, err := c.d.EnvVars()
if err != nil {
return nil, err
}
c.envVars = envVars
}
return c.envVars, nil
}
func (c *cache) Hooks() ([]Hook, error) {
c.Lock()
defer c.Unlock()

View File

@@ -22,6 +22,12 @@ type Device struct {
Path string
}
// EnvVar represents a discovered environment variable.
type EnvVar struct {
Name string
Value string
}
// Mount represents a discovered mount.
type Mount struct {
HostPath string
@@ -42,6 +48,7 @@ type Hook struct {
//go:generate moq -rm -fmt=goimports -stub -out discover_mock.go . Discover
type Discover interface {
Devices() ([]Device, error)
EnvVars() ([]EnvVar, error)
Mounts() ([]Mount, error)
Hooks() ([]Hook, error)
}

View File

@@ -20,6 +20,9 @@ var _ Discover = &DiscoverMock{}
// DevicesFunc: func() ([]Device, error) {
// panic("mock out the Devices method")
// },
// EnvVarsFunc: func() ([]EnvVar, error) {
// panic("mock out the EnvVars method")
// },
// HooksFunc: func() ([]Hook, error) {
// panic("mock out the Hooks method")
// },
@@ -36,6 +39,9 @@ type DiscoverMock struct {
// DevicesFunc mocks the Devices method.
DevicesFunc func() ([]Device, error)
// EnvVarsFunc mocks the EnvVars method.
EnvVarsFunc func() ([]EnvVar, error)
// HooksFunc mocks the Hooks method.
HooksFunc func() ([]Hook, error)
@@ -47,6 +53,9 @@ type DiscoverMock struct {
// Devices holds details about calls to the Devices method.
Devices []struct {
}
// EnvVars holds details about calls to the EnvVars method.
EnvVars []struct {
}
// Hooks holds details about calls to the Hooks method.
Hooks []struct {
}
@@ -55,6 +64,7 @@ type DiscoverMock struct {
}
}
lockDevices sync.RWMutex
lockEnvVars sync.RWMutex
lockHooks sync.RWMutex
lockMounts sync.RWMutex
}
@@ -90,6 +100,37 @@ func (mock *DiscoverMock) DevicesCalls() []struct {
return calls
}
// EnvVars calls EnvVarsFunc.
func (mock *DiscoverMock) EnvVars() ([]EnvVar, error) {
callInfo := struct {
}{}
mock.lockEnvVars.Lock()
mock.calls.EnvVars = append(mock.calls.EnvVars, callInfo)
mock.lockEnvVars.Unlock()
if mock.EnvVarsFunc == nil {
var (
envVarsOut []EnvVar
errOut error
)
return envVarsOut, errOut
}
return mock.EnvVarsFunc()
}
// EnvVarsCalls gets all the calls that were made to EnvVars.
// Check the length with:
//
// len(mockedDiscover.EnvVarsCalls())
func (mock *DiscoverMock) EnvVarsCalls() []struct {
} {
var calls []struct {
}
mock.lockEnvVars.RLock()
calls = mock.calls.EnvVars
mock.lockEnvVars.RUnlock()
return calls
}
// Hooks calls HooksFunc.
func (mock *DiscoverMock) Hooks() ([]Hook, error) {
callInfo := struct {

View File

@@ -0,0 +1,41 @@
/**
# SPDX-FileCopyrightText: Copyright (c) 2025 NVIDIA CORPORATION & AFFILIATES. All rights reserved.
# SPDX-License-Identifier: Apache-2.0
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
**/
package discover
var _ Discover = (*EnvVar)(nil)
// Devices returns an empty list of devices for a EnvVar discoverer.
func (e EnvVar) Devices() ([]Device, error) {
return nil, nil
}
// EnvVars returns an empty list of envs for a EnvVar discoverer.
func (e EnvVar) EnvVars() ([]EnvVar, error) {
return []EnvVar{e}, nil
}
// Mounts returns an empty list of mounts for a EnvVar discoverer.
func (e EnvVar) Mounts() ([]Mount, error) {
return nil, nil
}
// Hooks allows the Hook type to also implement the Discoverer interface.
// It returns a single hook
func (e EnvVar) Hooks() ([]Hook, error) {
return nil, nil
}

View File

@@ -45,6 +45,19 @@ func (f firstOf) Devices() ([]Device, error) {
return nil, errs
}
func (f firstOf) EnvVars() ([]EnvVar, error) {
var errs error
for _, d := range f {
envs, err := d.EnvVars()
if err != nil {
errs = errors.Join(errs, err)
continue
}
return envs, nil
}
return nil, errs
}
func (f firstOf) Hooks() ([]Hook, error) {
var errs error
for _, d := range f {

View File

@@ -46,6 +46,9 @@ const (
// An UpdateLDCacheHook is the hook used to update the ldcache in the
// container. This allows injected libraries to be discoverable.
UpdateLDCacheHook = HookName("update-ldcache")
// A CreateSonameSymlinksHook is the hook used to ensure that soname symlinks
// for injected libraries exist in the container.
CreateSonameSymlinksHook = HookName("create-soname-symlinks")
defaultNvidiaCDIHookPath = "/usr/bin/nvidia-cdi-hook"
)
@@ -57,6 +60,11 @@ func (h *Hook) Devices() ([]Device, error) {
return nil, nil
}
// EnvVars returns an empty list of envs for a Hook discoverer.
func (h *Hook) EnvVars() ([]EnvVar, error) {
return nil, nil
}
// Mounts returns an empty list of mounts for a Hook discoverer.
func (h *Hook) Mounts() ([]Mount, error) {
return nil, nil

View File

@@ -51,28 +51,22 @@ func (d ldconfig) Hooks() ([]Hook, error) {
return nil, fmt.Errorf("failed to discover mounts for ldcache update: %v", err)
}
h := createLDCacheUpdateHook(
d.hookCreator,
d.ldconfigPath,
getLibraryPaths(mounts),
)
return h.Hooks()
}
// createLDCacheUpdateHook locates the NVIDIA Container Toolkit CLI and creates a hook for updating the LD Cache
func createLDCacheUpdateHook(hookCreator HookCreator, ldconfig string, libraries []string) *Hook {
var args []string
if ldconfig != "" {
args = append(args, "--ldconfig-path", ldconfig)
if d.ldconfigPath != "" {
args = append(args, "--ldconfig-path", d.ldconfigPath)
}
for _, f := range uniqueFolders(libraries) {
for _, f := range uniqueFolders(getLibraryPaths(mounts)) {
args = append(args, "--folder", f)
}
return hookCreator.Create(UpdateLDCacheHook, args...)
h := Merge(
d.hookCreator.Create(CreateSonameSymlinksHook, args...),
d.hookCreator.Create(UpdateLDCacheHook, args...),
)
return h.Hooks()
}
// getLibraryPaths extracts the library dirs from the specified mounts

View File

@@ -39,11 +39,24 @@ func TestLDCacheUpdateHook(t *testing.T) {
mounts []Mount
mountError error
expectedError error
expectedArgs []string
expectedHooks []Hook
}{
{
description: "empty mounts",
expectedArgs: []string{"nvidia-cdi-hook", "update-ldcache"},
description: "empty mounts",
expectedHooks: []Hook{
{
Lifecycle: "createContainer",
Path: testNvidiaCDIHookPath,
Args: []string{"nvidia-cdi-hook", "create-soname-symlinks"},
Env: []string{"NVIDIA_CTK_DEBUG=false"},
},
{
Lifecycle: "createContainer",
Path: testNvidiaCDIHookPath,
Args: []string{"nvidia-cdi-hook", "update-ldcache"},
Env: []string{"NVIDIA_CTK_DEBUG=false"},
},
},
},
{
description: "mount error",
@@ -66,7 +79,20 @@ func TestLDCacheUpdateHook(t *testing.T) {
Path: "/usr/local/lib/libbar.so",
},
},
expectedArgs: []string{"nvidia-cdi-hook", "update-ldcache", "--folder", "/usr/local/lib", "--folder", "/usr/local/libother"},
expectedHooks: []Hook{
{
Lifecycle: "createContainer",
Path: testNvidiaCDIHookPath,
Args: []string{"nvidia-cdi-hook", "create-soname-symlinks", "--folder", "/usr/local/lib", "--folder", "/usr/local/libother"},
Env: []string{"NVIDIA_CTK_DEBUG=false"},
},
{
Lifecycle: "createContainer",
Path: testNvidiaCDIHookPath,
Args: []string{"nvidia-cdi-hook", "update-ldcache", "--folder", "/usr/local/lib", "--folder", "/usr/local/libother"},
Env: []string{"NVIDIA_CTK_DEBUG=false"},
},
},
},
{
description: "host paths are ignored",
@@ -76,12 +102,38 @@ func TestLDCacheUpdateHook(t *testing.T) {
Path: "/usr/local/lib/libfoo.so",
},
},
expectedArgs: []string{"nvidia-cdi-hook", "update-ldcache", "--folder", "/usr/local/lib"},
expectedHooks: []Hook{
{
Lifecycle: "createContainer",
Path: testNvidiaCDIHookPath,
Args: []string{"nvidia-cdi-hook", "create-soname-symlinks", "--folder", "/usr/local/lib"},
Env: []string{"NVIDIA_CTK_DEBUG=false"},
},
{
Lifecycle: "createContainer",
Path: testNvidiaCDIHookPath,
Args: []string{"nvidia-cdi-hook", "update-ldcache", "--folder", "/usr/local/lib"},
Env: []string{"NVIDIA_CTK_DEBUG=false"},
},
},
},
{
description: "explicit ldconfig path is passed",
ldconfigPath: testLdconfigPath,
expectedArgs: []string{"nvidia-cdi-hook", "update-ldcache", "--ldconfig-path", testLdconfigPath},
expectedHooks: []Hook{
{
Lifecycle: "createContainer",
Path: testNvidiaCDIHookPath,
Args: []string{"nvidia-cdi-hook", "create-soname-symlinks", "--ldconfig-path", testLdconfigPath},
Env: []string{"NVIDIA_CTK_DEBUG=false"},
},
{
Lifecycle: "createContainer",
Path: testNvidiaCDIHookPath,
Args: []string{"nvidia-cdi-hook", "update-ldcache", "--ldconfig-path", testLdconfigPath},
Env: []string{"NVIDIA_CTK_DEBUG=false"},
},
},
},
}
@@ -92,13 +144,6 @@ func TestLDCacheUpdateHook(t *testing.T) {
return tc.mounts, tc.mountError
},
}
expectedHook := Hook{
Path: testNvidiaCDIHookPath,
Args: tc.expectedArgs,
Lifecycle: "createContainer",
Env: []string{"NVIDIA_CTK_DEBUG=false"},
}
d, err := NewLDCacheUpdateHook(logger, mountMock, hookCreator, tc.ldconfigPath)
require.NoError(t, err)
@@ -112,9 +157,7 @@ func TestLDCacheUpdateHook(t *testing.T) {
}
require.NoError(t, err)
require.Len(t, hooks, 1)
require.EqualValues(t, hooks[0], expectedHook)
require.EqualValues(t, tc.expectedHooks, hooks)
devices, err := d.Devices()
require.NoError(t, err)

View File

@@ -53,6 +53,21 @@ func (d list) Devices() ([]Device, error) {
return allDevices, nil
}
// EnvVars returns all environment variables from the included discoverers.
func (d list) EnvVars() ([]EnvVar, error) {
var allEnvs []EnvVar
for i, di := range d {
envs, err := di.EnvVars()
if err != nil {
return nil, fmt.Errorf("error discovering envs for discoverer %v: %w", i, err)
}
allEnvs = append(allEnvs, envs...)
}
return allEnvs, nil
}
// Mounts returns all mounts from the included discoverers
func (d list) Mounts() ([]Mount, error) {
var allMounts []Mount

View File

@@ -27,6 +27,11 @@ func (e None) Devices() ([]Device, error) {
return nil, nil
}
// EnvVars returns an empty list of devices
func (e None) EnvVars() ([]EnvVar, error) {
return nil, nil
}
// Mounts returns an empty list of mounts
func (e None) Mounts() ([]Mount, error) {
return nil, nil

View File

@@ -20,8 +20,6 @@ import (
"tags.cncf.io/container-device-interface/pkg/cdi"
"tags.cncf.io/container-device-interface/specs-go"
"github.com/opencontainers/runc/libcontainer/devices"
"github.com/NVIDIA/nvidia-container-toolkit/internal/discover"
)
@@ -45,37 +43,19 @@ func (d device) toEdits() (*cdi.ContainerEdits, error) {
// toSpec converts a discovered Device to a CDI Spec Device. Note
// that missing info is filled in when edits are applied by querying the Device node.
func (d device) toSpec() (*specs.DeviceNode, error) {
s := d.fromPathOrDefault()
// The HostPath field was added in the v0.5.0 CDI specification.
// The cdi package uses strict unmarshalling when loading specs from file causing failures for
// unexpected fields.
// Since the behaviour for HostPath == "" and HostPath == Path are equivalent, we clear HostPath
// if it is equal to Path to ensure compatibility with the widest range of specs.
if s.HostPath == d.Path {
s.HostPath = ""
hostPath := d.HostPath
if hostPath == d.Path {
hostPath = ""
}
return s, nil
}
// fromPathOrDefault attempts to return the returns the information about the
// CDI device from the specified host path.
// If this fails a minimal device is returned so that this information can be
// queried by the container runtime such as containerd.
func (d device) fromPathOrDefault() *specs.DeviceNode {
dn, err := devices.DeviceFromPath(d.HostPath, "rwm")
if err != nil {
return &specs.DeviceNode{
HostPath: d.HostPath,
Path: d.Path,
}
}
return &specs.DeviceNode{
HostPath: d.HostPath,
s := specs.DeviceNode{
HostPath: hostPath,
Path: d.Path,
Major: dn.Major,
Minor: dn.Minor,
FileMode: &dn.FileMode,
}
return &s, nil
}

View File

@@ -55,6 +55,11 @@ func FromDiscoverer(d discover.Discover) (*cdi.ContainerEdits, error) {
return nil, fmt.Errorf("failed to discover devices: %v", err)
}
envs, err := d.EnvVars()
if err != nil {
return nil, fmt.Errorf("failed to discover environment variables: %w", err)
}
mounts, err := d.Mounts()
if err != nil {
return nil, fmt.Errorf("failed to discover mounts: %v", err)
@@ -74,6 +79,10 @@ func FromDiscoverer(d discover.Discover) (*cdi.ContainerEdits, error) {
c.Append(edits)
}
for _, e := range envs {
c.Append(envvar(e).toEdits())
}
for _, m := range mounts {
c.Append(mount(m).toEdits())
}

39
internal/edits/envvar.go Normal file
View File

@@ -0,0 +1,39 @@
/**
# SPDX-FileCopyrightText: Copyright (c) 2025 NVIDIA CORPORATION & AFFILIATES. All rights reserved.
# SPDX-License-Identifier: Apache-2.0
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
**/
package edits
import (
"fmt"
"tags.cncf.io/container-device-interface/pkg/cdi"
"tags.cncf.io/container-device-interface/specs-go"
"github.com/NVIDIA/nvidia-container-toolkit/internal/discover"
)
type envvar discover.EnvVar
// toEdits converts a discovered envvar to CDI Container Edits.
func (d envvar) toEdits() *cdi.ContainerEdits {
e := cdi.ContainerEdits{
ContainerEdits: &specs.ContainerEdits{
Env: []string{fmt.Sprintf("%s=%s", d.Name, d.Value)},
},
}
return &e
}

View File

@@ -23,34 +23,114 @@ import (
"github.com/NVIDIA/nvidia-container-toolkit/internal/logger"
)
// ResolveAutoMode determines the correct mode for the platform if set to "auto"
func ResolveAutoMode(logger logger.Interface, mode string, image image.CUDA) (rmode string) {
return resolveMode(logger, mode, image, nil)
// A RuntimeMode is used to select a specific mode of operation for the NVIDIA Container Runtime.
type RuntimeMode string
const (
// In LegacyRuntimeMode the nvidia-container-runtime injects the
// nvidia-container-runtime-hook as a prestart hook into the incoming
// container config. This hook invokes the nvidia-container-cli to perform
// the required modifications to the container.
LegacyRuntimeMode = RuntimeMode("legacy")
// In CSVRuntimeMode the nvidia-container-runtime processes a set of CSV
// files to determine which container modification are required. The
// contents of these CSV files are used to generate an in-memory CDI
// specification which is used to modify the container config.
CSVRuntimeMode = RuntimeMode("csv")
// In CDIRuntimeMode the nvidia-container-runtime applies the modifications
// to the container config required for the requested CDI devices in the
// same way that other CDI clients would.
CDIRuntimeMode = RuntimeMode("cdi")
// In JitCDIRuntimeMode the nvidia-container-runtime generates in-memory CDI
// specifications for requested NVIDIA devices.
JitCDIRuntimeMode = RuntimeMode("jit-cdi")
)
type RuntimeModeResolver interface {
ResolveRuntimeMode(string) RuntimeMode
}
func resolveMode(logger logger.Interface, mode string, image image.CUDA, propertyExtractor info.PropertyExtractor) (rmode string) {
type modeResolver struct {
logger logger.Interface
// TODO: This only needs to consider the requested devices.
image *image.CUDA
propertyExtractor info.PropertyExtractor
defaultMode RuntimeMode
}
type Option func(*modeResolver)
func WithDefaultMode(defaultMode RuntimeMode) Option {
return func(mr *modeResolver) {
mr.defaultMode = defaultMode
}
}
func WithLogger(logger logger.Interface) Option {
return func(mr *modeResolver) {
mr.logger = logger
}
}
func WithImage(image *image.CUDA) Option {
return func(mr *modeResolver) {
mr.image = image
}
}
func WithPropertyExtractor(propertyExtractor info.PropertyExtractor) Option {
return func(mr *modeResolver) {
mr.propertyExtractor = propertyExtractor
}
}
func NewRuntimeModeResolver(opts ...Option) RuntimeModeResolver {
r := &modeResolver{
defaultMode: JitCDIRuntimeMode,
}
for _, opt := range opts {
opt(r)
}
if r.logger == nil {
r.logger = &logger.NullLogger{}
}
return r
}
// ResolveAutoMode determines the correct mode for the platform if set to "auto"
func ResolveAutoMode(logger logger.Interface, mode string, image image.CUDA) (rmode RuntimeMode) {
r := modeResolver{
logger: logger,
image: &image,
propertyExtractor: nil,
}
return r.ResolveRuntimeMode(mode)
}
func (m *modeResolver) ResolveRuntimeMode(mode string) (rmode RuntimeMode) {
if mode != "auto" {
logger.Infof("Using requested mode '%s'", mode)
return mode
m.logger.Infof("Using requested mode '%s'", mode)
return RuntimeMode(mode)
}
defer func() {
logger.Infof("Auto-detected mode as '%v'", rmode)
m.logger.Infof("Auto-detected mode as '%v'", rmode)
}()
if image.OnlyFullyQualifiedCDIDevices() {
return "cdi"
if m.image.OnlyFullyQualifiedCDIDevices() {
return CDIRuntimeMode
}
nvinfo := info.New(
info.WithLogger(logger),
info.WithPropertyExtractor(propertyExtractor),
info.WithLogger(m.logger),
info.WithPropertyExtractor(m.propertyExtractor),
)
switch nvinfo.ResolvePlatform() {
case info.PlatformNVML, info.PlatformWSL:
return "legacy"
return m.defaultMode
case info.PlatformTegra:
return "csv"
return CSVRuntimeMode
}
return "legacy"
return m.defaultMode
}

View File

@@ -43,11 +43,16 @@ func TestResolveAutoMode(t *testing.T) {
mode: "not-auto",
expectedMode: "not-auto",
},
{
description: "legacy resolves to legacy",
mode: "legacy",
expectedMode: "legacy",
},
{
description: "no info defaults to legacy",
mode: "auto",
info: map[string]bool{},
expectedMode: "legacy",
expectedMode: "jit-cdi",
},
{
description: "non-nvml, non-tegra, nvgpu resolves to csv",
@@ -80,14 +85,14 @@ func TestResolveAutoMode(t *testing.T) {
expectedMode: "csv",
},
{
description: "nvml, non-tegra, non-nvgpu resolves to legacy",
description: "nvml, non-tegra, non-nvgpu resolves to jit-cdi",
mode: "auto",
info: map[string]bool{
"nvml": true,
"tegra": false,
"nvgpu": false,
},
expectedMode: "legacy",
expectedMode: "jit-cdi",
},
{
description: "nvml, non-tegra, nvgpu resolves to csv",
@@ -100,14 +105,14 @@ func TestResolveAutoMode(t *testing.T) {
expectedMode: "csv",
},
{
description: "nvml, tegra, non-nvgpu resolves to legacy",
description: "nvml, tegra, non-nvgpu resolves to jit-cdi",
mode: "auto",
info: map[string]bool{
"nvml": true,
"tegra": true,
"nvgpu": false,
},
expectedMode: "legacy",
expectedMode: "jit-cdi",
},
{
description: "nvml, tegra, nvgpu resolves to csv",
@@ -136,7 +141,7 @@ func TestResolveAutoMode(t *testing.T) {
},
},
{
description: "at least one non-cdi device resolves to legacy",
description: "at least one non-cdi device resolves to jit-cdi",
mode: "auto",
envmap: map[string]string{
"NVIDIA_VISIBLE_DEVICES": "nvidia.com/gpu=0,0",
@@ -146,7 +151,7 @@ func TestResolveAutoMode(t *testing.T) {
"tegra": false,
"nvgpu": false,
},
expectedMode: "legacy",
expectedMode: "jit-cdi",
},
{
description: "at least one non-cdi device resolves to csv",
@@ -170,7 +175,7 @@ func TestResolveAutoMode(t *testing.T) {
expectedMode: "cdi",
},
{
description: "cdi mount and non-CDI devices resolves to legacy",
description: "cdi mount and non-CDI devices resolves to jit-cdi",
mode: "auto",
mounts: []string{
"/var/run/nvidia-container-devices/cdi/nvidia.com/gpu/0",
@@ -181,7 +186,7 @@ func TestResolveAutoMode(t *testing.T) {
"tegra": false,
"nvgpu": false,
},
expectedMode: "legacy",
expectedMode: "jit-cdi",
},
{
description: "cdi mount and non-CDI envvar resolves to cdi",
@@ -199,22 +204,6 @@ func TestResolveAutoMode(t *testing.T) {
},
expectedMode: "cdi",
},
{
description: "non-cdi mount and CDI envvar resolves to legacy",
mode: "auto",
envmap: map[string]string{
"NVIDIA_VISIBLE_DEVICES": "nvidia.com/gpu=0",
},
mounts: []string{
"/var/run/nvidia-container-devices/0",
},
info: map[string]bool{
"nvml": true,
"tegra": false,
"nvgpu": false,
},
expectedMode: "legacy",
},
}
for _, tc := range testCases {
@@ -251,7 +240,12 @@ func TestResolveAutoMode(t *testing.T) {
image.WithAcceptDeviceListAsVolumeMounts(true),
image.WithAcceptEnvvarUnprivileged(true),
)
mode := resolveMode(logger, tc.mode, image, properties)
mr := NewRuntimeModeResolver(
WithLogger(logger),
WithImage(&image),
WithPropertyExtractor(properties),
)
mode := mr.ResolveRuntimeMode(tc.mode)
require.EqualValues(t, tc.expectedMode, mode)
})
}

View File

@@ -0,0 +1,206 @@
/**
# SPDX-FileCopyrightText: Copyright (c) 2025 NVIDIA CORPORATION & AFFILIATES. All rights reserved.
# SPDX-License-Identifier: Apache-2.0
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
**/
package ldconfig
import (
"fmt"
"os"
"os/exec"
"path/filepath"
"strings"
"github.com/NVIDIA/nvidia-container-toolkit/internal/config"
)
const (
// ldsoconfdFilenamePattern specifies the pattern for the filename
// in ld.so.conf.d that includes references to the specified directories.
// The 00-nvcr prefix is chosen to ensure that these libraries have a
// higher precedence than other libraries on the system, but lower than
// the 00-cuda-compat that is included in some containers.
ldsoconfdFilenamePattern = "00-nvcr-*.conf"
)
type Ldconfig struct {
ldconfigPath string
inRoot string
}
// NewRunner creates an exec.Cmd that can be used to run ldconfig.
func NewRunner(id string, ldconfigPath string, containerRoot string, additionalargs ...string) (*exec.Cmd, error) {
args := []string{
id,
strings.TrimPrefix(config.NormalizeLDConfigPath("@"+ldconfigPath), "@"),
containerRoot,
}
args = append(args, additionalargs...)
return createReexecCommand(args)
}
// New creates an Ldconfig struct that is used to perform operations on the
// ldcache and libraries in a particular root (e.g. a container).
func New(ldconfigPath string, inRoot string) (*Ldconfig, error) {
l := &Ldconfig{
ldconfigPath: ldconfigPath,
inRoot: inRoot,
}
if ldconfigPath == "" {
return nil, fmt.Errorf("an ldconfig path must be specified")
}
if inRoot == "" || inRoot == "/" {
return nil, fmt.Errorf("ldconfig must be run in the non-system root")
}
return l, nil
}
// CreateSonameSymlinks uses ldconfig to create the soname symlinks in the
// specified directories.
func (l *Ldconfig) CreateSonameSymlinks(directories ...string) error {
if len(directories) == 0 {
return nil
}
ldconfigPath, err := l.prepareRoot()
if err != nil {
return err
}
args := []string{
filepath.Base(ldconfigPath),
// Explicitly disable updating the LDCache.
"-N",
// Specify -n to only process the specified directories.
"-n",
}
args = append(args, directories...)
return SafeExec(ldconfigPath, args, nil)
}
func (l *Ldconfig) UpdateLDCache(directories ...string) error {
ldconfigPath, err := l.prepareRoot()
if err != nil {
return err
}
args := []string{
filepath.Base(ldconfigPath),
// Explicitly specify using /etc/ld.so.conf since the host's ldconfig may
// be configured to use a different config file by default.
"-f", "/etc/ld.so.conf",
}
if l.ldcacheExists() {
args = append(args, "-C", "/etc/ld.so.cache")
} else {
args = append(args, "-N")
}
// If the ld.so.conf.d directory exists, we create a config file there
// containing the required directories, otherwise we add the specified
// directories to the ldconfig command directly.
if l.ldsoconfdDirectoryExists() {
err := createLdsoconfdFile(ldsoconfdFilenamePattern, directories...)
if err != nil {
return fmt.Errorf("failed to update ld.so.conf.d: %w", err)
}
} else {
args = append(args, directories...)
}
return SafeExec(ldconfigPath, args, nil)
}
func (l *Ldconfig) prepareRoot() (string, error) {
// To prevent leaking the parent proc filesystem, we create a new proc mount
// in the specified root.
if err := mountProc(l.inRoot); err != nil {
return "", fmt.Errorf("error mounting /proc: %w", err)
}
// We mount the host ldconfig before we pivot root since host paths are not
// visible after the pivot root operation.
ldconfigPath, err := mountLdConfig(l.ldconfigPath, l.inRoot)
if err != nil {
return "", fmt.Errorf("error mounting host ldconfig: %w", err)
}
// We pivot to the container root for the new process, this further limits
// access to the host.
if err := pivotRoot(l.inRoot); err != nil {
return "", fmt.Errorf("error running pivot_root: %w", err)
}
return ldconfigPath, nil
}
func (l *Ldconfig) ldcacheExists() bool {
if _, err := os.Stat("/etc/ld.so.cache"); err != nil && os.IsNotExist(err) {
return false
}
return true
}
func (l *Ldconfig) ldsoconfdDirectoryExists() bool {
info, err := os.Stat("/etc/ld.so.conf.d")
if os.IsNotExist(err) {
return false
}
return info.IsDir()
}
// createLdsoconfdFile creates a file at /etc/ld.so.conf.d/.
// The file is created at /etc/ld.so.conf.d/{{ .pattern }} using `CreateTemp` and
// contains the specified directories on each line.
func createLdsoconfdFile(pattern string, dirs ...string) error {
if len(dirs) == 0 {
return nil
}
ldsoconfdDir := "/etc/ld.so.conf.d"
if err := os.MkdirAll(ldsoconfdDir, 0755); err != nil {
return fmt.Errorf("failed to create ld.so.conf.d: %w", err)
}
configFile, err := os.CreateTemp(ldsoconfdDir, pattern)
if err != nil {
return fmt.Errorf("failed to create config file: %w", err)
}
defer func() {
_ = configFile.Close()
}()
added := make(map[string]bool)
for _, dir := range dirs {
if added[dir] {
continue
}
_, err = fmt.Fprintf(configFile, "%s\n", dir)
if err != nil {
return fmt.Errorf("failed to update config file: %w", err)
}
added[dir] = true
}
// The created file needs to be world readable for the cases where the container is run as a non-root user.
if err := configFile.Chmod(0644); err != nil {
return fmt.Errorf("failed to chmod config file: %w", err)
}
return nil
}

View File

@@ -17,7 +17,7 @@
# limitations under the License.
**/
package ldcache
package ldconfig
import (
"errors"
@@ -29,8 +29,8 @@ import (
"syscall"
securejoin "github.com/cyphar/filepath-securejoin"
"github.com/moby/sys/reexec"
"github.com/opencontainers/runc/libcontainer/utils"
"golang.org/x/sys/unix"
)
@@ -182,7 +182,7 @@ func createTmpFs(target string, size int) error {
// createReexecCommand creates a command that can be used to trigger the reexec
// initializer.
// On linux this command runs in new namespaces.
func createReexecCommand(args []string) *exec.Cmd {
func createReexecCommand(args []string) (*exec.Cmd, error) {
cmd := reexec.Command(args...)
cmd.Stdin = os.Stdin
cmd.Stdout = os.Stdout
@@ -196,5 +196,5 @@ func createReexecCommand(args []string) *exec.Cmd {
syscall.CLONE_NEWNET,
}
return cmd
return cmd, nil
}

View File

@@ -17,14 +17,11 @@
# limitations under the License.
**/
package ldcache
package ldconfig
import (
"fmt"
"os"
"os/exec"
"github.com/moby/sys/reexec"
)
func pivotRoot(newroot string) error {
@@ -39,13 +36,6 @@ func mountProc(newroot string) error {
return fmt.Errorf("not supported")
}
// createReexecCommand creates a command that can be used ot trigger the reexec
// initializer.
func createReexecCommand(args []string) *exec.Cmd {
cmd := reexec.Command(args...)
cmd.Stdin = os.Stdin
cmd.Stdout = os.Stdout
cmd.Stderr = os.Stderr
return cmd
func createReexecCommand(args []string) (*exec.Cmd, error) {
return nil, fmt.Errorf("not supported")
}

View File

@@ -16,7 +16,7 @@
# limitations under the License.
**/
package ldcache
package ldconfig
import (
"fmt"

View File

@@ -16,7 +16,7 @@
# limitations under the License.
**/
package ldcache
package ldconfig
import "syscall"

View File

@@ -18,6 +18,7 @@ package modifier
import (
"fmt"
"strings"
"tags.cncf.io/container-device-interface/pkg/parser"
@@ -27,17 +28,27 @@ import (
"github.com/NVIDIA/nvidia-container-toolkit/internal/modifier/cdi"
"github.com/NVIDIA/nvidia-container-toolkit/internal/oci"
"github.com/NVIDIA/nvidia-container-toolkit/pkg/nvcdi"
"github.com/NVIDIA/nvidia-container-toolkit/pkg/nvcdi/spec"
)
const (
automaticDeviceVendor = "runtime.nvidia.com"
automaticDeviceClass = "gpu"
automaticDeviceKind = automaticDeviceVendor + "/" + automaticDeviceClass
automaticDevicePrefix = automaticDeviceKind + "="
)
// NewCDIModifier creates an OCI spec modifier that determines the modifications to make based on the
// CDI specifications available on the system. The NVIDIA_VISIBLE_DEVICES environment variable is
// used to select the devices to include.
func NewCDIModifier(logger logger.Interface, cfg *config.Config, image image.CUDA) (oci.SpecModifier, error) {
func NewCDIModifier(logger logger.Interface, cfg *config.Config, image image.CUDA, isJitCDI bool) (oci.SpecModifier, error) {
defaultKind := cfg.NVIDIAContainerRuntimeConfig.Modes.CDI.DefaultKind
if isJitCDI {
defaultKind = automaticDeviceKind
}
deviceRequestor := newCDIDeviceRequestor(
logger,
image,
cfg.NVIDIAContainerRuntimeConfig.Modes.CDI.DefaultKind,
defaultKind,
)
devices := deviceRequestor.DeviceRequests()
if len(devices) == 0 {
@@ -107,17 +118,34 @@ func (c *cdiDeviceRequestor) DeviceRequests() []string {
func filterAutomaticDevices(devices []string) []string {
var automatic []string
for _, device := range devices {
vendor, class, _ := parser.ParseDevice(device)
if vendor == "runtime.nvidia.com" && class == "gpu" {
automatic = append(automatic, device)
if !strings.HasPrefix(device, automaticDevicePrefix) {
continue
}
automatic = append(automatic, device)
}
return automatic
}
func newAutomaticCDISpecModifier(logger logger.Interface, cfg *config.Config, devices []string) (oci.SpecModifier, error) {
logger.Debugf("Generating in-memory CDI specs for devices %v", devices)
spec, err := generateAutomaticCDISpec(logger, cfg, devices)
var identifiers []string
for _, device := range devices {
identifiers = append(identifiers, strings.TrimPrefix(device, automaticDevicePrefix))
}
cdilib, err := nvcdi.New(
nvcdi.WithLogger(logger),
nvcdi.WithNVIDIACDIHookPath(cfg.NVIDIACTKConfig.Path),
nvcdi.WithDriverRoot(cfg.NVIDIAContainerCLIConfig.Root),
nvcdi.WithVendor(automaticDeviceVendor),
nvcdi.WithClass(automaticDeviceClass),
)
if err != nil {
return nil, fmt.Errorf("failed to construct CDI library: %w", err)
}
spec, err := cdilib.GetSpec(identifiers...)
if err != nil {
return nil, fmt.Errorf("failed to generate CDI spec: %w", err)
}
@@ -132,27 +160,6 @@ func newAutomaticCDISpecModifier(logger logger.Interface, cfg *config.Config, de
return cdiDeviceRequestor, nil
}
func generateAutomaticCDISpec(logger logger.Interface, cfg *config.Config, devices []string) (spec.Interface, error) {
cdilib, err := nvcdi.New(
nvcdi.WithLogger(logger),
nvcdi.WithNVIDIACDIHookPath(cfg.NVIDIACTKConfig.Path),
nvcdi.WithDriverRoot(cfg.NVIDIAContainerCLIConfig.Root),
nvcdi.WithVendor("runtime.nvidia.com"),
nvcdi.WithClass("gpu"),
)
if err != nil {
return nil, fmt.Errorf("failed to construct CDI library: %w", err)
}
var identifiers []string
for _, device := range devices {
_, _, id := parser.ParseDevice(device)
identifiers = append(identifiers, id)
}
return cdilib.GetSpec(identifiers...)
}
type deduplicatedDeviceRequestor struct {
deviceRequestor
}

View File

@@ -70,6 +70,18 @@ func TestDeviceRequests(t *testing.T) {
},
expectedDevices: []string{"nvidia.com/gpu=0", "example.com/class=device"},
},
{
description: "cdi devices from envvar with default kind",
input: cdiDeviceRequestor{
defaultKind: "runtime.nvidia.com/gpu",
},
spec: &specs.Spec{
Process: &specs.Process{
Env: []string{"NVIDIA_VISIBLE_DEVICES=all"},
},
},
expectedDevices: []string{"runtime.nvidia.com/gpu=all"},
},
{
description: "no matching annotations",
prefixes: []string{"not-prefix/"},

View File

@@ -41,6 +41,11 @@ func (d *byPathHookDiscoverer) Devices() ([]discover.Device, error) {
return nil, nil
}
// EnvVars returns the empty list for the by-path hook discoverer
func (d *byPathHookDiscoverer) EnvVars() ([]discover.EnvVar, error) {
return nil, nil
}
// Hooks returns the hooks for the GPU device.
// The following hooks are detected:
// 1. A hook to create /dev/dri/by-path symlinks

View File

@@ -106,6 +106,10 @@ func (d *nvsandboxutilsDGPU) Devices() ([]discover.Device, error) {
return devices, nil
}
func (d *nvsandboxutilsDGPU) EnvVars() ([]discover.EnvVar, error) {
return nil, nil
}
// Hooks returns a hook to create the by-path symlinks for the discovered devices.
func (d *nvsandboxutilsDGPU) Hooks() ([]discover.Hook, error) {
if len(d.deviceLinks) == 0 {

View File

@@ -101,14 +101,14 @@ func newSpecModifier(logger logger.Interface, cfg *config.Config, ociSpec oci.Sp
return modifiers, nil
}
func newModeModifier(logger logger.Interface, mode string, cfg *config.Config, image image.CUDA) (oci.SpecModifier, error) {
func newModeModifier(logger logger.Interface, mode info.RuntimeMode, cfg *config.Config, image image.CUDA) (oci.SpecModifier, error) {
switch mode {
case "legacy":
case info.LegacyRuntimeMode:
return modifier.NewStableRuntimeModifier(logger, cfg.NVIDIAContainerRuntimeHookConfig.Path), nil
case "csv":
case info.CSVRuntimeMode:
return modifier.NewCSVModifier(logger, cfg, image)
case "cdi":
return modifier.NewCDIModifier(logger, cfg, image)
case info.CDIRuntimeMode, info.JitCDIRuntimeMode:
return modifier.NewCDIModifier(logger, cfg, image, mode == info.JitCDIRuntimeMode)
}
return nil, fmt.Errorf("invalid runtime mode: %v", cfg.NVIDIAContainerRuntimeConfig.Mode)
@@ -119,7 +119,7 @@ func newModeModifier(logger logger.Interface, mode string, cfg *config.Config, i
// The image is also used to determine the runtime mode to apply.
// If a non-CDI mode is detected we ensure that the image does not process
// annotation devices.
func initRuntimeModeAndImage(logger logger.Interface, cfg *config.Config, ociSpec oci.Spec) (string, *image.CUDA, error) {
func initRuntimeModeAndImage(logger logger.Interface, cfg *config.Config, ociSpec oci.Spec) (info.RuntimeMode, *image.CUDA, error) {
rawSpec, err := ociSpec.Load()
if err != nil {
return "", nil, fmt.Errorf("failed to load OCI spec: %v", err)
@@ -136,9 +136,13 @@ func initRuntimeModeAndImage(logger logger.Interface, cfg *config.Config, ociSpe
return "", nil, err
}
mode := info.ResolveAutoMode(logger, cfg.NVIDIAContainerRuntimeConfig.Mode, image)
modeResolver := info.NewRuntimeModeResolver(
info.WithLogger(logger),
info.WithImage(&image),
)
mode := modeResolver.ResolveRuntimeMode(cfg.NVIDIAContainerRuntimeConfig.Mode)
// We update the mode here so that we can continue passing just the config to other functions.
cfg.NVIDIAContainerRuntimeConfig.Mode = mode
cfg.NVIDIAContainerRuntimeConfig.Mode = string(mode)
if mode == "cdi" || len(cfg.NVIDIAContainerRuntimeConfig.Modes.CDI.AnnotationPrefixes) == 0 {
return mode, &image, nil
@@ -154,12 +158,12 @@ func initRuntimeModeAndImage(logger logger.Interface, cfg *config.Config, ociSpe
}
// supportedModifierTypes returns the modifiers supported for a specific runtime mode.
func supportedModifierTypes(mode string) []string {
func supportedModifierTypes(mode info.RuntimeMode) []string {
switch mode {
case "cdi":
case info.CDIRuntimeMode, info.JitCDIRuntimeMode:
// For CDI mode we make no additional modifications.
return []string{"nvidia-hook-remover", "mode"}
case "csv":
case info.CSVRuntimeMode:
// For CSV mode we support mode and feature-gated modification.
return []string{"nvidia-hook-remover", "feature-gated", "mode"}
default:

View File

@@ -10,7 +10,7 @@ Build-Depends: debhelper (>= 9)
Package: nvidia-container-toolkit
Architecture: any
Depends: ${misc:Depends}, nvidia-container-toolkit-base (= @VERSION@), libnvidia-container-tools (>= @LIBNVIDIA_CONTAINER_TOOLS_VERSION@), libnvidia-container-tools (<< 2.0.0)
Depends: ${misc:Depends}, nvidia-container-toolkit-base (= @VERSION@), libnvidia-container-tools (= @VERSION@), libnvidia-container-tools (<< 2.0.0)
Breaks: nvidia-container-runtime (<= 3.5.0-1), nvidia-container-runtime-hook
Replaces: nvidia-container-runtime (<= 3.5.0-1), nvidia-container-runtime-hook
Description: NVIDIA Container toolkit

View File

@@ -3,7 +3,6 @@
set -e
sed -i "s;@SECTION@;${SECTION:+$SECTION/};g" debian/control
sed -i "s;@LIBNVIDIA_CONTAINER_TOOLS_VERSION@;${LIBNVIDIA_CONTAINER_TOOLS_VERSION:+$LIBNVIDIA_CONTAINER_TOOLS_VERSION};g" debian/control
sed -i "s;@VERSION@;${VERSION:+$VERSION};g" debian/control
if [ -n "$DISTRIB" ]; then

View File

@@ -23,7 +23,7 @@ Source8: nvidia-cdi-refresh.path
Obsoletes: nvidia-container-runtime <= 3.5.0-1, nvidia-container-runtime-hook <= 1.4.0-2
Provides: nvidia-container-runtime
Provides: nvidia-container-runtime-hook
Requires: libnvidia-container-tools >= %{libnvidia_container_tools_version}, libnvidia-container-tools < 2.0.0
Requires: libnvidia-container-tools == %{version}-%{release}, libnvidia-container-tools < 2.0.0
Requires: nvidia-container-toolkit-base == %{version}-%{release}
%description
@@ -86,7 +86,7 @@ fi
# As of 1.10.0-1 we generate the release information automatically
* %{release_date} NVIDIA CORPORATION <cudatools@nvidia.com> %{version}-%{release}
- See https://gitlab.com/nvidia/container-toolkit/container-toolkit/-/blob/%{git_commit}/CHANGELOG.md
- Bump libnvidia-container dependency to libnvidia-container-tools >= %{libnvidia_container_tools_version}
- Bump libnvidia-container dependency to libnvidia-container-tools == %{version}-%{release}
# The BASE package consists of the NVIDIA Container Runtime and the NVIDIA Container Toolkit CLI.
# This allows the package to be installed on systems where no NVIDIA Container CLI is available.

View File

@@ -56,6 +56,9 @@ const (
EnableCudaCompatHook = discover.EnableCudaCompatHook
// An UpdateLDCacheHook is used to update the ldcache in the container.
UpdateLDCacheHook = discover.UpdateLDCacheHook
// A CreateSonameSymlinksHook is the hook used to ensure that soname symlinks
// for injected libraries exist in the container.
CreateSonameSymlinksHook = discover.CreateSonameSymlinksHook
// Deprecated: Use CreateSymlinksHook instead.
HookCreateSymlinks = CreateSymlinksHook

View File

@@ -82,7 +82,7 @@ func (l *nvcdilib) newDriverVersionDiscoverer(version string) (discover.Discover
// NewDriverLibraryDiscoverer creates a discoverer for the libraries associated with the specified driver version.
func (l *nvcdilib) NewDriverLibraryDiscoverer(version string) (discover.Discover, error) {
libraryPaths, err := getVersionLibs(l.logger, l.driver, version)
libraryPaths, libCudaDirectoryPath, err := getVersionLibs(l.logger, l.driver, version)
if err != nil {
return nil, fmt.Errorf("failed to get libraries for driver version: %v", err)
}
@@ -116,6 +116,12 @@ func (l *nvcdilib) NewDriverLibraryDiscoverer(version string) (discover.Discover
disableDeviceNodeModification := l.hookCreator.Create(DisableDeviceNodeModificationHook)
discoverers = append(discoverers, disableDeviceNodeModification)
environmentVariable := &discover.EnvVar{
Name: "NVIDIA_CTK_LIBCUDA_DIR",
Value: libCudaDirectoryPath,
}
discoverers = append(discoverers, environmentVariable)
d := discover.Merge(discoverers...)
return d, nil
@@ -203,39 +209,41 @@ func NewDriverBinariesDiscoverer(logger logger.Interface, driverRoot string) dis
// getVersionLibs checks the LDCache for libraries ending in the specified driver version.
// Although the ldcache at the specified driverRoot is queried, the paths are returned relative to this driverRoot.
// This allows the standard mount location logic to be used for resolving the mounts.
func getVersionLibs(logger logger.Interface, driver *root.Driver, version string) ([]string, error) {
func getVersionLibs(logger logger.Interface, driver *root.Driver, version string) ([]string, string, error) {
logger.Infof("Using driver version %v", version)
libCudaPaths, err := cuda.New(
driver.Libraries(),
).Locate("." + version)
if err != nil {
return nil, fmt.Errorf("failed to locate libcuda.so.%v: %v", version, err)
return nil, "", fmt.Errorf("failed to locate libcuda.so.%v: %v", version, err)
}
libRoot := filepath.Dir(libCudaPaths[0])
libCudaDirectoryPath := filepath.Dir(libCudaPaths[0])
libraries := lookup.NewFileLocator(
lookup.WithLogger(logger),
lookup.WithSearchPaths(
libRoot,
filepath.Join(libRoot, "vdpau"),
libCudaDirectoryPath,
filepath.Join(libCudaDirectoryPath, "vdpau"),
),
lookup.WithOptional(true),
)
libs, err := libraries.Locate("*.so." + version)
if err != nil {
return nil, fmt.Errorf("failed to locate libraries for driver version %v: %v", version, err)
return nil, "", fmt.Errorf("failed to locate libraries for driver version %v: %v", version, err)
}
if driver.Root == "/" || driver.Root == "" {
return libs, nil
return libs, libCudaDirectoryPath, nil
}
libCudaDirectoryPath = driver.RelativeToRoot(libCudaDirectoryPath)
var relative []string
for _, l := range libs {
relative = append(relative, strings.TrimPrefix(l, driver.Root))
}
return relative, nil
return relative, libCudaDirectoryPath, nil
}

View File

@@ -55,6 +55,11 @@ func (d *deviceFolderPermissions) Devices() ([]discover.Device, error) {
return nil, nil
}
// EnvVars are empty for this discoverer
func (d *deviceFolderPermissions) EnvVars() ([]discover.EnvVar, error) {
return nil, nil
}
// Hooks returns a set of hooks that sets the file mode to 755 of parent folders for nested device nodes.
func (d *deviceFolderPermissions) Hooks() ([]discover.Hook, error) {
folders, err := d.getDeviceSubfolders()

View File

@@ -70,9 +70,9 @@ function copy-file() {
-v "$(pwd):$(pwd)" \
-w "$(pwd)" \
-u "$(id -u):$(id -g)" \
--entrypoint="bash" \
--entrypoint="sh" \
"${image}" \
-c "cp ${path_in_image} ${path_on_host}"
-c "cp -p ${path_in_image} ${path_on_host}"
fi
}

View File

@@ -96,9 +96,9 @@ function copy_file() {
-v "$(pwd):$(pwd)" \
-w "$(pwd)" \
-u "$(id -u):$(id -g)" \
--entrypoint="bash" \
--entrypoint="sh" \
"${image}" \
-c "cp ${path_in_image} ${path_on_host}"
-c "cp -p ${path_in_image} ${path_on_host}"
fi
}

View File

@@ -28,3 +28,4 @@ spec:
install: false
nvidiaDriver:
install: true
branch: 550

View File

@@ -173,10 +173,10 @@ var _ = Describe("docker", Ordered, ContinueOnFailure, func() {
When("Testing CUDA Forward compatibility", Ordered, func() {
BeforeAll(func(ctx context.Context) {
_, _, err := runner.Run("docker pull nvcr.io/nvidia/cuda:12.8.0-base-ubi8")
_, _, err := runner.Run("docker pull nvcr.io/nvidia/cuda:12.9.0-base-ubi8")
Expect(err).ToNot(HaveOccurred())
compatOutput, _, err := runner.Run("docker run --rm -i -e NVIDIA_VISIBLE_DEVICES=void nvcr.io/nvidia/cuda:12.8.0-base-ubi8 bash -c \"ls /usr/local/cuda/compat/libcuda.*.*\"")
compatOutput, _, err := runner.Run("docker run --rm -i -e NVIDIA_VISIBLE_DEVICES=void nvcr.io/nvidia/cuda:12.9.0-base-ubi8 bash -c \"ls /usr/local/cuda/compat/libcuda.*.*\"")
Expect(err).ToNot(HaveOccurred())
Expect(compatOutput).ToNot(BeEmpty())
@@ -199,21 +199,21 @@ var _ = Describe("docker", Ordered, ContinueOnFailure, func() {
})
It("should work with the nvidia runtime in legacy mode", func(ctx context.Context) {
ldconfigOut, _, err := runner.Run("docker run --rm -i -e NVIDIA_DISABLE_REQUIRE=true --runtime=nvidia --gpus all nvcr.io/nvidia/cuda:12.8.0-base-ubi8 bash -c \"ldconfig -p | grep libcuda.so.1\"")
ldconfigOut, _, err := runner.Run("docker run --rm -i -e NVIDIA_DISABLE_REQUIRE=true --runtime=nvidia --gpus all nvcr.io/nvidia/cuda:12.9.0-base-ubi8 bash -c \"ldconfig -p | grep libcuda.so.1\"")
Expect(err).ToNot(HaveOccurred())
Expect(ldconfigOut).To(ContainSubstring("/usr/local/cuda/compat"))
Expect(ldconfigOut).To(ContainSubstring("/usr/local/cuda-12.9/compat/"))
})
It("should work with the nvidia runtime in CDI mode", func(ctx context.Context) {
ldconfigOut, _, err := runner.Run("docker run --rm -i -e NVIDIA_DISABLE_REQUIRE=true --runtime=nvidia -e NVIDIA_VISIBLE_DEVICES=runtime.nvidia.com/gpu=all nvcr.io/nvidia/cuda:12.8.0-base-ubi8 bash -c \"ldconfig -p | grep libcuda.so.1\"")
ldconfigOut, _, err := runner.Run("docker run --rm -i -e NVIDIA_DISABLE_REQUIRE=true --runtime=nvidia -e NVIDIA_VISIBLE_DEVICES=runtime.nvidia.com/gpu=all nvcr.io/nvidia/cuda:12.9.0-base-ubi8 bash -c \"ldconfig -p | grep libcuda.so.1\"")
Expect(err).ToNot(HaveOccurred())
Expect(ldconfigOut).To(ContainSubstring("/usr/local/cuda/compat"))
Expect(ldconfigOut).To(ContainSubstring("/usr/local/cuda-12.9/compat/"))
})
It("should NOT work with nvidia-container-runtime-hook", func(ctx context.Context) {
ldconfigOut, _, err := runner.Run("docker run --rm -i -e NVIDIA_DISABLE_REQUIRE=true --runtime=runc --gpus all nvcr.io/nvidia/cuda:12.8.0-base-ubi8 bash -c \"ldconfig -p | grep libcuda.so.1\"")
It("should work with nvidia-container-runtime-hook", func(ctx context.Context) {
ldconfigOut, _, err := runner.Run("docker run --rm -i -e NVIDIA_DISABLE_REQUIRE=true --runtime=runc --gpus all nvcr.io/nvidia/cuda:12.9.0-base-ubi8 bash -c \"ldconfig -p | grep libcuda.so.1\"")
Expect(err).ToNot(HaveOccurred())
Expect(ldconfigOut).To(ContainSubstring("/usr/lib64"))
Expect(ldconfigOut).To(ContainSubstring("/usr/local/cuda-12.9/compat/"))
})
})
@@ -235,4 +235,26 @@ var _ = Describe("docker", Ordered, ContinueOnFailure, func() {
Expect(output).To(Equal("ModifyDeviceFiles: 0\n"))
})
})
When("A container is run using CDI", Ordered, func() {
BeforeAll(func(ctx context.Context) {
_, _, err := runner.Run("docker pull ubuntu")
Expect(err).ToNot(HaveOccurred())
})
It("should include libcuda.so in the ldcache", func(ctx context.Context) {
ldcacheOutput, _, err := runner.Run("docker run --rm -i --runtime=nvidia -e NVIDIA_VISIBLE_DEVICES=runtime.nvidia.com/gpu=all ubuntu bash -c \"ldconfig -p | grep 'libcuda.so'\"")
Expect(err).ToNot(HaveOccurred())
Expect(ldcacheOutput).ToNot(BeEmpty())
ldcacheLines := strings.Split(ldcacheOutput, "\n")
var libs []string
for _, line := range ldcacheLines {
parts := strings.SplitN(line, " (", 2)
libs = append(libs, strings.TrimSpace(parts[0]))
}
Expect(libs).To(ContainElements([]string{"libcuda.so", "libcuda.so.1"}))
})
})
})

View File

@@ -1,21 +0,0 @@
The MIT License (MIT)
Copyright (c) 2014 Brian Goff
Permission is hereby granted, free of charge, to any person obtaining a copy
of this software and associated documentation files (the "Software"), to deal
in the Software without restriction, including without limitation the rights
to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
copies of the Software, and to permit persons to whom the Software is
furnished to do so, subject to the following conditions:
The above copyright notice and this permission notice shall be included in all
copies or substantial portions of the Software.
THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
SOFTWARE.

View File

@@ -1,62 +0,0 @@
package md2man
import (
"fmt"
"io"
"os"
"strings"
"github.com/russross/blackfriday/v2"
)
func fmtListFlags(flags blackfriday.ListType) string {
knownFlags := []struct {
name string
flag blackfriday.ListType
}{
{"ListTypeOrdered", blackfriday.ListTypeOrdered},
{"ListTypeDefinition", blackfriday.ListTypeDefinition},
{"ListTypeTerm", blackfriday.ListTypeTerm},
{"ListItemContainsBlock", blackfriday.ListItemContainsBlock},
{"ListItemBeginningOfList", blackfriday.ListItemBeginningOfList},
{"ListItemEndOfList", blackfriday.ListItemEndOfList},
}
var f []string
for _, kf := range knownFlags {
if flags&kf.flag != 0 {
f = append(f, kf.name)
flags &^= kf.flag
}
}
if flags != 0 {
f = append(f, fmt.Sprintf("Unknown(%#x)", flags))
}
return strings.Join(f, "|")
}
type debugDecorator struct {
blackfriday.Renderer
}
func depth(node *blackfriday.Node) int {
d := 0
for n := node.Parent; n != nil; n = n.Parent {
d++
}
return d
}
func (d *debugDecorator) RenderNode(w io.Writer, node *blackfriday.Node, entering bool) blackfriday.WalkStatus {
fmt.Fprintf(os.Stderr, "%s%s %v %v\n",
strings.Repeat(" ", depth(node)),
map[bool]string{true: "+", false: "-"}[entering],
node,
fmtListFlags(node.ListFlags))
var b strings.Builder
status := d.Renderer.RenderNode(io.MultiWriter(&b, w), node, entering)
if b.Len() > 0 {
fmt.Fprintf(os.Stderr, ">> %q\n", b.String())
}
return status
}

View File

@@ -1,24 +0,0 @@
// Package md2man aims in converting markdown into roff (man pages).
package md2man
import (
"os"
"strconv"
"github.com/russross/blackfriday/v2"
)
// Render converts a markdown document into a roff formatted document.
func Render(doc []byte) []byte {
renderer := NewRoffRenderer()
var r blackfriday.Renderer = renderer
if v, _ := strconv.ParseBool(os.Getenv("MD2MAN_DEBUG")); v {
r = &debugDecorator{Renderer: r}
}
return blackfriday.Run(doc,
[]blackfriday.Option{
blackfriday.WithRenderer(r),
blackfriday.WithExtensions(renderer.GetExtensions()),
}...)
}

Some files were not shown because too many files have changed in this diff Show More