This change removes docker-runc as the highest priority
default candidate for the low-level runtimes supported by
the nvidia-container-runtime.
Signed-off-by: Evan Lezar <elezar@nvidia.com>
This change includes annotation devices in CUDA.VisibleDevices
with the highest priority. This allows for the CDI device
request extraction to be consistent across all request mechanisms.
Note that this does change behaviour in the following ways:
1. Annotations are considered when resolving the runtime mode.
2. Incorrectly formed device names in annotations are no longer treated as an error.
Signed-off-by: Evan Lezar <elezar@nvidia.com>
Following the refactoring of device request extraction, we can
now make CDI device requests consistent with other methods.
This change moves to using image.VisibleDevices instead of
separate calls to CDIDevicesFromMounts and VisibleDevicesFromEnvVar.
Signed-off-by: Evan Lezar <elezar@nvidia.com>
This change updates the image.CUDA type to also extract CDI
device requests. These are only relevant IF CDI prefixes are
specifically set.
Signed-off-by: Evan Lezar <elezar@nvidia.com>
This change allows device IDs to the specified in the GetSpec API.
This simplifies cases where CDI specs are being generated for specific
devices by ID.
Signed-off-by: Evan Lezar <elezar@nvidia.com>
This change consolidates the logic for determining requested devices
from the container image. The logic for this has been integrated into
the image.CUDA type so that multiple implementations are not required.
Signed-off-by: Carlos Eduardo Arango Gutierrez <eduardoa@nvidia.com>
Co-authored-by: Evan Lezar <elezar@nvidia.com>
On some RPM-based platforms, the path of the Vulkan ICD file
include an architecture-specific infix to distinguish it from
other architectures. (Most notably x86_64 vs i686). This change
attempts to discover the arch-specific ICD file in addition to
the standard nvidia_icd.json.
Signed-off-by: Evan Lezar <elezar@nvidia.com>
This change adds the ability to disabled specific (or all) CDI hooks to
both the nvidia-ctk cdi generate command and the nvcdi API.
Signed-off-by: Carlos Eduardo Arango Gutierrez <eduardoa@nvidia.com>
Signed-off-by: Evan Lezar <elezar@nvidia.com>
This change allows hooks to be configured with debug logging. This
is currently only enabled for the hooks generated from the runtime.
Signed-off-by: Evan Lezar <elezar@nvidia.com>
This change adds an nvidia-container-runtime.modes.legacy.cuda-compat-mode
config option. This can be set to one of four values:
* ldconfig (default): the --cuda-compat-mode=ldconfig flag is passed to the nvidia-container-cli
* mount: the --cuda-compat-mode=mount flag is passed to the nvidia-conainer-cli
* disabled: the --cuda-compat-mode=disabled flag is passed to the nvidia-container-cli
* hook: the --cuda-compat-mode=disabled flag is passed to the nvidia-container-cli AND the
enable-cuda-compat hook is used to provide forward compatibility.
Note that the disable-cuda-compat-lib-hook feature flag will prevent the enable-cuda-compat
hook from being used. This change also means that the allow-cuda-compat-libs-from-container
feature flag no longer has any effect.
Signed-off-by: Evan Lezar <elezar@nvidia.com>
This change updates github.com/NVIDIA/go-nvlib from v0.7.1 to v0.7.2
to allow Thor systems to be detected as Tegra-based. This allows fixes
automatic mode detection to work on these systems.
Signed-off-by: Evan Lezar <elezar@nvidia.com>
This ensures that mount propagation is set to rprivate for
mounts from the host into the container. This aligns with the
default in docker.
Signed-off-by: Evan Lezar <elezar@nvidia.com>
When constructing a list of discoverers using discover.Merge we
explicitly skip `nil` discoverers to simplify usage as we don't
have to explicitly check validity when processing the discoverers
in the list.
Signed-off-by: Evan Lezar <elezar@nvidia.com>
Since we explicitly check for the architecture of the
libraries in the ldcache, we need to also check the architecture
flag against the ARM constants.
Signed-off-by: Evan Lezar <elezar@nvidia.com>
This change switches to using the WithCache decorator for
mounts instead of keeping track of a cache locally.
This addresses a race condition when using the mounts structure.
Signed-off-by: Evan Lezar <elezar@nvidia.com>
This change adds the enable-cuda-compat hook to the incomming OCI runtime spec
if the allow-cuda-compat-libs-from-container feature flag is not enabled.
An update-ldcache hook is also injected to ensure that the required folders
are processed.
Signed-off-by: Evan Lezar <elezar@nvidia.com>
This allows the NVIDIA Container Toolkit to ignore IMEX channel requests
through the NVIDIA_IMEX_CHANNELS envvar or volume mounts and ensures that
the NVIDIA Container Toolkit cannot be used to provide out-of-band access
to an IMEX channel by simply specifying an environment variable, possibly
bypassing other checks by an orchestration system such as kubernetes.
Signed-off-by: Evan Lezar <elezar@nvidia.com>
This changes ensures that the cdi modifier also removes the NVIDIA
Container Runtime Hook from the incoming spec. This aligns with what is
done for CSV modifications and prevents an error when starting the
container.
Signed-off-by: Evan Lezar <elezar@nvidia.com>
This change adds an allow-cuda-compat-libs-from-container feature flag
to the NVIDIA Container Toolkit config. This allows a user to opt-in
to the previous default behaviour of overriding certain driver
libraries with CUDA compat libraries from the container.
Signed-off-by: Evan Lezar <elezar@nvidia.com>
In CSV mode the CSV files at /etc/nvidia-container-runtime/host-files-for-container.d/
should be the source of truth for container modifications. This change skips graphics
modifications to a container. This prevents conflicts when handling files such as
vulkan icd files which are already defined in the CSV file.
Signed-off-by: Evan Lezar <elezar@nvidia.com>