Evan Lezar
b55255e31f
Merge pull request #1110 from ArangoGutierrez/i/1049
...
Refactor extracting requested devices from the container image
2025-06-05 13:00:32 +02:00
Carlos Eduardo Arango Gutierrez
dede03f322
Refactor extracting requested devices from the container image
...
This change consolidates the logic for determining requested devices
from the container image. The logic for this has been integrated into
the image.CUDA type so that multiple implementations are not required.
Signed-off-by: Carlos Eduardo Arango Gutierrez <eduardoa@nvidia.com>
Co-authored-by: Evan Lezar <elezar@nvidia.com>
2025-06-05 12:38:45 +02:00
Evan Lezar
fdcd250362
Merge pull request #1129 from elezar/fix-deduplicate-driver-store-wsl
...
Minor cleanup of WSL2 CDI spec generation
2025-06-04 10:16:36 +02:00
Evan Lezar
b66d37bedb
[no-relnote] Minor code cleanup in WSL2 discoverer
...
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2025-06-03 23:43:17 +02:00
Evan Lezar
0c905d0de2
[no-relnote] Remove unneeded indirection
...
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2025-06-03 23:43:17 +02:00
Evan Lezar
0d0b56816e
Remove redundant deduplication of search paths for WSL
...
The GetDriverStorePaths function is implemented so as to
remove duplicate driver store paths which means that the
additional deduplication (which had a bug) can be removed.
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2025-06-03 23:43:10 +02:00
Evan Lezar
d59fd3da11
Merge pull request #1077 from ArangoGutierrez/1074
...
CI Pipeline / code-scanning (push) Has been cancelled
CI Pipeline / variables (push) Has been cancelled
CI Pipeline / golang (push) Has been cancelled
CI Pipeline / image (push) Has been cancelled
CI Pipeline / e2e-test (push) Has been cancelled
Add the ability to disable specific (or all) CDI hooks when generating a CDI specification
2025-06-03 16:03:18 +02:00
Carlos Eduardo Arango Gutierrez
6cf0248321
Added ability to disable specific (or all) CDI hooks
...
This change adds the ability to disabled specific (or all) CDI hooks to
both the nvidia-ctk cdi generate command and the nvcdi API.
Signed-off-by: Carlos Eduardo Arango Gutierrez <eduardoa@nvidia.com>
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2025-06-03 16:01:20 +02:00
Carlos Eduardo Arango Gutierrez
b4787511d2
Consolidate HookName functionality on internal/discover pkg
...
Signed-off-by: Carlos Eduardo Arango Gutierrez <eduardoa@nvidia.com>
2025-06-03 15:24:43 +02:00
Evan Lezar
890db82b46
Merge pull request #1127 from ArangoGutierrez/e2e/internal_runner
...
[no-relnote] E2E GitHub action to run with internal runner
2025-06-03 15:18:42 +02:00
Carlos Eduardo Arango Gutierrez
5915328be5
[no-relnote] E2E GitHub action to run with internal runner
...
Signed-off-by: Carlos Eduardo Arango Gutierrez <eduardoa@nvidia.com>
2025-06-03 12:47:35 +02:00
Evan Lezar
bb3a54f7f4
Merge pull request #1126 from elezar/bump-holodeck-0.2.12
...
CI Pipeline / code-scanning (push) Waiting to run
CI Pipeline / variables (push) Waiting to run
CI Pipeline / golang (push) Waiting to run
CI Pipeline / image (push) Blocked by required conditions
CI Pipeline / e2e-test (push) Blocked by required conditions
Bump NVIDIA/holodeck from 0.2.7 to 0.2.12
2025-06-03 10:55:50 +02:00
dependabot[bot]
a909914cd6
Bump NVIDIA/holodeck from 0.2.7 to 0.2.12
...
Bumps [NVIDIA/holodeck](https://github.com/nvidia/holodeck ) from 0.2.7 to 0.2.12.
- [Release notes](https://github.com/nvidia/holodeck/releases )
- [Commits](https://github.com/nvidia/holodeck/compare/v0.2.7...v0.2.12 )
---
updated-dependencies:
- dependency-name: NVIDIA/holodeck
dependency-version: 0.2.12
dependency-type: direct:production
update-type: version-update:semver-patch
...
Signed-off-by: dependabot[bot] <support@github.com>
2025-06-03 10:11:47 +02:00
Evan Lezar
f973271da1
Merge pull request #1119 from elezar/use-public-runners
...
[no-relnote] Switch to public runners
2025-06-02 21:53:02 +02:00
Evan Lezar
535e023828
Merge pull request #1120 from elezar/remove-release-archive
...
[no-relnote] Remove release:archive CI step
2025-06-02 13:44:33 +02:00
Evan Lezar
f2cf3e8deb
[no-relnote] Remove release:archive CI step
...
Remove unneeded release:archive internal CI step.
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2025-05-30 17:10:03 +02:00
Evan Lezar
03e8b9e0f5
[no-relnote] Switch to public runners
...
Since we now use pr-copy-bot, we are not running image builds
from forks and as such the required secrets for pushing to the
ghcr are present.
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2025-05-30 16:40:29 +02:00
Evan Lezar
450f73a046
Merge commit from fork
...
CI Pipeline / code-scanning (push) Has been cancelled
CI Pipeline / variables (push) Has been cancelled
CI Pipeline / golang (push) Has been cancelled
CI Pipeline / image (push) Has been cancelled
CI Pipeline / e2e-test (push) Has been cancelled
Add `NVIDIA_CTK_DEBUG=false` to hook envs
2025-05-30 15:31:26 +02:00
Evan Lezar
479df7134a
Add envvar to control debug logging in CDI hooks
...
This change allows hooks to be configured with debug logging. This
is currently only enabled for the hooks generated from the runtime.
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2025-05-30 15:27:52 +02:00
Evan Lezar
19a83e3542
Merge pull request #1112 from NVIDIA/dependabot/submodules/main/third_party/libnvidia-container-6eda4d7
...
CI Pipeline / code-scanning (push) Has been cancelled
CI Pipeline / variables (push) Has been cancelled
CI Pipeline / golang (push) Has been cancelled
CI Pipeline / image (push) Has been cancelled
CI Pipeline / e2e-test (push) Has been cancelled
Bump third_party/libnvidia-container from `caf057b` to `6eda4d7`
2025-05-28 10:33:44 +02:00
dependabot[bot]
d2344cba34
Bump third_party/libnvidia-container from caf057b
to 6eda4d7
...
Bumps [third_party/libnvidia-container](https://github.com/NVIDIA/libnvidia-container ) from `caf057b` to `6eda4d7`.
- [Release notes](https://github.com/NVIDIA/libnvidia-container/releases )
- [Commits](caf057b009...6eda4d76c8
)
---
updated-dependencies:
- dependency-name: third_party/libnvidia-container
dependency-version: 6eda4d76c8c5f8fc174e4abca83e513fb4dd63b0
dependency-type: direct:production
...
Signed-off-by: dependabot[bot] <support@github.com>
2025-05-28 08:16:30 +00:00
Evan Lezar
c8c22162b7
Merge pull request #958 from ArangoGutierrez/codecov
...
CI Pipeline / code-scanning (push) Waiting to run
CI Pipeline / variables (push) Waiting to run
CI Pipeline / golang (push) Waiting to run
CI Pipeline / image (push) Blocked by required conditions
CI Pipeline / e2e-test (push) Blocked by required conditions
[no-relnote] Enable Coveralls for code coverage
2025-05-27 15:48:54 +02:00
Evan Lezar
ea9b8721c0
Merge pull request #1108 from NVIDIA/dependabot/github_actions/main/NVIDIA/holodeck-0.2.9
...
CI Pipeline / code-scanning (push) Waiting to run
CI Pipeline / variables (push) Waiting to run
CI Pipeline / golang (push) Waiting to run
CI Pipeline / image (push) Blocked by required conditions
CI Pipeline / e2e-test (push) Blocked by required conditions
Bump NVIDIA/holodeck from 0.2.7 to 0.2.9
2025-05-27 10:21:15 +02:00
dependabot[bot]
eaaa8536e4
Bump NVIDIA/holodeck from 0.2.7 to 0.2.9
...
Bumps [NVIDIA/holodeck](https://github.com/nvidia/holodeck ) from 0.2.7 to 0.2.9.
- [Release notes](https://github.com/nvidia/holodeck/releases )
- [Commits](https://github.com/nvidia/holodeck/compare/v0.2.7...v0.2.9 )
---
updated-dependencies:
- dependency-name: NVIDIA/holodeck
dependency-version: 0.2.9
dependency-type: direct:production
update-type: version-update:semver-patch
...
Signed-off-by: dependabot[bot] <support@github.com>
2025-05-27 08:19:31 +00:00
Carlos Eduardo Arango Gutierrez
e955f65d8f
[no-relnote] Enable Coveralls
...
CI Pipeline / code-scanning (push) Has been cancelled
CI Pipeline / variables (push) Has been cancelled
CI Pipeline / golang (push) Has been cancelled
CI Pipeline / image (push) Has been cancelled
CI Pipeline / e2e-test (push) Has been cancelled
Signed-off-by: Carlos Eduardo Arango Gutierrez <eduardoa@nvidia.com>
2025-05-23 13:53:45 +02:00
Evan Lezar
b934c68bef
Merge pull request #1103 from elezar/reenable-nvsandboxutils
...
CI Pipeline / code-scanning (push) Has been cancelled
CI Pipeline / variables (push) Has been cancelled
CI Pipeline / golang (push) Has been cancelled
CI Pipeline / image (push) Has been cancelled
CI Pipeline / e2e-test (push) Has been cancelled
Reenable nvsandboxutils for driver discovery
2025-05-23 11:38:46 +02:00
Evan Lezar
7bd65da91e
Add FeatureFlags to the nvcdi API
...
This change adds support for feature flags to the nvcdi API.
A feature flag to disable nvsandboxutils is also added to allow
more flexibility in cases where this library causes issue.
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2025-05-22 17:26:41 +02:00
Evan Lezar
872aa2fe1c
Reenable nvsandboxutils for driver discovery
...
This change reenables nvsandboxutils for driver discovery. This
was disabled due to an error in a specific driver version (v565)
so as to not block the release of the DRA driver for ComputeDomains.
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2025-05-22 17:26:40 +02:00
Carlos Eduardo Arango Gutierrez
be6a36c023
Merge pull request #1102 from ArangoGutierrez/i/1094
...
CI Pipeline / code-scanning (push) Waiting to run
CI Pipeline / variables (push) Waiting to run
CI Pipeline / golang (push) Waiting to run
CI Pipeline / image (push) Blocked by required conditions
CI Pipeline / e2e-test (push) Blocked by required conditions
Edit discover.mounts to have a deterministic output
2025-05-22 15:53:22 +02:00
Carlos Eduardo Arango Gutierrez
aaaa3c6275
Edit discover.mounts to have a deterministic output
...
Signed-off-by: Carlos Eduardo Arango Gutierrez <eduardoa@nvidia.com>
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2025-05-22 15:25:03 +02:00
Evan Lezar
f93d96a0de
Merge pull request #1090 from ArangoGutierrez/hookcreator
...
CI Pipeline / code-scanning (push) Waiting to run
CI Pipeline / variables (push) Waiting to run
CI Pipeline / golang (push) Waiting to run
CI Pipeline / image (push) Blocked by required conditions
CI Pipeline / e2e-test (push) Blocked by required conditions
Refactor the way we handle Hook Creation
2025-05-21 14:23:51 +02:00
Carlos Eduardo Arango Gutierrez
2a4cf4c0a0
[no-relnote] Update gitignore
...
CI Pipeline / code-scanning (push) Has been cancelled
CI Pipeline / variables (push) Has been cancelled
CI Pipeline / golang (push) Has been cancelled
CI Pipeline / image (push) Has been cancelled
CI Pipeline / e2e-test (push) Has been cancelled
Signed-off-by: Carlos Eduardo Arango Gutierrez <eduardoa@nvidia.com>
2025-05-21 10:19:52 +02:00
Carlos Eduardo Arango Gutierrez
cf3b9317ef
Refactor the way we create CDI Hooks
...
Signed-off-by: Carlos Eduardo Arango Gutierrez <eduardoa@nvidia.com>
2025-05-21 10:19:47 +02:00
Evan Lezar
6ba25e7288
Merge pull request #1095 from NVIDIA/dependabot/submodules/main/third_party/libnvidia-container-caf057b
...
CI Pipeline / code-scanning (push) Waiting to run
CI Pipeline / variables (push) Waiting to run
CI Pipeline / golang (push) Waiting to run
CI Pipeline / image (push) Blocked by required conditions
CI Pipeline / e2e-test (push) Blocked by required conditions
Bump third_party/libnvidia-container from `51a7f20` to `caf057b`
2025-05-20 20:31:49 +02:00
dependabot[bot]
296633d148
Bump third_party/libnvidia-container from 51a7f20
to caf057b
...
Bumps [third_party/libnvidia-container](https://github.com/NVIDIA/libnvidia-container ) from `51a7f20` to `caf057b`.
- [Release notes](https://github.com/NVIDIA/libnvidia-container/releases )
- [Commits](51a7f20088...caf057b009
)
---
updated-dependencies:
- dependency-name: third_party/libnvidia-container
dependency-version: caf057b00987a6d874d519cc80b742c43faa859a
dependency-type: direct:production
...
Signed-off-by: dependabot[bot] <support@github.com>
2025-05-20 13:20:23 +00:00
Evan Lezar
ac8f190c99
Merge commit from fork
...
CI Pipeline / code-scanning (push) Has been cancelled
CI Pipeline / variables (push) Has been cancelled
CI Pipeline / golang (push) Has been cancelled
CI Pipeline / image (push) Has been cancelled
CI Pipeline / e2e-test (push) Has been cancelled
Run update-ldcache in isolated namespaces
2025-05-16 15:15:21 +02:00
Evan Lezar
3c1f1a6519
Merge pull request #1086 from NVIDIA/dependabot/submodules/main/third_party/libnvidia-container-51a7f20
...
Bump third_party/libnvidia-container from `d26524a` to `51a7f20`
2025-05-16 14:18:17 +02:00
dependabot[bot]
3ee5ff0aa2
Bump third_party/libnvidia-container from d26524a
to 51a7f20
...
Bumps [third_party/libnvidia-container](https://github.com/NVIDIA/libnvidia-container ) from `d26524a` to `51a7f20`.
- [Release notes](https://github.com/NVIDIA/libnvidia-container/releases )
- [Commits](d26524ab5d...51a7f20088
)
---
updated-dependencies:
- dependency-name: third_party/libnvidia-container
dependency-version: 51a7f20088dc0c3e7ddbb67629bf8e63b9130339
dependency-type: direct:production
...
Signed-off-by: dependabot[bot] <support@github.com>
2025-05-16 08:27:41 +00:00
Evan Lezar
6dfd63f4a8
Merge pull request #980 from elezar/add-rprivate-to-mount-options
...
CI Pipeline / code-scanning (push) Waiting to run
CI Pipeline / variables (push) Waiting to run
CI Pipeline / golang (push) Waiting to run
CI Pipeline / image (push) Blocked by required conditions
CI Pipeline / e2e-test (push) Blocked by required conditions
Add rprivate to CDI mount options
2025-05-16 07:53:39 +02:00
Evan Lezar
35e583b623
Merge pull request #1000 from elezar/ignore-unknown-hooks
...
Issue warning on unsupported CDI hook
2025-05-16 07:52:25 +02:00
Evan Lezar
7d71932d2a
Merge pull request #1085 from elezar/add-security-md
...
[no-relnote] Add SECURITY.md to repo
2025-05-16 07:51:27 +02:00
Evan Lezar
d3ea72c440
[no-relnote] Add SECURITY.md to repo
...
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2025-05-15 16:38:43 +02:00
Evan Lezar
c0dda358a3
Issue warning on unsupported CDI hook
...
CI Pipeline / code-scanning (push) Has been cancelled
CI Pipeline / variables (push) Has been cancelled
CI Pipeline / golang (push) Has been cancelled
CI Pipeline / image (push) Has been cancelled
CI Pipeline / e2e-test (push) Has been cancelled
To allow for CDI hooks to be added gradually we provide a generic no-op hook
for unrecognised subcommands. This will log a warning instead of erroring out.
An unsupported hook could be the result of a CDI specification referring to a
new hook that is not yet supported by an older NVIDIA Container Toolkit
version or a hook that has been removed in newer version.
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2025-05-15 14:05:19 +02:00
Evan Lezar
ec29b602c3
Run update-ldcache in isolated namespaces
...
This change uses the reexec package to run the update of the
ldcache in a container in a process with isolated namespaces.
Since the hook is invoked as a createContainer hook, these
namespaces are cloned from the container's namespaces.
In the reexec handler, we further isolate the proc filesystem,
mount the host ldconfig to a tmpfs, and pivot into the containers
root.
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2025-05-15 12:45:49 +02:00
Carlos Eduardo Arango Gutierrez
241881f12f
Merge pull request #1048 from ArangoGutierrez/updated_e2e
...
CI Pipeline / code-scanning (push) Has been cancelled
CI Pipeline / variables (push) Has been cancelled
CI Pipeline / golang (push) Has been cancelled
CI Pipeline / image (push) Has been cancelled
CI Pipeline / e2e-test (push) Has been cancelled
[no-relnote] Update E2E test suite
2025-05-14 12:27:01 +02:00
Carlos Eduardo Arango Gutierrez
eb40f240ac
[no-relnote] Update E2E suite
...
Signed-off-by: Carlos Eduardo Arango Gutierrez <eduardoa@nvidia.com>
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2025-05-14 11:22:14 +02:00
Evan Lezar
72b2ee9ce0
Merge pull request #1055 from elezar/add-cuda-compat-mode
...
CI Pipeline / code-scanning (push) Waiting to run
CI Pipeline / variables (push) Waiting to run
CI Pipeline / golang (push) Waiting to run
CI Pipeline / image (push) Blocked by required conditions
CI Pipeline / e2e-test (push) Blocked by required conditions
Add nvidia-container-cli.compat-mode config option
2025-05-13 21:56:18 +02:00
Evan Lezar
f4981f0876
Add cuda-compat-mode config option
...
This change adds an nvidia-container-runtime.modes.legacy.cuda-compat-mode
config option. This can be set to one of four values:
* ldconfig (default): the --cuda-compat-mode=ldconfig flag is passed to the nvidia-container-cli
* mount: the --cuda-compat-mode=mount flag is passed to the nvidia-conainer-cli
* disabled: the --cuda-compat-mode=disabled flag is passed to the nvidia-container-cli
* hook: the --cuda-compat-mode=disabled flag is passed to the nvidia-container-cli AND the
enable-cuda-compat hook is used to provide forward compatibility.
Note that the disable-cuda-compat-lib-hook feature flag will prevent the enable-cuda-compat
hook from being used. This change also means that the allow-cuda-compat-libs-from-container
feature flag no longer has any effect.
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2025-05-13 21:49:53 +02:00
Evan Lezar
2ec67033c0
Merge pull request #1081 from NVIDIA/dependabot/submodules/main/third_party/libnvidia-container-d26524a
...
Bump third_party/libnvidia-container from `a198166` to `d26524a`
2025-05-13 21:49:21 +02:00
dependabot[bot]
f8eda79aaf
Bump third_party/libnvidia-container from a198166
to d26524a
...
Bumps [third_party/libnvidia-container](https://github.com/NVIDIA/libnvidia-container ) from `a198166` to `d26524a`.
- [Release notes](https://github.com/NVIDIA/libnvidia-container/releases )
- [Commits](a198166e1c...d26524ab5d
)
---
updated-dependencies:
- dependency-name: third_party/libnvidia-container
dependency-version: d26524ab5db96a55ae86033f53de50d3794fb547
dependency-type: direct:production
...
Signed-off-by: dependabot[bot] <support@github.com>
2025-05-13 19:48:20 +00:00