Commit Graph

2361 Commits

Author SHA1 Message Date
Evan Lezar
19a83e3542
Merge pull request #1112 from NVIDIA/dependabot/submodules/main/third_party/libnvidia-container-6eda4d7
Some checks failed
CI Pipeline / code-scanning (push) Has been cancelled
CI Pipeline / variables (push) Has been cancelled
CI Pipeline / golang (push) Has been cancelled
CI Pipeline / image (push) Has been cancelled
CI Pipeline / e2e-test (push) Has been cancelled
Bump third_party/libnvidia-container from `caf057b` to `6eda4d7`
2025-05-28 10:33:44 +02:00
dependabot[bot]
d2344cba34
Bump third_party/libnvidia-container from caf057b to 6eda4d7
Bumps [third_party/libnvidia-container](https://github.com/NVIDIA/libnvidia-container) from `caf057b` to `6eda4d7`.
- [Release notes](https://github.com/NVIDIA/libnvidia-container/releases)
- [Commits](caf057b009...6eda4d76c8)

---
updated-dependencies:
- dependency-name: third_party/libnvidia-container
  dependency-version: 6eda4d76c8c5f8fc174e4abca83e513fb4dd63b0
  dependency-type: direct:production
...

Signed-off-by: dependabot[bot] <support@github.com>
2025-05-28 08:16:30 +00:00
Evan Lezar
c8c22162b7
Merge pull request #958 from ArangoGutierrez/codecov
Some checks are pending
CI Pipeline / code-scanning (push) Waiting to run
CI Pipeline / variables (push) Waiting to run
CI Pipeline / golang (push) Waiting to run
CI Pipeline / image (push) Blocked by required conditions
CI Pipeline / e2e-test (push) Blocked by required conditions
[no-relnote] Enable Coveralls for code coverage
2025-05-27 15:48:54 +02:00
Evan Lezar
ea9b8721c0
Merge pull request #1108 from NVIDIA/dependabot/github_actions/main/NVIDIA/holodeck-0.2.9
Some checks are pending
CI Pipeline / code-scanning (push) Waiting to run
CI Pipeline / variables (push) Waiting to run
CI Pipeline / golang (push) Waiting to run
CI Pipeline / image (push) Blocked by required conditions
CI Pipeline / e2e-test (push) Blocked by required conditions
Bump NVIDIA/holodeck from 0.2.7 to 0.2.9
2025-05-27 10:21:15 +02:00
dependabot[bot]
eaaa8536e4
Bump NVIDIA/holodeck from 0.2.7 to 0.2.9
Bumps [NVIDIA/holodeck](https://github.com/nvidia/holodeck) from 0.2.7 to 0.2.9.
- [Release notes](https://github.com/nvidia/holodeck/releases)
- [Commits](https://github.com/nvidia/holodeck/compare/v0.2.7...v0.2.9)

---
updated-dependencies:
- dependency-name: NVIDIA/holodeck
  dependency-version: 0.2.9
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
2025-05-27 08:19:31 +00:00
Carlos Eduardo Arango Gutierrez
e955f65d8f
[no-relnote] Enable Coveralls
Some checks failed
CI Pipeline / code-scanning (push) Has been cancelled
CI Pipeline / variables (push) Has been cancelled
CI Pipeline / golang (push) Has been cancelled
CI Pipeline / image (push) Has been cancelled
CI Pipeline / e2e-test (push) Has been cancelled
Signed-off-by: Carlos Eduardo Arango Gutierrez <eduardoa@nvidia.com>
2025-05-23 13:53:45 +02:00
Evan Lezar
b934c68bef
Merge pull request #1103 from elezar/reenable-nvsandboxutils
Some checks failed
CI Pipeline / code-scanning (push) Has been cancelled
CI Pipeline / variables (push) Has been cancelled
CI Pipeline / golang (push) Has been cancelled
CI Pipeline / image (push) Has been cancelled
CI Pipeline / e2e-test (push) Has been cancelled
Reenable nvsandboxutils for driver discovery
2025-05-23 11:38:46 +02:00
Evan Lezar
7bd65da91e
Add FeatureFlags to the nvcdi API
This change adds support for feature flags to the nvcdi API.

A feature flag to disable nvsandboxutils is also added to allow
more flexibility in cases where this library causes issue.

Signed-off-by: Evan Lezar <elezar@nvidia.com>
2025-05-22 17:26:41 +02:00
Evan Lezar
872aa2fe1c
Reenable nvsandboxutils for driver discovery
This change reenables nvsandboxutils for driver discovery. This
was disabled due to an error in a specific driver version (v565)
so as to not block the release of the DRA driver for ComputeDomains.

Signed-off-by: Evan Lezar <elezar@nvidia.com>
2025-05-22 17:26:40 +02:00
Carlos Eduardo Arango Gutierrez
be6a36c023
Merge pull request #1102 from ArangoGutierrez/i/1094
Some checks are pending
CI Pipeline / code-scanning (push) Waiting to run
CI Pipeline / variables (push) Waiting to run
CI Pipeline / golang (push) Waiting to run
CI Pipeline / image (push) Blocked by required conditions
CI Pipeline / e2e-test (push) Blocked by required conditions
Edit discover.mounts to have a deterministic output
2025-05-22 15:53:22 +02:00
Carlos Eduardo Arango Gutierrez
aaaa3c6275
Edit discover.mounts to have a deterministic output
Signed-off-by: Carlos Eduardo Arango Gutierrez <eduardoa@nvidia.com>
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2025-05-22 15:25:03 +02:00
Evan Lezar
f93d96a0de
Merge pull request #1090 from ArangoGutierrez/hookcreator
Some checks are pending
CI Pipeline / code-scanning (push) Waiting to run
CI Pipeline / variables (push) Waiting to run
CI Pipeline / golang (push) Waiting to run
CI Pipeline / image (push) Blocked by required conditions
CI Pipeline / e2e-test (push) Blocked by required conditions
Refactor the way we handle Hook Creation
2025-05-21 14:23:51 +02:00
Carlos Eduardo Arango Gutierrez
2a4cf4c0a0
[no-relnote] Update gitignore
Some checks failed
CI Pipeline / code-scanning (push) Has been cancelled
CI Pipeline / variables (push) Has been cancelled
CI Pipeline / golang (push) Has been cancelled
CI Pipeline / image (push) Has been cancelled
CI Pipeline / e2e-test (push) Has been cancelled
Signed-off-by: Carlos Eduardo Arango Gutierrez <eduardoa@nvidia.com>
2025-05-21 10:19:52 +02:00
Carlos Eduardo Arango Gutierrez
cf3b9317ef
Refactor the way we create CDI Hooks
Signed-off-by: Carlos Eduardo Arango Gutierrez <eduardoa@nvidia.com>
2025-05-21 10:19:47 +02:00
Evan Lezar
6ba25e7288
Merge pull request #1095 from NVIDIA/dependabot/submodules/main/third_party/libnvidia-container-caf057b
Some checks are pending
CI Pipeline / code-scanning (push) Waiting to run
CI Pipeline / variables (push) Waiting to run
CI Pipeline / golang (push) Waiting to run
CI Pipeline / image (push) Blocked by required conditions
CI Pipeline / e2e-test (push) Blocked by required conditions
Bump third_party/libnvidia-container from `51a7f20` to `caf057b`
2025-05-20 20:31:49 +02:00
dependabot[bot]
296633d148
Bump third_party/libnvidia-container from 51a7f20 to caf057b
Bumps [third_party/libnvidia-container](https://github.com/NVIDIA/libnvidia-container) from `51a7f20` to `caf057b`.
- [Release notes](https://github.com/NVIDIA/libnvidia-container/releases)
- [Commits](51a7f20088...caf057b009)

---
updated-dependencies:
- dependency-name: third_party/libnvidia-container
  dependency-version: caf057b00987a6d874d519cc80b742c43faa859a
  dependency-type: direct:production
...

Signed-off-by: dependabot[bot] <support@github.com>
2025-05-20 13:20:23 +00:00
Evan Lezar
ac8f190c99
Merge commit from fork
Some checks failed
CI Pipeline / code-scanning (push) Has been cancelled
CI Pipeline / variables (push) Has been cancelled
CI Pipeline / golang (push) Has been cancelled
CI Pipeline / image (push) Has been cancelled
CI Pipeline / e2e-test (push) Has been cancelled
Run update-ldcache in isolated namespaces
2025-05-16 15:15:21 +02:00
Evan Lezar
3c1f1a6519
Merge pull request #1086 from NVIDIA/dependabot/submodules/main/third_party/libnvidia-container-51a7f20
Bump third_party/libnvidia-container from `d26524a` to `51a7f20`
2025-05-16 14:18:17 +02:00
dependabot[bot]
3ee5ff0aa2
Bump third_party/libnvidia-container from d26524a to 51a7f20
Bumps [third_party/libnvidia-container](https://github.com/NVIDIA/libnvidia-container) from `d26524a` to `51a7f20`.
- [Release notes](https://github.com/NVIDIA/libnvidia-container/releases)
- [Commits](d26524ab5d...51a7f20088)

---
updated-dependencies:
- dependency-name: third_party/libnvidia-container
  dependency-version: 51a7f20088dc0c3e7ddbb67629bf8e63b9130339
  dependency-type: direct:production
...

Signed-off-by: dependabot[bot] <support@github.com>
2025-05-16 08:27:41 +00:00
Evan Lezar
6dfd63f4a8
Merge pull request #980 from elezar/add-rprivate-to-mount-options
Some checks are pending
CI Pipeline / code-scanning (push) Waiting to run
CI Pipeline / variables (push) Waiting to run
CI Pipeline / golang (push) Waiting to run
CI Pipeline / image (push) Blocked by required conditions
CI Pipeline / e2e-test (push) Blocked by required conditions
Add rprivate to CDI mount options
2025-05-16 07:53:39 +02:00
Evan Lezar
35e583b623
Merge pull request #1000 from elezar/ignore-unknown-hooks
Issue warning on unsupported CDI hook
2025-05-16 07:52:25 +02:00
Evan Lezar
7d71932d2a
Merge pull request #1085 from elezar/add-security-md
[no-relnote] Add SECURITY.md to repo
2025-05-16 07:51:27 +02:00
Evan Lezar
d3ea72c440
[no-relnote] Add SECURITY.md to repo
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2025-05-15 16:38:43 +02:00
Evan Lezar
c0dda358a3
Issue warning on unsupported CDI hook
Some checks failed
CI Pipeline / code-scanning (push) Has been cancelled
CI Pipeline / variables (push) Has been cancelled
CI Pipeline / golang (push) Has been cancelled
CI Pipeline / image (push) Has been cancelled
CI Pipeline / e2e-test (push) Has been cancelled
To allow for CDI hooks to be added gradually we provide a generic no-op hook
for unrecognised subcommands. This will log a warning instead of erroring out.

An unsupported hook could be the result of a CDI specification referring to a
new hook that is not yet supported by an older NVIDIA Container Toolkit
version or a hook that has been removed in newer version.

Signed-off-by: Evan Lezar <elezar@nvidia.com>
2025-05-15 14:05:19 +02:00
Evan Lezar
ec29b602c3
Run update-ldcache in isolated namespaces
This change uses the reexec package to run the update of the
ldcache in a container in a process with isolated namespaces.
Since the hook is invoked as a createContainer hook, these
namespaces are cloned from the container's namespaces.

In the reexec handler, we further isolate the proc filesystem,
mount the host ldconfig to a tmpfs, and pivot into the containers
root.

Signed-off-by: Evan Lezar <elezar@nvidia.com>
2025-05-15 12:45:49 +02:00
Carlos Eduardo Arango Gutierrez
241881f12f
Merge pull request #1048 from ArangoGutierrez/updated_e2e
Some checks failed
CI Pipeline / code-scanning (push) Has been cancelled
CI Pipeline / variables (push) Has been cancelled
CI Pipeline / golang (push) Has been cancelled
CI Pipeline / image (push) Has been cancelled
CI Pipeline / e2e-test (push) Has been cancelled
[no-relnote] Update E2E test suite
2025-05-14 12:27:01 +02:00
Carlos Eduardo Arango Gutierrez
eb40f240ac
[no-relnote] Update E2E suite
Signed-off-by: Carlos Eduardo Arango Gutierrez <eduardoa@nvidia.com>
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2025-05-14 11:22:14 +02:00
Evan Lezar
72b2ee9ce0
Merge pull request #1055 from elezar/add-cuda-compat-mode
Some checks are pending
CI Pipeline / code-scanning (push) Waiting to run
CI Pipeline / variables (push) Waiting to run
CI Pipeline / golang (push) Waiting to run
CI Pipeline / image (push) Blocked by required conditions
CI Pipeline / e2e-test (push) Blocked by required conditions
Add nvidia-container-cli.compat-mode config option
2025-05-13 21:56:18 +02:00
Evan Lezar
f4981f0876
Add cuda-compat-mode config option
This change adds an nvidia-container-runtime.modes.legacy.cuda-compat-mode
config option. This can be set to one of four values:

* ldconfig (default): the --cuda-compat-mode=ldconfig flag is passed to the nvidia-container-cli
* mount: the --cuda-compat-mode=mount flag is passed to the nvidia-conainer-cli
* disabled: the --cuda-compat-mode=disabled flag is passed to the nvidia-container-cli
* hook: the --cuda-compat-mode=disabled flag is passed to the nvidia-container-cli AND the
  enable-cuda-compat hook is used to provide forward compatibility.

Note that the disable-cuda-compat-lib-hook feature flag will prevent the enable-cuda-compat
hook from being used. This change also means that the allow-cuda-compat-libs-from-container
feature flag no longer has any effect.

Signed-off-by: Evan Lezar <elezar@nvidia.com>
2025-05-13 21:49:53 +02:00
Evan Lezar
2ec67033c0
Merge pull request #1081 from NVIDIA/dependabot/submodules/main/third_party/libnvidia-container-d26524a
Bump third_party/libnvidia-container from `a198166` to `d26524a`
2025-05-13 21:49:21 +02:00
dependabot[bot]
f8eda79aaf
Bump third_party/libnvidia-container from a198166 to d26524a
Bumps [third_party/libnvidia-container](https://github.com/NVIDIA/libnvidia-container) from `a198166` to `d26524a`.
- [Release notes](https://github.com/NVIDIA/libnvidia-container/releases)
- [Commits](a198166e1c...d26524ab5d)

---
updated-dependencies:
- dependency-name: third_party/libnvidia-container
  dependency-version: d26524ab5db96a55ae86033f53de50d3794fb547
  dependency-type: direct:production
...

Signed-off-by: dependabot[bot] <support@github.com>
2025-05-13 19:48:20 +00:00
Evan Lezar
51504097d8
Merge pull request #1078 from elezar/add-thor-support
Fix mode detection on Thor-based systems
2025-05-13 21:33:25 +02:00
Evan Lezar
a4dc28bb3f
Fix mode detection on Thor-based systems
This change updates github.com/NVIDIA/go-nvlib from v0.7.1 to v0.7.2
to allow Thor systems to be detected as Tegra-based. This allows fixes
automatic mode detection to work on these systems.

Signed-off-by: Evan Lezar <elezar@nvidia.com>
2025-05-13 21:25:11 +02:00
Evan Lezar
d0103aa6a3
Add rprivate to CDI mount options
Some checks failed
CI Pipeline / code-scanning (push) Has been cancelled
CI Pipeline / variables (push) Has been cancelled
CI Pipeline / golang (push) Has been cancelled
CI Pipeline / image (push) Has been cancelled
CI Pipeline / e2e-test (push) Has been cancelled
This ensures that mount propagation is set to rprivate for
mounts from the host into the container. This aligns with the
default in docker.

Signed-off-by: Evan Lezar <elezar@nvidia.com>
2025-05-09 15:16:13 +02:00
Evan Lezar
adb5e6719d
Merge pull request #1046 from elezar/resolve-ldcache-libs-on-arm64
Some checks failed
CI Pipeline / code-scanning (push) Has been cancelled
CI Pipeline / variables (push) Has been cancelled
CI Pipeline / golang (push) Has been cancelled
CI Pipeline / image (push) Has been cancelled
CI Pipeline / e2e-test (push) Has been cancelled
Fix resolution of libs in LDCache on ARM
2025-05-09 15:04:00 +02:00
Evan Lezar
0c254711e7
Merge pull request #1066 from NVIDIA/dependabot/docker/deployments/container/main/nvidia/cuda-12.9.0-base-ubuntu20.04
Bump nvidia/cuda from 12.8.1-base-ubuntu20.04 to 12.9.0-base-ubuntu20.04 in /deployments/container
2025-05-09 13:51:10 +02:00
Evan Lezar
27adebaa44
Merge pull request #1065 from elezar/skip-nill-discoverers
Skip nil discoverers in merge
2025-05-09 13:50:44 +02:00
dependabot[bot]
496cdb5463
Bump nvidia/cuda in /deployments/container
Bumps nvidia/cuda from 12.8.1-base-ubuntu20.04 to 12.9.0-base-ubuntu20.04.

---
updated-dependencies:
- dependency-name: nvidia/cuda
  dependency-version: 12.9.0-base-ubuntu20.04
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
2025-05-08 08:38:12 +00:00
Evan Lezar
132c9afb6c
Merge pull request #1063 from NVIDIA/dependabot/github_actions/main/slackapi/slack-github-action-2.1.0
Some checks failed
CI Pipeline / code-scanning (push) Has been cancelled
CI Pipeline / variables (push) Has been cancelled
CI Pipeline / golang (push) Has been cancelled
CI Pipeline / image (push) Has been cancelled
CI Pipeline / e2e-test (push) Has been cancelled
Bump slackapi/slack-github-action from 2.0.0 to 2.1.0
2025-05-07 17:19:08 +02:00
Evan Lezar
c879fb59c1
Merge pull request #1058 from NVIDIA/dependabot/github_actions/main/golangci/golangci-lint-action-8
Bump golangci/golangci-lint-action from 7 to 8
2025-05-07 16:54:45 +02:00
Evan Lezar
fbff2c4943
Merge pull request #1064 from NVIDIA/dependabot/docker/deployments/devel/main/golang-1.24.3
Bump golang from 1.24.2 to 1.24.3 in /deployments/devel
2025-05-07 16:54:18 +02:00
Evan Lezar
0c765c6536
Skip nil discoverers in merge
When constructing a list of discoverers using discover.Merge we
explicitly skip `nil` discoverers to simplify usage as we don't
have to explicitly check validity when processing the discoverers
in the list.

Signed-off-by: Evan Lezar <elezar@nvidia.com>
2025-05-07 12:51:38 +02:00
dependabot[bot]
0863749de3
Bump golang from 1.24.2 to 1.24.3 in /deployments/devel
Bumps golang from 1.24.2 to 1.24.3.

---
updated-dependencies:
- dependency-name: golang
  dependency-version: 1.24.3
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
2025-05-07 08:53:04 +00:00
dependabot[bot]
a8ca8e91f2
Bump slackapi/slack-github-action from 2.0.0 to 2.1.0
Bumps [slackapi/slack-github-action](https://github.com/slackapi/slack-github-action) from 2.0.0 to 2.1.0.
- [Release notes](https://github.com/slackapi/slack-github-action/releases)
- [Commits](https://github.com/slackapi/slack-github-action/compare/v2.0.0...v2.1.0)

---
updated-dependencies:
- dependency-name: slackapi/slack-github-action
  dependency-version: 2.1.0
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
2025-05-07 08:29:10 +00:00
Evan Lezar
cf395e765a
Merge pull request #1061 from NVIDIA/dependabot/go_modules/main/golang.org/x/sys-0.33.0
Some checks failed
CI Pipeline / code-scanning (push) Has been cancelled
CI Pipeline / variables (push) Has been cancelled
CI Pipeline / golang (push) Has been cancelled
CI Pipeline / image (push) Has been cancelled
CI Pipeline / e2e-test (push) Has been cancelled
Bump golang.org/x/sys from 0.32.0 to 0.33.0
2025-05-06 14:32:35 +02:00
dependabot[bot]
f859c9a671
Bump golang.org/x/sys from 0.32.0 to 0.33.0
Bumps [golang.org/x/sys](https://github.com/golang/sys) from 0.32.0 to 0.33.0.
- [Commits](https://github.com/golang/sys/compare/v0.32.0...v0.33.0)

---
updated-dependencies:
- dependency-name: golang.org/x/sys
  dependency-version: 0.33.0
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
2025-05-06 09:31:22 +00:00
Carlos Eduardo Arango Gutierrez
f50e815837
Merge pull request #1062 from NVIDIA/dependabot/go_modules/tests/main/golang.org/x/crypto-0.38.0
Bump golang.org/x/crypto from 0.37.0 to 0.38.0 in /tests
2025-05-06 11:30:11 +02:00
dependabot[bot]
ffcef4f9a8
Bump golang.org/x/crypto from 0.37.0 to 0.38.0 in /tests
Bumps [golang.org/x/crypto](https://github.com/golang/crypto) from 0.37.0 to 0.38.0.
- [Commits](https://github.com/golang/crypto/compare/v0.37.0...v0.38.0)

---
updated-dependencies:
- dependency-name: golang.org/x/crypto
  dependency-version: 0.38.0
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
2025-05-06 08:42:33 +00:00
dependabot[bot]
194a1663ab
Bump golangci/golangci-lint-action from 7 to 8
Bumps [golangci/golangci-lint-action](https://github.com/golangci/golangci-lint-action) from 7 to 8.
- [Release notes](https://github.com/golangci/golangci-lint-action/releases)
- [Commits](https://github.com/golangci/golangci-lint-action/compare/v7...v8)

---
updated-dependencies:
- dependency-name: golangci/golangci-lint-action
  dependency-version: '8'
  dependency-type: direct:production
  update-type: version-update:semver-major
...

Signed-off-by: dependabot[bot] <support@github.com>
2025-05-05 09:11:37 +00:00
Evan Lezar
51d603aec6
Merge pull request #1024 from NVIDIA/dependabot/go_modules/tests/main/golang.org/x/crypto-0.37.0
Some checks failed
CI Pipeline / code-scanning (push) Has been cancelled
CI Pipeline / variables (push) Has been cancelled
CI Pipeline / golang (push) Has been cancelled
CI Pipeline / image (push) Has been cancelled
CI Pipeline / e2e-test (push) Has been cancelled
Bump golang.org/x/crypto from 0.36.0 to 0.37.0 in /tests
2025-05-02 13:14:34 +02:00