Commit Graph

373 Commits

Author SHA1 Message Date
Evan Lezar
a4bfccc3fe Use include-persistenced-socket feature for CDI mode
This change ensures that the internal CDI representation includes
the persistenced socket if the include-persistenced-socket feature
flag is enabled.

Signed-off-by: Evan Lezar <elezar@nvidia.com>
2024-09-18 22:30:27 +02:00
Evan Lezar
ba1ed3232f Skip injection of nvidia-persistenced socket by default
This changes skips the injection of the nvidia-persistenced socket by
default.

An include-persistenced-socket feature flag is added to allow the
injection of this socket to be explicitly requested.

Signed-off-by: Evan Lezar <elezar@nvidia.com>
2024-09-18 22:10:09 +02:00
Evan Lezar
c3c0cdcc89
Merge pull request #660 from elezar/fix-libnvidia-allocator-duplicate-mount
Exclude libnvidia-allocator from graphics mounts
2024-08-22 14:00:25 +02:00
Evan Lezar
beb921fafe Exclude libnvidia-allocator from graphics mounts
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2024-08-20 16:28:14 +02:00
Evan Lezar
5316c2a618
Merge pull request #657 from yeahdongcn/typo
Rename icp_test.go to ipc_test.go
2024-08-19 11:38:56 +02:00
dependabot[bot]
070b40d62a Bump github.com/golangci/golangci-lint in /deployments/devel
Bumps [github.com/golangci/golangci-lint](https://github.com/golangci/golangci-lint) from 1.59.1 to 1.60.1.
- [Release notes](https://github.com/golangci/golangci-lint/releases)
- [Changelog](https://github.com/golangci/golangci-lint/blob/master/CHANGELOG.md)
- [Commits](https://github.com/golangci/golangci-lint/compare/v1.59.1...v1.60.1)

---
updated-dependencies:
- dependency-name: github.com/golangci/golangci-lint
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
2024-08-19 11:31:11 +02:00
Xiaodong Ye
13862f3c75 Rename icp_test.go to ipc_test.go
Signed-off-by: Xiaodong Ye <yeahdongcn@gmail.com>
2024-08-19 15:39:54 +08:00
Evan Lezar
122136fe92
Merge pull request #627 from elezar/add-mig-cdi-test
[no-relnotes] Add initial unit test for MIG CDI spec generation
2024-08-01 11:00:07 +02:00
Evan Lezar
fa16d83494 [no-relnotes] Add initial unit test for MIG CDI spec generation
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2024-07-31 11:56:36 +02:00
Christopher Desiniotis
d8ccd50349
[bugfix] Process error type instead of nvml.Return in internal/platform-support/dgpu
Signed-off-by: Christopher Desiniotis <cdesiniotis@nvidia.com>
2024-07-22 13:57:14 -07:00
Evan Lezar
448a3853ad
Merge pull request #552 from elezar/refactor-dgpu-discovery
Refactor dGPU device discovery
2024-07-10 11:48:30 +02:00
Evan Lezar
e527cc1ff5 Use relative path to locate driver libraries
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2024-07-09 13:54:07 +02:00
Evan Lezar
55ea268829 Add RelativeToRoot function to Driver
This adds a function to return a path as a path relative to
the specified driver root.

Signed-off-by: Evan Lezar <elezar@nvidia.com>
2024-07-09 13:54:07 +02:00
Seungmin Kim
aae3da88c3 Inject additional libraries for full X11 functionality
This change updates X11 library detection to ensure that this
works for a wider range of driver installations.

Signed-off-by: Seungmin Kim <8457324+ehfd@users.noreply.github.com>
Signed-off-by: tux-rampage <tuxrampage@gmail.com>
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2024-07-09 13:53:55 +02:00
Evan Lezar
be11cf428b [no-relnote] Add MIG discoverer to dgpu package
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2024-07-09 13:34:04 +02:00
Evan Lezar
b42a5d3e3a [no-relnote] Refactor dGPU device discovery
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2024-07-09 13:34:04 +02:00
Evan Lezar
3aeba886d4 Reduce logging for the NVIDIA Container runtime
This change moves most of the logging for the NVIDIA Container runtime
from Info to Debug or Trace -- especially in the case where no
modifications are requried.

This removes execessive logging.

Signed-off-by: Evan Lezar <elezar@nvidia.com>
2024-07-01 12:12:38 +02:00
Evan Lezar
490c7dd599 Add Tracef to logger Interface
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2024-07-01 12:12:38 +02:00
Evan Lezar
8df5e33ef6 Add String function to oci.Runtime interface
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2024-07-01 12:12:38 +02:00
Evan Lezar
2c8431c1f8 Fix bug in argument parsing for logger creation
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2024-07-01 12:03:53 +02:00
German Maltsev
692dac4cbd [no-relnote] Fixed spelling error
Signed-off-by: German Maltsev <maltsev.german@gmail.com>
2024-06-19 17:40:56 +03:00
Evan Lezar
5f2be72335 Support vulkan ICD files in a driver root
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2024-06-17 11:33:29 +02:00
Evan Lezar
ef57c07199 Bump github.com/NVIDIA/go-nvlib to v0.5.0
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2024-05-28 13:28:28 +02:00
Evan Lezar
edda11d647
Merge pull request #428 from elezar/fix-cdi-mode-resolution
Fix cdi mode resolution
2024-05-21 13:22:10 +02:00
Evan Lezar
3defc6babb Use go-nvlib mode resolution
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2024-05-21 12:25:54 +02:00
Avi Deitcher
179d8655f9 Move nvidia-ctk hook command into own binary
This change creates an nvidia-cdi-hook binary for implementing
CDI hooks. This allows for these hooks to be separated from the
nvidia-ctk command which may, for example, require libnvidia-ml
to support other functionality.

The nvidia-ctk hook subcommand is maintained as an alias for the
time being to allow for existing CDI specifications referring to
this path to work as expected.

Signed-off-by: Avi Deitcher <avi@deitcher.net>
2024-05-21 12:19:44 +02:00
Evan Lezar
d5f6e6f868 Use nvml/mock package
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2024-04-18 14:53:37 +02:00
Evan Lezar
082ce066ed Replace go-nvlib/pkg/nvml with go-nvml/pkg/nvml
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2024-04-18 14:53:37 +02:00
Jared Baur
5788e622f4 Use XDG_DATA_DIRS instead of hardcoding /usr/share
When running nvidia-ctk on a system that uses a custom XDG_DATA_DIRS
environment variable value, the configuration files for `glvnd`,
`vulkan`, and `egl` fail to get passed through from the host to the
container. Reading from XDG_DATA_DIRS instead of hardcoding the default
value allows for finding said files so they can be mounted in the
container.

Signed-off-by: Jared Baur <jaredbaur@fastmail.com>
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2024-04-11 17:13:02 +02:00
Evan Lezar
29c0f82ed2
Merge pull request #327 from elezar/add-driver-config
Add config search path option to driver root
2024-04-11 16:58:33 +02:00
Evan Lezar
011c658945 Create root.Driver instance at first usage
This allows for testing through injection of the driver root.

Signed-off-by: Evan Lezar <elezar@nvidia.com>
2024-04-03 15:03:33 +02:00
Evan Lezar
09341a0934 Add support for feature flags
This change adds a features config that allows
individual features to be toggled at a global level. Each feature can (by default)
be controlled by an environment variable.

The GDS, MOFED, NVSWITCH, and GDRCOPY features are examples of such features.

Signed-off-by: Evan Lezar <elezar@nvidia.com>
2024-04-03 11:58:37 +02:00
Evan Lezar
2a9e3537ec Add config search paths option to driver root.
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2024-04-02 23:03:05 +02:00
Evan Lezar
643b89e539 Add driver.Config
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2024-03-25 20:18:15 +02:00
Evan Lezar
93763d25f0 Use functional options to construct driver root
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2024-03-25 20:13:25 +02:00
Jakub Bujak
44ae31d101 Use D3DKMTEnumAdapters3 for adapter enumeration
D3DKMTEnumAdapters3 is required to enumerate MCDM compute-only adapters

Signed-off-by: Jakub Bujak <jbujak@nvidia.com>
2024-03-12 11:42:08 +01:00
Evan Lezar
6780afbed1
Merge pull request #379 from NVIDIA/exists-fallback
[R550 driver support] add fallback logic to device.Exists(name)
2024-02-27 22:25:33 +02:00
Tariq Ibrahim
f80f4c485d [R550 driver support] add fallback logic to device.Exists(name)
Signed-off-by: Tariq Ibrahim <tibrahim@nvidia.com>
2024-02-27 11:59:35 -08:00
Evan Lezar
a2a1a78620
Merge pull request #330 from tariq1890/nvidia-dev-maj-num-lookup
add fallback logic when retrieving major number of the nvidia control device
2024-02-12 17:22:32 +01:00
Evan Lezar
e64b723b71 Add proc.devices.New constructor
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2024-02-06 11:19:43 +01:00
Tariq Ibrahim
f414ac2865 add fallback logic when retrieving major number of the nvidia control device
Signed-off-by: Tariq Ibrahim <tibrahim@nvidia.com>
2024-02-05 22:55:54 -08:00
Evan Lezar
772cf77dcc Fix build and test on darwin
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2024-02-05 23:58:28 +01:00
Christopher Desiniotis
55097b3d7d Add a new gated modifier for GDRCopy which injects the gdrdrv device node
Signed-off-by: Christopher Desiniotis <cdesiniotis@nvidia.com>
2024-01-24 14:25:58 -08:00
Tariq Ibrahim
9c1f0bb08b fix minor typos and rm unused logger param
Signed-off-by: Tariq Ibrahim <tibrahim@nvidia.com>
2024-01-22 16:48:11 -08:00
Jared Baur
838493b8b9
Allow for customizing the path to ldconfig
Since the `createContainer` `runc` hook runs with the environment that
the container's config.json specifies, the path to `ldconfig` may not be
easily resolvable if the host environment differs enough from the
container (e.g. on a NixOS host where all binaries are under hashed
paths in /nix/store with an Ubuntu container whose PATH contains
FHS-style paths such as /bin and /usr/bin). This change allows for
specifying exactly where ldconfig comes from.

Signed-off-by: Jared Baur <jaredbaur@fastmail.com>
2024-01-17 21:07:00 -08:00
Evan Lezar
f6c252cbde Add crun as a default low-level runtime.
This change adds crun as a configured low-level runtime.
Note that runc still preferred and will be used if present on the
system.

Signed-off-by: Evan Lezar <elezar@nvidia.com>
2024-01-17 11:31:07 +01:00
Evan Lezar
9c029cac72 Fix bug in determining CLI user on SUSE systems
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2024-01-11 13:54:40 +01:00
Evan Lezar
c90211e070 Log explicitly requested runtime mode
For users running the nvidia-container-runtime it would be useful
to determine the runtime mode used from the logs directly instead
of relying on other log messages as signals. This change ensures
that an explicitly selected mode is also logged instead of only
when mode=auto.

Signed-off-by: Evan Lezar <elezar@nvidia.com>
2023-12-15 15:35:35 +01:00
Jared Baur
95b8ebc297
Use devRoot for discovering character devices on Tegra platforms
Signed-off-by: Jared Baur <jaredbaur@fastmail.com>
2023-12-14 11:46:21 -08:00
Jared Baur
508438a0c5
Fix using devRoot on Tegra platforms
Using `WithDevRoot` on Tegra platforms was incorrectly setting
`driverRoot`, fix it so that it correctly sets `devRoot`.

Signed-off-by: Jared Baur <jaredbaur@fastmail.com>
2023-12-13 19:56:02 -08:00