Commit Graph

86 Commits

Author SHA1 Message Date
Evan Lezar
d60aa34a78 Add function to get the PCI bus ID for a device
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2024-06-17 14:56:19 +02:00
Kevin Klues
20ba32166d
Merge pull request #38 from PiotrProkop/add-vfs-info
feat: add additional SRIOV info to NvidiaPciDevice
2024-06-07 14:23:53 +02:00
PiotrProkop
bf3f431fc8 feat: add additinal SRIOV info to NvidiaPciDevice
Signed-off-by: PiotrProkop <pprokop@nvidia.com>
2024-06-07 14:12:33 +02:00
Evan Lezar
67ba04ed26 Make nvmllib a requried argument to devicelib
In oder to ensure consistent usage, we add an explicit
argument for an nvml.Interface implementation to the
device.New constructor.

Signed-off-by: Evan Lezar <elezar@nvidia.com>
2024-05-27 17:16:14 +02:00
Evan Lezar
21c8f035ca Add ResolvePlatform function to info package
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2024-05-02 14:09:59 +02:00
Evan Lezar
d1e08f17ea Add UsesOnlyNVGPUModule check to PropertyExtractor interface
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2024-05-02 14:09:38 +02:00
Evan Lezar
791d093c62 Refactor info API
This change adds a PropertyExtractor interface to encapsulate functions
that query a system for certain capabilities.

The IsTegraSystem has been renamed to HasTegraFiles function and marked
as Deprecated.

Signed-off-by: Evan Lezar <elezar@nvidia.com>
2024-05-02 14:08:04 +02:00
Kevin Klues
153699bb93 Update to incorporate go-nvml updates to expose interface types
Signed-off-by: Kevin Klues <kklues@nvidia.com>
2024-04-12 21:19:30 +00:00
Carlos Eduardo Arango Gutierrez
48789b76df
Address golangci-lint warnings
Signed-off-by: Carlos Eduardo Arango Gutierrez <eduardoa@nvidia.com>
2024-04-04 15:24:10 +02:00
Evan Lezar
fb0dc9d525 Add ComputeMode constants
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2024-03-12 17:23:12 +02:00
Evan Lezar
2feaa48250 Add SetComputeMode method to Device
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2024-03-11 16:25:04 +02:00
Evan Lezar
06cbc571ef Add nvmlDeviceHandle function to Device interface
This change allows the underlying device handle to be returned
without relying on type-casting.

Signed-off-by: Evan Lezar <elezar@nvidia.com>
2024-01-09 14:05:01 +01:00
Christopher Desiniotis
177e4eef6f Add an Identifier type to nvlib/device which implements common parsing of GPU indices and UUIDs
Signed-off-by: Christopher Desiniotis <cdesiniotis@nvidia.com>
2023-12-06 16:42:51 -08:00
Evan Lezar
9fd385bace
Merge pull request #7 from NVIDIA/add-nvlink-functions
Add functions related to NVLink info
2023-11-16 16:09:31 +01:00
Evan Lezar
2d9404b131 Rename go module to github.com/NVIDIA/go-nvlib
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2023-11-15 17:58:43 +01:00
Evan Lezar
e2e221a166 Run go fmt on pciids/pciids.go
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2023-11-14 13:37:40 +01:00
Evan Lezar
80d61efe5d Add functions related to NVLink info
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2023-11-14 12:55:16 +01:00
Evan Lezar
30ca72faaf TOFIX: Allow libname to be specified
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2023-10-23 14:09:03 +02:00
Evan Lezar
278851d719 Use GetLibrary().Lookup() in nvml package
This change uses the GetLibrary().Lookup() function in the nvml package
to check whether a particular function is available. This avoids
the need to explicitly open a library, for example.

Signed-off-by: Evan Lezar <elezar@nvidia.com>
2023-10-20 16:56:56 +02:00
Tariq Ibrahim
e7e9adaebd add make target to update default pciids file 2023-09-22 12:37:41 -07:00
Tariq Ibrahim
4bbcda1940 update default pci_ids db 2023-08-28 16:01:27 -07:00
Christopher Desiniotis
aa1c216841 Add a local logging interface for nvpci
Signed-off-by: Christopher Desiniotis <cdesiniotis@nvidia.com>
2023-08-17 09:57:29 -07:00
Christopher Desiniotis
114da86794 Generate warnings instead of errors for unknown device / class ids in the PCI database.
Signed-off-by: Christopher Desiniotis <cdesiniotis@nvidia.com>
2023-07-31 11:33:00 -07:00
Christopher Desiniotis
1b3ef9bd64 Update pciids interface to return errors for invalid vendor / device ids
Signed-off-by: Christopher Desiniotis <cdesiniotis@nvidia.com>
2023-06-13 11:18:21 -07:00
Christopher Desiniotis
066d8f30bc Allow options to be passed when creating an instance of the nvpci interface
Signed-off-by: Christopher Desiniotis <cdesiniotis@nvidia.com>
2023-06-09 17:27:31 -07:00
Christopher Desiniotis
76018d282e Allow clients of the pciids API to set the pci.ids filepath
Signed-off-by: Christopher Desiniotis <cdesiniotis@nvidia.com>
2023-06-09 16:06:51 -07:00
Evan Lezar
62eb401f91 Check if device is MIG Capable when visiting MIG devices
This change updates Device.VisitMigDevices to align with
Device.VisitMigProfiles in than the function is skipped for
non-MIG-capable devices. This allows the function to always
be a no-op on older drivers where MIG is not supported.

Signed-off-by: Evan Lezar <elezar@nvidia.com>
2023-05-22 13:24:11 +02:00
Kevin Klues
18ad7cd513 Merge branch 'add-brand' into 'main'
Pass device.GetBrand() through from NVML and wrap it to print a string

See merge request nvidia/cloud-native/go-nvlib!37
2023-03-27 17:12:25 +00:00
Kevin Klues
8d1b98baa6 Fix bug where MigProfile.Equals() would not work with wrapper type
Signed-off-by: Kevin Klues <kklues@nvidia.com>
2023-03-27 16:43:56 +00:00
Kevin Klues
2b4f40a90b Extract MockNVDeviceLib into helper function in nvdev tests
Signed-off-by: Kevin Klues <kklues@nvidia.com>
2023-03-27 16:42:42 +00:00
Kevin Klues
82adde1bf4 Remove redundant tests and fix misleading tests
Signed-off-by: Kevin Klues <kklues@nvidia.com>
2023-03-27 10:08:21 +00:00
Kevin Klues
18957773f2 Add function for AssertValidMigProfileFormat
This does not verify that the profile is a valid profile for the current
platform, but rather that it simply adheres to the proper formatting of a MIG
profile string.

Signed-off-by: Kevin Klues <kklues@nvidia.com>
2023-03-27 10:04:32 +00:00
Kevin Klues
087de4f458 Pass device.GetBrand() through from NVMl and wrap it to print a string
Signed-off-by: Kevin Klues <kklues@nvidia.com>
2023-03-26 21:15:51 +00:00
Kevin Klues
8c50f9f18f Fix bug in heuristic for which MIG profiles to skip
Signed-off-by: Kevin Klues <kklues@nvidia.com>
2023-03-25 22:01:20 +00:00
Kevin Klues
500a464b22 Cache mig profiles in devicelib, not just each device
Signed-off-by: Kevin Klues <kklues@nvidia.com>
2023-03-25 18:48:18 +00:00
Kevin Klues
631bde023f Add ability to query device architeture and cuda compute capability
Signed-off-by: Kevin Klues <kklues@nvidia.com>
2023-03-24 14:24:19 +00:00
Kevin Klues
642041d1e0 Update mig-profile parsing / name generation after go-nvml v12.0 bump
Signed-off-by: Kevin Klues <kklues@nvidia.com>
2023-03-23 19:29:57 +00:00
Evan Lezar
bcbaf5a0de Add HasDXCore to info package
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2023-02-08 16:04:35 +01:00
Kevin Klues
264c5dab79 Add NewDeviceByUUID() and NewMigDeviceByUUID() calls to nvlib.device
Signed-off-by: Kevin Klues <kklues@nvidia.com>
2022-12-08 14:53:50 +00:00
Kevin Klues
5d4be6ac55 Regenerate mocks for NVML
Signed-off-by: Kevin Klues <kklues@nvidia.com>
2022-12-08 14:53:45 +00:00
Kevin Klues
6a4886e49e Add Placement related calls for GPUInstances in nvml wrapper
Signed-off-by: Kevin Klues <kklues@nvidia.com>
2022-12-08 14:53:39 +00:00
Evan Lezar
7e5501f6a3 Skip DGX Display devices in addition to NVIDIA DGX Display devices
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2022-12-07 11:40:09 +01:00
Evan Lezar
1fc1eee392 Remove WithSelecteDeviceClasses option
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2022-11-21 15:47:44 +01:00
Evan Lezar
655eb9795c Skip display devices based on device names
This allows devices to be skipped based on device names and
skips "NVIDIA DGX Display" devices by default.

Signed-off-by: Evan Lezar <elezar@nvidia.com>
2022-11-21 15:46:15 +01:00
Evan Lezar
fa5d0408ce Ensure pci bus ID is lower case
The PCI Bus ID returned by NVML is upper case and results in the following error:

error getting PCI device class for device:
failed to construct PCI device:
unable to read PCI device vendor id for 0000:0A:00.0:
open /sys/bus/pci/devices/0000:0A:00.0/vendor:
no such file or directory

Signed-off-by: Evan Lezar <elezar@nvidia.com>
2022-11-16 12:12:07 +01:00
Evan Lezar
e37e145458 Add filtering of devices based on PCI device class
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2022-11-16 10:30:49 +01:00
Evan Lezar
f156c34310 Add private constructor for creating a device
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2022-11-15 17:42:22 +01:00
Evan Lezar
e96d9c58f1 Add GetGPUByPciBusID to nvpci.Interface
This change adds a GetGPUByPciBusID method to the nvpci Interface.
The exising NewDevice function is moved to nvmdev where it is used.

Signed-off-by: Evan Lezar <elezar@nvidia.com>
2022-11-15 17:42:22 +01:00
Zvonko Kaiser
f3102f8dcb Added PCI IDS support and DPU detection 2022-11-02 03:58:13 -07:00
Evan Lezar
8b5e3d224d Ensure that invalid MIG profiles are skipped
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2022-10-14 10:31:50 +02:00