Kevin Klues
153699bb93
Update to incorporate go-nvml updates to expose interface types
...
Signed-off-by: Kevin Klues <kklues@nvidia.com>
2024-04-12 21:19:30 +00:00
Carlos Eduardo Arango Gutierrez
48789b76df
Address golangci-lint warnings
...
Signed-off-by: Carlos Eduardo Arango Gutierrez <eduardoa@nvidia.com>
2024-04-04 15:24:10 +02:00
Evan Lezar
fb0dc9d525
Add ComputeMode constants
...
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2024-03-12 17:23:12 +02:00
Evan Lezar
2feaa48250
Add SetComputeMode method to Device
...
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2024-03-11 16:25:04 +02:00
Evan Lezar
06cbc571ef
Add nvmlDeviceHandle function to Device interface
...
This change allows the underlying device handle to be returned
without relying on type-casting.
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2024-01-09 14:05:01 +01:00
Christopher Desiniotis
177e4eef6f
Add an Identifier type to nvlib/device which implements common parsing of GPU indices and UUIDs
...
Signed-off-by: Christopher Desiniotis <cdesiniotis@nvidia.com>
2023-12-06 16:42:51 -08:00
Evan Lezar
9fd385bace
Merge pull request #7 from NVIDIA/add-nvlink-functions
...
Add functions related to NVLink info
2023-11-16 16:09:31 +01:00
Evan Lezar
2d9404b131
Rename go module to github.com/NVIDIA/go-nvlib
...
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2023-11-15 17:58:43 +01:00
Evan Lezar
e2e221a166
Run go fmt on pciids/pciids.go
...
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2023-11-14 13:37:40 +01:00
Evan Lezar
80d61efe5d
Add functions related to NVLink info
...
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2023-11-14 12:55:16 +01:00
Evan Lezar
30ca72faaf
TOFIX: Allow libname to be specified
...
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2023-10-23 14:09:03 +02:00
Evan Lezar
278851d719
Use GetLibrary().Lookup() in nvml package
...
This change uses the GetLibrary().Lookup() function in the nvml package
to check whether a particular function is available. This avoids
the need to explicitly open a library, for example.
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2023-10-20 16:56:56 +02:00
Tariq Ibrahim
e7e9adaebd
add make target to update default pciids file
2023-09-22 12:37:41 -07:00
Tariq Ibrahim
4bbcda1940
update default pci_ids db
2023-08-28 16:01:27 -07:00
Christopher Desiniotis
aa1c216841
Add a local logging interface for nvpci
...
Signed-off-by: Christopher Desiniotis <cdesiniotis@nvidia.com>
2023-08-17 09:57:29 -07:00
Christopher Desiniotis
114da86794
Generate warnings instead of errors for unknown device / class ids in the PCI database.
...
Signed-off-by: Christopher Desiniotis <cdesiniotis@nvidia.com>
2023-07-31 11:33:00 -07:00
Christopher Desiniotis
1b3ef9bd64
Update pciids interface to return errors for invalid vendor / device ids
...
Signed-off-by: Christopher Desiniotis <cdesiniotis@nvidia.com>
2023-06-13 11:18:21 -07:00
Christopher Desiniotis
066d8f30bc
Allow options to be passed when creating an instance of the nvpci interface
...
Signed-off-by: Christopher Desiniotis <cdesiniotis@nvidia.com>
2023-06-09 17:27:31 -07:00
Christopher Desiniotis
76018d282e
Allow clients of the pciids API to set the pci.ids filepath
...
Signed-off-by: Christopher Desiniotis <cdesiniotis@nvidia.com>
2023-06-09 16:06:51 -07:00
Evan Lezar
62eb401f91
Check if device is MIG Capable when visiting MIG devices
...
This change updates Device.VisitMigDevices to align with
Device.VisitMigProfiles in than the function is skipped for
non-MIG-capable devices. This allows the function to always
be a no-op on older drivers where MIG is not supported.
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2023-05-22 13:24:11 +02:00
Kevin Klues
18ad7cd513
Merge branch 'add-brand' into 'main'
...
Pass device.GetBrand() through from NVML and wrap it to print a string
See merge request nvidia/cloud-native/go-nvlib!37
2023-03-27 17:12:25 +00:00
Kevin Klues
8d1b98baa6
Fix bug where MigProfile.Equals() would not work with wrapper type
...
Signed-off-by: Kevin Klues <kklues@nvidia.com>
2023-03-27 16:43:56 +00:00
Kevin Klues
2b4f40a90b
Extract MockNVDeviceLib into helper function in nvdev tests
...
Signed-off-by: Kevin Klues <kklues@nvidia.com>
2023-03-27 16:42:42 +00:00
Kevin Klues
82adde1bf4
Remove redundant tests and fix misleading tests
...
Signed-off-by: Kevin Klues <kklues@nvidia.com>
2023-03-27 10:08:21 +00:00
Kevin Klues
18957773f2
Add function for AssertValidMigProfileFormat
...
This does not verify that the profile is a valid profile for the current
platform, but rather that it simply adheres to the proper formatting of a MIG
profile string.
Signed-off-by: Kevin Klues <kklues@nvidia.com>
2023-03-27 10:04:32 +00:00
Kevin Klues
087de4f458
Pass device.GetBrand() through from NVMl and wrap it to print a string
...
Signed-off-by: Kevin Klues <kklues@nvidia.com>
2023-03-26 21:15:51 +00:00
Kevin Klues
8c50f9f18f
Fix bug in heuristic for which MIG profiles to skip
...
Signed-off-by: Kevin Klues <kklues@nvidia.com>
2023-03-25 22:01:20 +00:00
Kevin Klues
500a464b22
Cache mig profiles in devicelib, not just each device
...
Signed-off-by: Kevin Klues <kklues@nvidia.com>
2023-03-25 18:48:18 +00:00
Kevin Klues
631bde023f
Add ability to query device architeture and cuda compute capability
...
Signed-off-by: Kevin Klues <kklues@nvidia.com>
2023-03-24 14:24:19 +00:00
Kevin Klues
642041d1e0
Update mig-profile parsing / name generation after go-nvml v12.0 bump
...
Signed-off-by: Kevin Klues <kklues@nvidia.com>
2023-03-23 19:29:57 +00:00
Evan Lezar
bcbaf5a0de
Add HasDXCore to info package
...
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2023-02-08 16:04:35 +01:00
Kevin Klues
264c5dab79
Add NewDeviceByUUID() and NewMigDeviceByUUID() calls to nvlib.device
...
Signed-off-by: Kevin Klues <kklues@nvidia.com>
2022-12-08 14:53:50 +00:00
Kevin Klues
5d4be6ac55
Regenerate mocks for NVML
...
Signed-off-by: Kevin Klues <kklues@nvidia.com>
2022-12-08 14:53:45 +00:00
Kevin Klues
6a4886e49e
Add Placement related calls for GPUInstances in nvml wrapper
...
Signed-off-by: Kevin Klues <kklues@nvidia.com>
2022-12-08 14:53:39 +00:00
Evan Lezar
7e5501f6a3
Skip DGX Display devices in addition to NVIDIA DGX Display devices
...
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2022-12-07 11:40:09 +01:00
Evan Lezar
1fc1eee392
Remove WithSelecteDeviceClasses option
...
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2022-11-21 15:47:44 +01:00
Evan Lezar
655eb9795c
Skip display devices based on device names
...
This allows devices to be skipped based on device names and
skips "NVIDIA DGX Display" devices by default.
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2022-11-21 15:46:15 +01:00
Evan Lezar
fa5d0408ce
Ensure pci bus ID is lower case
...
The PCI Bus ID returned by NVML is upper case and results in the following error:
error getting PCI device class for device:
failed to construct PCI device:
unable to read PCI device vendor id for 0000:0A:00.0:
open /sys/bus/pci/devices/0000:0A:00.0/vendor:
no such file or directory
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2022-11-16 12:12:07 +01:00
Evan Lezar
e37e145458
Add filtering of devices based on PCI device class
...
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2022-11-16 10:30:49 +01:00
Evan Lezar
f156c34310
Add private constructor for creating a device
...
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2022-11-15 17:42:22 +01:00
Evan Lezar
e96d9c58f1
Add GetGPUByPciBusID to nvpci.Interface
...
This change adds a GetGPUByPciBusID method to the nvpci Interface.
The exising NewDevice function is moved to nvmdev where it is used.
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2022-11-15 17:42:22 +01:00
Zvonko Kaiser
f3102f8dcb
Added PCI IDS support and DPU detection
2022-11-02 03:58:13 -07:00
Evan Lezar
8b5e3d224d
Ensure that invalid MIG profiles are skipped
...
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2022-10-14 10:31:50 +02:00
Evan Lezar
1cb5426db8
Add functions for interacting with Events
...
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2022-09-21 15:10:06 +02:00
Kevin Klues
f933892965
Add extended APIs for top-level devices to the device package
...
Signed-off-by: Kevin Klues <kklues@nvidia.com>
2022-09-16 13:34:17 +00:00
Kevin Klues
1d680a93b6
Move MIG apis to device package
...
We decided it makes sense to have top level device and MIG device abstractions
all under one package rather than trying to separate them. It will make it
easier to hav them clal between each other without package dependency loops.
Signed-off-by: Kevin Klues <kklues@nvidia.com>
2022-09-16 13:09:09 +00:00
Kevin Klues
8e749776c5
Add nvml wrappers for getting GIs and CIs by ID
...
Signed-off-by: Kevin Klues <kklues@nvidia.com>
2022-09-15 17:08:00 +00:00
Kevin Klues
e95e3a5e8b
Add a MIG package as a subpackage to nvlib
...
For now this package only has functions to work with MIG profiles. More
functionality will be added here in the future.
Signed-off-by: Kevin Klues <kklues@nvidia.com>
2022-09-15 17:08:00 +00:00
Evan Lezar
16ab19d8ae
Merge branch 'add-nvlib-base' into 'main'
...
Add a new nvlib package and move the nvinfo package into it
See merge request nvidia/cloud-native/go-nvlib!16
2022-09-15 11:36:25 +00:00
Kevin Klues
d23f460ad3
Move the nvinfo package into pkg/nvlib/info
...
Also build an interface around the API so that it can more easily be mocked.
Signed-off-by: Kevin Klues <kklues@nvidia.com>
2022-09-15 11:30:34 +00:00