Commit Graph

130 Commits

Author SHA1 Message Date
Kevin Klues
6a4886e49e Add Placement related calls for GPUInstances in nvml wrapper
Signed-off-by: Kevin Klues <kklues@nvidia.com>
2022-12-08 14:53:39 +00:00
Kevin Klues
49dcad67c4 Merge branch 'skip-DGX-display' into 'main'
Skip DGX Display devices in addition to NVIDIA DGX Display devices

See merge request nvidia/cloud-native/go-nvlib!28
2022-12-07 11:20:48 +00:00
Evan Lezar
7e5501f6a3 Skip DGX Display devices in addition to NVIDIA DGX Display devices
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2022-12-07 11:40:09 +01:00
Evan Lezar
a27e593595 Merge branch 'skip-display-devices-on-name' into 'main'
Skip display devices based on model name

See merge request nvidia/cloud-native/go-nvlib!26
2022-11-21 20:39:40 +00:00
Evan Lezar
417a5254a4 Pin moq version to v0.2.7
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2022-11-21 15:50:01 +01:00
Evan Lezar
d69a94ffdd Add .shell target for interactive make
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2022-11-21 15:47:46 +01:00
Evan Lezar
1fc1eee392 Remove WithSelecteDeviceClasses option
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2022-11-21 15:47:44 +01:00
Evan Lezar
655eb9795c Skip display devices based on device names
This allows devices to be skipped based on device names and
skips "NVIDIA DGX Display" devices by default.

Signed-off-by: Evan Lezar <elezar@nvidia.com>
2022-11-21 15:46:15 +01:00
Evan Lezar
0e10f084d1 Merge branch 'fix-pci-id-case' into 'main'
Ensure pci bus ID is lower case

See merge request nvidia/cloud-native/go-nvlib!25
2022-11-16 11:13:58 +00:00
Evan Lezar
fa5d0408ce Ensure pci bus ID is lower case
The PCI Bus ID returned by NVML is upper case and results in the following error:

error getting PCI device class for device:
failed to construct PCI device:
unable to read PCI device vendor id for 0000:0A:00.0:
open /sys/bus/pci/devices/0000:0A:00.0/vendor:
no such file or directory

Signed-off-by: Evan Lezar <elezar@nvidia.com>
2022-11-16 12:12:07 +01:00
Kevin Klues
9110850748 Merge branch 'skip-display-devices' into 'main'
Skip devices based on PCI device class

See merge request nvidia/cloud-native/go-nvlib!24
2022-11-16 09:48:48 +00:00
Evan Lezar
4a0fdc2e8a Skip pkg/nvml folder when linting
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2022-11-16 10:30:49 +01:00
Evan Lezar
e37e145458 Add filtering of devices based on PCI device class
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2022-11-16 10:30:49 +01:00
Evan Lezar
f156c34310 Add private constructor for creating a device
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2022-11-15 17:42:22 +01:00
Evan Lezar
e96d9c58f1 Add GetGPUByPciBusID to nvpci.Interface
This change adds a GetGPUByPciBusID method to the nvpci Interface.
The exising NewDevice function is moved to nvmdev where it is used.

Signed-off-by: Evan Lezar <elezar@nvidia.com>
2022-11-15 17:42:22 +01:00
Zvonko Kaiser
0e8a479bd5 Merge branch 'pciids' into 'main'
Added PCI IDS support and DPU detection

See merge request nvidia/cloud-native/go-nvlib!23
2022-11-03 09:57:58 +00:00
Zvonko Kaiser
f3102f8dcb Added PCI IDS support and DPU detection 2022-11-02 03:58:13 -07:00
Evan Lezar
7222fea1a7 Merge branch 'fix-walk-mig-profile' into 'main'
Ensure that invalid MIG profiles are skipped

See merge request nvidia/cloud-native/go-nvlib!21
2022-10-14 10:08:48 +00:00
Evan Lezar
8b5e3d224d Ensure that invalid MIG profiles are skipped
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2022-10-14 10:31:50 +02:00
Kevin Klues
1049a7fa76 Merge branch 'add-events' into 'main'
Add functions for interacting with Events

See merge request nvidia/cloud-native/go-nvlib!20
2022-09-22 13:34:27 +00:00
Evan Lezar
1cb5426db8 Add functions for interacting with Events
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2022-09-21 15:10:06 +02:00
Evan Lezar
01649c65ea Merge branch 'add-device-apis' into 'main'
Move MIG apis to device package and add extended APIs for top-level devices to it

See merge request nvidia/cloud-native/go-nvlib!19
2022-09-16 14:15:32 +00:00
Kevin Klues
f933892965 Add extended APIs for top-level devices to the device package
Signed-off-by: Kevin Klues <kklues@nvidia.com>
2022-09-16 13:34:17 +00:00
Kevin Klues
1d680a93b6 Move MIG apis to device package
We decided it makes sense to have top level device and MIG device abstractions
all under one package rather than trying to separate them. It will make it
easier to hav them clal between each other without package dependency loops.

Signed-off-by: Kevin Klues <kklues@nvidia.com>
2022-09-16 13:09:09 +00:00
Kevin Klues
8719e258a8 Merge branch 'add-mig-pkg' into 'main'
Add MIG package with abstraction for MIG profiles in it

See merge request nvidia/cloud-native/go-nvlib!15
2022-09-16 08:17:41 +00:00
Kevin Klues
8e749776c5 Add nvml wrappers for getting GIs and CIs by ID
Signed-off-by: Kevin Klues <kklues@nvidia.com>
2022-09-15 17:08:00 +00:00
Kevin Klues
e95e3a5e8b Add a MIG package as a subpackage to nvlib
For now this package only has functions to work with MIG profiles. More
functionality will be added here in the future.

Signed-off-by: Kevin Klues <kklues@nvidia.com>
2022-09-15 17:08:00 +00:00
Evan Lezar
16ab19d8ae Merge branch 'add-nvlib-base' into 'main'
Add a new nvlib package and move the nvinfo package into it

See merge request nvidia/cloud-native/go-nvlib!16
2022-09-15 11:36:25 +00:00
Kevin Klues
d23f460ad3 Move the nvinfo package into pkg/nvlib/info
Also build an interface around the API so that it can more easily be mocked.

Signed-off-by: Kevin Klues <kklues@nvidia.com>
2022-09-15 11:30:34 +00:00
Evan Lezar
a880a67681 Merge branch 'add-get-cuda-driver-version' into 'main'
Add additional functions to nvml interfaces

See merge request nvidia/cloud-native/go-nvlib!18
2022-09-05 13:31:28 +00:00
Evan Lezar
211a8eb973 Address minor lint error
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2022-09-02 15:12:23 +02:00
Evan Lezar
bf9a4d3476 Sort functions in interface alphabetically
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2022-09-02 15:07:39 +02:00
Evan Lezar
a404873b12 Add additional functions to Device interface
Add the following functions to the Device interface:
* GetCudaComputeCapability
* GetAttributes
* GetName

Signed-off-by: Evan Lezar <elezar@nvidia.com>
2022-09-02 15:05:12 +02:00
Evan Lezar
da71bc2bff Update go-nvml dependency
This updates go-nvml to include fixes for getting devices by UUID
and to remove the panic from calls to Init and Shutdown.

Signed-off-by: Evan Lezar <elezar@nvidia.com>
2022-09-01 14:06:25 +02:00
Evan Lezar
9175bde20b Add SystemGetCudaDriverVersion to NVML interface
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2022-09-01 14:05:16 +02:00
Christopher Desiniotis
7e23240abb Merge branch 'add-getters-for-nvpci-and-nvmdev' into 'main'
Add several 'getters' to nvpci and nvmdev

See merge request nvidia/cloud-native/go-nvlib!17
2022-08-26 16:24:04 +00:00
Christopher Desiniotis
bccac280ca nvmdev: Add GetPhysicalFunction() for both Device and ParentDevice
Signed-off-by: Christopher Desiniotis <cdesiniotis@nvidia.com>
2022-08-25 09:35:11 -07:00
Christopher Desiniotis
6ff7845b92 nvpci: Add GetGPUByIndex()
Signed-off-by: Christopher Desiniotis <cdesiniotis@nvidia.com>
2022-08-25 09:34:49 -07:00
Christopher Desiniotis
09ae86c8e0 Merge branch 'pf-filtering' into 'main'
Detect if NvidiaPCIDevice is a PF or VF

See merge request nvidia/cloud-native/go-nvlib!13
2022-08-15 17:09:02 +00:00
Kevin Klues
ad3fa31634 Merge branch 'add-nvml-wrapper' into 'main'
Add an interface based wrapper around go-nvml for better mocking

See merge request nvidia/cloud-native/go-nvlib!14
2022-08-11 12:41:20 +00:00
Kevin Klues
8e030df089 Add make target for 'go generate'
Signed-off-by: Kevin Klues <kklues@nvidia.com>
2022-08-11 12:13:41 +00:00
Kevin Klues
2e1e2e784a Add String() and Error() functions to Return type in nvml package
There is a default implementation for these that is overwritten if the
underlying NVML library ends up being used.

Signed-off-by: Kevin Klues <kklues@nvidia.com>
2022-08-11 12:13:41 +00:00
Kevin Klues
008aa70bc6 Add an interface based wrapper around go-nvml for better mocking
Signed-off-by: Kevin Klues <kklues@nvidia.com>
2022-08-10 15:52:16 +00:00
Kevin Klues
2d296edf19 Add go-nvml as a vendored repo
Signed-off-by: Kevin Klues <kklues@nvidia.com>
2022-08-10 06:57:21 +00:00
Christopher Desiniotis
afdf3edd99 Detect if NvidiaPCIDevice is a PF or VF
Signed-off-by: Christopher Desiniotis <cdesiniotis@nvidia.com>
2022-07-28 19:16:39 -07:00
Christopher Desiniotis
c7f47cb02a Merge branch 'iommu-group' into 'main'
Detect iommu_group for PCI and mdev devices

See merge request nvidia/cloud-native/go-nvlib!12
2022-07-25 23:20:03 +00:00
Christopher Desiniotis
f52cd402a1 Detect iommu_group for PCI and mdev devices 2022-07-25 23:20:03 +00:00
Christopher Desiniotis
f281b5e581 Merge branch 'driver-detection' into 'main'
Detect driver bound to an NvidiaPCIDevice and mdev device

See merge request nvidia/cloud-native/go-nvlib!11
2022-07-14 20:39:17 +00:00
Evan Lezar
a965ca0b8f Merge branch 'add-info-package' into 'main'
Add nvinfo package to query system state

See merge request nvidia/cloud-native/go-nvlib!9
2022-07-13 11:50:52 +00:00
Christopher Desiniotis
8209652159 Detect driver bound to mdev devices
Signed-off-by: Christopher Desiniotis <cdesiniotis@nvidia.com>
2022-07-08 13:11:01 -07:00