mirror of
https://github.com/deepseek-ai/DeepGEMM
synced 2025-06-26 23:15:49 +00:00
This change introduces the necessary compiler flags and CMake configurations to enable support for the Nvidia Blackwell SM120 architecture. - Modified deep_gemm/jit/compiler.py to include sm_120 and compute_120 flags for NVCC and NVRTC. - Updated CMakeLists.txt to add the new architecture flags for the build process. Further testing on Blackwell hardware is required to validate MMA instruction compatibility and overall performance. |
||
|---|---|---|
| .. | ||
| __init__.py | ||
| compiler.py | ||
| interleave_ffma.py | ||
| runtime.py | ||