mirror of
https://github.com/deepseek-ai/DeepGEMM
synced 2025-05-06 00:04:22 +00:00
* Add swizzling params * Add TMA D descriptor * Always use STSMx2 * Swizzling draft * Compatible with padding * Fix bugs * Optimize swizzle performance * Optimize expression * Optimize TMA issues * Fix README * Stricter assertions |
||
---|---|---|
.. | ||
__init__.py | ||
gemm.py | ||
m_grouped_gemm.py | ||
tuner.py | ||
utils.py |