Several code lints x2

2025-06-26 23:15:49 +00:00 · 2025-04-22 17:24:02 +08:00
parent 902208a17e
commit f4014953ad
3 changed files with 11 additions and 8 deletions
--- a/README.md
+++ b/README.md
@@ -16,7 +16,7 @@ Despite its lightweight design, DeepGEMM's performance matches or exceeds expert
 - [x] Shared memory swizzling for output
 - [ ] Larger block size on N (up to 256)
 - [x] MoE scheduler with TMA multicast compatibility
- [ ] Fix TMA multicast compatibility for indivisible shapes
+- [x] Fix TMA multicast compatibility for indivisible shapes
 - [ ] Weight gradient kernels for dense models
 - [ ] Weight gradient kernels for MoE models
 - [ ] Utility kernels for MoE models (as a pre-built CUDA library)