Update plans

2025-06-26 23:15:49 +00:00 · 2025-05-14 15:05:24 +08:00
parent 04278f6dee
commit ebf3d2f916
1 changed files with 2 additions and 2 deletions
--- a/README.md
+++ b/README.md
@@ -25,14 +25,14 @@ Despite its lightweight design, DeepGEMM's performance matches or exceeds expert
 - [ ] Sanitizer for testing
 - [x] Weight gradient kernels for dense models
 - [x] Weight gradient kernels for MoE models
+- [ ] Better `get_best_configs` modeling
 - [ ] Utility kernels for MoE models (maybe with [tile-lang](https://github.com/tile-ai/tilelang))
 - [ ] CUDA PDL support
 - [ ] More scaling granularity support via templates
 - [ ] Larger TMA multicast size for some shapes
 - [x] MMA template refactor with CUTLASS
- [ ] Optimizations for unaligned shapes
 - [ ] Optimizations for power efficiency
- [ ] Remove shape limitations on N and K
+- [x] Remove shape limitations on N and K
 - [ ] BF16 kernels
 - [ ] Split/stream-k optimizations