From ebf3d2f916f4da88834bbe3fe44d9ef7cd0d6f93 Mon Sep 17 00:00:00 2001
From: Chenggang Zhao <chenggangz@deepseek.com>
Date: Wed, 14 May 2025 15:05:24 +0800
Subject: [PATCH] Update plans

---
 README.md | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/README.md b/README.md
index 2aa53ce..170c271 100644
--- a/README.md
+++ b/README.md
@@ -25,14 +25,14 @@ Despite its lightweight design, DeepGEMM's performance matches or exceeds expert
 - [ ] Sanitizer for testing
 - [x] Weight gradient kernels for dense models
 - [x] Weight gradient kernels for MoE models
+- [ ] Better `get_best_configs` modeling
 - [ ] Utility kernels for MoE models (maybe with [tile-lang](https://github.com/tile-ai/tilelang))
 - [ ] CUDA PDL support
 - [ ] More scaling granularity support via templates
 - [ ] Larger TMA multicast size for some shapes
 - [x] MMA template refactor with CUTLASS
-- [ ] Optimizations for unaligned shapes
 - [ ] Optimizations for power efficiency
-- [ ] Remove shape limitations on N and K
+- [x] Remove shape limitations on N and K
 - [ ] BF16 kernels
 - [ ] Split/stream-k optimizations