From 33e0c3ce406fd1b82891501f5354ceacc64ff5c8 Mon Sep 17 00:00:00 2001
From: Chenggang Zhao <chenggangz@deepseek.com>
Date: Thu, 24 Apr 2025 14:37:53 +0800
Subject: [PATCH] Update plans

---
 README.md | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/README.md b/README.md
index 5d925da..dab1f05 100644
--- a/README.md
+++ b/README.md
@@ -17,6 +17,9 @@ Despite its lightweight design, DeepGEMM's performance matches or exceeds expert
 - [ ] Larger block size on N (up to 256)
 - [x] MoE scheduler with TMA multicast compatibility
 - [x] Fix TMA multicast compatibility for indivisible shapes
+- [ ] Skip useless computation on M
+- [ ] NVRTC as a faster compiler
+- [ ] Sanitizer for testing
 - [ ] Weight gradient kernels for dense models
 - [ ] Weight gradient kernels for MoE models
 - [ ] Utility kernels for MoE models (as a pre-built CUDA library)