mirror of
https://github.com/deepseek-ai/DeepGEMM
synced 2025-05-05 23:34:22 +00:00
doc: update README
Signed-off-by: Zihua Wu <13583761+lucifer1004@users.noreply.github.com>
This commit is contained in:
parent
d473f594be
commit
69852c465f
@ -18,7 +18,7 @@ Despite its lightweight design, DeepGEMM's performance matches or exceeds expert
|
||||
- [x] MoE scheduler with TMA multicast compatibility
|
||||
- [x] Fix TMA multicast compatibility for indivisible shapes
|
||||
- [ ] Skip useless computation on M
|
||||
- [ ] NVRTC as a faster compiler
|
||||
- [x] NVRTC as a faster compiler
|
||||
- [ ] Sanitizer for testing
|
||||
- [ ] Weight gradient kernels for dense models
|
||||
- [ ] Weight gradient kernels for MoE models
|
||||
|
Loading…
Reference in New Issue
Block a user