Commit Graph

4 Commits

Author SHA1 Message Date
Chenggang Zhao
350989eef3 Unify ceil_divs 2025-05-15 16:48:32 +08:00
Chenggang Zhao
4373af2e82 Add DG_PRINT_CONFIGS 2025-05-15 16:36:40 +08:00
Chenggang Zhao
816b39053a Refactor launch-related structures 2025-05-15 16:14:21 +08:00
Zhean Xu
04278f6dee
Weight gradient kernels for dense and MoE models (#95)
* Init weight gradient kernels.

* Support unaligned n,k and gmem stride

* Update docs

* Several cleanups

* Remove restrictions on N

* Add stride(0) assertions

---------

Co-authored-by: Chenggang Zhao <chenggangz@deepseek.com>
2025-05-14 14:47:58 +08:00