Commit Graph

6 Commits

Author SHA1 Message Date
Shengyu Liu
c2067be3ea
Performance Update (2025.04.22) (#71)
* Fix benchmark script

* Performance optimization for compute-bound cases

* Add new testcase (s_k = 16384)

* Update README.md

* Update comment

* Update README.md

* Add the deep-dive blog

* Add background color for MLA Kernel Sched.drawio.svg

* Use relative path for the schedule image

* Move flash_mla.h to kernels/params.h
2025-04-22 17:50:57 +08:00
ljss
e1e9fa98f8 Style fix 2025-02-25 09:18:11 +08:00
Sijia Chen
65fb7732fc support fp16 2025-02-24 01:58:53 -08:00
lancerts
4fbaa9527c minor fix test 2025-02-23 20:12:49 -08:00
sazc
051e40e82b tests: Triton had remove the fast_flush parameter from do_bench (#4485) 2025-02-24 10:59:22 +08:00
Jiashi Li
414a2f3eed Initial commit
i
2025-02-24 09:20:23 +08:00