Commit Graph

14 Commits

Author SHA1 Message Date
chenhongmin.will
d1689ab64f use mm1's Aregs instead of mma0's Cregs 2025-02-27 11:59:17 +08:00
chenhongmin.will
1757a6db07 try fix 2025-02-27 09:11:17 +08:00
chenhongmin.will
dbd8c307eb fix sV 2025-02-27 01:42:58 +08:00
chenhongmin.will
6dcea4952c add TransV 2025-02-26 18:48:24 +08:00
chenhongmin.will
6a4eb631e2 add transv barrier 2025-02-26 17:57:00 +08:00
chenhongmin.will
59f691763e fix Vt illegal 2025-02-26 17:39:29 +08:00
chenhongmin.will
f6fab1b915 change to use per_tensor 2025-02-26 10:21:09 +08:00
chenhongmin.will
dfe8ffc75a enable fp8 api 2025-02-25 23:02:57 +08:00
chenhongmin.will
c50d29d170 fix compile 2025-02-25 21:52:11 +08:00
chenhongmin.will
7409203f44 enable fp8 compile 2025-02-25 21:12:40 +08:00
chenhongmin.will
fed0499301 fp8 shared mem 2025-02-25 11:26:50 +08:00
chenhongmin.will
b67a18f850 update gmem 2025-02-25 09:45:19 +08:00
chenhongmin.will
d833dbd711 enable fp8 2025-02-25 09:03:02 +08:00
Jiashi Li
414a2f3eed Initial commit
i
2025-02-24 09:20:23 +08:00