chenhongmin.will
|
d1689ab64f
|
use mm1's Aregs instead of mma0's Cregs
|
2025-02-27 11:59:17 +08:00 |
|
chenhongmin.will
|
1757a6db07
|
try fix
|
2025-02-27 09:11:17 +08:00 |
|
chenhongmin.will
|
dbd8c307eb
|
fix sV
|
2025-02-27 01:42:58 +08:00 |
|
chenhongmin.will
|
6dcea4952c
|
add TransV
|
2025-02-26 18:48:24 +08:00 |
|
chenhongmin.will
|
6a4eb631e2
|
add transv barrier
|
2025-02-26 17:57:00 +08:00 |
|
chenhongmin.will
|
59f691763e
|
fix Vt illegal
|
2025-02-26 17:39:29 +08:00 |
|
chenhongmin.will
|
29de9e0c79
|
debug mode
|
2025-02-26 16:03:17 +08:00 |
|
chenhongmin.will
|
f6fab1b915
|
change to use per_tensor
|
2025-02-26 10:21:09 +08:00 |
|
chenhongmin.will
|
4b314cd655
|
update fp8 api
|
2025-02-26 08:33:25 +08:00 |
|
chenhongmin.will
|
ef644a56e0
|
update ut
|
2025-02-26 08:20:18 +08:00 |
|
chenhongmin.will
|
870418802a
|
add fp8 ut
|
2025-02-26 07:57:51 +08:00 |
|
chenhongmin.will
|
dfe8ffc75a
|
enable fp8 api
|
2025-02-25 23:02:57 +08:00 |
|
chenhongmin.will
|
c50d29d170
|
fix compile
|
2025-02-25 21:52:11 +08:00 |
|
chenhongmin.will
|
7409203f44
|
enable fp8 compile
|
2025-02-25 21:12:40 +08:00 |
|
chenhongmin.will
|
fed0499301
|
fp8 shared mem
|
2025-02-25 11:26:50 +08:00 |
|
chenhongmin.will
|
b67a18f850
|
update gmem
|
2025-02-25 09:45:19 +08:00 |
|
chenhongmin.will
|
d833dbd711
|
enable fp8
|
2025-02-25 09:03:02 +08:00 |
|
chenhongmin.will
|
dae0690055
|
init fp8
|
2025-02-24 21:12:36 +08:00 |
|
Jiashi Li
|
bcb90f2afd
|
Merge pull request #9 from homorunner/main
support Windows build
|
2025-02-24 13:21:58 +08:00 |
|
Jiashi Li
|
dd1161e396
|
Merge pull request #14 from lancerts/minor-fix
minor fix test
|
2025-02-24 13:13:58 +08:00 |
|
lancerts
|
4fbaa9527c
|
minor fix test
|
2025-02-23 20:12:49 -08:00 |
|
Jiashi Li
|
accc1695ee
|
Merge pull request #12 from sazczmh/main
tests: Triton 3.2.0 had remove the fast_flush parameter from do_bench
|
2025-02-24 11:57:41 +08:00 |
|
程元
|
e62bdb4d3f
|
support Windows build
|
2025-02-24 11:29:36 +08:00 |
|
sazc
|
051e40e82b
|
tests: Triton had remove the fast_flush parameter from do_bench (#4485)
|
2025-02-24 10:59:22 +08:00 |
|
Jiashi Li
|
414a2f3eed
|
Initial commit
i
|
2025-02-24 09:20:23 +08:00 |
|