Commit Graph

  • 9edee0c022 update .gitignore main ljss 2025-04-29 12:03:15 +08:00
  • 9c5dfab6d1 update to cutlass 3.9 ljss 2025-04-29 12:02:57 +08:00
  • 01a27728e6 Fix synchronization issues ljss 2025-04-28 18:53:04 +08:00
  • 70b9468520 Fix LaTeX render error (#74) Shengyu Liu 2025-04-23 10:21:14 +08:00
  • 6cff5a73f5 Minor fix to the docs to correct FlashAttention-3's paper link and typos (#73) ℍ𝕠𝕝𝕝𝕠𝕨 𝕄𝕒𝕟 2025-04-23 05:14:05 +03:00
  • a9444cd67d Update README.md (#72) Shengyu Liu 2025-04-22 18:03:14 +08:00
  • c2067be3ea Performance Update (2025.04.22) (#71) Shengyu Liu 2025-04-22 17:50:57 +08:00
  • b31bfe72a8 add missing copyright ljss 2025-03-01 18:24:24 +08:00
  • 3e123bc93c add community support for [AMD] Jiashi Li 2025-03-01 17:55:58 +08:00
  • 1aef31d163 reformat Community Support section hpp 2025-02-27 09:42:09 +08:00
  • 77d9d8d21b add Community Support of [Hygon DCU] [Intellifusion] [Iluvatar Corex] hpp 2025-02-27 09:40:47 +08:00
  • 4430e398d9 add Community Support of [Hygon DCU] [Intellifusion] [Iluvatar Corex] hpp 2025-02-27 09:39:18 +08:00
  • 480405ada9 fix readme Jiashi Li 2025-02-26 20:32:39 +08:00
  • 966eedc2f7 Fix readme Jiashi Li 2025-02-26 20:30:45 +08:00
  • 01d6d40062 Merge pull request #45 from yangsijia-serena/main Jiashi Li 2025-02-26 20:14:40 +08:00
  • 6492cabb28 add Community Support of [MetaX] and [Moore Threads] hpp 2025-02-26 11:26:42 +08:00
  • b67980309b fix(benchmark): store 'compare' and 'one' perf results in csv files and visualize them yangsijia.614 2025-02-25 23:52:54 +08:00
  • 4edea86f9e cuda12.8 recommendation ljss 2025-02-26 00:05:57 +08:00
  • b549289fb4 Merge pull request #32 from sijiac/fp16-support Jiashi Li 2025-02-25 09:19:42 +08:00
  • e1e9fa98f8 Style fix ljss 2025-02-25 09:18:11 +08:00
  • a3b74b8574 add flag to disable FP16 compile Sijia Chen 2025-02-24 10:01:59 -08:00
  • 18e32770cc Merge pull request #35 from KnowingNothing/main Jiashi Li 2025-02-25 00:41:23 +08:00
  • 7d69520ad4 Merge pull request #37 from chunyang-wen/Update-doc-string Jiashi Li 2025-02-25 00:38:31 +08:00
  • 922f63bdaa add gitignore for png and csv files in benchmark zhengsize 2025-02-24 23:58:52 +08:00
  • c4c5912b05 Update docstring chunyang.wen 2025-02-25 00:11:57 +08:00
  • 4da4dbd303 feat: add benchmark for flash_infer vs flash_mla zhengsize 2025-02-24 22:34:22 +08:00
  • 65fb7732fc support fp16 Sijia Chen 2025-02-24 01:58:53 -08:00
  • 15a82b81b8 replace c10 optional with std optional Sijia Chen 2025-02-24 00:25:25 -08:00
  • bcb90f2afd Merge pull request #9 from homorunner/main Jiashi Li 2025-02-24 13:21:58 +08:00
  • dd1161e396 Merge pull request #14 from lancerts/minor-fix Jiashi Li 2025-02-24 13:13:58 +08:00
  • 4fbaa9527c minor fix test lancerts 2025-02-23 20:12:49 -08:00
  • accc1695ee Merge pull request #12 from sazczmh/main Jiashi Li 2025-02-24 11:57:41 +08:00
  • e62bdb4d3f support Windows build 程元 2025-02-24 11:29:36 +08:00
  • 051e40e82b tests: Triton had remove the fast_flush parameter from do_bench (#4485) sazc 2025-02-24 10:59:22 +08:00
  • 414a2f3eed Initial commit Jiashi Li 2025-02-21 14:31:27 +08:00