support fp16

2025-06-26 18:15:54 +00:00 · 2025-02-24 01:58:53 -08:00
parent 15a82b81b8
commit 65fb7732fc
7 changed files with 139 additions and 91 deletions
--- a/README.md
+++ b/README.md
@@ -3,7 +3,7 @@
 FlashMLA is an efficient MLA decoding kernel for Hopper GPUs, optimized for variable-length sequences serving.

 Currently released:
- BF16
+- BF16, FP16
 - Paged kvcache with block size of 64

 ## Quick start