FlashMLA/csrc
Kevin Zhang e0557deb3a Feature:Support flashMLA decoding via flashAttn2(#29)
Changes:
1. Implement flashMLA with matrix absorption algorithm via flashAttn2
2. Add golden test on MXMACA platform
2025-02-24 23:56:05 +08:00
..
flash_api_mla.cpp Feature:Support flashMLA decoding via flashAttn2(#29) 2025-02-24 23:56:05 +08:00
flash_api_mla.h Feature:Support flashMLA decoding via flashAttn2(#29) 2025-02-24 23:56:05 +08:00
flash_api.cpp Feature:Support flashMLA decoding via flashAttn2(#29) 2025-02-24 23:56:05 +08:00
flash_fwd_mla_kernel.h Initial commit 2025-02-24 09:20:23 +08:00
flash_fwd_split_kernel_k64_V1x8.h Feature:Support flashMLA decoding via flashAttn2(#29) 2025-02-24 23:56:05 +08:00
flash_fwd_splitkv_hdimqk576_hdimv512_m32n16_bf16_True_True_sm80.cu Feature:Support flashMLA decoding via flashAttn2(#29) 2025-02-24 23:56:05 +08:00
flash_fwd_splitkv_hdimqk576_hdimv512_m32n16_bf16_True_True_split_sm80.cu Feature:Support flashMLA decoding via flashAttn2(#29) 2025-02-24 23:56:05 +08:00
flash_fwd_splitkv_hdimqk576_hdimv512_m32n16_fp16_True_True_sm80.cu Feature:Support flashMLA decoding via flashAttn2(#29) 2025-02-24 23:56:05 +08:00
flash_fwd_splitkv_hdimqk576_hdimv512_m32n16_fp16_True_True_split_sm80.cu Feature:Support flashMLA decoding via flashAttn2(#29) 2025-02-24 23:56:05 +08:00
flash_parameter.h Feature:Support flashMLA decoding via flashAttn2(#29) 2025-02-24 23:56:05 +08:00