mirror of
https://github.com/deepseek-ai/FlashMLA
synced 2025-06-26 18:15:54 +00:00
Changes: 1. Implement flashMLA with matrix absorption algorithm via flashAttn2 2. Add golden test on MXMACA platform |
||
|---|---|---|
| .. | ||
| flash_api_mla.cpp | ||
| flash_api_mla.h | ||
| flash_api.cpp | ||
| flash_fwd_mla_kernel.h | ||
| flash_fwd_split_kernel_k64_V1x8.h | ||
| flash_fwd_splitkv_hdimqk576_hdimv512_m32n16_bf16_True_True_sm80.cu | ||
| flash_fwd_splitkv_hdimqk576_hdimv512_m32n16_bf16_True_True_split_sm80.cu | ||
| flash_fwd_splitkv_hdimqk576_hdimv512_m32n16_fp16_True_True_sm80.cu | ||
| flash_fwd_splitkv_hdimqk576_hdimv512_m32n16_fp16_True_True_split_sm80.cu | ||
| flash_parameter.h | ||