FlashMLA/mla_combine.h at 15f3897667c7da36f5b71bb5eb2867a4ed215214 - FlashMLA - Gitea: Git with a cup of tea

DeepSeek/FlashMLA

mirror of https://github.com/deepseek-ai/FlashMLA synced 2025-06-26 18:15:54 +00:00

Shengyu Liu 287061ec34 Performance optimization for compute-bound cases

2025-04-21 17:22:59 +08:00

7 lines

152 B

C++

Raw Blame History

 #pragma once
 #include "flash_mla.h"
 template<typename ElementT>
 void run_flash_mla_combine_kernel(Flash_fwd_mla_params &params, cudaStream_t stream);