DeepEP/csrc/kernels
Zhicheng Wu 05df5554ff
Use one qp per sm for internode normal kernels (#181)
let the sender SM use the channel_id, and the receiver SM use channel_id + num_channels
2025-06-13 14:37:59 +08:00
..
api.cuh Support UE8M0 data format. (#206) 2025-06-12 09:38:19 +08:00
buffer.cuh Initial commit 2025-02-25 09:07:53 +08:00
CMakeLists.txt Support UE8M0 data format. (#206) 2025-06-12 09:38:19 +08:00
configs.cuh Support Ampere architecture (#204) 2025-06-11 15:48:18 +08:00
exception.cuh Initial commit 2025-02-25 09:07:53 +08:00
ibgda_device.cuh Fully remove barrier FIFO designs (#200) 2025-06-10 16:23:20 +08:00
internode_ll.cu Support UE8M0 data format. (#206) 2025-06-12 09:38:19 +08:00
internode.cu Use one qp per sm for internode normal kernels (#181) 2025-06-13 14:37:59 +08:00
intranode.cu Support UE8M0 data format. (#206) 2025-06-12 09:38:19 +08:00
launch.cuh Support Ampere architecture (#204) 2025-06-11 15:48:18 +08:00
layout.cu Support Ampere architecture (#204) 2025-06-11 15:48:18 +08:00
runtime.cu Support Ampere architecture (#204) 2025-06-11 15:48:18 +08:00
utils.cuh Support UE8M0 data format. (#206) 2025-06-12 09:38:19 +08:00