DeepEP/csrc/kernels
Shangyan Zhou dd133d39bc
Fix warp synchronization. (#215)
* Fix warp synchronization.

* Another fix.
2025-06-16 17:05:11 +08:00
..
api.cuh Remove the low-latency usage flag (#214) 2025-06-16 13:30:14 +08:00
buffer.cuh Initial commit 2025-02-25 09:07:53 +08:00
CMakeLists.txt Support UE8M0 data format. (#206) 2025-06-12 09:38:19 +08:00
configs.cuh Support Ampere architecture (#204) 2025-06-11 15:48:18 +08:00
exception.cuh Initial commit 2025-02-25 09:07:53 +08:00
ibgda_device.cuh Fully remove barrier FIFO designs (#200) 2025-06-10 16:23:20 +08:00
internode_ll.cu Remove the low-latency usage flag (#214) 2025-06-16 13:30:14 +08:00
internode.cu Fix warp synchronization. (#215) 2025-06-16 17:05:11 +08:00
intranode.cu Update intranode.cu (#210) 2025-06-16 11:03:58 +08:00
launch.cuh Support Ampere architecture (#204) 2025-06-11 15:48:18 +08:00
layout.cu Support Ampere architecture (#204) 2025-06-11 15:48:18 +08:00
runtime.cu Support Ampere architecture (#204) 2025-06-11 15:48:18 +08:00
utils.cuh Add automatic warp count control for low-latency kernels (#213) 2025-06-16 11:56:43 +08:00