Files
DeepEP/csrc
Shangyan Zhou 9eb2f84b3e Optimize intranode combine. (#247)
* Increase the test round.

* Add warp synchronization.

* Shuffle the send warps.

* Add time elapsed into bench result.
2025-06-24 09:10:23 +08:00
..
2025-06-23 11:44:06 +08:00
2025-02-25 09:07:53 +08:00