Commit Graph

7 Commits

Author SHA1 Message Date
Shangyan Zhou
9eb2f84b3e
Optimize intranode combine. (#247)
* Increase the test round.

* Add warp synchronization.

* Shuffle the send warps.

* Add time elapsed into bench result.
2025-06-24 09:10:23 +08:00
Chenggang Zhao
9d4f7ef8ee Surpass type checks 2025-06-18 16:04:42 +08:00
Chenggang Zhao
b56f7c2c8c Adjust import order 2025-06-18 15:50:06 +08:00
Shangyan Zhou
cd371d31fc Move import. 2025-06-18 14:52:04 +08:00
Shangyan Zhou
bf4a4a21d2 Set device_id to suppress pytorch warning. 2025-06-18 14:43:38 +08:00
Shifang Xu
21efbe9b48
Support UE8M0 data format. (#206) 2025-06-12 09:38:19 +08:00
Chenggang Zhao
ebfe47e46f Initial commit 2025-02-25 09:07:53 +08:00