Commit Graph

5 Commits

Author SHA1 Message Date
IshanaSabrish
927eebc10f fix: Update named barrier thread count to match actual participating threads
- Changed kNThreads (256) to 128 in NamedBarrier::arrive calls to match the actual number of threads in warp group
- Fixed potential deadlock issue where barrier was waiting for more threads than would arrive
- Updated both SReady and SoftmaxReady barrier synchronizations
2025-03-01 21:18:05 +05:30
Sijia Chen
a3b74b8574 add flag to disable FP16 compile 2025-02-24 10:01:59 -08:00
Sijia Chen
65fb7732fc support fp16 2025-02-24 01:58:53 -08:00
Sijia Chen
15a82b81b8 replace c10 optional with std optional 2025-02-24 00:25:40 -08:00
Jiashi Li
414a2f3eed Initial commit
i
2025-02-24 09:20:23 +08:00