Files
DeepGEMM/deep_gemm
A-transformer 92521df34d comment more clear about Memory Consistency and Barrier Visibility
Memory Consistency and Barrier Visibility: Both __syncthreads() and cute::cluster_sync() serve as synchronization points, ensuring that all threads reach the barrier before any proceed. This guarantees that all prior memory operations, including barrier initialization, are visible to all threads within the synchronization scope.
2025-02-27 22:01:50 +04:00
..
2025-02-27 17:57:21 +08:00
2025-02-27 23:18:52 +08:00
2025-02-26 18:37:22 +08:00
2025-02-25 22:52:41 +08:00