Commit Graph

2 Commits

Author SHA1 Message Date
ademeure
6cbff5778f Correctly flush L2, as reconstructing the tensors on every iteration effectively put them in the L2, and gave the GPU enough idle time to avoid thermal throttling in a potentially unrealistic way.
The previous behaviour is potentially representative of some use cases (e.g. previous kernel filling L2 with the data in a very specific way) but not standard benchmarking practice.
2025-03-15 20:46:24 +00:00
Chenggang Zhao
a6d97a1c1b Initial commit 2025-02-25 22:52:41 +08:00