Chenggang Zhao
|
bc118b248a
|
Add the transaction window data structure for RDMA senders (#245)
* Add draft
* Add fast-debugging flags
* Fix several bugs
* Add sender timeout checks
* Fix stuck
* Fix bugs
* Fix bugs
|
2025-06-24 09:12:40 +08:00 |
|
Chenggang Zhao
|
7b0c25f864
|
Support more hidden size
|
2025-06-20 16:37:28 +08:00 |
|
Chenggang Zhao
|
b8d90fb753
|
Support Ampere architecture (#204)
* Update README
* Update `setup.py`
* Fix headers
* Add `DISABLE_NVSHMEM` for APIs
* Fix launch
* Fix TMA settings
* Fix TMA usages
* Fix dlink
* Separate layout kernels
* Update version
* Add `is_sm90_compiled`
* Fix tests
* Add NVLink connection checks
* Update README
* Fix tests
* Add some comments
* Minor fix
* Minor fix
* Fix bugs
|
2025-06-11 15:48:18 +08:00 |
|
sleepcoo
|
a107266a4e
|
support hidden size 4096
Co-authored-by: zhyncs <me@zhyncs.com>
Co-authored-by: yinfan98 <1106310035@qq.com>
|
2025-05-12 16:41:21 +08:00 |
|
Chenggang Zhao
|
ebfe47e46f
|
Initial commit
|
2025-02-25 09:07:53 +08:00 |
|