DeepEP/csrc
Chenggang Zhao b8d90fb753
Support Ampere architecture (#204)
* Update README

* Update `setup.py`

* Fix headers

* Add `DISABLE_NVSHMEM` for APIs

* Fix launch

* Fix TMA settings

* Fix TMA usages

* Fix dlink

* Separate layout kernels

* Update version

* Add `is_sm90_compiled`

* Fix tests

* Add NVLink connection checks

* Update README

* Fix tests

* Add some comments

* Minor fix

* Minor fix

* Fix bugs
2025-06-11 15:48:18 +08:00
..
kernels Support Ampere architecture (#204) 2025-06-11 15:48:18 +08:00
CMakeLists.txt Use TMA instead of LD/ST for intra-node normal kernels (#191) 2025-06-06 15:40:17 +08:00
config.hpp Support Ampere architecture (#204) 2025-06-11 15:48:18 +08:00
deep_ep.cpp Support Ampere architecture (#204) 2025-06-11 15:48:18 +08:00
deep_ep.hpp Support CUDA graph for intranode normal kernels (#203) 2025-06-11 11:08:54 +08:00
event.hpp Initial commit 2025-02-25 09:07:53 +08:00