Commit Graph

23 Commits

Author SHA1 Message Date
Chenggang Zhao
0008c6755e
Merge pull request #67 from deepseek-ai/roce-support
Update NVSHMEM to v3.2.5.
2025-03-11 09:30:45 +08:00
Chenggang Zhao
ed7487c15e Support BF16 for low-latency kernels 2025-03-10 17:24:41 +08:00
Chenggang Zhao
1fc40d50f3 Improve AR performance 2025-03-06 21:41:19 +08:00
Chenggang Zhao
41385ba5b3
Merge pull request #45 from deepseek-ai/ar-support
Fix AR bugs for normal kernels
2025-03-06 09:48:17 +08:00
Chenggang Zhao
458cdcb22a Fix AR bugs for normal kernels 2025-03-05 17:13:35 +08:00
Shangyan Zhou
e995aa22db Update NVSHMEM to v3.2.5. 2025-03-05 16:16:52 +08:00
Chenggang Zhao
680e424bdc Bugs fixed 2025-03-05 14:27:45 +08:00
Chenggang Zhao
592296cd45 Add some plans 2025-03-04 15:54:46 +08:00
Chenggang Zhao
1553fc42bf Improve EP2/4 performance 2025-03-04 15:34:33 +08:00
Chenggang Zhao
55cdd9a64f Fix typo 2025-03-04 14:17:58 +08:00
Chenggang Zhao
2a3cac903a Add some docs 2025-03-04 10:19:42 +08:00
Chenggang Zhao
c5b4040502 Enable intranode kernel tests with EP2 and EP4 2025-03-03 15:01:02 +08:00
Chenggang Zhao
6cc3497df8 Remove all raw tensors for better P2P overlapping 2025-03-03 14:25:22 +08:00
Chenggang Zhao
f60306409a
Merge pull request #32 from youkaichao/youkaichao-patch-1
Update path
2025-03-03 09:19:28 +08:00
youkaichao
88b1622e7d
update path 2025-02-28 17:26:14 +08:00
Shangyan Zhou
231e17ebb7
Merge pull request #29 from youkaichao/youkaichao-patch-1
fix installation
2025-02-28 17:21:47 +08:00
youkaichao
30e2778d18
Update README.md 2025-02-28 16:56:09 +08:00
Chenggang Zhao
77bb07aa20 Update some comments and docs 2025-02-27 10:27:22 +08:00
Chenggang Zhao
3885404ffb Add NVSHMEM_IB_ENABLE_RELAXED_ORDERING 2025-02-26 17:54:12 +08:00
Chenggang Zhao
45f481b87b Update figures 2025-02-26 16:24:59 +08:00
haswelliris
1a0a8bda09 Update prerequisites installation instructions 2025-02-25 17:19:07 +08:00
Chenggang Zhao
84d3d6fdee
Update README.md 2025-02-25 10:59:09 +08:00
Chenggang Zhao
ebfe47e46f Initial commit 2025-02-25 09:07:53 +08:00