Fix the performance data.

2025-06-26 18:28:11 +00:00 · 2025-04-22 11:23:42 +08:00 · 2025-04-22 11:23:42 +08:00 · 3b1045db43
commit 3b1045db43
parent edbb1bc3ff
1 changed files with 4 additions and 4 deletions
--- a/README.md
+++ b/README.md
@ -17,11 +17,11 @@ We test normal kernels on H800 (~160 GB/s NVLink maximum bandwidth), with each c
 |   Type    | Dispatch #EP | Bottleneck bandwidth | Combine #EP | Bottleneck bandwidth |
 |:---------:|:------------:|:--------------------:|:-----------:|:--------------------:|
 | Intranode |      8       |  153 GB/s (NVLink)   |      8      |  158 GB/s (NVLink)   |
-| Internode |      16      |    47 GB/s (RDMA)    |     16      |    62 GB/s (RDMA)    |
-| Internode |      32      |    59 GB/s (RDMA)    |     32      |    60 GB/s (RDMA)    |
-| Internode |      64      |    49 GB/s (RDMA)    |     64      |    51 GB/s (RDMA)    |
+| Internode |      16      |    43 GB/s (RDMA)    |     16      |    43 GB/s (RDMA)    |
+| Internode |      32      |    58 GB/s (RDMA)    |     32      |    57 GB/s (RDMA)    |
+| Internode |      64      |    51 GB/s (RDMA)    |     64      |    50 GB/s (RDMA)    |

-**News (2025.04.22)**: the performance is optimized by 5-35% by Tencent Network Platform Department, see [#130](https://github.com/deepseek-ai/DeepEP/pull/130) for more details. Thanks for the contribution!
+**News (2025.04.22)**: with optimizations from Tencent Network Platform Department, performance was enhanced by up to 30%, see [#130](https://github.com/deepseek-ai/DeepEP/pull/130) for more details. Thanks for the contribution!

 ### Low-latency kernels with pure RDMA