Bugs fixed

2025-05-05 20:44:48 +00:00 · 2025-03-05 14:27:45 +08:00 · 2025-03-05 14:27:45 +08:00 · 680e424bdc
commit 680e424bdc
parent 592296cd45
2 changed files with 2 additions and 2 deletions
--- a/README.md
+++ b/README.md
@ -282,6 +282,7 @@ For two micro-batch overlapping, you can refer to the following figure. With our

 ## Roadmap

+- [ ] AR support (releasing soon)
 - [ ] A100 support (intranode only)
 - [ ] Support BF16 for the low-latency dispatch kernel
 - [ ] Support NVLink protocol for intranode low-latency kernels
--- a/csrc/kernels/internode.cu
+++ b/csrc/kernels/internode.cu
@ -383,8 +383,7 @@ notify_dispatch(const int* num_tokens_per_rank, int* moe_recv_counter_mapped, in

        // Calculate prefix sum
        __syncthreads();
-        EP_STATIC_ASSERT(kNumRDMARanks <= 32, "Invalid number of RDMA ranks");
-        if (thread_id < kNumRDMARanks) {
+        if (thread_id == 0) {
            auto prefix_row = rdma_channel_prefix_matrix + dst_rdma_rank * num_channels;
            #pragma unroll
            for (int i = 1; i < num_channels; ++ i)