From 89f5de17a303cd342b64631e06ce401b36b6e1e0 Mon Sep 17 00:00:00 2001 From: RK Date: Thu, 27 Feb 2025 10:56:13 +0800 Subject: [PATCH] fix the front of readme --- README.md | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/README.md b/README.md index 96ca9c8..67c662c 100644 --- a/README.md +++ b/README.md @@ -2,7 +2,7 @@ DualPipe is an innovative bidirectional pipeline parallism algorithm introduced in the [DeepSeek-V3 Technical Report](https://arxiv.org/pdf/2412.19437). It achieves full overlap of forward and backward computation-communication phases, also reducing pipeline bubbles. For detailed information on computation-communication overlap, please refer to the [profile data](https://github.com/deepseek-ai/profile-data). -### Schedules +## Schedules ![schedules](images/schedules.png) @@ -19,7 +19,7 @@ have mutually overlapped computation and communication | ZB1P | (PP-1)(𝐹+𝐵-2𝑊) | 1× | PP | | DualPipe | (PP/2-1)(𝐹&𝐵+𝐵-3𝑊) | 2× | PP+1 | -𝐹 denotes the execution time of a forward chunk, 𝐵 denotes the execution time of a +**𝐹** denotes the execution time of a forward chunk, 𝐵 denotes the execution time of a full backward chunk, 𝑊 denotes the execution time of a "backward for weights" chunk, and 𝐹&𝐵 denotes the execution time of two mutually overlapped forward and backward chunks.