mirror of
https://github.com/deepseek-ai/DeepEP
synced 2025-06-26 18:28:11 +00:00
update readme
Signed-off-by: youkaichao <youkaichao@gmail.com>
This commit is contained in:
parent
97be5a3873
commit
b9b7ce348b
96
third-party/README.md
vendored
96
third-party/README.md
vendored
@ -8,66 +8,19 @@
|
|||||||
|
|
||||||
## Prerequisites
|
## Prerequisites
|
||||||
|
|
||||||
1. [GDRCopy](https://github.com/NVIDIA/gdrcopy) (v2.4 and above recommended) is a low-latency GPU memory copy library based on NVIDIA GPUDirect RDMA technology, and *it requires kernel module installation with root privileges.*
|
Hardware requirements:
|
||||||
|
- GPUs inside one node needs to be connected by NVLink
|
||||||
2. Hardware requirements
|
- GPUs across different nodes needs to be connected by RDMA devices, see [GPUDirect RDMA Documentation](https://docs.nvidia.com/cuda/gpudirect-rdma/)
|
||||||
- GPUDirect RDMA capable devices, see [GPUDirect RDMA Documentation](https://docs.nvidia.com/cuda/gpudirect-rdma/)
|
|
||||||
- InfiniBand GPUDirect Async (IBGDA) support, see [IBGDA Overview](https://developer.nvidia.com/blog/improving-network-performance-of-hpc-systems-using-nvidia-magnum-io-nvshmem-and-gpudirect-async/)
|
- InfiniBand GPUDirect Async (IBGDA) support, see [IBGDA Overview](https://developer.nvidia.com/blog/improving-network-performance-of-hpc-systems-using-nvidia-magnum-io-nvshmem-and-gpudirect-async/)
|
||||||
- For more detailed requirements, see [NVSHMEM Hardware Specifications](https://docs.nvidia.com/nvshmem/release-notes-install-guide/install-guide/abstract.html#hardware-requirements)
|
- For more detailed requirements, see [NVSHMEM Hardware Specifications](https://docs.nvidia.com/nvshmem/release-notes-install-guide/install-guide/abstract.html#hardware-requirements)
|
||||||
|
|
||||||
## Installation procedure
|
## Installation procedure
|
||||||
|
|
||||||
### 1. Install GDRCopy
|
### 1. Acquiring NVSHMEM source code
|
||||||
|
|
||||||
GDRCopy requires kernel module installation on the host system. Complete these steps on the bare-metal host before container deployment:
|
|
||||||
|
|
||||||
#### Build and installation
|
|
||||||
|
|
||||||
```bash
|
|
||||||
wget https://github.com/NVIDIA/gdrcopy/archive/refs/tags/v2.4.4.tar.gz
|
|
||||||
cd gdrcopy-2.4.4/
|
|
||||||
make -j$(nproc)
|
|
||||||
sudo make prefix=/opt/gdrcopy install
|
|
||||||
```
|
|
||||||
|
|
||||||
#### Kernel module installation
|
|
||||||
|
|
||||||
After compiling the software, you need to install the appropriate packages based on your Linux distribution.
|
|
||||||
For instance, using Ubuntu 22.04 and CUDA 12.3 as an example:
|
|
||||||
|
|
||||||
```bash
|
|
||||||
pushd packages
|
|
||||||
CUDA=/path/to/cuda ./build-deb-packages.sh
|
|
||||||
sudo dpkg -i gdrdrv-dkms_2.4.4_amd64.Ubuntu22_04.deb \
|
|
||||||
libgdrapi_2.4.4_amd64.Ubuntu22_04.deb \
|
|
||||||
gdrcopy-tests_2.4.4_amd64.Ubuntu22_04+cuda12.3.deb \
|
|
||||||
gdrcopy_2.4.4_amd64.Ubuntu22_04.deb
|
|
||||||
popd
|
|
||||||
sudo ./insmod.sh # Load kernel modules on the bare-metal system
|
|
||||||
```
|
|
||||||
|
|
||||||
#### Container environment notes
|
|
||||||
|
|
||||||
For containerized environments:
|
|
||||||
1. Host: keep kernel modules loaded (`gdrdrv`)
|
|
||||||
2. Container: install DEB packages *without* rebuilding modules:
|
|
||||||
```bash
|
|
||||||
sudo dpkg -i gdrcopy_2.4.4_amd64.Ubuntu22_04.deb \
|
|
||||||
libgdrapi_2.4.4_amd64.Ubuntu22_04.deb \
|
|
||||||
gdrcopy-tests_2.4.4_amd64.Ubuntu22_04+cuda12.3.deb
|
|
||||||
```
|
|
||||||
|
|
||||||
#### Verification
|
|
||||||
|
|
||||||
```bash
|
|
||||||
gdrcopy_copybw # Should show bandwidth test results
|
|
||||||
```
|
|
||||||
|
|
||||||
### 2. Acquiring NVSHMEM source code
|
|
||||||
|
|
||||||
Download NVSHMEM v3.2.5 from the [NVIDIA NVSHMEM OPEN SOURCE PACKAGES](https://developer.nvidia.com/downloads/assets/secure/nvshmem/nvshmem_src_3.2.5-1.txz).
|
Download NVSHMEM v3.2.5 from the [NVIDIA NVSHMEM OPEN SOURCE PACKAGES](https://developer.nvidia.com/downloads/assets/secure/nvshmem/nvshmem_src_3.2.5-1.txz).
|
||||||
|
|
||||||
### 3. Apply our custom patch
|
### 2. Apply our custom patch
|
||||||
|
|
||||||
Navigate to your NVSHMEM source directory and apply our provided patch:
|
Navigate to your NVSHMEM source directory and apply our provided patch:
|
||||||
|
|
||||||
@ -75,7 +28,7 @@ Navigate to your NVSHMEM source directory and apply our provided patch:
|
|||||||
git apply /path/to/deep_ep/dir/third-party/nvshmem.patch
|
git apply /path/to/deep_ep/dir/third-party/nvshmem.patch
|
||||||
```
|
```
|
||||||
|
|
||||||
### 4. Configure NVIDIA driver
|
### 3. Configure NVIDIA driver (required by inter-node communication)
|
||||||
|
|
||||||
Enable IBGDA by modifying `/etc/modprobe.d/nvidia.conf`:
|
Enable IBGDA by modifying `/etc/modprobe.d/nvidia.conf`:
|
||||||
|
|
||||||
@ -92,26 +45,31 @@ sudo reboot
|
|||||||
|
|
||||||
For more detailed configurations, please refer to the [NVSHMEM Installation Guide](https://docs.nvidia.com/nvshmem/release-notes-install-guide/install-guide/abstract.html).
|
For more detailed configurations, please refer to the [NVSHMEM Installation Guide](https://docs.nvidia.com/nvshmem/release-notes-install-guide/install-guide/abstract.html).
|
||||||
|
|
||||||
### 5. Build and installation
|
### 4. Build and installation
|
||||||
|
|
||||||
The following example demonstrates building NVSHMEM with IBGDA support:
|
DeepEP uses NVLink for intra-node communication and IBGDA for inter-node communication. All the other features are disabled to reduce the dependencies.
|
||||||
|
|
||||||
```bash
|
```bash
|
||||||
CUDA_HOME=/path/to/cuda \
|
export CUDA_HOME=/path/to/cuda
|
||||||
GDRCOPY_HOME=/path/to/gdrcopy \
|
# disable all features except IBGDA
|
||||||
NVSHMEM_SHMEM_SUPPORT=0 \
|
export NVSHMEM_IBGDA_SUPPORT=1
|
||||||
NVSHMEM_UCX_SUPPORT=0 \
|
|
||||||
NVSHMEM_USE_NCCL=0 \
|
|
||||||
NVSHMEM_MPI_SUPPORT=0 \
|
|
||||||
NVSHMEM_IBGDA_SUPPORT=1 \
|
|
||||||
NVSHMEM_PMIX_SUPPORT=0 \
|
|
||||||
NVSHMEM_TIMEOUT_DEVICE_POLLING=0 \
|
|
||||||
NVSHMEM_USE_GDRCOPY=1 \
|
|
||||||
cmake -S . -B build/ -DCMAKE_INSTALL_PREFIX=/path/to/your/dir/to/install
|
|
||||||
|
|
||||||
cd build
|
export NVSHMEM_SHMEM_SUPPORT=0
|
||||||
make -j$(nproc)
|
export NVSHMEM_UCX_SUPPORT=0
|
||||||
make install
|
export NVSHMEM_USE_NCCL=0
|
||||||
|
export NVSHMEM_PMIX_SUPPORT=0
|
||||||
|
export NVSHMEM_TIMEOUT_DEVICE_POLLING=0
|
||||||
|
export NVSHMEM_USE_GDRCOPY=0
|
||||||
|
export NVSHMEM_IBRC_SUPPORT=0
|
||||||
|
export NVSHMEM_BUILD_TESTS=0
|
||||||
|
export NVSHMEM_BUILD_EXAMPLES=0
|
||||||
|
export NVSHMEM_MPI_SUPPORT=0
|
||||||
|
export NVSHMEM_BUILD_HYDRA_LAUNCHER=0
|
||||||
|
export NVSHMEM_BUILD_TXZ_PACKAGE=0
|
||||||
|
export NVSHMEM_TIMEOUT_DEVICE_POLLING=0
|
||||||
|
|
||||||
|
cmake -G Ninja -S . -B build -DCMAKE_INSTALL_PREFIX=/path/to/your/dir/to/install
|
||||||
|
cmake --build build/ --target install
|
||||||
```
|
```
|
||||||
|
|
||||||
## Post-installation configuration
|
## Post-installation configuration
|
||||||
|
Loading…
Reference in New Issue
Block a user