mirror of
https://github.com/deepseek-ai/DeepSeek-V3
synced 2025-01-22 12:25:30 +00:00
add version
Signed-off-by: simon-mo <simon.mo@hey.com>
This commit is contained in:
parent
cf47874d8e
commit
e2c15caf04
@ -307,7 +307,7 @@ For comprehensive step-by-step instructions on running DeepSeek-V3 with LMDeploy
|
||||
|
||||
### 6.5 Inference with vLLM (recommended)
|
||||
|
||||
[vLLM](https://github.com/vllm-project/vllm) supports DeepSeek-V3 inference for FP8 and BF16 modes on both NVIDIA and AMD GPUs. Aside from standard techniques, vLLM offers _pipeline parallelism_ allowing you to run this model on multiple machines connected by networks. For detailed guidance, please refer to the [vLLM instructions](https://docs.vllm.ai/en/latest/serving/distributed_serving.html). Please feel free to follow [the enhancement plan](https://github.com/vllm-project/vllm/issues/11539) as well.
|
||||
[vLLM](https://github.com/vllm-project/vllm) v0.6.6 supports DeepSeek-V3 inference for FP8 and BF16 modes on both NVIDIA and AMD GPUs. Aside from standard techniques, vLLM offers _pipeline parallelism_ allowing you to run this model on multiple machines connected by networks. For detailed guidance, please refer to the [vLLM instructions](https://docs.vllm.ai/en/latest/serving/distributed_serving.html). Please feel free to follow [the enhancement plan](https://github.com/vllm-project/vllm/issues/11539) as well.
|
||||
|
||||
### 6.6 Recommended Inference Functionality with AMD GPUs
|
||||
|
||||
|
Loading…
Reference in New Issue
Block a user