mirror of
https://github.com/deepseek-ai/3FS
synced 2025-06-26 18:16:45 +00:00
Update README.md (#58)
This commit is contained in:
@@ -43,7 +43,7 @@ The test cluster comprised 25 storage nodes (2 NUMA domains/node, 1 storage serv
|
|||||||
### 3. KVCache
|
### 3. KVCache
|
||||||
|
|
||||||
KVCache is a technique used to optimize the LLM inference process. It avoids redundant computations by caching the key and value vectors of previous tokens in the decoder layers.
|
KVCache is a technique used to optimize the LLM inference process. It avoids redundant computations by caching the key and value vectors of previous tokens in the decoder layers.
|
||||||
The top figure demonstrates the read throughput of all KVCache clients, highlighting both peak and average values, with peak throughput reaching up to 40 GiB/s. The bottom figure presents the IOPS of removing ops from garbage collection (GC) during the same time period.
|
The top figure demonstrates the read throughput of all KVCache clients (1×400Gbps NIC/node), highlighting both peak and average values, with peak throughput reaching up to 40 GiB/s. The bottom figure presents the IOPS of removing ops from garbage collection (GC) during the same time period.
|
||||||
|
|
||||||

|

|
||||||

|

|
||||||
|
|||||||
Reference in New Issue
Block a user