mirror of
https://github.com/deepseek-ai/smallpond
synced 2025-05-09 23:21:33 +00:00
28 lines
826 B
ReStructuredText
28 lines
826 B
ReStructuredText
smallpond
|
|
=========
|
|
|
|
Smallpond is a lightweight distributed data processing framework.
|
|
It uses `duckdb`_ as the compute engine and stores data in `parquet`_ format on a distributed file system (e.g. `3FS`_).
|
|
|
|
.. _duckdb: https://duckdb.org/
|
|
.. _parquet: https://parquet.apache.org/
|
|
.. _3FS: https://github.com/deepseek-ai/3fs
|
|
|
|
Why smallpond?
|
|
--------------
|
|
|
|
- **Performance**: Smallpond uses DuckDB to deliver native-level performance for efficient data processing.
|
|
- **Scalability**: Leverages high-performance distributed file systems for intermediate storage, enabling PB-scale data handling without memory bottlenecks.
|
|
- **Simplicity**: No long-running services or complex dependencies, making it easy to deploy and maintain.
|
|
|
|
.. toctree::
|
|
:maxdepth: 1
|
|
|
|
getstarted
|
|
internals
|
|
|
|
.. toctree::
|
|
:maxdepth: 3
|
|
|
|
api
|