mirror of
https://github.com/deepseek-ai/smallpond
synced 2025-06-26 18:27:45 +00:00
init
This commit is contained in:
37
docs/source/internals.rst
Normal file
37
docs/source/internals.rst
Normal file
@@ -0,0 +1,37 @@
|
||||
Internals
|
||||
=========
|
||||
|
||||
Data Root
|
||||
---------
|
||||
|
||||
Smallpond stores all data in a single directory called data root.
|
||||
|
||||
This directory has the following structure:
|
||||
|
||||
.. code-block:: bash
|
||||
|
||||
data_root
|
||||
└── 2024-12-11-12-00-28.2cc39990-296f-48a3-8063-78cf6dca460b # job_time.job_id
|
||||
├── config # configuration and state
|
||||
│ ├── exec_plan.pickle
|
||||
│ ├── logical_plan.pickle
|
||||
│ └── runtime_ctx.pickle
|
||||
├── log # logs
|
||||
│ ├── graph.png
|
||||
│ └── scheduler.log
|
||||
├── queue # message queue between scheduler and workers
|
||||
├── output # output data
|
||||
├── staging # intermediate data
|
||||
│ ├── DataSourceTask.000001
|
||||
│ ├── EvenlyDistributedPartitionProducerTask.000002
|
||||
│ ├── completed_tasks # output dataset of completed tasks
|
||||
│ └── started_tasks # used for checkpoint
|
||||
└── temp # temporary data
|
||||
├── DataSourceTask.000001
|
||||
└── EvenlyDistributedPartitionProducerTask.000002
|
||||
|
||||
Failure Recovery
|
||||
----------------
|
||||
|
||||
Smallpond can recover from failure and resume execution from the last checkpoint.
|
||||
Checkpoint is task-level. A few tasks, such as `ArrowBatchTask`, support checkpointing at the batch level.
|
||||
Reference in New Issue
Block a user