awesome-deepseek-integration/docs/curator
..
README_cn.md
README.md

image

Curator

Curator is an open-source tool to curate large scale datasets for post-training LLMs.

Curator was used to curate Bespoke-Stratos-17k, a reasoning dataset to train a fully open reasoning model Bespoke-Stratos.

Curator supports:

  • Calling Deepseek API for scalable synthetic data curation
  • Easy structured data extraction
  • Caching and automatic recovery
  • Dataset visualization
  • Saving $ using batch mode

Call Deepseek API with Curator easily:

image

Get Started here