---
title: Folder Sync
---

This example shows how to use the *clearml-data* folder sync function.

*clearml-data* folder sync mode is useful for cases when users have a single point of truth (i.e. a folder) that updates 
from time to time. When the point of truth is updated, users can call `clearml-data sync` and the 
changes (file addition, modification, or removal) will be reflected in ClearML.

## Creating Initial Version

## Prerequisites
First, make sure that you have cloned the [clearml](https://github.com/allegroai/clearml) repository. This contains all
the needed files.
1. Open terminal and change directory to the cloned repository's examples folder
    `cd clearml/examples/reporting`

## Syncing a Folder
Create a dataset and sync the `data_samples` folder from the repo to ClearML
```bash
clearml-data sync --project datasets --name sync_folder --folder data_samples
```

Expected response:

```
clearml-data - Dataset Management & Versioning CLI
Creating a new dataset:
New dataset created id=0d8f5f3e5ebd4f849bfb218021be1ede
Syncing dataset id 0d8f5f3e5ebd4f849bfb218021be1ede to local folder data_samples
Generating SHA2 hash for 5 files
Hash generation completed
Sync completed: 0 files removed, 5 added / modified
Finalizing dataset
Pending uploads, starting dataset upload to https://files.community.clear.ml
Uploading compressed dataset changes (5 files, total 222.17 KB) to https://files.community.clear.ml
Upload completed (222.17 KB)
2021-05-04 09:57:56,809 - clearml.Task - INFO - Waiting to finish uploads
2021-05-04 09:57:57,581 - clearml.Task - INFO - Finished uploading
Dataset closed and finalized
```

As can be seen, the `clearml-data sync` command creates the dataset, then uploads the files, and closes the dataset.


## Modifying Synced Folder

Now we'll modify the folder:
1. Add another line to one of the files in the `data_samples` folder.
1. Add a file to the sample_data folder.<br/> 
   Run`echo "data data data" > data_samples/new_data.txt` (this will create the file `new_data.txt` and put it in the `data_samples` folder)


We'll repeat the process of creating a new dataset with the previous one as its parent, and syncing the folder.

```bash
clearml-data sync --project datasets --name second_ds --parents a1ddc8b0711b4178828f6c6e6e994b7c --folder data_samples
```

Expected response:
```
clearml-data - Dataset Management & Versioning CLI
Creating a new dataset:
New dataset created id=0992dd6bae6144388e0f2ef131d9724a
Syncing dataset id 0992dd6bae6144388e0f2ef131d9724a to local folder data_samples
Generating SHA2 hash for 6 files
Hash generation completed
Sync completed: 0 files removed, 2 added / modified
Finalizing dataset
Pending uploads, starting dataset upload to https://files.community.clear.ml
Uploading compressed dataset changes (2 files, total 742 bytes) to https://files.community.clear.ml
Upload completed (742 bytes)
2021-05-04 10:05:42,353 - clearml.Task - INFO - Waiting to finish uploads
2021-05-04 10:05:43,106 - clearml.Task - INFO - Finished uploading
Dataset closed and finalized
```

We can see that 2 files were added or modified, just as we expected!