Add dataset reporting info (#740)

This commit is contained in:
pollfly 2023-12-26 15:50:12 +02:00 committed by GitHub
parent 4456da4019
commit 054eb2ad54
No known key found for this signature in database
GPG Key ID: 4AEE18F83AFDEB23

View File

@ -216,6 +216,34 @@ For example:
dataset.remove_files(dataset_path="*.csv", recursive=True)
```
## Dataset Preview
Add informative metrics, plots, or media to the Dataset. Use [`Dataset.get_logger()`](../references/sdk/dataset.md#get_logger)
to access the dataset's logger object, then add any additional information to the dataset, using the methods
available with a [logger](../references/sdk/logger.md) object.
You can add some dataset summaries (like [table reporting](../references/sdk/logger.md#report_table)) to create a preview
of the data stored for better visibility, or attach any statistics generated by the data ingestion process.
For example:
```python
# Attach a table to the dataset
dataset.get_logger().report_table(
title="Raw Dataset Metadata", series="Raw Dataset Metadata", csv="path/to/csv"
)
# Attach a historgram to the table
dataset.get_logger().report_histogram(
title="Class distribution",
series="Class distribution",
values=histogram_data,
iteration=0,
xlabels=histogram_data.index.tolist(),
yaxis="Number of samples",
)
```
## Uploading Files
To upload the dataset files to network storage, use the [`Dataset.upload`](../references/sdk/dataset.md#upload) method.