diff --git a/docs/clearml_data/clearml_data_sdk.md b/docs/clearml_data/clearml_data_sdk.md index 5e471a99..978906f3 100644 --- a/docs/clearml_data/clearml_data_sdk.md +++ b/docs/clearml_data/clearml_data_sdk.md @@ -28,6 +28,9 @@ ClearML Data supports multiple ways to create datasets programmatically, which p will inherit its data * [`Dataset.squash()`](#datasetsquash) - Generate a new dataset from by squashing together a set of related datasets +You can add metadata to your datasets using the `Dataset.set_metadata` method, and access the metadata using the +`Dataset.get_metadata` method. See [`set_metadata`](../references/sdk/dataset.md#set_metadata) and [`get_metadata`](../references/sdk/dataset.md#get_metadata). + ### Dataset.create() Use the [`Dataset.create`](../references/sdk/dataset.md#datasetcreate) class method to create a dataset. diff --git a/docs/clearml_sdk/model_sdk.md b/docs/clearml_sdk/model_sdk.md index a80f87c4..3f61e7b8 100644 --- a/docs/clearml_sdk/model_sdk.md +++ b/docs/clearml_sdk/model_sdk.md @@ -121,7 +121,9 @@ model_list = Model.query_models( # If `True`, include archived models include_archived=True, # Maximum number of models returned - max_results=5 + max_results=5, + # Only models with matching metadata + metadata={"key":"value"} ) ``` diff --git a/docs/clearml_sdk/task_sdk.md b/docs/clearml_sdk/task_sdk.md index e1bad108..768bc977 100644 --- a/docs/clearml_sdk/task_sdk.md +++ b/docs/clearml_sdk/task_sdk.md @@ -231,9 +231,10 @@ The task's outputs, such as artifacts and models, can also be retrieved. ## Querying / Searching Tasks -Searching and filtering tasks can be done via the [web UI](../webapp/webapp_overview.md) and programmatically. -Input search parameters into the [`Task.get_tasks`](../references/sdk/task.md#taskget_tasks) method, which returns a -list of task objects that match the search. +Search and filter tasks programmatically. Input search parameters into the [`Task.get_tasks`](../references/sdk/task.md#taskget_tasks) +method, which returns a list of task objects that match the search. Pass `allow_archived=False` to filter out archived +tasks. + For example: ```python @@ -241,6 +242,7 @@ task_list = Task.get_tasks( task_ids=None, # type Optional[Sequence[str]] project_name=None, # Optional[str] task_name=None, # Optional[str] + allow_archived=True, # [bool] task_filter=None, # Optional[Dict]# # tasks with tag `included_tag` and without tag `excluded_tag` tags=['included_tag', '-excluded_tag'] diff --git a/docs/configs/clearml_conf.md b/docs/configs/clearml_conf.md index d4dc32dd..39c94886 100644 --- a/docs/configs/clearml_conf.md +++ b/docs/configs/clearml_conf.md @@ -1191,6 +1191,11 @@ will not exceed the value of `matplotlib_untitled_history_size`
#### sdk.network + +**`sdk.network.file_upload_retries`** (*int*) +* Number of retries before failing to upload a file + +--- **`sdk.network.iteration`** (*dict*) @@ -1266,6 +1271,18 @@ will not exceed the value of `matplotlib_untitled_history_size` * Specify a list of direct access objects using glob patterns which matches sets of files using wildcards. Direct access objects are not downloaded or cached, and any download request will return a direct reference. +##### sdk.storage.log + +**`sdk.storage.log.report_download_chunk_size_mb`** (*int*) +* Specify how often in MB the `StorageManager` reports its download progress to the console. By default, it reports +every 5MB + +--- + +**`sdk.storage.log.report_upload_chunk_size_mb`** (*int*) +* Specify how often in MB the `StorageManager` reports its upload progress to the console. By default, it reports every +5MB + ## Configuration Vault :::note Enterprise Feature diff --git a/docs/guides/storage/examples_storagehelper.md b/docs/guides/storage/examples_storagehelper.md index 90c98f7b..5d4fc51a 100644 --- a/docs/guides/storage/examples_storagehelper.md +++ b/docs/guides/storage/examples_storagehelper.md @@ -38,6 +38,9 @@ To download a non-compressed file, set the `extract_archive` argument to `False` manager.get_local_copy(remote_url="s3://MyBucket/MyFolder/file.ext", extract_archive=False) +By default, the `StorageManager` reports its download progress to the console every 5MB. You can change this using the +[`StorageManager.set_report_download_chunk_size`](../../references/sdk/storage.md#storagemanagerset_report_download_chunk_size) +class method, and specifying the chunk size in MB (not supported for Azure and GCP storage). ### Uploading a File @@ -47,6 +50,12 @@ argument. manager.upload_file(local_file="/mnt/data/also_file.ext", remote_url="s3://MyBucket/MyFolder") +Use the `retries parameter` to set the number of times file upload should be retried in case of failure. + +By default, the `StorageManager` reports its upload progress to the console every 5MB. You can change this using the +[`StorageManager.set_report_upload_chunk_size`](../../references/sdk/storage.md#storagemanagerset_report_upload_chunk_size) +class method, and specifying the chunk size in MB (not supported for Azure and GCP storage). + ### Setting Cache Limits