Edit docstrings (#1013)

This commit is contained in:
pollfly 2023-05-28 08:48:49 +03:00 committed by GitHub
parent 449a4cc42d
commit db2f899d95
No known key found for this signature in database
GPG Key ID: 4AEE18F83AFDEB23
5 changed files with 145 additions and 137 deletions

View File

@ -470,7 +470,7 @@ class PipelineController(object):
pass
:param post_execute_callback: Callback function, called when a step (Task) is completed
and it other jobs are executed. Allows a user to modify the Task status after completion.
and other jobs are executed. Allows a user to modify the Task status after completion.
.. code-block:: py
@ -738,7 +738,7 @@ class PipelineController(object):
pass
:param post_execute_callback: Callback function, called when a step (Task) is completed
and it other jobs are executed. Allows a user to modify the Task status after completion.
and other jobs are executed. Allows a user to modify the Task status after completion.
.. code-block:: py
@ -862,7 +862,7 @@ class PipelineController(object):
pass
:param Callable step_task_completed_callback: Callback function, called when a step (Task) is completed
and it other jobs are executed. Allows a user to modify the Task status after completion.
and other jobs are executed. Allows a user to modify the Task status after completion.
.. code-block:: py
@ -951,7 +951,7 @@ class PipelineController(object):
def connect_configuration(self, configuration, name=None, description=None):
# type: (Union[Mapping, list, Path, str], Optional[str], Optional[str]) -> Union[dict, Path, str]
"""
Connect a configuration dictionary or configuration file (pathlib.Path / str) to a the PipelineController object.
Connect a configuration dictionary or configuration file (pathlib.Path / str) to the PipelineController object.
This method should be called before reading the configuration file.
For example, a local file:
@ -1373,7 +1373,7 @@ class PipelineController(object):
pass
:param Callable step_task_completed_callback: Callback function, called when a step (Task) is completed
and it other jobs are executed. Allows a user to modify the Task status after completion.
and other jobs are executed. Allows a user to modify the Task status after completion.
.. code-block:: py
@ -1895,7 +1895,7 @@ class PipelineController(object):
pass
:param post_execute_callback: Callback function, called when a step (Task) is completed
and it other jobs are executed. Allows a user to modify the Task status after completion.
and other jobs are executed. Allows a user to modify the Task status after completion.
.. code-block:: py
@ -3644,7 +3644,7 @@ class PipelineDecorator(PipelineController):
pass
:param post_execute_callback: Callback function, called when a step (Task) is completed
and it other jobs are executed. Allows a user to modify the Task status after completion.
and other jobs are executed. Allows a user to modify the Task status after completion.
.. code-block:: py

View File

@ -229,14 +229,15 @@ class OptimizerBOHB(SearchStrategy, RandomSeed):
year = {2018},
}
:param eta : float (3)
:param eta: float (3)
In each iteration, a complete run of sequential halving is executed. In it,
after evaluating each configuration on the same subset size, only a fraction of
1/eta of them 'advances' to the next round.
Must be greater or equal to 2.
:param min_budget : float (0.01)
:param min_budget: float (0.01)
The smallest budget to consider. Needs to be positive!
:param max_budget : float (1)
:param max_budget: float (1)
The largest budget to consider. Needs to be larger than min_budget!
The budgets will be geometrically distributed
:math:`a^2 + b^2 = c^2 /sim /eta^k` for :math:`k/in [0, 1, ... , num/_subsets - 1]`.

View File

@ -432,7 +432,7 @@ class SearchStrategy(object):
Helper function, Implementation is not required. Default use in process_step default implementation.
Check if the job needs to be aborted or already completed.
If returns ``False``, the job was aborted / completed, and should be taken off the current job list
If returns ``False``, the job was aborted / completed, and should be taken off the current job list.
If there is a budget limitation, this call should update
``self.budget.compute_time.update`` / ``self.budget.iterations.update``
@ -534,6 +534,8 @@ class SearchStrategy(object):
where index 0 is the best performing Task.
Example w/ all_metrics=False:
.. code-block:: py
[
('0593b76dc7234c65a13a301f731958fa',
{
@ -550,6 +552,8 @@ class SearchStrategy(object):
Example w/ all_metrics=True:
.. code-block:: py
[
('0593b76dc7234c65a13a301f731958fa',
{
@ -599,9 +603,8 @@ class SearchStrategy(object):
# type: (int, bool, bool, bool) -> Sequence[(str, dict)]
"""
Return a list of dictionaries of the top performing experiments.
Example: [
{'task_id': Task-ID, 'metrics': scalar-metric-dict, 'hyper_parameters': Hyper-Parameters},
]
Example: ``[{'task_id': Task-ID, 'metrics': scalar-metric-dict, 'hyper_parameters': Hyper-Parameters},]``
Order is based on the controller ``Objective`` object.
:param int top_k: The number of Tasks (experiments) to return.
@ -614,46 +617,50 @@ class SearchStrategy(object):
where index 0 is the best performing Task.
Example w/ all_metrics=False:
[
{
task_id: '0593b76dc7234c65a13a301f731958fa',
hyper_parameters: {'General/lr': '0.03', 'General/batch_size': '32'},
metrics: {
'accuracy per class/cat': {
'metric': 'accuracy per class',
'variant': 'cat',
'value': 0.119,
'min_value': 0.119,
'max_value': 0.782
},
}
},
]
.. code-block:: py
Example w/ all_metrics=True:
[
{
task_id: '0593b76dc7234c65a13a301f731958fa',
hyper_parameters: {'General/lr': '0.03', 'General/batch_size': '32'},
metrics: {
'accuracy per class/cat': {
'metric': 'accuracy per class',
'variant': 'cat',
'value': 0.119,
'min_value': 0.119,
'max_value': 0.782
},
}
},
]
[
{
task_id: '0593b76dc7234c65a13a301f731958fa',
hyper_parameters: {'General/lr': '0.03', 'General/batch_size': '32'},
metrics: {
'accuracy per class/cat': {
'metric': 'accuracy per class',
'variant': 'cat',
'value': 0.119,
'min_value': 0.119,
'max_value': 0.782
},
'accuracy per class/deer': {
'metric': 'accuracy per class',
'variant': 'deer',
'value': 0.219,
'min_value': 0.219,
'max_value': 0.282
},
}
},
]
Example w/ all_metrics=True:
.. code-block:: py
[
{
task_id: '0593b76dc7234c65a13a301f731958fa',
hyper_parameters: {'General/lr': '0.03', 'General/batch_size': '32'},
metrics: {
'accuracy per class/cat': {
'metric': 'accuracy per class',
'variant': 'cat',
'value': 0.119,
'min_value': 0.119,
'max_value': 0.782
},
'accuracy per class/deer': {
'metric': 'accuracy per class',
'variant': 'deer',
'value': 0.219,
'min_value': 0.219,
'max_value': 0.282
},
}
},
]
"""
additional_filters = dict(page_size=int(top_k), page=0)
if only_completed:
@ -761,7 +768,8 @@ class SearchStrategy(object):
"""
Set the function used to name a newly created job.
:param callable naming_function:
:param callable naming_function: Callable function for naming a newly created job.
Use the following format:
.. code-block:: py
@ -1072,7 +1080,7 @@ class RandomSearch(SearchStrategy):
class HyperParameterOptimizer(object):
"""
Hyper-parameter search controller. Clones the base experiment, changes arguments and tries to maximize/minimize
Hyperparameter search controller. Clones the base experiment, changes arguments and tries to maximize/minimize
the defined objective.
"""
_tag = 'optimization'
@ -1105,13 +1113,12 @@ class HyperParameterOptimizer(object):
``validation``).
:param str objective_metric_series: The Objective metric series to maximize / minimize (for example, ``loss``).
:param str objective_metric_sign: The objective to maximize / minimize.
The values are:
- ``min`` - Minimize the last reported value for the specified title/series scalar.
- ``max`` - Maximize the last reported value for the specified title/series scalar.
- ``min_global`` - Minimize the min value of *all* reported values for the specific title/series scalar.
- ``max_global`` - Maximize the max value of *all* reported values for the specific title/series scalar.
- ``min`` - Minimize the last reported value for the specified title/series scalar.
- ``max`` - Maximize the last reported value for the specified title/series scalar.
- ``min_global`` - Minimize the min value of *all* reported values for the specific title/series scalar.
- ``max_global`` - Maximize the max value of *all* reported values for the specific title/series scalar.
:param class.SearchStrategy optimizer_class: The SearchStrategy optimizer to use for the hyper-parameter search
:param int max_number_of_concurrent_tasks: The maximum number of concurrent Tasks (experiments) running at the
@ -1121,24 +1128,21 @@ class HyperParameterOptimizer(object):
default is ``None``, indicating no time limit.
:param float compute_time_limit: The maximum compute time in minutes. When time limit is exceeded,
all jobs aborted. (Optional)
:param bool auto_connect_task: Store optimization arguments and configuration in the Task
:param bool auto_connect_task: Store optimization arguments and configuration in the Task.
The values are:
- ``True`` - The optimization argument and configuration will be stored in the Task. All arguments will
be under the hyper-parameter section ``opt``, and the optimization hyper_parameters space will
- ``True`` - The optimization argument and configuration will be stored in the Task. All arguments will
be under the hyperparameter section ``opt``, and the optimization hyper_parameters space will be
stored in the Task configuration object section.
- ``False`` - Do not store with Task.
- ``Task`` - A specific Task object to connect the optimization process with.
- ``False`` - Do not store with Task.
- ``Task`` - A specific Task object to connect the optimization process with.
:param bool always_create_task: Always create a new Task
:param bool always_create_task: Always create a new Task.
The values are:
- ``True`` - No current Task initialized. Create a new task named ``optimization`` in the ``base_task_id``
- ``True`` - No current Task initialized. Create a new task named ``optimization`` in the ``base_task_id``
project.
- ``False`` - Use the :py:meth:`task.Task.current_task` (if exists) to report statistics.
- ``False`` - Use the :py:meth:`task.Task.current_task` (if exists) to report statistics.
:param str spawn_project: If project name is specified, create all optimization Jobs (Tasks) in the
specified project instead of the original base_task_id project.
@ -1505,9 +1509,8 @@ class HyperParameterOptimizer(object):
# type: (int, bool, bool, bool) -> Sequence[(str, dict)]
"""
Return a list of dictionaries of the top performing experiments.
Example: [
{'task_id': Task-ID, 'metrics': scalar-metric-dict, 'hyper_parameters': Hyper-Parameters},
]
Example: ``[{'task_id': Task-ID, 'metrics': scalar-metric-dict, 'hyper_parameters': Hyper-Parameters},]``
Order is based on the controller ``Objective`` object.
:param int top_k: The number of Tasks (experiments) to return.
@ -1520,46 +1523,50 @@ class HyperParameterOptimizer(object):
where index 0 is the best performing Task.
Example w/ all_metrics=False:
[
{
task_id: '0593b76dc7234c65a13a301f731958fa',
hyper_parameters: {'General/lr': '0.03', 'General/batch_size': '32'},
metrics: {
'accuracy per class/cat': {
'metric': 'accuracy per class',
'variant': 'cat',
'value': 0.119,
'min_value': 0.119,
'max_value': 0.782
},
}
},
]
.. code-block:: py
Example w/ all_metrics=True:
[
{
task_id: '0593b76dc7234c65a13a301f731958fa',
hyper_parameters: {'General/lr': '0.03', 'General/batch_size': '32'},
metrics: {
'accuracy per class/cat': {
'metric': 'accuracy per class',
'variant': 'cat',
'value': 0.119,
'min_value': 0.119,
'max_value': 0.782
},
}
},
]
[
{
task_id: '0593b76dc7234c65a13a301f731958fa',
hyper_parameters: {'General/lr': '0.03', 'General/batch_size': '32'},
metrics: {
'accuracy per class/cat': {
'metric': 'accuracy per class',
'variant': 'cat',
'value': 0.119,
'min_value': 0.119,
'max_value': 0.782
},
'accuracy per class/deer': {
'metric': 'accuracy per class',
'variant': 'deer',
'value': 0.219,
'min_value': 0.219,
'max_value': 0.282
},
}
},
]
Example w/ all_metrics=True:
.. code-block:: py
[
{
task_id: '0593b76dc7234c65a13a301f731958fa',
hyper_parameters: {'General/lr': '0.03', 'General/batch_size': '32'},
metrics: {
'accuracy per class/cat': {
'metric': 'accuracy per class',
'variant': 'cat',
'value': 0.119,
'min_value': 0.119,
'max_value': 0.782
},
'accuracy per class/deer': {
'metric': 'accuracy per class',
'variant': 'deer',
'value': 0.219,
'min_value': 0.219,
'max_value': 0.282
},
}
},
]
"""
if not self.optimizer:
return []
@ -1615,13 +1622,12 @@ class HyperParameterOptimizer(object):
``validation``).
:param str objective_metric_series: The Objective metric series to maximize / minimize (for example, ``loss``).
:param str objective_metric_sign: The objective to maximize / minimize.
The values are:
- ``min`` - Minimize the last reported value for the specified title/series scalar.
- ``max`` - Maximize the last reported value for the specified title/series scalar.
- ``min_global`` - Minimize the min value of *all* reported values for the specific title/series scalar.
- ``max_global`` - Maximize the max value of *all* reported values for the specific title/series scalar.
- ``min`` - Minimize the last reported value for the specified title/series scalar.
- ``max`` - Maximize the last reported value for the specified title/series scalar.
- ``min_global`` - Minimize the min value of *all* reported values for the specific title/series scalar.
- ``max_global`` - Maximize the max value of *all* reported values for the specific title/series scalar.
:param str optimizer_task_id: Parent optimizer Task ID
:param top_k: The number of Tasks (experiments) to return.
:return: A list of Task objects, ordered by performance, where index 0 is the best performing Task.

View File

@ -110,7 +110,7 @@ class Parameter(RandomSeed):
class UniformParameterRange(Parameter):
"""
Uniform randomly sampled hyper-parameter object.
Uniform randomly sampled hyperparameter object.
"""
def __init__(
@ -129,12 +129,11 @@ class UniformParameterRange(Parameter):
:param float min_value: The minimum sample to use for uniform random sampling.
:param float max_value: The maximum sample to use for uniform random sampling.
:param float step_size: If not ``None``, set step size (quantization) for value sampling.
:param bool include_max_value: Range includes the ``max_value``
:param bool include_max_value: Range includes the ``max_value``.
The values are:
- ``True`` - The range includes the ``max_value`` (Default)
- ``False`` - Does not include.
- ``True`` - The range includes the ``max_value`` (Default)
- ``False`` - Does not include.
"""
super(UniformParameterRange, self).__init__(name=name)
@ -221,7 +220,7 @@ class LogUniformParameterRange(UniformParameterRange):
class UniformIntegerParameterRange(Parameter):
"""
Uniform randomly sampled integer Hyper-Parameter object.
Uniform randomly sampled integer Hyperparameter object.
"""
def __init__(self, name, min_value, max_value, step_size=1, include_max_value=True):
@ -233,12 +232,11 @@ class UniformIntegerParameterRange(Parameter):
:param int min_value: The minimum sample to use for uniform random sampling.
:param int max_value: The maximum sample to use for uniform random sampling.
:param int step_size: The default step size is ``1``.
:param bool include_max_value: Range includes the ``max_value``
:param bool include_max_value: Range includes the ``max_value``.
The values are:
- ``True`` - Includes the ``max_value`` (Default)
- ``False`` - Does not include.
- ``True`` - Includes the ``max_value`` (Default)
- ``False`` - Does not include.
"""
super(UniformIntegerParameterRange, self).__init__(name=name)

View File

@ -324,6 +324,7 @@ class Dataset(object):
# type: () -> Mapping[str, LinkEntry]
"""
Notice this call returns an internal representation, do not modify!
:return: dict with relative file path as key, and LinkEntry as value
"""
return self._dataset_link_entries
@ -643,8 +644,9 @@ class Dataset(object):
If -1 is provided, use a single zip artifact for the entire dataset change-set (old behaviour)
:param max_workers: Numbers of threads to be spawned when zipping and uploading the files.
If None (default) it will be set to:
- 1: if the upload destination is a cloud provider ('s3', 'gs', 'azure')
- number of logical cores: otherwise
- 1: if the upload destination is a cloud provider ('s3', 'gs', 'azure')
- number of logical cores: otherwise
:param int retries: Number of retries before failing to upload each zip. If 0, the upload is not retried.
:raise: If the upload failed (i.e. at least one zip failed to upload), raise a `ValueError`
@ -839,7 +841,7 @@ class Dataset(object):
# type: (Union[numpy.array, pd.DataFrame, Dict[str, Any]], str, bool) -> () # noqa: F821
"""
Attach a user-defined metadata to the dataset. Check `Task.upload_artifact` for supported types.
If type is Optionally make it visible as a table in the UI.
If type is Pandas Dataframes, optionally make it visible as a table in the UI.
"""
if metadata_name.startswith(self.__data_entry_name_prefix):
raise ValueError("metadata_name can not start with '{}'".format(self.__data_entry_name_prefix))
@ -954,7 +956,7 @@ class Dataset(object):
# type: (Union[Path, _Path, str], bool, Optional[int], Optional[int], bool, Optional[int]) -> Optional[str]
"""
return a base folder with a writable (mutable) local copy of the entire dataset
download and copy / soft-link, files from all the parent dataset versions
download and copy / soft-link, files from all the parent dataset versions
:param target_folder: Target folder for the writable copy
:param overwrite: If True, recursively delete the target folder before creating a copy.
@ -1223,11 +1225,11 @@ class Dataset(object):
:param output_uri: Location to upload the datasets file to, including preview samples.
The following are examples of ``output_uri`` values for the supported locations:
- A shared folder: ``/mnt/share/folder``
- S3: ``s3://bucket/folder``
- Google Cloud Storage: ``gs://bucket-name/folder``
- Azure Storage: ``azure://company.blob.core.windows.net/folder/``
- Default file server: None
- A shared folder: ``/mnt/share/folder``
- S3: ``s3://bucket/folder``
- Google Cloud Storage: ``gs://bucket-name/folder``
- Azure Storage: ``azure://company.blob.core.windows.net/folder/``
- Default file server: None
:param description: Description of the dataset
@ -1786,6 +1788,7 @@ class Dataset(object):
"""
Return a Logger object for the Dataset, allowing users to report statistics metrics
and debug samples on the Dataset itself
:return: Logger object
"""
return self._task.get_logger()
@ -1797,8 +1800,8 @@ class Dataset(object):
(it does not imply on the number of chunks parent versions store)
:param include_parents: If True (default),
return the total number of chunks from this version and all parent versions.
If False, only return the number of chunks we stored on this specific version.
return the total number of chunks from this version and all parent versions.
If False, only return the number of chunks we stored on this specific version.
:return: Number of chunks stored on the dataset.
"""