2021-05-13 23:48:51 +00:00
---
title: Hyperparameter Optimization
---
The [hyper_parameter_optimizer.py ](https://github.com/allegroai/clearml/blob/master/examples/optimization/hyper-parameter-optimization/hyper_parameter_optimizer.py )
2023-01-12 14:57:08 +00:00
example script demonstrates hyperparameter optimization (HPO), which is automated by using ClearML.
2021-05-13 23:48:51 +00:00
2021-09-09 10:17:46 +00:00
## Set the Search Strategy for Optimization
2021-05-13 23:48:51 +00:00
A search strategy is required for the optimization, as well as a search strategy optimizer class to implement that strategy.
The following search strategies can be used:
2024-03-07 13:14:11 +00:00
* Optuna hyperparameter optimization - [`automation.optuna.OptimizerOptuna` ](../../../references/sdk/hpo_optuna_optuna_optimizeroptuna.md ).
2021-05-13 23:48:51 +00:00
For more information about Optuna, see the [Optuna ](https://optuna.org/ ) documentation.
2024-03-07 13:14:11 +00:00
* BOHB - [`automation.hpbandster.OptimizerBOHB` ](../../../references/sdk/hpo_hpbandster_bandster_optimizerbohb.md ).
2021-05-13 23:48:51 +00:00
BOHB performs robust and efficient hyperparameter optimization at scale by combining the speed of Hyperband searches
with the guidance and guarantees of convergence of Bayesian Optimization.
2022-03-13 13:07:06 +00:00
ClearML implements BOHB for automation with HpBandSter's [bohb.py ](https://github.com/automl/HpBandSter/blob/master/hpbandster/optimizers/bohb.py ).
2021-05-13 23:48:51 +00:00
For more information about HpBandSter BOHB, see the [HpBandSter ](https://automl.github.io/HpBandSter/build/html/index.html )
documentation.
2024-03-07 13:14:11 +00:00
* Random uniform sampling of hyperparameter strategy - [`automation.RandomSearch` ](../../../references/sdk/hpo_optimization_randomsearch.md )
* Full grid sampling strategy of every hyperparameter combination - [`automation.GridSearch` ](../../../references/sdk/hpo_optimization_gridsearch.md ).
* Custom - Use a custom class and inherit from the ClearML automation base strategy class, [`SearchStrategy` ](https://github.com/allegroai/clearml/blob/master/clearml/automation/optimization.py#L310 )
2021-05-13 23:48:51 +00:00
2024-03-07 13:14:11 +00:00
The search strategy class that is chosen will be passed to the [`automation.HyperParameterOptimizer` ](../../../references/sdk/hpo_optimization_hyperparameteroptimizer.md )
2021-05-13 23:48:51 +00:00
object later.
The example code attempts to import `OptimizerOptuna` for the search strategy. If `clearml.automation.optuna` is not
installed, it attempts to import `OptimizerBOHB` . If `clearml.automation.hpbandster` is not installed, it uses
2023-11-15 10:11:28 +00:00
`RandomSearch` as the search strategy.
2021-05-13 23:48:51 +00:00
2021-09-17 09:35:50 +00:00
```python
2021-12-14 13:12:30 +00:00
try:
from clearml.automation.optuna import OptimizerOptuna # noqa
aSearchStrategy = OptimizerOptuna
except ImportError as ex:
try:
from clearml.automation.hpbandster import OptimizerBOHB # noqa
aSearchStrategy = OptimizerBOHB
except ImportError as ex:
2021-05-13 23:48:51 +00:00
logging.getLogger().warning(
'Apologies, it seems you do not have \'optuna\' or \'hpbandster\' installed, '
'we will be using RandomSearch strategy instead')
aSearchStrategy = RandomSearch
2021-09-17 09:35:50 +00:00
```
2021-05-13 23:48:51 +00:00
2021-09-09 10:17:46 +00:00
## Define a Callback
2021-05-13 23:48:51 +00:00
When the optimization starts, a callback is provided that returns the best performing set of hyperparameters. In the script,
the `job_complete_callback` function returns the ID of `top_performance_job_id` .
2021-09-17 09:35:50 +00:00
```python
2021-12-14 13:12:30 +00:00
def job_complete_callback(
job_id, # type: str
objective_value, # type: float
objective_iteration, # type: int
job_parameters, # type: dict
top_performance_job_id # type: str
):
print('Job completed!', job_id, objective_value, objective_iteration, job_parameters)
if job_id == top_performance_job_id:
print('WOOT WOOT we broke the record! Objective reached {}'.format(objective_value))
2021-09-17 09:35:50 +00:00
```
2021-05-13 23:48:51 +00:00
2021-09-09 10:17:46 +00:00
## Initialize the Optimization Task
2021-05-13 23:48:51 +00:00
2022-05-19 06:59:10 +00:00
Initialize the Task, which will be stored in ClearML Server when the code runs. After the code runs at least once, it
2021-05-13 23:48:51 +00:00
can be [reproduced ](../../../webapp/webapp_exp_reproducing.md ) and [tuned ](../../../webapp/webapp_exp_tuning.md ).
2023-09-11 10:33:30 +00:00
Set the Task type to `optimizer` , and create a new experiment (and Task object) each time the optimizer runs (`reuse_last_task_id=False`).
2021-05-13 23:48:51 +00:00
2023-10-15 07:59:07 +00:00
When the code runs, it creates an experiment named **Automatic Hyper-Parameter Optimization** in
the **Hyper-Parameter Optimization** project, which can be seen in the **ClearML Web UI** .
2021-05-13 23:48:51 +00:00
2021-09-17 09:35:50 +00:00
```python
2021-12-14 13:12:30 +00:00
# Connecting CLEARML
task = Task.init(
project_name='Hyper-Parameter Optimization',
task_name='Automatic Hyper-Parameter Optimization',
task_type=Task.TaskTypes.optimizer,
reuse_last_task_id=False
)
2021-09-17 09:35:50 +00:00
```
2021-05-13 23:48:51 +00:00
2021-09-09 10:17:46 +00:00
## Set Up the Arguments
2021-05-13 23:48:51 +00:00
Create an arguments dictionary that contains the ID of the Task to optimize, and a Boolean indicating whether the
optimizer will run as a service, see [Running as a service ](#running-as-a-service ).
In this example, an experiment named **Keras HP optimization base** is being optimized. The experiment must have run at
2022-05-19 06:59:10 +00:00
least once so that it is stored in ClearML Server, and, therefore, can be cloned.
2021-05-13 23:48:51 +00:00
Since the arguments dictionary is connected to the Task, after the code runs once, the `template_task_id` can be changed
2021-10-06 12:50:00 +00:00
to optimize a different experiment.
2021-05-13 23:48:51 +00:00
2021-09-17 09:35:50 +00:00
```python
2022-12-26 09:08:10 +00:00
# experiment template to optimize in the hyperparameter optimization
2021-12-14 13:12:30 +00:00
args = {
'template_task_id': None,
'run_as_service': False,
}
args = task.connect(args)
2021-05-13 23:48:51 +00:00
2021-12-14 13:12:30 +00:00
# Get the template task experiment that we want to optimize
if not args['template_task_id']:
args['template_task_id'] = Task.get_task(
project_name='examples', task_name='Keras HP optimization base').id
2021-09-17 09:35:50 +00:00
```
2021-05-13 23:48:51 +00:00
2021-09-17 09:35:50 +00:00
## Creating the Optimizer Object
2021-05-13 23:48:51 +00:00
2024-03-07 13:14:11 +00:00
Initialize an [`automation.HyperParameterOptimizer` ](../../../references/sdk/hpo_optimization_hyperparameteroptimizer.md )
2024-05-21 08:30:46 +00:00
object, setting the following optimization parameters:
2021-05-13 23:48:51 +00:00
2024-05-21 08:30:46 +00:00
* ID of a ClearML task to optimize. This task will be cloned, and each clone will sample a different set of hyperparameters values:
2021-09-17 09:35:50 +00:00
2024-05-21 08:30:46 +00:00
```python
an_optimizer = HyperParameterOptimizer(
# This is the experiment we want to optimize
base_task_id=args['template_task_id'],
```
* Hyperparameter ranges to sample, instantiating them as ClearML automation objects using [`automation.UniformIntegerParameterRange` ](../../../references/sdk/hpo_parameters_uniformintegerparameterrange.md )
and [`automation.DiscreteParameterRange` ](../../../references/sdk/hpo_parameters_discreteparameterrange.md ):
```python
hyper_parameters=[
UniformIntegerParameterRange('layer_1', min_value=128, max_value=512, step_size=128),
UniformIntegerParameterRange('layer_2', min_value=128, max_value=512, step_size=128),
DiscreteParameterRange('batch_size', values=[96, 128, 160]),
DiscreteParameterRange('epochs', values=[30]),
],
```
* Metric to optimize and the optimization objective:
```python
objective_metric_title='val_acc',
objective_metric_series='val_acc',
objective_metric_sign='max',
```
:::tip Multi-objective Optimization
If you are using the Optuna framework (see [Set the Search Strategy for Optimization ](#set-the-search-strategy-for-optimization )),
you can list multiple optimization objectives. When doing so, make sure the `objective_metric_title` ,
`objective_metric_series` , and `objective_metric_sign` lists are
the same length. Each title will be matched to its respective series and sign.
For example, the code below sets two objectives: to minimize the `validation/loss` metric and to maximize the
`validation/accuracy` metric:
```python
objective_metric_title=["validation", "validation"]
objective_metric_series=["loss", "accuracy"]
objective_metric_sign=["min", "max"]
```
:::
2024-05-26 06:45:26 +00:00
* Number of concurrent Tasks:
2024-05-21 08:30:46 +00:00
```python
max_number_of_concurrent_tasks=2,
```
2024-05-26 06:45:26 +00:00
* Optimization strategy (see [Set the search strategy for optimization ](#set-the-search-strategy-for-optimization )):
2024-05-21 08:30:46 +00:00
```python
optimizer_class=aSearchStrategy,
```
2024-05-26 06:45:26 +00:00
* Queue to use for remote execution. This is overridden if the optimizer runs as a service.
2024-05-21 08:30:46 +00:00
```python
execution_queue='1xGPU',
```
2024-05-26 06:45:26 +00:00
* Remaining parameters, including the time limit per Task (minutes), period for checking the optimization (minutes),
2024-05-21 08:30:46 +00:00
maximum number of jobs to launch, minimum and maximum number of iterations for each Task:
```python
# Optional: Limit the execution time of a single experiment, in minutes.
# (this is optional, and if using OptimizerBOHB, it is ignored)
time_limit_per_job=10.,
# Check the experiments every 6 seconds is way too often, we should probably set it to 5 min,
# assuming a single experiment is usually hours...
pool_period_min=0.1,
# set the maximum number of jobs to launch for the optimization, default (None) unlimited
# If OptimizerBOHB is used, it defined the maximum budget in terms of full jobs
# basically the cumulative number of iterations will not exceed total_max_jobs * max_iteration_per_job
total_max_jobs=10,
# This is only applicable for OptimizerBOHB and ignore by the rest
# set the minimum number of iterations for an experiment, before early stopping
min_iteration_per_job=10,
# Set the maximum number of iterations for an experiment to execute
# (This is optional, unless using OptimizerBOHB where this is a must)
max_iteration_per_job=30,
2021-05-13 23:48:51 +00:00
2024-05-21 08:30:46 +00:00
) # done creating HyperParameterOptimizer
```
2021-05-13 23:48:51 +00:00
2021-09-09 10:17:46 +00:00
## Running as a Service
2021-05-13 23:48:51 +00:00
2024-07-24 07:49:10 +00:00
To run the optimization as a service, set the `run_as_service` argument to `true` . For more information about
2024-07-15 12:53:41 +00:00
running as a service, see [Services Mode ](../../../clearml_agent/clearml_agent_services_mode.md ).
2021-05-13 23:48:51 +00:00
2021-09-17 09:35:50 +00:00
```python
2021-12-14 13:12:30 +00:00
# if we are running as a service, just enqueue ourselves into the services queue and let it run the optimization
if args['run_as_service']:
# if this code is executed by `clearml-agent` the function call does nothing.
# if executed locally, the local process will be terminated, and a remote copy will be executed instead
task.execute_remotely(queue_name='services', exit_process=True)
2021-09-17 09:35:50 +00:00
```
2021-05-13 23:48:51 +00:00
## Optimize
2021-12-14 13:12:30 +00:00
The optimizer is ready. Set the report period and [start ](../../../references/sdk/hpo_optimization_hyperparameteroptimizer.md#start )
2024-07-24 07:49:10 +00:00
it, providing the callback method to report the best performance:
2021-05-13 23:48:51 +00:00
2021-09-17 09:35:50 +00:00
```python
2021-12-14 13:12:30 +00:00
# report every 12 seconds, this is way too often, but we are testing here J
an_optimizer.set_report_period(0.2)
# start the optimization process, callback function to be called every time an experiment is completed
# this function returns immediately
an_optimizer.start(job_complete_callback=job_complete_callback)
# set the time limit for the optimization process (2 hours)
2021-09-17 09:35:50 +00:00
```
2021-05-13 23:48:51 +00:00
Now that it is running:
1. Set a time limit for optimization
1. Wait
1. Get the best performance
1. Print the best performance
1. Stop the optimizer.
2021-09-17 09:35:50 +00:00
```python
2021-12-14 13:12:30 +00:00
# set the time limit for the optimization process (2 hours)
an_optimizer.set_time_limit(in_minutes=90.0)
# wait until process is done (notice we are controlling the optimization process in the background)
an_optimizer.wait()
# optimization is completed, print the top performing experiments id
top_exp = an_optimizer.get_top_experiments(top_k=3)
print([t.id for t in top_exp])
# make sure background optimization stopped
an_optimizer.stop()
print('We are done, good bye')
2021-09-17 09:35:50 +00:00
```