clearml-docs/docs/integrations/xgboost.md

---
title: XGBoost
---

:::tip
If you are not already using ClearML, see [Getting Started](../getting_started/ds/ds_first_steps.md) for setup 
instructions.
:::

ClearML integrates seamlessly with [XGBoost](https://xgboost.readthedocs.io/en/stable/), automatically logging its models,
and scalars. 

All you have to do is simply add two lines of code to your XGBoost script:

```python
from clearml import Task
task = Task.init(task_name="<task_name>", project_name="<project_name>")
```

And that's it! This creates a [ClearML Task](../fundamentals/task.md) which captures: 
* Source code and uncommitted changes
* Installed packages
* XGBoost model files 
* Scalars (loss, learning rates)
* Console output
* General details such as machine details, runtime, creation date etc.
* Hyperparameters created with standard python packages (e.g. argparse, click, Python Fire, etc.)
* And more

:::tip Logging Plots
ClearML automatically logs plots displayed using Matplotlib. To automatically log XGBoost plots, like tree and
feature importance plots, pass `matplotlib.pyplot.show()` after the plot creation method:

```python
import matplotlib.pyplot as plt
import xgboost as xgb
from xgboost import plot_tree

# model training 
# ... 

xgb.plot_importance(model)
plt.show()
try:
    plot_tree(model)
    plt.show()
except ImportError:
    print('Skipping tree plot: You must install graphviz to support plot tree')
```
:::

You can view all the task details in the [WebApp](../webapp/webapp_overview.md). 

![Experiment scalars](../img/examples_xgboost_metric_scalars.png)

## Automatic Logging Control 
By default, when ClearML is integrated into your XGBoost script, it captures models, and 
scalars. But, you may want to have more control over what your experiment logs.

To control a task's framework logging, use the `auto_connect_frameworks` parameter of [`Task.init()`](../references/sdk/task.md#taskinit). 
Completely disable all automatic logging by setting the parameter to `False`. For finer grained control of logged 
frameworks, input a dictionary, with framework-boolean pairs.

For example:

```python
auto_connect_frameworks={
   'xgboost': False, 'catboost': False, 'tensorflow': False, 'tensorboard': False, 
    'pytorch': True, 'scikit': True, 'fastai': True, 'lightgbm': False,
   'hydra': True, 'detect_repository': True, 'tfdefines': True, 'joblib': True,
   'megengine': True
}
```

You can also input wildcards as dictionary values, so ClearML will log a model created by a framework only if its local 
path matches at least one wildcard. 

For example, in the code below, ClearML will log XGBoost models only if their paths have the `.pt` extension. The 
unspecified frameworks' values default to true so all their models are automatically logged.

```python
auto_connect_frameworks={'xgboost' : '*.pt'}
```

## Manual Logging
To augment its automatic logging, ClearML also provides an explicit logging interface.

See more information about explicitly logging information to a ClearML Task:
* [Models](../clearml_sdk/model_sdk.md#manually-logging-models)
* [Configuration](../clearml_sdk/task_sdk.md#configuration) (e.g. parameters, configuration files)
* [Artifacts](../clearml_sdk/task_sdk.md#artifacts) (e.g. output files or python objects created by a task)
* [Scalars](../clearml_sdk/task_sdk.md#scalars) 
* [Text/Plots/Debug Samples](../fundamentals/logger.md#manual-reporting)

See [Explicit Reporting Tutorial](../guides/reporting/explicit_reporting.md).

## Examples 

Take a look at ClearML's XGBoost examples. The examples use XGBoost and ClearML in different configurations with 
additional tools, like Matplotlib and scikit-learn: 
* [XGBoost Metric](../guides/frameworks/xgboost/xgboost_metrics.md) - Demonstrates ClearML automatic logging of XGBoost models and plots 
* [XGBoost and scikit-learn](../guides/frameworks/xgboost/xgboost_sample.md) - Demonstrates ClearML automatic logging of XGBoost scalars and models 

## Remote Execution
ClearML logs all the information required to reproduce an experiment on a different machine (installed packages, 
uncommitted changes etc.). The [ClearML Agent](../clearml_agent) listens to designated queues and when a task is enqueued, 
the agent pulls it, recreates its execution environment, and runs it, reporting its scalars, plots, etc. to the 
experiment manager.

Deploy a ClearML Agent onto any machine (e.g. a cloud VM, a local GPU machine, your own laptop) by simply running the 
following command on it:

```commandline
clearml-agent daemon --queue <queues_to_listen_to> [--docker]
```

Use the ClearML [Autoscalers](../cloud_autoscaling/autoscaling_overview.md) to help you manage cloud workloads in the 
cloud of your choice (AWS, GCP, Azure) and automatically deploy ClearML agents: the autoscaler automatically spins up 
and shuts down instances as needed, according to a resource budget that you set.

### Cloning, Editing, and Enqueuing

![Cloning, editing, enqueuing gif](../img/gif/integrations_yolov5.gif)

Use ClearML's web interface to edit task details, like configuration parameters or input models, then execute the task 
with the new configuration on a remote machine:

* Clone the experiment
* Edit the hyperparameters and/or other details
* Enqueue the task

The ClearML Agent executing the task will use the new values to [override any hard coded values](../clearml_agent).

### Executing a Task Remotely

You can set a task to be executed remotely programmatically by adding [`Task.execute_remotely()`](../references/sdk/task.md#execute_remotely) 
to your script. This method stops the current local execution of the task, and then enqueues it to a specified queue to 
re-run it on a remote machine.

```python
# If executed locally, process will terminate, and a copy will be executed by an agent instead
task.execute_remotely(queue_name='default', exit_process=True)
```

## Hyperparameter Optimization
Use ClearML's [`HyperParameterOptimizer`](../references/sdk/hpo_optimization_hyperparameteroptimizer.md) class to find 
the hyperparameter values that yield the best performing models. See [Hyperparameter Optimization](../fundamentals/hpo.md) 
for more information.
Edit CatBoost and XGBoost integration pages (#624) 2023-07-31 09:32:51 +00:00			`---`
			`title: XGBoost`
			`---`

			`:::tip`
			`If you are not already using ClearML, see [Getting Started](../getting_started/ds/ds_first_steps.md) for setup`
			`instructions.`
			`:::`

			`ClearML integrates seamlessly with [XGBoost](https://xgboost.readthedocs.io/en/stable/), automatically logging its models,`
			`and scalars.`

			`All you have to do is simply add two lines of code to your XGBoost script:`

			```python
			`from clearml import Task`
			`task = Task.init(task_name="<task_name>", project_name="<project_name>")`
			```

Small edits (#724) 2023-12-03 12:27:46 +00:00			`And that's it! This creates a [ClearML Task](../fundamentals/task.md) which captures:`
Edit CatBoost and XGBoost integration pages (#624) 2023-07-31 09:32:51 +00:00			`* Source code and uncommitted changes`
			`* Installed packages`
			`* XGBoost model files`
			`* Scalars (loss, learning rates)`
			`* Console output`
			`* General details such as machine details, runtime, creation date etc.`
			`* Hyperparameters created with standard python packages (e.g. argparse, click, Python Fire, etc.)`
			`* And more`

			`:::tip Logging Plots`
			`ClearML automatically logs plots displayed using Matplotlib. To automatically log XGBoost plots, like tree and`
			feature importance plots, pass `matplotlib.pyplot.show()` after the plot creation method:

			```python
			`import matplotlib.pyplot as plt`
			`import xgboost as xgb`
			`from xgboost import plot_tree`

			`# model training`
			`# ...`

			`xgb.plot_importance(model)`
			`plt.show()`
			`try:`
			`plot_tree(model)`
			`plt.show()`
			`except ImportError:`
			`print('Skipping tree plot: You must install graphviz to support plot tree')`
			```
			`:::`

			`You can view all the task details in the [WebApp](../webapp/webapp_overview.md).`

			`![Experiment scalars](../img/examples_xgboost_metric_scalars.png)`

			`## Automatic Logging Control`
			`By default, when ClearML is integrated into your XGBoost script, it captures models, and`
			`scalars. But, you may want to have more control over what your experiment logs.`

			To control a task's framework logging, use the `auto_connect_frameworks` parameter of [`Task.init()`](../references/sdk/task.md#taskinit).
			Completely disable all automatic logging by setting the parameter to `False`. For finer grained control of logged
			`frameworks, input a dictionary, with framework-boolean pairs.`

			`For example:`

			```python
			`auto_connect_frameworks={`
			`'xgboost': False, 'catboost': False, 'tensorflow': False, 'tensorboard': False,`
			`'pytorch': True, 'scikit': True, 'fastai': True, 'lightgbm': False,`
			`'hydra': True, 'detect_repository': True, 'tfdefines': True, 'joblib': True,`
Update `auto_connect_frameworks` info (#718) 2023-11-23 13:05:19 +00:00			`'megengine': True`
Edit CatBoost and XGBoost integration pages (#624) 2023-07-31 09:32:51 +00:00			`}`
			```

			`You can also input wildcards as dictionary values, so ClearML will log a model created by a framework only if its local`
			`path matches at least one wildcard.`

			For example, in the code below, ClearML will log XGBoost models only if their paths have the `.pt` extension. The
			`unspecified frameworks' values default to true so all their models are automatically logged.`

			```python
			`auto_connect_frameworks={'xgboost' : '*.pt'}`
			```

			`## Manual Logging`
			`To augment its automatic logging, ClearML also provides an explicit logging interface.`

			`See more information about explicitly logging information to a ClearML Task:`
			`* [Models](../clearml_sdk/model_sdk.md#manually-logging-models)`
			`* [Configuration](../clearml_sdk/task_sdk.md#configuration) (e.g. parameters, configuration files)`
			`* [Artifacts](../clearml_sdk/task_sdk.md#artifacts) (e.g. output files or python objects created by a task)`
			`* [Scalars](../clearml_sdk/task_sdk.md#scalars)`
			`* [Text/Plots/Debug Samples](../fundamentals/logger.md#manual-reporting)`

			`See [Explicit Reporting Tutorial](../guides/reporting/explicit_reporting.md).`

			`## Examples`

Small edits (#630) 2023-08-01 14:05:53 +00:00			`Take a look at ClearML's XGBoost examples. The examples use XGBoost and ClearML in different configurations with`
Edit CatBoost and XGBoost integration pages (#624) 2023-07-31 09:32:51 +00:00			`additional tools, like Matplotlib and scikit-learn:`
			`* [XGBoost Metric](../guides/frameworks/xgboost/xgboost_metrics.md) - Demonstrates ClearML automatic logging of XGBoost models and plots`
			`* [XGBoost and scikit-learn](../guides/frameworks/xgboost/xgboost_sample.md) - Demonstrates ClearML automatic logging of XGBoost scalars and models`

			`## Remote Execution`
			`ClearML logs all the information required to reproduce an experiment on a different machine (installed packages,`
			`uncommitted changes etc.). The [ClearML Agent](../clearml_agent) listens to designated queues and when a task is enqueued,`
			`the agent pulls it, recreates its execution environment, and runs it, reporting its scalars, plots, etc. to the`
			`experiment manager.`

			`Deploy a ClearML Agent onto any machine (e.g. a cloud VM, a local GPU machine, your own laptop) by simply running the`
			`following command on it:`

			```commandline
			`clearml-agent daemon --queue <queues_to_listen_to> [--docker]`
			```

Small edits (#668) 2023-09-11 10:33:30 +00:00			`Use the ClearML [Autoscalers](../cloud_autoscaling/autoscaling_overview.md) to help you manage cloud workloads in the`
Edit CatBoost and XGBoost integration pages (#624) 2023-07-31 09:32:51 +00:00			`cloud of your choice (AWS, GCP, Azure) and automatically deploy ClearML agents: the autoscaler automatically spins up`
			`and shuts down instances as needed, according to a resource budget that you set.`

			`### Cloning, Editing, and Enqueuing`

			`![Cloning, editing, enqueuing gif](../img/gif/integrations_yolov5.gif)`

			`Use ClearML's web interface to edit task details, like configuration parameters or input models, then execute the task`
			`with the new configuration on a remote machine:`

			`* Clone the experiment`
			`* Edit the hyperparameters and/or other details`
			`* Enqueue the task`

			`The ClearML Agent executing the task will use the new values to [override any hard coded values](../clearml_agent).`

			`### Executing a Task Remotely`

			You can set a task to be executed remotely programmatically by adding [`Task.execute_remotely()`](../references/sdk/task.md#execute_remotely)
			`to your script. This method stops the current local execution of the task, and then enqueues it to a specified queue to`
			`re-run it on a remote machine.`

			```python
			`# If executed locally, process will terminate, and a copy will be executed by an agent instead`
			`task.execute_remotely(queue_name='default', exit_process=True)`
			```

			`## Hyperparameter Optimization`
Small edits (#724) 2023-12-03 12:27:46 +00:00			Use ClearML's [`HyperParameterOptimizer`](../references/sdk/hpo_optimization_hyperparameteroptimizer.md) class to find
Edit CatBoost and XGBoost integration pages (#624) 2023-07-31 09:32:51 +00:00			`the hyperparameter values that yield the best performing models. See [Hyperparameter Optimization](../fundamentals/hpo.md)`
			`for more information.`