mirror of
https://github.com/clearml/clearml-docs
synced 2025-06-26 18:17:44 +00:00
add advanced to ToC, add image, add more info to execute_remotely.md
This commit is contained in:
@@ -7,13 +7,60 @@ script demonstrates the use of the [`execute_remotely`](../../references/sdk/tas
|
||||
|
||||
The script does the following:
|
||||
* Trains a simple deep neural network on the PyTorch built-in MNIST dataset.
|
||||
* Uses ClearML automatic logging.
|
||||
* Uses ClearML's automatic and explicit logging.
|
||||
* Creates an experiment named `remote_execution pytorch mnist train`, which is associated with the `examples` project.
|
||||
|
||||
When the code is executed, the training runs for one epoch to make sure nothing crashes, then the code passes the `execute_remotely` method
|
||||
which terminates the local execution of the code. Execution will switch to remote execution by the agent listening to
|
||||
queue specified in the `queue_name` parameter of the method.
|
||||
## Execution Flow
|
||||
|
||||
This feature is especially helpful if you want to run the first epoch locally on your machine to debug and to
|
||||
make sure code doesn't crash, and then move to a stronger machine for the entire training.
|
||||
The following describes the code's execution flow:
|
||||
1. The training runs for one epoch.
|
||||
1. The code passes the `execute_remotely` method which terminates the local execution of the code.
|
||||
1. Execution switches to remote execution by the agent listening to queue specified in the `queue_name` parameter of the method.
|
||||
|
||||
The `execute_remotely` method is especially helpful when running code on a development machine for a few iterations
|
||||
to debug and to make sure the code doesn't crash, or setting up an environment. After that, the training can be
|
||||
moved to be executed by a stronger machine.
|
||||
|
||||
## Scalars
|
||||
|
||||
In the example script's `train` function, the following code explicitly reports scalars to **ClearML**:
|
||||
|
||||
```python
|
||||
Logger.current_logger().report_scalar(
|
||||
"train", "loss", iteration=(epoch * len(train_loader) + batch_idx), value=loss.item())
|
||||
```
|
||||
|
||||
In the `test` method, the code explicitly reports `loss` and `accuracy` scalars.
|
||||
|
||||
```python
|
||||
Logger.current_logger().report_scalar(
|
||||
"test", "loss", iteration=epoch, value=test_loss)
|
||||
Logger.current_logger().report_scalar(
|
||||
"test", "accuracy", iteration=epoch, value=(correct / len(test_loader.dataset)))
|
||||
```
|
||||
|
||||
These scalars can be visualized in plots, which appear in the ClearML web UI, in the experiment's
|
||||
page **>** **RESULTS** **>** **SCALARS**.
|
||||
|
||||

|
||||
|
||||
## Hyperparameters
|
||||
|
||||
ClearML automatically logs command line options defined with `argparse`. They appear in **CONFIGURATIONS** **>** **HYPER PARAMETERS** **>** **Args**.
|
||||
|
||||

|
||||
|
||||
## Console
|
||||
|
||||
Text printed to the console for training progress, as well as all other console output, appear in **RESULTS** **>** **CONSOLE**.
|
||||
|
||||

|
||||
|
||||
## Artifacts
|
||||
|
||||
Model artifacts associated with the experiment appear in the info panel of the **EXPERIMENTS** tab and in
|
||||
the info panel of the **MODELS** tab.
|
||||
|
||||
|
||||

|
||||
|
||||
|
||||
BIN
docs/img/examples_remote_execution_artifacts.png
Normal file
BIN
docs/img/examples_remote_execution_artifacts.png
Normal file
Binary file not shown.
|
After Width: | Height: | Size: 40 KiB |
@@ -57,6 +57,7 @@ module.exports = {
|
||||
],
|
||||
guidesSidebar: [
|
||||
'guides/guidemain',
|
||||
{'Advanced': ['guides/advanced/execute_remotely', 'guides/advanced/multiple_tasks_single_process']},
|
||||
{'Automation': ['guides/automation/manual_random_param_search_example', 'guides/automation/task_piping']},
|
||||
{'Data Management': ['guides/data management/data_man_simple', 'guides/data management/data_man_folder_sync', 'guides/data management/data_man_cifar_classification']},
|
||||
{'ClearML Task': ['guides/clearml-task/clearml_task_tutorial']},
|
||||
|
||||
Reference in New Issue
Block a user