Small edits (#595)

This commit is contained in:
pollfly 2023-06-15 11:22:50 +03:00 committed by GitHub
parent c256f46993
commit fdffc9c271
No known key found for this signature in database
GPG Key ID: 4AEE18F83AFDEB23
29 changed files with 62 additions and 62 deletions

View File

@ -138,7 +138,7 @@ clearml-agent execute [-h] --id TASK_ID [--log-file LOG_FILE] [--disable-monitor
|`--log-file`| The log file for Task execution output (stdout / stderr) to a text file.|<img src="/docs/latest/icons/ico-optional-yes.svg" alt="Yes" className="icon size-md center-md" />| |`--log-file`| The log file for Task execution output (stdout / stderr) to a text file.|<img src="/docs/latest/icons/ico-optional-yes.svg" alt="Yes" className="icon size-md center-md" />|
|`--log-level`| SDK log level. The values are:<ul><li>`DEBUG`</li><li>`INFO`</li><li>`WARN`</li><li>`WARNING`</li><li>`ERROR`</li><li>`CRITICAL`</li></ul>|<img src="/docs/latest/icons/ico-optional-yes.svg" alt="Yes" className="icon size-md center-md" />| |`--log-level`| SDK log level. The values are:<ul><li>`DEBUG`</li><li>`INFO`</li><li>`WARN`</li><li>`WARNING`</li><li>`ERROR`</li><li>`CRITICAL`</li></ul>|<img src="/docs/latest/icons/ico-optional-yes.svg" alt="Yes" className="icon size-md center-md" />|
|`-O`| Compile optimized pyc code (see python documentation). Repeat for more optimization.|<img src="/docs/latest/icons/ico-optional-yes.svg" alt="Yes" className="icon size-md center-md" />| |`-O`| Compile optimized pyc code (see python documentation). Repeat for more optimization.|<img src="/docs/latest/icons/ico-optional-yes.svg" alt="Yes" className="icon size-md center-md" />|
|`--require-queue`| If the specified task is not queued (in any queue), the execution will fail. (Used for 3rd party scheduler integration, e.g. K8s, SLURM, etc.)|<img src="/docs/latest/icons/ico-optional-yes.svg" alt="Yes" className="icon size-md center-md" />| |`--require-queue`| If the specified task is not queued, the execution will fail. (Used for 3rd party scheduler integration, e.g. K8s, SLURM, etc.)|<img src="/docs/latest/icons/ico-optional-yes.svg" alt="Yes" className="icon size-md center-md" />|
|`--standalone-mode`| Do not use any network connects, assume everything is pre-installed|<img src="/docs/latest/icons/ico-optional-yes.svg" alt="Yes" className="icon size-md center-md" />| |`--standalone-mode`| Do not use any network connects, assume everything is pre-installed|<img src="/docs/latest/icons/ico-optional-yes.svg" alt="Yes" className="icon size-md center-md" />|
## list ## list

View File

@ -156,7 +156,7 @@ dataset = Dataset.create(dataset_name="my dataset", dataset_project="example pro
dataset.add_files(path="path/to/folder_or_file") dataset.add_files(path="path/to/folder_or_file")
``` ```
There is an option to add a set of files based on wildcard matching of a single string or a list of strings, using the You can add a set of files based on wildcard matching of a single string or a list of strings, using the
`wildcard` parameter. Specify whether to match the wildcard files recursively using the `recursive` parameter. `wildcard` parameter. Specify whether to match the wildcard files recursively using the `recursive` parameter.
For example: For example:
@ -207,7 +207,7 @@ To remove files from a current dataset, use the [`Dataset.remove_files`](../refe
Input the path to the folder or file to be removed in the `dataset_path` parameter. The path is relative to the dataset. Input the path to the folder or file to be removed in the `dataset_path` parameter. The path is relative to the dataset.
To remove links, specify their URL (e.g. `s3://bucket/file`). To remove links, specify their URL (e.g. `s3://bucket/file`).
There is also an option to input a wildcard into `dataset_path` in order to remove a set of files matching the wildcard. You can also input a wildcard into `dataset_path` in order to remove a set of files matching the wildcard.
Set the `recursive` parameter to `True` in order to match all wildcard files recursively Set the `recursive` parameter to `True` in order to match all wildcard files recursively
For example: For example:

View File

@ -927,10 +927,10 @@ This call is not cached. If the Task has many reported scalars, it might take a
#### Get Single Value Scalars #### Get Single Value Scalars
To get the values of a reported single-value scalars, use [`task.get_reported_single_value()`](../references/sdk/task.md#get_reported_single_value) To get the values of a reported single-value scalars, use [`Task.get_reported_single_value()`](../references/sdk/task.md#get_reported_single_value)
and specify the scalar's `name`. and specify the scalar's `name`.
To get all reported single scalar values, use [`task.get_reported_single_values()`](../references/sdk/task.md#get_reported_single_values), To get all reported single scalar values, use [`Task.get_reported_single_values()`](../references/sdk/task.md#get_reported_single_values),
which returns a dictionary of scalar name and value pairs: which returns a dictionary of scalar name and value pairs:
```console ```console

View File

@ -95,7 +95,7 @@ Go to a specific apps documentation page to view all configuration options
* [GCP Autoscaler](../webapp/applications/apps_gcp_autoscaler.md) * [GCP Autoscaler](../webapp/applications/apps_gcp_autoscaler.md)
## Kubernetes ## Kubernetes
ClearML offers an option to install `clearml-agent` through a Helm chart. You can install `clearml-agent` through a Helm chart.
The Clearml Agent deployment is set to service a queue(s). When tasks are added to the queues, the agent pulls the task The Clearml Agent deployment is set to service a queue(s). When tasks are added to the queues, the agent pulls the task
and creates a pod to execute the task. Kubernetes handles resource management. Your task pod will remain pending until and creates a pod to execute the task. Kubernetes handles resource management. Your task pod will remain pending until

View File

@ -61,7 +61,7 @@ help maintainers reproduce the problem:
a [gist](https://gist.github.com) (and provide a link to that gist). a [gist](https://gist.github.com) (and provide a link to that gist).
* **Describe the behavior you observed after following the steps** and the exact problem with that behavior. * **Describe the behavior you observed after following the steps** and the exact problem with that behavior.
* **Explain which behavior you expected to see and why.** * **Explain which behavior you expected to see and why.**
* **For Web-App issues, please include screenshots and animated GIFs** that recreate the described steps and clearly demonstrate * **For WebApp (UI) issues, please include screenshots and animated GIFs** that recreate the described steps and clearly demonstrate
the problem. You can use [LICEcap](https://www.cockos.com/licecap) to record GIFs on macOS and Windows, and [silentcast](https://github.com/colinkeenan/silentcast) the problem. You can use [LICEcap](https://www.cockos.com/licecap) to record GIFs on macOS and Windows, and [silentcast](https://github.com/colinkeenan/silentcast)
or [byzanz](https://github.com/threedaymonk/byzanz) on Linux. or [byzanz](https://github.com/threedaymonk/byzanz) on Linux.
@ -85,9 +85,9 @@ Enhancement suggestions are tracked as GitHub issues. After you determine which
Before you submit a new PR: Before you submit a new PR:
* Verify that the work you plan to merge addresses an existing [issue](https://github.com/allegroai/clearml/issues) (If not, open a new one) * Verify that the work you plan to merge addresses an existing [issue](https://github.com/allegroai/clearml/issues) (if not, open a new one)
* Check related discussions in the [ClearML slack community](https://joinslack.clear.ml) * Check related discussions in the [ClearML slack community](https://joinslack.clear.ml)
(Or start your own discussion on the ``#clearml-dev`` channel) (or start your own discussion on the ``#clearml-dev`` channel)
* Make sure your code conforms to the ClearML coding standards by running: * Make sure your code conforms to the ClearML coding standards by running:
flake8 --max-line-length=120 --statistics --show-source --extend-ignore=E501 ./clearml* flake8 --max-line-length=120 --statistics --show-source --extend-ignore=E501 ./clearml*

View File

@ -45,7 +45,7 @@ Once deployed, ClearML Server exposes the following services:
1. Go to AWS EC2 Console. 1. Go to AWS EC2 Console.
1. In the **Details** tab, **Public DNS (IPv4)** shows the ClearML Server address. 1. In the **Details** tab, **Public DNS (IPv4)** shows the ClearML Server address.
**To access ClearML Server Web-App (UI):** **To access ClearML Server WebApp (UI):**
* Direct browser to its web server URL: `http://<Server Address>:8080` * Direct browser to its web server URL: `http://<Server Address>:8080`

View File

@ -91,12 +91,12 @@ title: FAQ
**ClearML Server Troubleshooting** **ClearML Server Troubleshooting**
* [I did a reinstall. Why can't I create credentials in the Web-App (UI)?](#clearml-server-reinstall-cookies) * [I did a reinstall. Why can't I create credentials in the WebApp (UI)?](#clearml-server-reinstall-cookies)
* [How do I fix Docker upgrade errors?](#common-docker-upgrade-errors) * [How do I fix Docker upgrade errors?](#common-docker-upgrade-errors)
* [Why is web login authentication not working?](#port-conflict) * [Why is web login authentication not working?](#port-conflict)
* [How do I bypass a proxy configuration to access my local ClearML Server?](#proxy-localhost) * [How do I bypass a proxy configuration to access my local ClearML Server?](#proxy-localhost)
* [Trains is failing to update ClearML Server. I get an error 500 (or 400). How do I fix this?](#elastic_watermark) * [Trains is failing to update ClearML Server. I get an error 500 (or 400). How do I fix this?](#elastic_watermark)
* [Why is my Trains Web-App (UI) not showing any data?](#web-ui-empty) * [Why is my Trains WebApp (UI) not showing any data?](#web-ui-empty)
* [Why can't I access my ClearML Server when I run my code in a virtual machine?](#vm_server) * [Why can't I access my ClearML Server when I run my code in a virtual machine?](#vm_server)
**ClearML Agent** **ClearML Agent**
@ -321,7 +321,7 @@ task = Task.init(project_name, task_name, Task.TaskTypes.testing)
**Sometimes I see experiments as running when in fact they are not. What's going on?** <a id="experiment-running-but-stopped"></a> **Sometimes I see experiments as running when in fact they are not. What's going on?** <a id="experiment-running-but-stopped"></a>
ClearML monitors your Python process. When the process exits properly, ClearML closes the experiment. When the process crashes and terminates abnormally, it sometimes misses the stop signal. In this case, you can safely right-click the experiment in the Web-App and abort it. ClearML monitors your Python process. When the process exits properly, ClearML closes the experiment. When the process crashes and terminates abnormally, it sometimes misses the stop signal. In this case, you can safely right-click the experiment in the WebApp and abort it.
<br/> <br/>
@ -919,7 +919,7 @@ on the "Configuring Your Own ClearML Server" page.
**Can I add web login authentication to ClearML Server?** <a id="web-auth"></a> **Can I add web login authentication to ClearML Server?** <a id="web-auth"></a>
By default, anyone can log in to the ClearML Server Web-App. You can configure the ClearML Server to allow only a specific set of users to access the system. By default, anyone can log in to the ClearML Server WebApp. You can configure the ClearML Server to allow only a specific set of users to access the system.
For detailed instructions, see [Web Login Authentication](deploying_clearml/clearml_server_config.md#web-login-authentication) For detailed instructions, see [Web Login Authentication](deploying_clearml/clearml_server_config.md#web-login-authentication)
on the "Configuring Your Own ClearML Server" page in the "Deploying ClearML" section. on the "Configuring Your Own ClearML Server" page in the "Deploying ClearML" section.
@ -940,7 +940,7 @@ For detailed instructions, see [Modifying non-responsive Task watchdog settings]
## ClearML Server Troubleshooting ## ClearML Server Troubleshooting
**I did a reinstall. Why can't I create credentials in the Web-App (UI)?** <a id="clearml-server-reinstall-cookies"></a> **I did a reinstall. Why can't I create credentials in the WebApp (UI)?** <a id="clearml-server-reinstall-cookies"></a>
The issue is likely your browser cookies for ClearML Server. Clearing your browser cookies for ClearML Server is recommended. The issue is likely your browser cookies for ClearML Server. Clearing your browser cookies for ClearML Server is recommended.
For example: For example:
@ -1089,9 +1089,9 @@ A likely indication of this situation can be determined by searching your clearm
<br/> <br/>
**Why is my ClearML Web-App (UI) not showing any data?** <a className="tr_top_negative" id="web-ui-empty"></a> **Why is my ClearML WebApp (UI) not showing any data?** <a className="tr_top_negative" id="web-ui-empty"></a>
If your ClearML Web-App (UI) does not show anything, it may be an error authenticating with the server. Try clearing the application cookies for the site in your browser's developer tools. If your ClearML WebApp (UI) does not show anything, it may be an error authenticating with the server. Try clearing the application cookies for the site in your browser's developer tools.
**Why can't I access my ClearML Server when I run my code in a virtual machine?** <a id="vm_server"></a> **Why can't I access my ClearML Server when I run my code in a virtual machine?** <a id="vm_server"></a>

View File

@ -7,7 +7,7 @@ title: Tasks
A Task is a single code execution session, which can represent an experiment, a step in a workflow, a workflow controller, A Task is a single code execution session, which can represent an experiment, a step in a workflow, a workflow controller,
or any custom implementation you choose. or any custom implementation you choose.
To transform an existing script into a **ClearML Task**, one must call the [`Task.init()`](../references/sdk/task.md#taskinit) method To transform an existing script into a **ClearML Task**, call the [`Task.init()`](../references/sdk/task.md#taskinit) method
and specify a task name and its project. This creates a Task object that automatically captures code execution and specify a task name and its project. This creates a Task object that automatically captures code execution
information as well as execution outputs. information as well as execution outputs.

View File

@ -84,7 +84,7 @@ preprocess_task = Task.get_task(task_id='preprocessing_task_id')
local_csv = preprocess_task.artifacts['data'].get_local_copy() local_csv = preprocess_task.artifacts['data'].get_local_copy()
``` ```
The `task.artifacts` is a dictionary where the keys are the artifact names, and the returned object is the artifact object. `task.artifacts` is a dictionary where the keys are the artifact names, and the returned object is the artifact object.
Calling `get_local_copy()` returns a local cached copy of the artifact. Therefore, next time we execute the code, we don't Calling `get_local_copy()` returns a local cached copy of the artifact. Therefore, next time we execute the code, we don't
need to download the artifact again. need to download the artifact again.
Calling `get()` gets a deserialized pickled object. Calling `get()` gets a deserialized pickled object.

View File

@ -30,11 +30,11 @@ So lets start with the inputs: hyperparameters. Hyperparameters are the confi
Lets take this simple code as an example. First of all, we start the script with the 2 magic lines of code that we covered before. Next to that we have a mix of command line arguments and some additional parameters in a dictionary here. Lets take this simple code as an example. First of all, we start the script with the 2 magic lines of code that we covered before. Next to that we have a mix of command line arguments and some additional parameters in a dictionary here.
The command line arguments will be captured automatically, and for the dict (or really any python object) we can use the `task.connect()` function, to report our dict values as ClearML hyperparameters. The command line arguments will be captured automatically, and for the dict (or really any python object) we can use the `Task.connect()` function, to report our dict values as ClearML hyperparameters.
As you can see, when we run the script, all hyperparameters are captured and parsed by the server, giving you a clean overview in the UI. As you can see, when we run the script, all hyperparameters are captured and parsed by the server, giving you a clean overview in the UI.
Configuration objects, however, work slightly differently and are mostly used for more complex configurations, like a nested dict or a yaml file for example. Theyre logged by using the `task.connect_configuration()` function instead and will save the configuration as a whole, without parsing it. Configuration objects, however, work slightly differently and are mostly used for more complex configurations, like a nested dict or a yaml file for example. Theyre logged by using the `Task.connect_configuration()` function instead and will save the configuration as a whole, without parsing it.
We have now logged our task with all of its inputs, but if we wanted to, we could rerun our code with different parameters and this is where the magic happens. We have now logged our task with all of its inputs, but if we wanted to, we could rerun our code with different parameters and this is where the magic happens.

View File

@ -119,7 +119,7 @@ abort task 0. So one way of doing that would be to go to the current experiment,
ClearML will actually bring me to the original experiment view, the experiment manager, remember everything is ClearML will actually bring me to the original experiment view, the experiment manager, remember everything is
integrated here. The experiment manager of that example task. So what I can do here if I look at the console, I have a integrated here. The experiment manager of that example task. So what I can do here if I look at the console, I have a
bunch of output here. I can actually abort it as well. And if I abort it, what will happen is this task will stop bunch of output here. I can actually abort it as well. And if I abort it, what will happen is this task will stop
executing. Essentially, it will send a `ctrl c`, so a quit command or a terminate command, to the original task on the \ executing. Essentially, it will send a `ctrl c`, so a quit command or a terminate command, to the original task on the
remote machine. So the remote machine will say okay, I'm done here. I will just quit it right here. If, for example, remote machine. So the remote machine will say okay, I'm done here. I will just quit it right here. If, for example,
your model is not performing very well, or you see like oh, something is definitely wrong here, you can always just your model is not performing very well, or you see like oh, something is definitely wrong here, you can always just
abort it. And the cool thing is if we go back to the **Workers and Queues**, we'll see that the `Beast 0` has given up working abort it. And the cool thing is if we go back to the **Workers and Queues**, we'll see that the `Beast 0` has given up working
@ -212,13 +212,13 @@ you would have made yourself, and now you want to get it into the queue. Now one
you could do a `Task.init` which essentially tracks the run of your code as an experiment in the experiment manager, and you could do a `Task.init` which essentially tracks the run of your code as an experiment in the experiment manager, and
then you could go and clone the experiment and then enqueue it. This is something that we saw in the Getting Started videos before. then you could go and clone the experiment and then enqueue it. This is something that we saw in the Getting Started videos before.
Now, another way of doing this is to actually use what you can see here, which is `task.execute_remotely`. What this line Now, another way of doing this is to actually use what you can see here, which is `Task.execute_remotely()`. What this line
specifically will do, is when you run the file right here. Let me just do that real quick. So if we do specifically will do, is when you run the file right here. Let me just do that real quick. So if we do
`python setup/example_task_CPU.py` what will happen is ClearML will do the `Task.init` like it would always do, but then `python setup/example_task_CPU.py` what will happen is ClearML will do the `Task.init` like it would always do, but then
it would encounter the `task.execute_remotely` and what that will tell ClearML is say okay, take all of this code, take it would encounter the `Task.execute_remotely()` and what that will tell ClearML is say okay, take all of this code, take
all of the packages that are installed, take all of the things that you would normally take as part of the experiment all of the packages that are installed, take all of the things that you would normally take as part of the experiment
manager, but stop executing right here and then send the rest, send everything through to a ClearML agent or to the queue manager, but stop executing right here and then send the rest, send everything through to a ClearML agent or to the queue
so that a ClearML agent can start working on it. So one way of doing this is to add a `task.execute_remotely` just all so that a ClearML agent can start working on it. So one way of doing this is to add a `Task.execute_remotely()` just all
the way at the top and then once you run it, you will see here `clearml WARNING - Terminating local execution process`, the way at the top and then once you run it, you will see here `clearml WARNING - Terminating local execution process`,
and so if we're seeing here if we're going to take a look we can see that Model Training currently running, and if we go and so if we're seeing here if we're going to take a look we can see that Model Training currently running, and if we go
and take a look, at our queues here, we have `any-remote-machine` running Model Training right here. And if we go and and take a look, at our queues here, we have `any-remote-machine` running Model Training right here. And if we go and
@ -246,7 +246,7 @@ our Model Training GPU. But remember again that we also have the autoscaler. So
autoscaler, you'll see here that we indeed have one task in the GPU queue. And we also see that the `GPU_machines` autoscaler, you'll see here that we indeed have one task in the GPU queue. And we also see that the `GPU_machines`
Running Instances is one as well. So we can follow along with the logs here. And it actually detected that there is a Running Instances is one as well. So we can follow along with the logs here. And it actually detected that there is a
task in a GPU queue, and it's now spinning up a new machine, a new GPU machine to be running that specific task, and then task in a GPU queue, and it's now spinning up a new machine, a new GPU machine to be running that specific task, and then
it will shut that back down again when it's done. So this is just one example of how you can use `task.execute_remotely` it will shut that back down again when it's done. So this is just one example of how you can use `Task.execute_remotely()`
to very efficiently get your tasks into the queue. Actually, it could also be the first time. So if you don't want to to very efficiently get your tasks into the queue. Actually, it could also be the first time. So if you don't want to
use the experiment manager for example, you don't actually have to use a task that is already in the system, you can use the experiment manager for example, you don't actually have to use a task that is already in the system, you can
just say it does not execute remotely, and it will just put it into the system for you and immediately launch it remotely. just say it does not execute remotely, and it will just put it into the system for you and immediately launch it remotely.

View File

@ -146,7 +146,7 @@ there and open it up, we first get the status of the task, just to be sure. Reme
something else might have happened in the meantime. If the status is not `completed`, we want to say this is the something else might have happened in the meantime. If the status is not `completed`, we want to say this is the
status, it isn't completed this should not happen but. If it is completed, we are going to create a table with these status, it isn't completed this should not happen but. If it is completed, we are going to create a table with these
functions that I won't go deeper into. Basically, they format the dictionary of the state of the task scalars into functions that I won't go deeper into. Basically, they format the dictionary of the state of the task scalars into
markdown that we can actually use. Let me just go into this though one quick time. So we can basically do `task.get_last_scalar_metrics`, markdown that we can actually use. Let me just go into this though one quick time. So we can basically do `Task.get_last_scalar_metrics()`,
and this function is built into ClearML, which basically gives you a dictionary with all the metrics on your task. and this function is built into ClearML, which basically gives you a dictionary with all the metrics on your task.
We'll just get that formatted into a table, make it into a pandas DataFrame, and then tabulate it with this cool package We'll just get that formatted into a table, make it into a pandas DataFrame, and then tabulate it with this cool package
that turns it into MarkDown. So now that we have marked down in the table, we then want to return results table. You can that turns it into MarkDown. So now that we have marked down in the table, we then want to return results table. You can

View File

@ -30,7 +30,7 @@ Yeah, yeah we can, it's called hyperparameter optimization. And we can do all of
If you dont know what Hyperparameter Optimization is yet, you can find a link to our blog post on the topic in the description below. But in its most basic form, hyperparameter optimization tries to optimize a certain output by changing a set of inputs. If you dont know what Hyperparameter Optimization is yet, you can find a link to our blog post on the topic in the description below. But in its most basic form, hyperparameter optimization tries to optimize a certain output by changing a set of inputs.
Lets say weve been working on this model here, and we were tracking our experiments with it anyway. We can see we have some hyperparameters to work with in the **Hyperparameters** tab of the web UI. They are logged by using the `task.connect` function in our code. These are our inputs. We also have a scaler called `validation/epoch_accuracy`, that we want to get as high as possible. This is our output. We could also select to minimize the `epoch_loss` for example, that is something you can decide yourself. Lets say weve been working on this model here, and we were tracking our experiments with it anyway. We can see we have some hyperparameters to work with in the **Hyperparameters** tab of the web UI. They are logged by using the `Task.connect` function in our code. These are our inputs. We also have a scaler called `validation/epoch_accuracy`, that we want to get as high as possible. This is our output. We could also select to minimize the `epoch_loss` for example, that is something you can decide yourself.
We can see that no code was used to log the scalar. It's done automatically because we are using TensorBoard. We can see that no code was used to log the scalar. It's done automatically because we are using TensorBoard.

View File

@ -103,7 +103,7 @@ In `slack_alerts.py`, the class `SlackMonitor` inherits from the `Monitor` class
* Builds the Slack message which includes the most recent output to the console (retrieved by calling [`Task.get_reported_console_output`](../../references/sdk/task.md#get_reported_console_output)), * Builds the Slack message which includes the most recent output to the console (retrieved by calling [`Task.get_reported_console_output`](../../references/sdk/task.md#get_reported_console_output)),
and the URL of the Task's output log in the ClearML Web UI (retrieved by calling [`Task.get_output_log_web_page`](../../references/sdk/task.md#get_output_log_web_page)). and the URL of the Task's output log in the ClearML Web UI (retrieved by calling [`Task.get_output_log_web_page`](../../references/sdk/task.md#get_output_log_web_page)).
The example provides the option to run locally or execute remotely by calling the [`Task.execute_remotely`](../../references/sdk/task.md#execute_remotely) You can run the example remotely by calling the [`Task.execute_remotely`](../../references/sdk/task.md#execute_remotely)
method. method.
To interface to Slack, the example uses `slack_sdk.WebClient` and `slack_sdk.errors.SlackApiError`. To interface to Slack, the example uses `slack_sdk.WebClient` and `slack_sdk.errors.SlackApiError`.

View File

@ -22,7 +22,7 @@ In the `examples/frameworks/pytorch` directory, run the experiment script:
Clone the experiment to create an editable copy for tuning. Clone the experiment to create an editable copy for tuning.
1. In the **ClearML Web-App (UI)**, on the Projects page, click the `examples` project card. 1. In the ClearML WebApp (UI), on the Projects page, click the `examples` project card.
1. In the experiments table, right-click the experiment `pytorch mnist train`. 1. In the experiments table, right-click the experiment `pytorch mnist train`.
@ -82,7 +82,7 @@ Run the worker daemon on the local development machine.
Enqueue the tuned experiment. Enqueue the tuned experiment.
1. In the **ClearML Web-App (UI)**, experiments table, right-click the experiment `Clone Of pytorch mnist train`. 1. In the ClearML WebApp > experiments table, right-click the experiment `Clone Of pytorch mnist train`.
1. In the context menu, click **Enqueue**. 1. In the context menu, click **Enqueue**.
@ -95,7 +95,7 @@ Enqueue the tuned experiment.
## Step 6: Compare the Experiments ## Step 6: Compare the Experiments
To compare the original and tuned experiments: To compare the original and tuned experiments:
1. In the **ClearML Web-App (UI)**, on the Projects page, click the `examples` project. 1. In the ClearML WebApp (UI), on the Projects page, click the `examples` project.
1. In the experiments table, select the checkboxes for the two experiments: `pytorch mnist train` and `Clone Of pytorch mnist train`. 1. In the experiments table, select the checkboxes for the two experiments: `pytorch mnist train` and `Clone Of pytorch mnist train`.
1. On the menu bar at the bottom of the experiments table, click **COMPARE**. The experiment comparison window appears. 1. On the menu bar at the bottom of the experiments table, click **COMPARE**. The experiment comparison window appears.
All differences appear with a different background color to highlight them. All differences appear with a different background color to highlight them.

View File

@ -10,7 +10,7 @@ Hyper-Datasets are supported by the `allegroai` python package.
### Connecting Dataviews to a Task ### Connecting Dataviews to a Task
Use [`Task.connect`](../references/sdk/task.md#connect) to connect a Dataview object to a Task: Use [`Task.connect()`](../references/sdk/task.md#connect) to connect a Dataview object to a Task:
```python ```python
from allegroai import DataView, Task from allegroai import DataView, Task

View File

@ -102,7 +102,7 @@ def step_one(pickle_data_url: str, extra: int = 43):
instead of rerunning the step. instead of rerunning the step.
* `packages` - A list of required packages or a local requirements.txt file. Example: `["tqdm>=2.1", "scikit-learn"]` or * `packages` - A list of required packages or a local requirements.txt file. Example: `["tqdm>=2.1", "scikit-learn"]` or
`"./requirements.txt"`. If not provided, packages are automatically added based on the imports used inside the function. `"./requirements.txt"`. If not provided, packages are automatically added based on the imports used inside the function.
* `execution_queue` (Optional) - Queue in which to enqueue the specific step. This overrides the queue set with the * `execution_queue` (optional) - Queue in which to enqueue the specific step. This overrides the queue set with the
[`PipelineDecorator.set_default_execution_queue method`](../references/sdk/automation_controller_pipelinecontroller.md#pipelinedecoratorset_default_execution_queue) [`PipelineDecorator.set_default_execution_queue method`](../references/sdk/automation_controller_pipelinecontroller.md#pipelinedecoratorset_default_execution_queue)
method. method.
* `continue_on_fail` - If `True`, a failed step does not cause the pipeline to stop (or marked as failed). Notice, that * `continue_on_fail` - If `True`, a failed step does not cause the pipeline to stop (or marked as failed). Notice, that
@ -115,11 +115,11 @@ def step_one(pickle_data_url: str, extra: int = 43):
* Examples: * Examples:
* remote url: `"https://github.com/user/repo.git"` * remote url: `"https://github.com/user/repo.git"`
* local repo copy: `"./repo"` -> will automatically store the remote repo url and commit ID based on the locally cloned copy * local repo copy: `"./repo"` -> will automatically store the remote repo url and commit ID based on the locally cloned copy
* `repo_branch` (Optional) - Specify the remote repository branch (Ignored, if local repo path is used) * `repo_branch` (optional) - Specify the remote repository branch (ignored, if local repo path is used)
* `repo_commit` (Optional) - Specify the repository commit ID (Ignored, if local repo path is used) * `repo_commit` (optional) - Specify the repository commit ID (ignored, if local repo path is used)
* `helper_functions` (Optional) - A list of helper functions to make available for the standalone pipeline step. By default, the pipeline step function has no access to any of the other functions, by specifying additional functions here, the remote pipeline step could call the additional functions. * `helper_functions` (optional) - A list of helper functions to make available for the standalone pipeline step. By default, the pipeline step function has no access to any of the other functions, by specifying additional functions here, the remote pipeline step could call the additional functions.
Example, assuming you have two functions, `parse_data()` and `load_data()`: `[parse_data, load_data]` Example, assuming you have two functions, `parse_data()` and `load_data()`: `[parse_data, load_data]`
* `parents` Optional list of parent steps in the pipeline. The current step in the pipeline will be sent for execution only after all the parent steps have been executed successfully. * `parents` (optional) - A list of parent steps in the pipeline. The current step in the pipeline will be sent for execution only after all the parent steps have been executed successfully.
* `retry_on_failure` - Number of times to retry step in case of failure. You can also input a callable function in the * `retry_on_failure` - Number of times to retry step in case of failure. You can also input a callable function in the
following format: following format:
@ -153,12 +153,12 @@ def step_one(pickle_data_url: str, extra: int = 43):
Additionally, you can enable automatic logging of a steps metrics / artifacts / models to the pipeline task using the Additionally, you can enable automatic logging of a steps metrics / artifacts / models to the pipeline task using the
following arguments: following arguments:
* `monitor_metrics` (Optional) - Automatically log the step's reported metrics also on the pipeline Task. The expected * `monitor_metrics` (optional) - Automatically log the step's reported metrics also on the pipeline Task. The expected
format is one of the following: format is one of the following:
* List of pairs metric (title, series) to log: [(step_metric_title, step_metric_series), ]. Example: `[('test', 'accuracy'), ]` * List of pairs metric (title, series) to log: [(step_metric_title, step_metric_series), ]. Example: `[('test', 'accuracy'), ]`
* List of tuple pairs, to specify a different target metric to use on the pipeline Task: [((step_metric_title, step_metric_series), (target_metric_title, target_metric_series)), ]. * List of tuple pairs, to specify a different target metric to use on the pipeline Task: [((step_metric_title, step_metric_series), (target_metric_title, target_metric_series)), ].
Example: `[[('test', 'accuracy'), ('model', 'accuracy')], ]` Example: `[[('test', 'accuracy'), ('model', 'accuracy')], ]`
* `monitor_artifacts` (Optional) - Automatically log the step's artifacts on the pipeline Task. * `monitor_artifacts` (optional) - Automatically log the step's artifacts on the pipeline Task.
* Provided a list of * Provided a list of
artifact names created by the step function, these artifacts will be logged automatically also on the Pipeline Task artifact names created by the step function, these artifacts will be logged automatically also on the Pipeline Task
itself. Example: `['processed_data', ]` (target artifact name on the Pipeline Task will have the same name as the original itself. Example: `['processed_data', ]` (target artifact name on the Pipeline Task will have the same name as the original
@ -166,7 +166,7 @@ following arguments:
* Alternatively, provide a list of pairs (source_artifact_name, target_artifact_name), where the first string is the * Alternatively, provide a list of pairs (source_artifact_name, target_artifact_name), where the first string is the
artifact name as it appears on the component Task, and the second is the target artifact name to put on the Pipeline artifact name as it appears on the component Task, and the second is the target artifact name to put on the Pipeline
Task. Example: `[('processed_data', 'final_processed_data'), ]` Task. Example: `[('processed_data', 'final_processed_data'), ]`
* `monitor_models` (Optional) - Automatically log the step's output models on the pipeline Task. * `monitor_models` (optional) - Automatically log the step's output models on the pipeline Task.
* Provided a list of model names created by the step's Task, they will also appear on the Pipeline itself. Example: `['model_weights', ]` * Provided a list of model names created by the step's Task, they will also appear on the Pipeline itself. Example: `['model_weights', ]`
* To select the latest (lexicographic) model use `model_*`, or the last created model with just `*`. Example: `['model_weights_*', ]` * To select the latest (lexicographic) model use `model_*`, or the last created model with just `*`. Example: `['model_weights_*', ]`
* Alternatively, provide a list of pairs (source_model_name, target_model_name), where the first string is the model * Alternatively, provide a list of pairs (source_model_name, target_model_name), where the first string is the model

View File

@ -2,7 +2,7 @@
title: Comparing Experiments title: Comparing Experiments
--- ---
It is always useful to investigate what causes an experiment to succeed. It is always useful to investigate what causes an experiment to succeed.
The **ClearML Web UI** provides experiment comparison features, allowing to locate, visualize, and analyze differences including: The ClearML Web UI provides experiment comparison features, allowing to locate, visualize, and analyze differences including:
* [Details](#details) * [Details](#details)
- Artifacts - Input model, output model, and model design. - Artifacts - Input model, output model, and model design.
@ -123,7 +123,7 @@ Visualize the comparison of scalars, which includes metrics and monitored resour
### Compare Scalar Series ### Compare Scalar Series
Compare scalar series in plots and analyze differences using **ClearML Web UI** plot tools. Compare scalar series in plots and analyze differences using plot tools.
**To compare scalar series:** **To compare scalar series:**

View File

@ -7,7 +7,7 @@ The **ClearML Web UI** is the graphical user interface for the ClearML platform,
* Browsing * Browsing
* Resource utilization monitoring * Resource utilization monitoring
* Profile management * Profile management
* Direct access to the ClearML community (Slack Channel, Youtube, and GitHub). * Direct access to the ClearML community (Slack Channel, YouTube, and GitHub).
![WebApp screenshots gif](../img/gif/webapp_screenshots.gif) ![WebApp screenshots gif](../img/gif/webapp_screenshots.gif)

View File

@ -24,7 +24,7 @@ be sent to the experiment's page.
Every project has a `description` field. The UI provides a Markdown editor to edit this field. Every project has a `description` field. The UI provides a Markdown editor to edit this field.
In the Markdown document, you can write and share reports and add links to **ClearML** experiments In the Markdown document, you can write and share reports and add links to ClearML experiments
or any network resource such as issue tracker, web repository, etc. or any network resource such as issue tracker, web repository, etc.
### Editing the Description ### Editing the Description