Small edits (#595)

This commit is contained in:
pollfly
2023-06-15 11:22:50 +03:00
committed by GitHub
parent c256f46993
commit fdffc9c271
29 changed files with 62 additions and 62 deletions

View File

@@ -84,7 +84,7 @@ preprocess_task = Task.get_task(task_id='preprocessing_task_id')
local_csv = preprocess_task.artifacts['data'].get_local_copy()
```
The `task.artifacts` is a dictionary where the keys are the artifact names, and the returned object is the artifact object.
`task.artifacts` is a dictionary where the keys are the artifact names, and the returned object is the artifact object.
Calling `get_local_copy()` returns a local cached copy of the artifact. Therefore, next time we execute the code, we don't
need to download the artifact again.
Calling `get()` gets a deserialized pickled object.

View File

@@ -30,11 +30,11 @@ So lets start with the inputs: hyperparameters. Hyperparameters are the confi
Lets take this simple code as an example. First of all, we start the script with the 2 magic lines of code that we covered before. Next to that we have a mix of command line arguments and some additional parameters in a dictionary here.
The command line arguments will be captured automatically, and for the dict (or really any python object) we can use the `task.connect()` function, to report our dict values as ClearML hyperparameters.
The command line arguments will be captured automatically, and for the dict (or really any python object) we can use the `Task.connect()` function, to report our dict values as ClearML hyperparameters.
As you can see, when we run the script, all hyperparameters are captured and parsed by the server, giving you a clean overview in the UI.
Configuration objects, however, work slightly differently and are mostly used for more complex configurations, like a nested dict or a yaml file for example. Theyre logged by using the `task.connect_configuration()` function instead and will save the configuration as a whole, without parsing it.
Configuration objects, however, work slightly differently and are mostly used for more complex configurations, like a nested dict or a yaml file for example. Theyre logged by using the `Task.connect_configuration()` function instead and will save the configuration as a whole, without parsing it.
We have now logged our task with all of its inputs, but if we wanted to, we could rerun our code with different parameters and this is where the magic happens.

View File

@@ -188,7 +188,7 @@ your machine usage and your GPU usage and stuff like that, and then the learning
give you a really, really quick overview of the most important metrics that you're trying to solve. And keep in mind
this F1 score because this is the thing that we're trying to optimize here.
Then plots. I can, for example, plot a confusion matrix every X iterations. So in this case ,for example, after a few
Then plots. I can, for example, plot a confusion matrix every X iterations. So in this case, for example, after a few
iterations, I plot the confusion matrix again just so I can see over time how well the model starts performing. So as
you can see here, a perfect confusion matrix will be a diagonal line because every true label will be combined with the
exact same predicted label. And in this case, it's horribly wrong. But then over time it starts getting closer and

View File

@@ -119,7 +119,7 @@ abort task 0. So one way of doing that would be to go to the current experiment,
ClearML will actually bring me to the original experiment view, the experiment manager, remember everything is
integrated here. The experiment manager of that example task. So what I can do here if I look at the console, I have a
bunch of output here. I can actually abort it as well. And if I abort it, what will happen is this task will stop
executing. Essentially, it will send a `ctrl c`, so a quit command or a terminate command, to the original task on the \
executing. Essentially, it will send a `ctrl c`, so a quit command or a terminate command, to the original task on the
remote machine. So the remote machine will say okay, I'm done here. I will just quit it right here. If, for example,
your model is not performing very well, or you see like oh, something is definitely wrong here, you can always just
abort it. And the cool thing is if we go back to the **Workers and Queues**, we'll see that the `Beast 0` has given up working
@@ -212,13 +212,13 @@ you would have made yourself, and now you want to get it into the queue. Now one
you could do a `Task.init` which essentially tracks the run of your code as an experiment in the experiment manager, and
then you could go and clone the experiment and then enqueue it. This is something that we saw in the Getting Started videos before.
Now, another way of doing this is to actually use what you can see here, which is `task.execute_remotely`. What this line
Now, another way of doing this is to actually use what you can see here, which is `Task.execute_remotely()`. What this line
specifically will do, is when you run the file right here. Let me just do that real quick. So if we do
`python setup/example_task_CPU.py` what will happen is ClearML will do the `Task.init` like it would always do, but then
it would encounter the `task.execute_remotely` and what that will tell ClearML is say okay, take all of this code, take
it would encounter the `Task.execute_remotely()` and what that will tell ClearML is say okay, take all of this code, take
all of the packages that are installed, take all of the things that you would normally take as part of the experiment
manager, but stop executing right here and then send the rest, send everything through to a ClearML agent or to the queue
so that a ClearML agent can start working on it. So one way of doing this is to add a `task.execute_remotely` just all
so that a ClearML agent can start working on it. So one way of doing this is to add a `Task.execute_remotely()` just all
the way at the top and then once you run it, you will see here `clearml WARNING - Terminating local execution process`,
and so if we're seeing here if we're going to take a look we can see that Model Training currently running, and if we go
and take a look, at our queues here, we have `any-remote-machine` running Model Training right here. And if we go and
@@ -246,7 +246,7 @@ our Model Training GPU. But remember again that we also have the autoscaler. So
autoscaler, you'll see here that we indeed have one task in the GPU queue. And we also see that the `GPU_machines`
Running Instances is one as well. So we can follow along with the logs here. And it actually detected that there is a
task in a GPU queue, and it's now spinning up a new machine, a new GPU machine to be running that specific task, and then
it will shut that back down again when it's done. So this is just one example of how you can use `task.execute_remotely`
it will shut that back down again when it's done. So this is just one example of how you can use `Task.execute_remotely()`
to very efficiently get your tasks into the queue. Actually, it could also be the first time. So if you don't want to
use the experiment manager for example, you don't actually have to use a task that is already in the system, you can
just say it does not execute remotely, and it will just put it into the system for you and immediately launch it remotely.

View File

@@ -146,7 +146,7 @@ there and open it up, we first get the status of the task, just to be sure. Reme
something else might have happened in the meantime. If the status is not `completed`, we want to say this is the
status, it isn't completed this should not happen but. If it is completed, we are going to create a table with these
functions that I won't go deeper into. Basically, they format the dictionary of the state of the task scalars into
markdown that we can actually use. Let me just go into this though one quick time. So we can basically do `task.get_last_scalar_metrics`,
markdown that we can actually use. Let me just go into this though one quick time. So we can basically do `Task.get_last_scalar_metrics()`,
and this function is built into ClearML, which basically gives you a dictionary with all the metrics on your task.
We'll just get that formatted into a table, make it into a pandas DataFrame, and then tabulate it with this cool package
that turns it into MarkDown. So now that we have marked down in the table, we then want to return results table. You can

View File

@@ -30,7 +30,7 @@ Yeah, yeah we can, it's called hyperparameter optimization. And we can do all of
If you dont know what Hyperparameter Optimization is yet, you can find a link to our blog post on the topic in the description below. But in its most basic form, hyperparameter optimization tries to optimize a certain output by changing a set of inputs.
Lets say weve been working on this model here, and we were tracking our experiments with it anyway. We can see we have some hyperparameters to work with in the **Hyperparameters** tab of the web UI. They are logged by using the `task.connect` function in our code. These are our inputs. We also have a scaler called `validation/epoch_accuracy`, that we want to get as high as possible. This is our output. We could also select to minimize the `epoch_loss` for example, that is something you can decide yourself.
Lets say weve been working on this model here, and we were tracking our experiments with it anyway. We can see we have some hyperparameters to work with in the **Hyperparameters** tab of the web UI. They are logged by using the `Task.connect` function in our code. These are our inputs. We also have a scaler called `validation/epoch_accuracy`, that we want to get as high as possible. This is our output. We could also select to minimize the `epoch_loss` for example, that is something you can decide yourself.
We can see that no code was used to log the scalar. It's done automatically because we are using TensorBoard.