mirror of
https://github.com/clearml/clearml
synced 2025-01-31 17:17:00 +00:00
459 lines
18 KiB
Plaintext
459 lines
18 KiB
Plaintext
{
|
|
"cells": [
|
|
{
|
|
"attachments": {},
|
|
"cell_type": "markdown",
|
|
"metadata": {
|
|
"id": "B7Fs0CZeFPVM"
|
|
},
|
|
"source": [
|
|
"<div align=\"center\">\n",
|
|
"\n",
|
|
" <a href=\"https://clear.ml\" target=\"_blank\">\n",
|
|
" <img width=\"512\", src=\"https://github.com/allegroai/clearml/raw/master/docs/clearml-logo.svg\"></a>\n",
|
|
"\n",
|
|
"\n",
|
|
"<br>\n",
|
|
"\n",
|
|
"<h1>Notebook 3: Remote Task Execution</h1>\n",
|
|
"\n",
|
|
"<br>\n",
|
|
"\n",
|
|
"Hi there! This is the third notebook in the ClearML getting started notebook series, meant to teach you the ropes. In the <a href=\"https://colab.research.google.com/github/allegroai/clearml/blob/master/docs/tutorials/Getting_Started_2_Setting_Up_Agent.ipynb\" target=\"_blank\">last notebook</a> we set up an agent within the Google Colab environment. Now it's time to make that agent run our task! We're going to clone our first, exisiting, experiment, change some parameters and then run the resulting task remotely using our colab agent.\n",
|
|
"\n",
|
|
"You can find out more details about the other ClearML modules and the technical specifics of each in <a href=\"https://clear.ml/docs\" target=\"_blank\">our documentation.</a>\n",
|
|
"\n",
|
|
"<table>\n",
|
|
"<tbody>\n",
|
|
" <tr>\n",
|
|
" <td>Step 1: Experiment Management</td>\n",
|
|
" <td><a target=\"_blank\" href=\"https://colab.research.google.com/github/allegroai/clearml/blob/master/docs/tutorials/Getting_Started_1_Experiment_Management.ipynb\">\n",
|
|
" <img src=\"https://colab.research.google.com/assets/colab-badge.svg\" alt=\"Open In Colab\"/>\n",
|
|
"</a></td>\n",
|
|
" </tr>\n",
|
|
" <tr>\n",
|
|
" <td>Step 2: Remote Agent</td>\n",
|
|
" <td><a target=\"_blank\" href=\"https://colab.research.google.com/github/allegroai/clearml/blob/master/docs/tutorials/Getting_Started_2_Setting_Up_Agent.ipynb\">\n",
|
|
" <img src=\"https://colab.research.google.com/assets/colab-badge.svg\" alt=\"Open In Colab\"/>\n",
|
|
"</a></td>\n",
|
|
" </tr>\n",
|
|
" <tr>\n",
|
|
" <td><b>Step 3: Remote Task Execution</b></td>\n",
|
|
" <td><a target=\"_blank\" href=\"https://colab.research.google.com/github/allegroai/clearml/blob/master/docs/tutorials/Getting_Started_3_Remote_Execution.ipynb\">\n",
|
|
" <img src=\"https://colab.research.google.com/assets/colab-badge.svg\" alt=\"Open In Colab\"/>\n",
|
|
"</a></td>\n",
|
|
" </tr>\n",
|
|
"</tbody>\n",
|
|
"</table>\n",
|
|
"\n",
|
|
"</div>"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "markdown",
|
|
"metadata": {
|
|
"id": "U7jM_FiPLnvi"
|
|
},
|
|
"source": [
|
|
"# 📦 Setup\n",
|
|
"\n",
|
|
"Just like in the other notebooks, we will need to install and setup ClearML.\n",
|
|
"\n",
|
|
"**NOTE: Make sure the agent is still running in the other notebook, when running this one!**"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": null,
|
|
"metadata": {
|
|
"id": "H1Otow-YE9ks"
|
|
},
|
|
"outputs": [],
|
|
"source": [
|
|
"%pip install --upgrade clearml\n",
|
|
"import clearml\n",
|
|
"clearml.browser_login()"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "markdown",
|
|
"metadata": {
|
|
"id": "vneMjf39Z3Uh"
|
|
},
|
|
"source": [
|
|
"# 🔎 Querying our Task\n",
|
|
"\n",
|
|
"If we want to remotely run a task, we first need to know which one!\n",
|
|
"Getting the Task ID from the webUI is quite easy, just navigate to your project, select the task you're interested in and click this button to copy the Task ID to your clipboard:\n",
|
|
"\n",
|
|
"![](https://i.imgur.com/W7ZnEnX.png)\n"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": null,
|
|
"metadata": {
|
|
"colab": {
|
|
"base_uri": "https://localhost:8080/"
|
|
},
|
|
"id": "0TazW5dSZ2w1",
|
|
"outputId": "0b238e0d-32b8-4a9d-e4bf-141b798e6806"
|
|
},
|
|
"outputs": [
|
|
{
|
|
"data": {
|
|
"text/plain": [
|
|
"('TB Logging', '0bfefb8b86ba44798d9abe34ba0a6ab4')"
|
|
]
|
|
},
|
|
"execution_count": 7,
|
|
"metadata": {},
|
|
"output_type": "execute_result"
|
|
}
|
|
],
|
|
"source": [
|
|
"# from clearml import Task\n",
|
|
"\n",
|
|
"# task = Task.get_task(task_id=\"YOUR_TASK_ID\")\n",
|
|
"# task.name, task.id"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "markdown",
|
|
"metadata": {
|
|
"id": "hgRieZmvgIPS"
|
|
},
|
|
"source": [
|
|
"However, we can also query ClearML using the Python SDK. Let's search your ClearML history for any task in the project `Getting Started` with the name `XGBoost Training` (both of which we used in tutorial notebook 1). ClearML should then give us a list of Task IDs that could fit the bill."
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": null,
|
|
"metadata": {
|
|
"colab": {
|
|
"base_uri": "https://localhost:8080/"
|
|
},
|
|
"id": "uQSAOb_AgF78",
|
|
"outputId": "81a2d3bc-b4aa-4af3-b3fe-d9f3fcec6a7a"
|
|
},
|
|
"outputs": [
|
|
{
|
|
"name": "stdout",
|
|
"output_type": "stream",
|
|
"text": [
|
|
"XGBoost Training 1c7f708a976b413da147b341c54f862d\n"
|
|
]
|
|
}
|
|
],
|
|
"source": [
|
|
"from clearml import Task\n",
|
|
"\n",
|
|
"# This will get the single most recent task fitting the description\n",
|
|
"# ⚠️ NOTE: Make sure you ran the XGBoost Training cell from notebook 1 and that the project and task name exist!\n",
|
|
"task = Task.get_task(project_name=\"Getting Started\", task_name=\"XGBoost Training\")\n",
|
|
"\n",
|
|
"if not task:\n",
|
|
" print(\"⚠️ WARNING: In order to make this work, you will need the XGBoost Training task from Notebook 1. Make sure to run the cell linked below in the same ClearML account!\")\n",
|
|
" print(\"https://colab.research.google.com/drive/1oHiW1qwLVvazk3qFZWBULfpciPEQp8kc#scrollTo=CSaL3XTqhYAy&line=5&uniqifier=1\")\n",
|
|
"else:\n",
|
|
" print(task.name, task.id)"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": null,
|
|
"metadata": {
|
|
"colab": {
|
|
"base_uri": "https://localhost:8080/"
|
|
},
|
|
"id": "gRz3FQFdjKcC",
|
|
"outputId": "63148b61-136c-4cd5-fc61-5829c713c831"
|
|
},
|
|
"outputs": [
|
|
{
|
|
"data": {
|
|
"text/plain": [
|
|
"[('XGBoost Training', '959c353c183245d5b85bfb2958e49f5d'),\n",
|
|
" ('XGBoost Training', '6810dd7c2c944099a4bc544a4c5079ff'),\n",
|
|
" ('XGBoost Training', '818894e032a44e7c9bafd758ee9fae7b'),\n",
|
|
" ('XGBoost Training', '762d13fc58ad4c5e8ae3695195b00d97'),\n",
|
|
" ('XGBoost Training', '3bf2b49800c24fd7b16f64f8c70cef83'),\n",
|
|
" ('XGBoost Training', '95ead34d9ce94955b2998b1de37f4892'),\n",
|
|
" ('XGBoost Training', '1c7f708a976b413da147b341c54f862d')]"
|
|
]
|
|
},
|
|
"execution_count": 9,
|
|
"metadata": {},
|
|
"output_type": "execute_result"
|
|
}
|
|
],
|
|
"source": [
|
|
"# This will get all tasks fitting the description\n",
|
|
"tasks = Task.get_tasks(project_name=\"Getting Started\", task_name=\"XGBoost Training\")\n",
|
|
"[(task.name, task.id) for task in tasks]"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": null,
|
|
"metadata": {
|
|
"colab": {
|
|
"base_uri": "https://localhost:8080/",
|
|
"height": 35
|
|
},
|
|
"id": "nyTBIPxUjutj",
|
|
"outputId": "36f3808b-4386-4702-898c-8bb3fee0246a"
|
|
},
|
|
"outputs": [
|
|
{
|
|
"data": {
|
|
"application/vnd.google.colaboratory.intrinsic+json": {
|
|
"type": "string"
|
|
},
|
|
"text/plain": [
|
|
"'1c7f708a976b413da147b341c54f862d'"
|
|
]
|
|
},
|
|
"execution_count": 10,
|
|
"metadata": {},
|
|
"output_type": "execute_result"
|
|
}
|
|
],
|
|
"source": [
|
|
"# Let's set our task ID based on the results here so we can use it below\n",
|
|
"task_id = task.id\n",
|
|
"task_id"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "markdown",
|
|
"metadata": {
|
|
"id": "FLmjZG8gj2WO"
|
|
},
|
|
"source": [
|
|
"# 👥 Clone and Enqueue Task\n",
|
|
"\n",
|
|
"Now that we know which task we want to run, we can start by cloning it. Cloning a task makes it editable. Now we can override any of the parameters or runtime configurations. For example, here we can set a different `max_depth` by specifying the parameter name as well as the section.\n",
|
|
"\n",
|
|
"![](https://i.imgur.com/IOL20Sg.png)"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": null,
|
|
"metadata": {
|
|
"colab": {
|
|
"base_uri": "https://localhost:8080/"
|
|
},
|
|
"id": "OtYJyyY7jzpl",
|
|
"outputId": "3dbd436b-648c-4c73-bbfd-1eeba10e47e2"
|
|
},
|
|
"outputs": [
|
|
{
|
|
"data": {
|
|
"text/plain": [
|
|
"<tasks.EnqueueResponse: {\n",
|
|
" \"updated\": 1,\n",
|
|
" \"fields\": {\n",
|
|
" \"status\": \"queued\",\n",
|
|
" \"status_reason\": \"\",\n",
|
|
" \"status_message\": \"\",\n",
|
|
" \"status_changed\": \"2023-02-27T15:53:37.611445+00:00\",\n",
|
|
" \"last_update\": \"2023-02-27T15:53:37.611445+00:00\",\n",
|
|
" \"last_change\": \"2023-02-27T15:53:37.611445+00:00\",\n",
|
|
" \"last_changed_by\": \"fed1d02f8f5c40c79f84733ccc8ec065\",\n",
|
|
" \"enqueue_status\": \"created\",\n",
|
|
" \"execution.queue\": \"2d9d510d7f8546ca939555dffe45deb1\"\n",
|
|
" },\n",
|
|
" \"queued\": 1\n",
|
|
"}>"
|
|
]
|
|
},
|
|
"execution_count": 14,
|
|
"metadata": {},
|
|
"output_type": "execute_result"
|
|
}
|
|
],
|
|
"source": [
|
|
"# Clone the original task and keep track of the new one\n",
|
|
"new_task = Task.clone(source_task=task)\n",
|
|
"# Override any of the parameters!\n",
|
|
"new_task.update_parameters({\"General/max_depth\": 3})\n",
|
|
"# We can even rename it if we wanted\n",
|
|
"new_task.rename(f\"Cloned Task\")\n",
|
|
"# Make sure that the diff does not contain Colab invocation!\n",
|
|
"# cf. https://github.com/allegroai/clearml/issues/1204\n",
|
|
"new_task.set_script(diff=\"pass\")\n",
|
|
"# Now enqueue it for the colab worker to start working on it!\n",
|
|
"Task.enqueue(task=new_task, queue_name=\"default\")"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": null,
|
|
"metadata": {
|
|
"colab": {
|
|
"base_uri": "https://localhost:8080/",
|
|
"height": 35
|
|
},
|
|
"id": "3vxyEaJNk_KV",
|
|
"outputId": "ddf31b3e-1cf9-498c-9440-5fbf18802be8"
|
|
},
|
|
"outputs": [
|
|
{
|
|
"data": {
|
|
"application/vnd.google.colaboratory.intrinsic+json": {
|
|
"type": "string"
|
|
},
|
|
"text/plain": [
|
|
"'in_progress'"
|
|
]
|
|
},
|
|
"execution_count": 15,
|
|
"metadata": {},
|
|
"output_type": "execute_result"
|
|
}
|
|
],
|
|
"source": [
|
|
"# Let's take a look at the status\n",
|
|
"new_task.status"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": null,
|
|
"metadata": {
|
|
"id": "hx4WqzfDIPTl"
|
|
},
|
|
"outputs": [],
|
|
"source": [
|
|
"from time import sleep\n",
|
|
"# Now we can set up a loop that waits until our task is done!\n",
|
|
"# If you have enabled notifications on Colab, it will even let you know\n",
|
|
"# when the ClearML task is done!\n",
|
|
"while new_task.status not in [\"completed\", \"failed\"]:\n",
|
|
" if new_task.status == \"draft\":\n",
|
|
" print(\"Task is still in draft mode! You have to enqueue it before the agent can run it.\")\n",
|
|
"\n",
|
|
" print(f\"Task is not done yet! Current status: {new_task.status}\")\n",
|
|
" sleep(5)"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "markdown",
|
|
"metadata": {
|
|
"id": "mefZhEBVlOx-"
|
|
},
|
|
"source": [
|
|
"When a task is enqueued, it has 4 possible states:\n",
|
|
"\n",
|
|
"- Pending (the task is waiting to be pulled by a worker)\n",
|
|
"- In Progress (the task is pulled and being executed)\n",
|
|
"- Completed\n",
|
|
"- Failed\n",
|
|
"\n",
|
|
"You can keep track of this status by using the `.status` property of a clearml Task instance. If you only have the Task ID, you can always get the task object by running `task = Task.get_task(task_id=\"your_task_id\")`\n",
|
|
"\n",
|
|
"In the webUI the status of a task is pretty visible:\n",
|
|
"![](https://i.imgur.com/Ep7v6AN.png)\n",
|
|
"\n",
|
|
"**NOTE: The first time running a task can take quite a long time because the agent has to install all of the original packages and get the original code. Everything is cached however, so it should be faster next time!**\n",
|
|
"\n",
|
|
"If everything went well, you should now see the colab worker start working on the task:\n",
|
|
"\n",
|
|
"![](https://i.imgur.com/eI0ui5J.png)"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "markdown",
|
|
"metadata": {
|
|
"id": "Ejr6wqZxmqil"
|
|
},
|
|
"source": [
|
|
"# 🧐 Inspecting the results\n",
|
|
"\n",
|
|
"Now that the task has completed successfully, you can now view its results just like any other experiment in the experiment manager!\n",
|
|
"\n",
|
|
"![](https://i.imgur.com/zfkuKlO.png)\n",
|
|
"\n",
|
|
"You can of course also select both the original and the cloned experiment and compare them directly in ClearML!\n",
|
|
"\n",
|
|
"![](https://i.imgur.com/Gd56Swd.png)\n",
|
|
"\n",
|
|
"Remember that the SDK is quite powerful too, so almost any piece of information that you can see in the webUI, you can also get from the SDK to build automations onto.\n",
|
|
"\n",
|
|
"For example, we could get the reported scalars right into our python environment with just a single function."
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": null,
|
|
"metadata": {
|
|
"colab": {
|
|
"base_uri": "https://localhost:8080/"
|
|
},
|
|
"id": "C86NZAopnsJE",
|
|
"outputId": "3e7e539a-7069-45a5-d534-d97debb4dff8"
|
|
},
|
|
"outputs": [],
|
|
"source": [
|
|
"new_task.get_reported_scalars()"
|
|
]
|
|
},
|
|
{
|
|
"attachments": {},
|
|
"cell_type": "markdown",
|
|
"metadata": {
|
|
"id": "yu1X2tJNoIQ-"
|
|
},
|
|
"source": [
|
|
"# 🥾 Next steps\n",
|
|
"\n",
|
|
"Now that you know the basics of ClearML, you can get started with running your own experiments!\n",
|
|
"\n",
|
|
"## 📺 Check out our [Youtube Channel](https://www.youtube.com/@ClearML)!\n",
|
|
"It holds [detailed tutorial videos](https://www.youtube.com/watch?v=ZxgfHhPi8Gk&list=PLMdIlCuMqSTnoC45ME5_JnsJX0zWqDdlO), as well as [cool use cases and walkthroughs of real-life projects](https://www.youtube.com/watch?v=quSGXvuK1IM&list=PLMdIlCuMqSTmUfsAWXrK8zibwvxfduFPy)!\n",
|
|
"\n",
|
|
"[![Getting Started](http://img.youtube.com/vi/s3k9ntmQmD4/0.jpg)](http://www.youtube.com/watch?v=s3k9ntmQmD4 \"Getting Started\")\n",
|
|
"\n",
|
|
"[![Day in The Life Of](http://img.youtube.com/vi/quSGXvuK1IM/0.jpg)](http://www.youtube.com/watch?v=quSGXvuK1IM \"Day in The Life Of\")\n",
|
|
"\n",
|
|
"[![Training a Sarcasm Detector](http://img.youtube.com/vi/0wmwnNnN8ow/0.jpg)](http://www.youtube.com/watch?v=0wmwnNnN8ow \"Training a Sarcasm Detector\")\n",
|
|
"\n",
|
|
"\n",
|
|
"\n",
|
|
"## 💪 Check out our examples\n",
|
|
"\n",
|
|
"We have very [specific feature examples](https://clear.ml/docs/latest/docs/clearml_sdk/task_sdk) as well as a repo full of [example scripts](https://github.com/allegroai/clearml/tree/master/examples)!\n",
|
|
"\n",
|
|
"## 📚 Check out our [documentation](https://clear.ml/docs/latest/docs)\n",
|
|
"\n",
|
|
"## 🔥 Check out our [blog](https://clear.ml/blog/)\n",
|
|
"\n"
|
|
]
|
|
}
|
|
],
|
|
"metadata": {
|
|
"colab": {
|
|
"collapsed_sections": [
|
|
"vneMjf39Z3Uh",
|
|
"FLmjZG8gj2WO",
|
|
"Ejr6wqZxmqil"
|
|
],
|
|
"provenance": [],
|
|
"toc_visible": true
|
|
},
|
|
"kernelspec": {
|
|
"display_name": "Python 3",
|
|
"name": "python3"
|
|
},
|
|
"language_info": {
|
|
"name": "python"
|
|
}
|
|
},
|
|
"nbformat": 4,
|
|
"nbformat_minor": 0
|
|
}
|
|
|