From d8fc34290cd1bb1f242749c4bede8f6ab4b9388b Mon Sep 17 00:00:00 2001 From: YunLiu <55491388+KumoLiu@users.noreply.github.com> Date: Fri, 19 Jan 2024 23:35:57 +0800 Subject: [PATCH] Add `torch.compile` benchmark tutorial (#1607) Fixes # . ### Description Add a tutorial about how to use `torch.compile`. ### Checks - [ ] Avoid including large-size files in the PR. - [ ] Clean up long text outputs from code cells in the notebook. - [ ] For security purposes, please check the contents and remove any sensitive info such as user names and private key. - [ ] Ensure (1) hyperlinks and markdown anchors are working (2) use relative paths for tutorial repo files (3) put figure and graphs in the `./figure` folder - [ ] Notebook runs automatically `./runner.sh -t ` --------- Signed-off-by: YunLiu <55491388+KumoLiu@users.noreply.github.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> --- modules/torch_compile.ipynb | 607 ++++++++++++++++++++++++++++++++++++ 1 file changed, 607 insertions(+) create mode 100644 modules/torch_compile.ipynb diff --git a/modules/torch_compile.ipynb b/modules/torch_compile.ipynb new file mode 100644 index 0000000000..ac85eaa652 --- /dev/null +++ b/modules/torch_compile.ipynb @@ -0,0 +1,607 @@ +{ + "cells": [ + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "Copyright (c) MONAI Consortium \n", + "Licensed under the Apache License, Version 2.0 (the \"License\"); \n", + "you may not use this file except in compliance with the License. \n", + "You may obtain a copy of the License at \n", + "    http://www.apache.org/licenses/LICENSE-2.0 \n", + "Unless required by applicable law or agreed to in writing, software \n", + "distributed under the License is distributed on an \"AS IS\" BASIS, \n", + "WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. \n", + "See the License for the specific language governing permissions and \n", + "limitations under the License." + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "# MONAI pipeline with PyTorch 2.0 Features\n", + "\n", + "This notebook introduces how to use `torch.compile` in the MONAI pipeline. `torch.compile` is the main API for PyTorch 2.0, which wraps your model and returns a compiled model. It is a fully additive (and optional) feature and hence 2.0 is 100% backward compatible by definition. We also run an end-to-end pipeline based on [\"fast_training_tutorial.ipynb\"](https://github.com/Project-MONAI/tutorials/blob/main/acceleration/fast_training_tutorial.ipynb), and the speed up is 1.16x." + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "## Setup environment" + ] + }, + { + "cell_type": "code", + "execution_count": 1, + "metadata": {}, + "outputs": [], + "source": [ + "!python -c \"import monai\" || pip install -q \"monai-weekly[nibabel, matplotlib]\"\n", + "# %pip install -q torch==2.1.0" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "## Setup imports" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "import os\n", + "import torch\n", + "import shutil\n", + "import tempfile\n", + "import numpy as np\n", + "import matplotlib.pyplot as plt\n", + "\n", + "import monai\n", + "import monai.transforms as mt\n", + "from monai.config import print_config\n", + "from monai.utils import set_determinism\n", + "from monai.bundle import download, create_workflow\n", + "from monai.engines import SupervisedTrainer\n", + "\n", + "print_config()" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "## Setup data directory\n", + "\n", + "You can specify a directory with the `MONAI_DATA_DIRECTORY` environment variable. \n", + "This allows you to save results and reuse downloads. \n", + "If not specified, a temporary directory will be used." + ] + }, + { + "cell_type": "code", + "execution_count": 3, + "metadata": {}, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "/workspace/data\n" + ] + } + ], + "source": [ + "directory = os.environ.get(\"MONAI_DATA_DIRECTORY\")\n", + "root_dir = tempfile.mkdtemp() if directory is None else directory\n", + "print(root_dir)" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "## Download dataset" + ] + }, + { + "cell_type": "code", + "execution_count": 4, + "metadata": {}, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "2024-01-19 06:02:08,535 - INFO - Expected md5 is None, skip md5 check for file samples.zip.\n", + "2024-01-19 06:02:08,536 - INFO - File exists: samples.zip, skipped downloading.\n", + "2024-01-19 06:02:08,537 - INFO - Writing into directory: /workspace/data.\n" + ] + } + ], + "source": [ + "sample_url = \"https://github.com/Project-MONAI/MONAI-extra-test-data/releases\"\n", + "sample_url += \"/download/0.8.1/totalSegmentator_mergedLabel_samples.zip\"\n", + "monai.apps.download_and_extract(sample_url, output_dir=root_dir, filepath=\"samples.zip\")\n", + "\n", + "base_name = os.path.join(root_dir, \"totalSegmentator_mergedLabel_samples\")\n", + "input_data = []\n", + "for filename in os.listdir(os.path.join(base_name, \"imagesTr\")):\n", + " input_data.append(\n", + " {\n", + " \"image\": os.path.join(base_name, \"imagesTr\", filename),\n", + " \"label\": os.path.join(base_name, \"labelsTr\", filename),\n", + " }\n", + " )" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "## Set deterministic for reproducibility" + ] + }, + { + "cell_type": "code", + "execution_count": 5, + "metadata": {}, + "outputs": [], + "source": [ + "set_determinism(seed=0)" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "## Set up timing and training functions\n", + "\n", + "For best accuracies, we use CUDA events and synchronization to measure the forward and backward propagations in training." + ] + }, + { + "cell_type": "code", + "execution_count": 6, + "metadata": {}, + "outputs": [], + "source": [ + "def timed(fn):\n", + " start = torch.cuda.Event(enable_timing=True)\n", + " end = torch.cuda.Event(enable_timing=True)\n", + " start.record()\n", + " result = fn()\n", + " end.record()\n", + " torch.cuda.synchronize()\n", + " return result, start.elapsed_time(end) / 1000\n", + "\n", + "\n", + "def train(model, inputs, labels):\n", + " outputs = model(inputs)\n", + " loss_function = monai.losses.DiceCELoss(to_onehot_y=True, softmax=True)\n", + " loss = loss_function(outputs, labels)\n", + " loss.backward()\n", + " return loss" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "## Set up model\n", + "\n", + "Here we used `create_workflow` to get the network instance from the bundle. You can also initialize your own network." + ] + }, + { + "cell_type": "code", + "execution_count": 7, + "metadata": {}, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "2024-01-19 06:02:11,869 - INFO - --- input summary of monai.bundle.scripts.download ---\n", + "2024-01-19 06:02:11,870 - INFO - > name: 'wholeBody_ct_segmentation'\n", + "2024-01-19 06:02:11,872 - INFO - > bundle_dir: './bundle'\n", + "2024-01-19 06:02:11,872 - INFO - > source: 'monaihosting'\n", + "2024-01-19 06:02:11,873 - INFO - > remove_prefix: 'monai_'\n", + "2024-01-19 06:02:11,874 - INFO - > progress: True\n", + "2024-01-19 06:02:11,875 - INFO - ---\n", + "\n", + "\n", + "2024-01-19 06:02:12,098 - INFO - Expected md5 is None, skip md5 check for file bundle/wholeBody_ct_segmentation_v0.2.1.zip.\n", + "2024-01-19 06:02:12,099 - INFO - File exists: bundle/wholeBody_ct_segmentation_v0.2.1.zip, skipped downloading.\n", + "2024-01-19 06:02:12,100 - INFO - Writing into directory: bundle.\n", + "2024-01-19 06:02:13,092 - INFO - --- input summary of monai.bundle.scripts.run ---\n", + "2024-01-19 06:02:13,095 - INFO - > config_file: './bundle/wholeBody_ct_segmentation/configs/train.json'\n", + "2024-01-19 06:02:13,095 - INFO - > workflow_type: 'train'\n", + "2024-01-19 06:02:13,096 - INFO - ---\n", + "\n", + "\n", + "2024-01-19 06:02:13,097 - INFO - Setting logging properties based on config: bundle/wholeBody_ct_segmentation/configs/logging.conf.\n" + ] + } + ], + "source": [ + "device = \"cuda:0\" if torch.cuda.is_available() else \"cpu\"\n", + "bundle_dir = \"./bundle\"\n", + "os.makedirs(bundle_dir, exist_ok=True)\n", + "\n", + "bundle = download(\"wholeBody_ct_segmentation\", bundle_dir=bundle_dir)\n", + "config_file = os.path.join(bundle_dir, \"wholeBody_ct_segmentation/configs/train.json\")\n", + "train_workflow = create_workflow(config_file=str(config_file), workflow_type=\"train\")\n", + "\n", + "\n", + "def init_model(device):\n", + " return train_workflow.network_def.to(device)" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "## Set up DataLoader and train transforms" + ] + }, + { + "cell_type": "code", + "execution_count": 8, + "metadata": {}, + "outputs": [ + { + "name": "stderr", + "output_type": "stream", + "text": [ + "Loading dataset: 0%| | 0/20 [00:00" + ] + }, + "execution_count": 13, + "metadata": {}, + "output_type": "execute_result" + }, + { + "data": { + "image/png": "", + "text/plain": [ + "
" + ] + }, + "metadata": {}, + "output_type": "display_data" + } + ], + "source": [ + "plt.figure(\"train\", (12, 6))\n", + "plt.subplot(1, 2, 1)\n", + "plt.title(\"Total Train Time(300 epochs)\")\n", + "plt.bar(\"Compile\", sum(compile_time), 1, label=\"Compile training\", color=\"red\")\n", + "plt.bar(\"Fast\", sum(eager_time), 1, label=\"Fast training\", color=\"green\")\n", + "plt.ylabel(\"secs\")\n", + "plt.yscale(\"log\")\n", + "plt.grid(alpha=0.4, linestyle=\":\")\n", + "plt.legend(loc=\"best\")\n", + "\n", + "plt.subplot(1, 2, 2)\n", + "plt.title(\"Epoch Time\")\n", + "x = [i + 1 for i in range(int(len(compile_time) / step))]\n", + "plt.xlabel(\"epoch\")\n", + "plt.ylabel(\"secs\")\n", + "plt.plot(\n", + " x,\n", + " [sum(compile_time[i * step : (i + 1) * step]) for i in range(int(len(compile_time) / step))],\n", + " label=\"Compile training\",\n", + " color=\"red\",\n", + ")\n", + "plt.plot(\n", + " x,\n", + " [sum(eager_time[i * step : (i + 1) * step]) for i in range(int(len(eager_time) / step))],\n", + " label=\"Fast training\",\n", + " color=\"green\",\n", + ")\n", + "plt.yscale(\"log\")\n", + "plt.grid(alpha=0.4, linestyle=\":\")\n", + "plt.legend(loc=\"best\")" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "We also tried `torch.compile` in [fast_training_tutorial.ipynb](https://github.com/Project-MONAI/tutorials/blob/main/acceleration/fast_training_tutorial.ipynb).\n", + "The total training time for fast and compile is as follows: 354.9534s and 305.6460s, speedup: 1.16x." + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "## Use `torch.compile` with Engine Classes\n", + "\n", + "We can simply set `compile=True` in the `SupervisedTrainer` and `SupervisedEvaluator`. Here we convert data to `torch.Tensor` internally if set `compile=True`. Here is the [ticket](https://github.com/pytorch/pytorch/issues/117026) we can track." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "loss_function = monai.losses.DiceCELoss(to_onehot_y=True, softmax=True)\n", + "trainer = SupervisedTrainer(\n", + " device=device,\n", + " max_epochs=epoch_num,\n", + " train_data_loader=data_loader,\n", + " network=model,\n", + " optimizer=optimizer,\n", + " loss_function=loss_function,\n", + " # postprocessing=post_transform,\n", + " # amp=args.amp,\n", + " # key_train_metric={\n", + " # \"train_dice\": MeanDice(\n", + " # include_background=False,\n", + " # output_transform=from_engine([\"pred\", \"label\"]),\n", + " # )\n", + " # },\n", + " compile=True,\n", + " # you can also add `compile_kwargs` dict of the args for `torch.compile()` API\n", + " compile_kwargs={},\n", + ")" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "## Cleanup data directory\n", + "Remove directory if a temporary was used." + ] + }, + { + "cell_type": "code", + "execution_count": 14, + "metadata": {}, + "outputs": [], + "source": [ + "if directory is None:\n", + " shutil.rmtree(root_dir)" + ] + } + ], + "metadata": { + "kernelspec": { + "display_name": "Python 3", + "language": "python", + "name": "python3" + }, + "language_info": { + "codemirror_mode": { + "name": "ipython", + "version": 3 + }, + "file_extension": ".py", + "mimetype": "text/x-python", + "name": "python", + "nbconvert_exporter": "python", + "pygments_lexer": "ipython3", + "version": "3.10.12" + } + }, + "nbformat": 4, + "nbformat_minor": 2 +}