Cleaned documentation, added thumbnails to image-less examples, stand…

…ardised model docstrings
florencejt · Nov 30, 2023 · 27ac475 · 27ac475
1 parent 0a99bcf
commit 27ac475
Show file tree

Hide file tree

Showing 90 changed files with 1,840 additions and 1,234 deletions.
diff --git a/docs/_static/florencestheme.css b/docs/_static/florencestheme.css
@@ -266,6 +266,13 @@ a:active {
     display: inline-block;
 }
 
+.wy-menu-vertical li.toctree-l4.current li.toctree-l5 > a {
+    /* background: #cdf8be; */
+    /*background: #ffe7fb;*/
+    background: var(--lightpink);
+    display: inline-block;
+}
+
 .wy-menu-vertical {
     overflow-x: scroll;
     /*background-color: #ffe7fb; !* Background color for the side scroll *!*/

diff --git a/docs/_static/fusilli_pipeline_diagram.png b/docs/_static/fusilli_pipeline_diagram.png
diff --git a/docs/_static/modify_thumbnail.png b/docs/_static/modify_thumbnail.png
diff --git a/docs/auto_examples/auto_examples_jupyter.zip b/docs/auto_examples/auto_examples_jupyter.zip
diff --git a/docs/auto_examples/auto_examples_python.zip b/docs/auto_examples/auto_examples_python.zip
diff --git a/docs/auto_examples/customising_behaviour/customising_training_parameters.ipynb b/docs/auto_examples/customising_behaviour/customising_training_parameters.ipynb
@@ -6,6 +6,17 @@
       "source": [
         "\n# How to customise the training in Fusilli\n\nThis tutorial will show you how to customise the training of your fusion model.\n\nWe will cover the following topics:\n\n* Early stopping\n* Batch size\n* Number of epochs\n* Checkpoint suffix modification\n\n## Early stopping\n\nEarly stopping is implemented in Fusilli using the PyTorch Lightning\n[EarlyStopping](https://lightning.ai/docs/pytorch/stable/api/lightning.pytorch.callbacks.EarlyStopping.html#lightning.pytorch.callbacks.EarlyStopping)\ncallback. This callback can be passed to the\n:func:`~fusilli.model_utils.train_and_save_models` function using the\n``early_stopping_callback`` argument. For example:\n\n```python\nfrom fusilli.data import get_data_module\nfrom fusilli.train import train_and_save_models\n\nfrom lightning.pytorch.callbacks import EarlyStopping\n\nmodified_early_stopping_callback = EarlyStopping(\n    monitor=\"val_loss\",\n    min_delta=0.00,\n    patience=3,\n    verbose=True,\n    mode=\"min\",\n)\n\ndatamodule = get_data_module(\n        fusion_model=example_model,\n        params=params,\n        own_early_stopping_callback=modified_early_stopping_callback,\n    )\n\ntrained_model_list = train_and_save_models(\n    data_module=datamodule,\n    params=params,\n    fusion_model=example_model,\n    )\n```\nNote that you only need to pass the callback to the :func:`~.fusilli.data.get_data_module` and **not** to the :func:`~.fusilli.train.train_and_save_models` function. The new early stopping measure will be saved within the data module and accessed during training.\n\n\n-----\n\n## Batch size\n\nThe batch size can be set using the ``batch_size`` argument in the :func:`~.fusilli.data.get_data_module` function. By default, the batch size is 8.\n\n```python\nfrom fusilli.data import get_data_module\nfrom fusilli.train import train_and_save_models\n\ndatamodule = get_data_module(\n        fusion_model=example_model,\n        params=params,\n        batch_size=32,\n    )\n\ntrained_model_list = train_and_save_models(\n        data_module=datamodule,\n        params=params,\n        fusion_model=example_model,\n        batch_size=32,\n    )\n```\n-----\n\n## Number of epochs\n\nYou can change the maximum number of epochs using the ``max_epochs`` argument in the :func:`~.fusilli.data.get_data_module` and :func:`~.fusilli.train.train_and_save_models` functions. By default, the maximum number of epochs is 1000.\n\nYou also pass it to the :func:`~.fusilli.data.get_data_module` function because some of the fusion models require pre-training.\n\nChanging the ``max_epochs`` parameter is especially useful when wanting to run a quick test of your model. For example, you can set ``max_epochs=5`` to run a quick test of your model.\n\n```python\nfrom fusilli.data import get_data_module\nfrom fusilli.train import train_and_save_models\n\ndatamodule = get_data_module(\n        fusion_model=example_model,\n        params=params,\n        max_epochs=5,\n    )\n\ntrained_model_list = train_and_save_models(\n        data_module=datamodule,\n        params=params,\n        fusion_model=example_model,\n        max_epochs=5,\n    )\n```\nSetting ``max_epochs`` to -1 will train the model until early stopping is triggered.\n\n-----\n\n## Checkpoint suffix modification\n\nBy default, Fusilli saves the model checkpoints in the following format:\n\n    ``{fusion_model.__name__}_epoch={epoch_n}.ckpt``\n\nIf the checkpoint is for a pre-trained model, then the following format is used:\n\n    ``subspace_{fusion_model.__name__}_{pretrained_model.__name__}.ckpt``\n\nYou can add suffixes to the checkpoint names by passing a string to the ``extra_log_string_dict`` argument in the :func:`~.fusilli.data.get_data_module` and :func:`~.fusilli.train.train_and_save_models` functions. For example, I could add a suffix to denote that I've changed the batch size for this particular run:\n\n```python\nfrom fusilli.data import get_data_module\nfrom fusilli.train import train_and_save_models\n\nextra_suffix_dict = {\"batchsize\": 32}\n\ndatamodule = get_data_module(\n        fusion_model=example_model,\n        params=params,\n        batch_size=32,\n        extra_log_string_dict=extra_suffix_dict,\n    )\n\ntrained_model_list = train_and_save_models(\n        data_module=datamodule,\n        params=params,\n        fusion_model=example_model,\n        batch_size=32,\n        extra_log_string_dict=extra_suffix_dict,\n    )\n```\nThe checkpoint name would then be (if the model trained for 100 epochs):\n\n    ``ExampleModel_epoch=100_batchsize_32.ckpt``\n\n\n<div class=\"alert alert-info\"><h4>Note</h4><p>The ``extra_log_string_dict`` argument is also used to modify the logging behaviour of the model. For more information, see `wandb`.</p></div>\n"
       ]
+    },
+    {
+      "cell_type": "code",
+      "execution_count": null,
+      "metadata": {
+        "collapsed": false
+      },
+      "outputs": [],
+      "source": [
+        "# sphinx_gallery_thumbnail_path = '_static/pink_pasta_logo.png'"
+      ]
     }
   ],
   "metadata": {

diff --git a/docs/auto_examples/customising_behaviour/customising_training_parameters.py b/docs/auto_examples/customising_behaviour/customising_training_parameters.py
@@ -153,3 +153,4 @@
 
     The ``extra_log_string_dict`` argument is also used to modify the logging behaviour of the model. For more information, see :ref:`wandb`.
 """
+# sphinx_gallery_thumbnail_path = '_static/pink_pasta_logo.png'
diff --git a/docs/auto_examples/customising_behaviour/customising_training_parameters.rst b/docs/auto_examples/customising_behaviour/customising_training_parameters.rst
@@ -172,6 +172,12 @@ The checkpoint name would then be (if the model trained for 100 epochs):
 
     The ``extra_log_string_dict`` argument is also used to modify the logging behaviour of the model. For more information, see :ref:`wandb`.
 
+.. GENERATED FROM PYTHON SOURCE LINES 156-157
+
+.. code-block:: Python
+
+    # sphinx_gallery_thumbnail_path = '_static/pink_pasta_logo.png'
+
 
 .. _sphx_glr_download_auto_examples_customising_behaviour_customising_training_parameters.py:
 

diff --git a/...ising_behaviour/images/thumb/sphx_glr_customising_training_parameters_thumb.png b/...ising_behaviour/images/thumb/sphx_glr_customising_training_parameters_thumb.png
diff --git a/...s/customising_behaviour/images/thumb/sphx_glr_plot_modify_layer_sizes_thumb.png b/...s/customising_behaviour/images/thumb/sphx_glr_plot_modify_layer_sizes_thumb.png
diff --git a/docs/auto_examples/customising_behaviour/plot_modify_layer_sizes.ipynb b/docs/auto_examples/customising_behaviour/plot_modify_layer_sizes.ipynb
@@ -11,7 +11,7 @@
       "cell_type": "markdown",
       "metadata": {},
       "source": [
-        "## Setting up the experiment\n\nFirst, we will set up the experiment by importing the necessary packages, creating the simulated data, and setting the parameters for the experiment.\n\nFor a more detailed explanation of this process, please see the `train_test_examples` tutorials.\n\n\n"
+        "## Setting up the experiment\n\nFirst, we will set up the experiment by importing the necessary packages, creating the simulated data, and setting the parameters for the experiment.\n\nFor a more detailed explanation of this process, please see the example tutorials.\n\n\n"
       ]
     },
     {
@@ -22,14 +22,14 @@
       },
       "outputs": [],
       "source": [
-        "import matplotlib.pyplot as plt\nimport os\nimport torch.nn as nn\nfrom torch_geometric.nn import GCNConv, ChebConv\n\nfrom docs.examples import generate_sklearn_simulated_data\nfrom fusilli.data import get_data_module\nfrom fusilli.eval import RealsVsPreds\nfrom fusilli.train import train_and_save_models\n\nfrom fusilli.fusionmodels.tabularfusion.attention_weighted_GNN import AttentionWeightedGNN\n\nparams = {\n    \"test_size\": 0.2,\n    \"kfold_flag\": False,\n    \"log\": False,\n    \"pred_type\": \"regression\",\n    \"loss_log_dir\": \"loss_logs/modify_layers\",  # where the csv of the loss is saved for plotting later\n    \"checkpoint_dir\": \"checkpoints\",\n    \"loss_fig_path\": \"loss_figures\",\n}\n\n# empty the loss log directory (only needed for this tutorial)\nfor dir in os.listdir(params[\"loss_log_dir\"]):\n    for file in os.listdir(os.path.join(params[\"loss_log_dir\"], dir)):\n        os.remove(os.path.join(params[\"loss_log_dir\"], dir, file))\n    # remove dir\n    os.rmdir(os.path.join(params[\"loss_log_dir\"], dir))\n\nparams = generate_sklearn_simulated_data(\n    num_samples=100,\n    num_tab1_features=10,\n    num_tab2_features=15,\n    img_dims=(1, 100, 100),\n    params=params,\n)"
+        "# sphinx_gallery_thumbnail_path = '_static/modify_thumbnail.png'\nimport matplotlib.pyplot as plt\nimport os\nimport torch.nn as nn\nfrom torch_geometric.nn import GCNConv, ChebConv\n\nfrom docs.examples import generate_sklearn_simulated_data\nfrom fusilli.data import get_data_module\nfrom fusilli.eval import RealsVsPreds\nfrom fusilli.train import train_and_save_models\n\nfrom fusilli.fusionmodels.tabularfusion.attention_weighted_GNN import AttentionWeightedGNN\n\nparams = {\n    \"test_size\": 0.2,\n    \"kfold_flag\": False,\n    \"log\": False,\n    \"pred_type\": \"regression\",\n    \"loss_log_dir\": \"loss_logs/modify_layers\",  # where the csv of the loss is saved for plotting later\n    \"checkpoint_dir\": \"checkpoints\",\n    \"loss_fig_path\": \"loss_figures\",\n}\n\n# empty the loss log directory (only needed for this tutorial)\nfor dir in os.listdir(params[\"loss_log_dir\"]):\n    for file in os.listdir(os.path.join(params[\"loss_log_dir\"], dir)):\n        os.remove(os.path.join(params[\"loss_log_dir\"], dir, file))\n    # remove dir\n    os.rmdir(os.path.join(params[\"loss_log_dir\"], dir))\n\nparams = generate_sklearn_simulated_data(\n    num_samples=100,\n    num_tab1_features=10,\n    num_tab2_features=15,\n    img_dims=(1, 100, 100),\n    params=params,\n)"
       ]
     },
     {
       "cell_type": "markdown",
       "metadata": {},
       "source": [
-        "## Specifying the model modifications\n\nNow, we will specify the modifications we want to make to the model.\n\nWe are using the :class:`~fusilli.fusionmodels.tabularfusion.attention_weighted_GNN.AttentionWeightedGNN` model for this example.\nThis is a graph-based model which has a pretrained MLP (multi-layer perceptron) to get attention weights, and a graph neural network that uses the attention weights to perform the fusion.\n\nThe following modifications can be made to the method that makes the graph structure: :class:`~fusilli.fusionmodels.tabularfusion.attention_weighted_GNN.AttentionWeightedGraphMaker`:\n\n\n.. list-table::\n  :widths: 40 60\n  :header-rows: 1\n  :stub-columns: 0\n\n  * - Attribute\n    - Guidance\n  * - :attr:`~.AttentionWeightedGraphMaker.early_stop_callback`\n    - ``EarlyStopping`` object from ``from lightning.pytorch.callbacks import EarlyStopping``\n  * - :attr:`~.AttentionWeightedGraphMaker.edge_probability_threshold`\n    - Integer between 0 and 100.\n  * - :attr:`~.AttentionWeightedGraphMaker.attention_MLP_test_size`\n    - Float between 0 and 1.\n  * - :attr:`~.AttentionWeightedGraphMaker.AttentionWeightingMLPInstance.weighting_layers`\n    - ``nn.ModuleDict``: final layer output size must be the same as the input layer input size.\n  * - :attr:`~.AttentionWeightedGraphMaker.AttentionWeightingMLPInstance.fused_layers`\n    - ``nn.Sequential``\n\n\nThe following modifications can be made to the **fusion** model :class:`~fusilli.fusionmodels.tabularfusion.attention_weighted_GNN.AttentionWeightedGNN`:\n\n.. list-table::\n  :widths: 40 60\n  :header-rows: 1\n  :stub-columns: 0\n\n  * - Attribute\n    - Guidance\n  * - :attr:`~.AttentionWeightedGNN.graph_conv_layers`\n    - ``nn.Sequential`` of ``torch_geometric.nn` Layers.\n  * - :attr:`~.AttentionWeightedGNN.dropout_prob`\n    - Float between (not including) 0 and 1.\n\nLet's modify the model! More info about how to do this can be found in `modifying-models`.\n\n"
+        "## Specifying the model modifications\n\nNow, we will specify the modifications we want to make to the model.\n\nWe are using the :class:`~fusilli.fusionmodels.tabularfusion.attention_weighted_GNN.AttentionWeightedGNN` model for this example.\nThis is a graph-based model which has a pretrained MLP (multi-layer perceptron) to get attention weights, and a graph neural network that uses the attention weights to perform the fusion.\n\nThe following modifications can be made to the method that makes the graph structure: :class:`~fusilli.fusionmodels.tabularfusion.attention_weighted_GNN.AttentionWeightedGraphMaker`:\n\n\n.. list-table::\n  :widths: 40 60\n  :header-rows: 1\n  :stub-columns: 0\n\n  * - Attribute\n    - Guidance\n  * - :attr:`~.AttentionWeightedGraphMaker.early_stop_callback`\n    - ``EarlyStopping`` object from ``from lightning.pytorch.callbacks import EarlyStopping``\n  * - :attr:`~.AttentionWeightedGraphMaker.edge_probability_threshold`\n    - Integer between 0 and 100.\n  * - :attr:`~.AttentionWeightedGraphMaker.attention_MLP_test_size`\n    - Float between 0 and 1.\n  * - :attr:`~.AttentionWeightedGraphMaker.AttentionWeightingMLPInstance.weighting_layers`\n    - ``nn.ModuleDict``: final layer output size must be the same as the input layer input size.\n  * - :attr:`~.AttentionWeightedGraphMaker.AttentionWeightingMLPInstance.fused_layers`\n    - ``nn.Sequential``\n\n\nThe following modifications can be made to the **fusion** model :class:`~fusilli.fusionmodels.tabularfusion.attention_weighted_GNN.AttentionWeightedGNN`:\n\n.. list-table::\n  :widths: 40 60\n  :header-rows: 1\n  :stub-columns: 0\n\n  * - Attribute\n    - Guidance\n  * - :attr:`~.AttentionWeightedGNN.graph_conv_layers`\n    - ``nn.Sequential`` of ``torch_geometric.nn`` Layers.\n  * - :attr:`~.AttentionWeightedGNN.dropout_prob`\n    - Float between (not including) 0 and 1.\n\nLet's modify the model! More info about how to do this can be found in `modifying-models`.\n\n"
       ]
     },
     {
@@ -83,7 +83,7 @@
       "cell_type": "markdown",
       "metadata": {},
       "source": [
-        "## What happens when the modifications are incorrect?\n\nLet's see what happens when we try to modify an **attribute that doesn't exist**.\n\n\n"
+        "You can see that the input features to the ``final_prediction`` layer changed to fit with our modification to the ``graph_conv_layers`` output features!\n\n## What happens when the modifications are incorrect?\n\nLet's see what happens when we try to modify an **attribute that doesn't exist**.\n\n\n"
       ]
     },
     {
@@ -130,7 +130,7 @@
       "cell_type": "markdown",
       "metadata": {},
       "source": [
-        "## What about modifying multiple attributes with the **conflicting modifications**?\n\n"
+        "## What about modifying multiple attributes with the **conflicting modifications**?\n\n\nFor this, let's switch to looking at the :class:`~fusilli.fusionmodels.tabularfusion.concat_feature_maps.ConcatTabularFeatureMaps` model.\nThis model concatenates the feature maps of the two modalities and then passes them through a prediction layer.\n\nWe can modify the layers that each tabular modality goes through before being concatenated, as well as the layers that come after the concatenation.\n\nThe output features of our modified ``mod1_layers`` and ``mod2_layers`` are 100 and 128, so the input features of the ``fused_layers`` should be 228. However, we've set the input features of the ``fused_layers`` to be 25.\n\nLet's see what happens when we try to modify the model in this way. It should throw an error when the data is passed through the model.\n\n"
       ]
     },
     {
@@ -141,7 +141,7 @@
       },
       "outputs": [],
       "source": [
-        "#\n# For this, let's switch to looking at the :class:`~fusilli.fusionmodels.tabularfusion.concat_feature_maps.ConcatTabularFeatureMaps` model.\n# This model concatenates the feature maps of the two modalities and then passes them through a prediction layer.\n#\n# We can modify the layers that each tabular modality goes through before being concatenated, as well as the layers that come after the concatenation.\n#\n# The output features of our modified ``mod1_layers`` and ``mod2_layers`` are 100 and 128, so the input features of the ``fused_layers`` should be 228. However, we've set the input features of the ``fused_layers`` to be 25.\n#\n# Let's see what happens when we try to modify the model in this way. It should throw an error when the data is passed through the model.\n\nlayer_mods = {\n    \"ConcatTabularFeatureMaps\": {\n        \"mod1_layers\": nn.ModuleDict(\n            {\n                \"layer 1\": nn.Sequential(\n                    nn.Linear(10, 32),\n                    nn.ReLU(),\n                ),\n                \"layer 2\": nn.Sequential(\n                    nn.Linear(32, 66),\n                    nn.ReLU(),\n                ),\n                \"layer 3\": nn.Sequential(\n                    nn.Linear(66, 128),\n                    nn.ReLU(),\n                ),\n            }\n        ),\n        \"mod2_layers\": nn.ModuleDict(\n            {\n                \"layer 1\": nn.Sequential(\n                    nn.Linear(15, 45),\n                    nn.ReLU(),\n                ),\n                \"layer 2\": nn.Sequential(\n                    nn.Linear(45, 70),\n                    nn.ReLU(),\n                ),\n                \"layer 3\": nn.Sequential(\n                    nn.Linear(70, 100),\n                    nn.ReLU(),\n                ),\n            }\n        ),\n        \"fused_layers\": nn.Sequential(\n            nn.Linear(25, 150),\n            nn.ReLU(),\n            nn.Linear(150, 75),\n            nn.ReLU(),\n            nn.Linear(75, 50),\n            nn.ReLU(),\n        ),\n    },\n}\n\n# get the data and train the model\n\nfrom fusilli.fusionmodels.tabularfusion.concat_feature_maps import ConcatTabularFeatureMaps\n\ndatamodule = get_data_module(ConcatTabularFeatureMaps, params, layer_mods=layer_mods)\ntrained_model_list = train_and_save_models(\n    data_module=datamodule,\n    params=params,\n    fusion_model=ConcatTabularFeatureMaps,\n    layer_mods=layer_mods,\n    max_epochs=5,\n)"
+        "layer_mods = {\n    \"ConcatTabularFeatureMaps\": {\n        \"mod1_layers\": nn.ModuleDict(\n            {\n                \"layer 1\": nn.Sequential(\n                    nn.Linear(10, 32),\n                    nn.ReLU(),\n                ),\n                \"layer 2\": nn.Sequential(\n                    nn.Linear(32, 66),\n                    nn.ReLU(),\n                ),\n                \"layer 3\": nn.Sequential(\n                    nn.Linear(66, 128),\n                    nn.ReLU(),\n                ),\n            }\n        ),\n        \"mod2_layers\": nn.ModuleDict(\n            {\n                \"layer 1\": nn.Sequential(\n                    nn.Linear(15, 45),\n                    nn.ReLU(),\n                ),\n                \"layer 2\": nn.Sequential(\n                    nn.Linear(45, 70),\n                    nn.ReLU(),\n                ),\n                \"layer 3\": nn.Sequential(\n                    nn.Linear(70, 100),\n                    nn.ReLU(),\n                ),\n            }\n        ),\n        \"fused_layers\": nn.Sequential(\n            nn.Linear(25, 150),\n            nn.ReLU(),\n            nn.Linear(150, 75),\n            nn.ReLU(),\n            nn.Linear(75, 50),\n            nn.ReLU(),\n        ),\n    },\n}\n\n# get the data and train the model\n\nfrom fusilli.fusionmodels.tabularfusion.concat_feature_maps import ConcatTabularFeatureMaps\n\ndatamodule = get_data_module(ConcatTabularFeatureMaps, params, layer_mods=layer_mods)\ntrained_model_list = train_and_save_models(\n    data_module=datamodule,\n    params=params,\n    fusion_model=ConcatTabularFeatureMaps,\n    layer_mods=layer_mods,\n    max_epochs=5,\n)"
       ]
     },
     {
Original file line number	Diff line number	Diff line change
Expand Up		@@ -153,3 +153,4 @@

		The ``extra_log_string_dict`` argument is also used to modify the logging behaviour of the model. For more information, see :ref:`wandb`.
		"""
		# sphinx_gallery_thumbnail_path = '_static/pink_pasta_logo.png'