Skip to content

Commit

Permalink
Cleaned documentation, added thumbnails to image-less examples, stand…
Browse files Browse the repository at this point in the history
…ardised model docstrings
  • Loading branch information
florencejt committed Nov 30, 2023
1 parent 0a99bcf commit 27ac475
Show file tree
Hide file tree
Showing 90 changed files with 1,840 additions and 1,234 deletions.
7 changes: 7 additions & 0 deletions docs/_static/florencestheme.css
Original file line number Diff line number Diff line change
Expand Up @@ -266,6 +266,13 @@ a:active {
display: inline-block;
}

.wy-menu-vertical li.toctree-l4.current li.toctree-l5 > a {
/* background: #cdf8be; */
/*background: #ffe7fb;*/
background: var(--lightpink);
display: inline-block;
}

.wy-menu-vertical {
overflow-x: scroll;
/*background-color: #ffe7fb; !* Background color for the side scroll *!*/
Expand Down
Binary file added docs/_static/fusilli_pipeline_diagram.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added docs/_static/modify_thumbnail.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file modified docs/auto_examples/auto_examples_jupyter.zip
Binary file not shown.
Binary file modified docs/auto_examples/auto_examples_python.zip
Binary file not shown.
Original file line number Diff line number Diff line change
Expand Up @@ -6,6 +6,17 @@
"source": [
"\n# How to customise the training in Fusilli\n\nThis tutorial will show you how to customise the training of your fusion model.\n\nWe will cover the following topics:\n\n* Early stopping\n* Batch size\n* Number of epochs\n* Checkpoint suffix modification\n\n## Early stopping\n\nEarly stopping is implemented in Fusilli using the PyTorch Lightning\n[EarlyStopping](https://lightning.ai/docs/pytorch/stable/api/lightning.pytorch.callbacks.EarlyStopping.html#lightning.pytorch.callbacks.EarlyStopping)\ncallback. This callback can be passed to the\n:func:`~fusilli.model_utils.train_and_save_models` function using the\n``early_stopping_callback`` argument. For example:\n\n```python\nfrom fusilli.data import get_data_module\nfrom fusilli.train import train_and_save_models\n\nfrom lightning.pytorch.callbacks import EarlyStopping\n\nmodified_early_stopping_callback = EarlyStopping(\n monitor=\"val_loss\",\n min_delta=0.00,\n patience=3,\n verbose=True,\n mode=\"min\",\n)\n\ndatamodule = get_data_module(\n fusion_model=example_model,\n params=params,\n own_early_stopping_callback=modified_early_stopping_callback,\n )\n\ntrained_model_list = train_and_save_models(\n data_module=datamodule,\n params=params,\n fusion_model=example_model,\n )\n```\nNote that you only need to pass the callback to the :func:`~.fusilli.data.get_data_module` and **not** to the :func:`~.fusilli.train.train_and_save_models` function. The new early stopping measure will be saved within the data module and accessed during training.\n\n\n-----\n\n## Batch size\n\nThe batch size can be set using the ``batch_size`` argument in the :func:`~.fusilli.data.get_data_module` function. By default, the batch size is 8.\n\n```python\nfrom fusilli.data import get_data_module\nfrom fusilli.train import train_and_save_models\n\ndatamodule = get_data_module(\n fusion_model=example_model,\n params=params,\n batch_size=32,\n )\n\ntrained_model_list = train_and_save_models(\n data_module=datamodule,\n params=params,\n fusion_model=example_model,\n batch_size=32,\n )\n```\n-----\n\n## Number of epochs\n\nYou can change the maximum number of epochs using the ``max_epochs`` argument in the :func:`~.fusilli.data.get_data_module` and :func:`~.fusilli.train.train_and_save_models` functions. By default, the maximum number of epochs is 1000.\n\nYou also pass it to the :func:`~.fusilli.data.get_data_module` function because some of the fusion models require pre-training.\n\nChanging the ``max_epochs`` parameter is especially useful when wanting to run a quick test of your model. For example, you can set ``max_epochs=5`` to run a quick test of your model.\n\n```python\nfrom fusilli.data import get_data_module\nfrom fusilli.train import train_and_save_models\n\ndatamodule = get_data_module(\n fusion_model=example_model,\n params=params,\n max_epochs=5,\n )\n\ntrained_model_list = train_and_save_models(\n data_module=datamodule,\n params=params,\n fusion_model=example_model,\n max_epochs=5,\n )\n```\nSetting ``max_epochs`` to -1 will train the model until early stopping is triggered.\n\n-----\n\n## Checkpoint suffix modification\n\nBy default, Fusilli saves the model checkpoints in the following format:\n\n ``{fusion_model.__name__}_epoch={epoch_n}.ckpt``\n\nIf the checkpoint is for a pre-trained model, then the following format is used:\n\n ``subspace_{fusion_model.__name__}_{pretrained_model.__name__}.ckpt``\n\nYou can add suffixes to the checkpoint names by passing a string to the ``extra_log_string_dict`` argument in the :func:`~.fusilli.data.get_data_module` and :func:`~.fusilli.train.train_and_save_models` functions. For example, I could add a suffix to denote that I've changed the batch size for this particular run:\n\n```python\nfrom fusilli.data import get_data_module\nfrom fusilli.train import train_and_save_models\n\nextra_suffix_dict = {\"batchsize\": 32}\n\ndatamodule = get_data_module(\n fusion_model=example_model,\n params=params,\n batch_size=32,\n extra_log_string_dict=extra_suffix_dict,\n )\n\ntrained_model_list = train_and_save_models(\n data_module=datamodule,\n params=params,\n fusion_model=example_model,\n batch_size=32,\n extra_log_string_dict=extra_suffix_dict,\n )\n```\nThe checkpoint name would then be (if the model trained for 100 epochs):\n\n ``ExampleModel_epoch=100_batchsize_32.ckpt``\n\n\n<div class=\"alert alert-info\"><h4>Note</h4><p>The ``extra_log_string_dict`` argument is also used to modify the logging behaviour of the model. For more information, see `wandb`.</p></div>\n"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"collapsed": false
},
"outputs": [],
"source": [
"# sphinx_gallery_thumbnail_path = '_static/pink_pasta_logo.png'"
]
}
],
"metadata": {
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -153,3 +153,4 @@
The ``extra_log_string_dict`` argument is also used to modify the logging behaviour of the model. For more information, see :ref:`wandb`.
"""
# sphinx_gallery_thumbnail_path = '_static/pink_pasta_logo.png'
Original file line number Diff line number Diff line change
Expand Up @@ -172,6 +172,12 @@ The checkpoint name would then be (if the model trained for 100 epochs):

The ``extra_log_string_dict`` argument is also used to modify the logging behaviour of the model. For more information, see :ref:`wandb`.

.. GENERATED FROM PYTHON SOURCE LINES 156-157
.. code-block:: Python
# sphinx_gallery_thumbnail_path = '_static/pink_pasta_logo.png'
.. _sphx_glr_download_auto_examples_customising_behaviour_customising_training_parameters.py:

Expand Down
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Original file line number Diff line number Diff line change
Expand Up @@ -11,7 +11,7 @@
"cell_type": "markdown",
"metadata": {},
"source": [
"## Setting up the experiment\n\nFirst, we will set up the experiment by importing the necessary packages, creating the simulated data, and setting the parameters for the experiment.\n\nFor a more detailed explanation of this process, please see the `train_test_examples` tutorials.\n\n\n"
"## Setting up the experiment\n\nFirst, we will set up the experiment by importing the necessary packages, creating the simulated data, and setting the parameters for the experiment.\n\nFor a more detailed explanation of this process, please see the example tutorials.\n\n\n"
]
},
{
Expand All @@ -22,14 +22,14 @@
},
"outputs": [],
"source": [
"import matplotlib.pyplot as plt\nimport os\nimport torch.nn as nn\nfrom torch_geometric.nn import GCNConv, ChebConv\n\nfrom docs.examples import generate_sklearn_simulated_data\nfrom fusilli.data import get_data_module\nfrom fusilli.eval import RealsVsPreds\nfrom fusilli.train import train_and_save_models\n\nfrom fusilli.fusionmodels.tabularfusion.attention_weighted_GNN import AttentionWeightedGNN\n\nparams = {\n \"test_size\": 0.2,\n \"kfold_flag\": False,\n \"log\": False,\n \"pred_type\": \"regression\",\n \"loss_log_dir\": \"loss_logs/modify_layers\", # where the csv of the loss is saved for plotting later\n \"checkpoint_dir\": \"checkpoints\",\n \"loss_fig_path\": \"loss_figures\",\n}\n\n# empty the loss log directory (only needed for this tutorial)\nfor dir in os.listdir(params[\"loss_log_dir\"]):\n for file in os.listdir(os.path.join(params[\"loss_log_dir\"], dir)):\n os.remove(os.path.join(params[\"loss_log_dir\"], dir, file))\n # remove dir\n os.rmdir(os.path.join(params[\"loss_log_dir\"], dir))\n\nparams = generate_sklearn_simulated_data(\n num_samples=100,\n num_tab1_features=10,\n num_tab2_features=15,\n img_dims=(1, 100, 100),\n params=params,\n)"
"# sphinx_gallery_thumbnail_path = '_static/modify_thumbnail.png'\nimport matplotlib.pyplot as plt\nimport os\nimport torch.nn as nn\nfrom torch_geometric.nn import GCNConv, ChebConv\n\nfrom docs.examples import generate_sklearn_simulated_data\nfrom fusilli.data import get_data_module\nfrom fusilli.eval import RealsVsPreds\nfrom fusilli.train import train_and_save_models\n\nfrom fusilli.fusionmodels.tabularfusion.attention_weighted_GNN import AttentionWeightedGNN\n\nparams = {\n \"test_size\": 0.2,\n \"kfold_flag\": False,\n \"log\": False,\n \"pred_type\": \"regression\",\n \"loss_log_dir\": \"loss_logs/modify_layers\", # where the csv of the loss is saved for plotting later\n \"checkpoint_dir\": \"checkpoints\",\n \"loss_fig_path\": \"loss_figures\",\n}\n\n# empty the loss log directory (only needed for this tutorial)\nfor dir in os.listdir(params[\"loss_log_dir\"]):\n for file in os.listdir(os.path.join(params[\"loss_log_dir\"], dir)):\n os.remove(os.path.join(params[\"loss_log_dir\"], dir, file))\n # remove dir\n os.rmdir(os.path.join(params[\"loss_log_dir\"], dir))\n\nparams = generate_sklearn_simulated_data(\n num_samples=100,\n num_tab1_features=10,\n num_tab2_features=15,\n img_dims=(1, 100, 100),\n params=params,\n)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Specifying the model modifications\n\nNow, we will specify the modifications we want to make to the model.\n\nWe are using the :class:`~fusilli.fusionmodels.tabularfusion.attention_weighted_GNN.AttentionWeightedGNN` model for this example.\nThis is a graph-based model which has a pretrained MLP (multi-layer perceptron) to get attention weights, and a graph neural network that uses the attention weights to perform the fusion.\n\nThe following modifications can be made to the method that makes the graph structure: :class:`~fusilli.fusionmodels.tabularfusion.attention_weighted_GNN.AttentionWeightedGraphMaker`:\n\n\n.. list-table::\n :widths: 40 60\n :header-rows: 1\n :stub-columns: 0\n\n * - Attribute\n - Guidance\n * - :attr:`~.AttentionWeightedGraphMaker.early_stop_callback`\n - ``EarlyStopping`` object from ``from lightning.pytorch.callbacks import EarlyStopping``\n * - :attr:`~.AttentionWeightedGraphMaker.edge_probability_threshold`\n - Integer between 0 and 100.\n * - :attr:`~.AttentionWeightedGraphMaker.attention_MLP_test_size`\n - Float between 0 and 1.\n * - :attr:`~.AttentionWeightedGraphMaker.AttentionWeightingMLPInstance.weighting_layers`\n - ``nn.ModuleDict``: final layer output size must be the same as the input layer input size.\n * - :attr:`~.AttentionWeightedGraphMaker.AttentionWeightingMLPInstance.fused_layers`\n - ``nn.Sequential``\n\n\nThe following modifications can be made to the **fusion** model :class:`~fusilli.fusionmodels.tabularfusion.attention_weighted_GNN.AttentionWeightedGNN`:\n\n.. list-table::\n :widths: 40 60\n :header-rows: 1\n :stub-columns: 0\n\n * - Attribute\n - Guidance\n * - :attr:`~.AttentionWeightedGNN.graph_conv_layers`\n - ``nn.Sequential`` of ``torch_geometric.nn` Layers.\n * - :attr:`~.AttentionWeightedGNN.dropout_prob`\n - Float between (not including) 0 and 1.\n\nLet's modify the model! More info about how to do this can be found in `modifying-models`.\n\n"
"## Specifying the model modifications\n\nNow, we will specify the modifications we want to make to the model.\n\nWe are using the :class:`~fusilli.fusionmodels.tabularfusion.attention_weighted_GNN.AttentionWeightedGNN` model for this example.\nThis is a graph-based model which has a pretrained MLP (multi-layer perceptron) to get attention weights, and a graph neural network that uses the attention weights to perform the fusion.\n\nThe following modifications can be made to the method that makes the graph structure: :class:`~fusilli.fusionmodels.tabularfusion.attention_weighted_GNN.AttentionWeightedGraphMaker`:\n\n\n.. list-table::\n :widths: 40 60\n :header-rows: 1\n :stub-columns: 0\n\n * - Attribute\n - Guidance\n * - :attr:`~.AttentionWeightedGraphMaker.early_stop_callback`\n - ``EarlyStopping`` object from ``from lightning.pytorch.callbacks import EarlyStopping``\n * - :attr:`~.AttentionWeightedGraphMaker.edge_probability_threshold`\n - Integer between 0 and 100.\n * - :attr:`~.AttentionWeightedGraphMaker.attention_MLP_test_size`\n - Float between 0 and 1.\n * - :attr:`~.AttentionWeightedGraphMaker.AttentionWeightingMLPInstance.weighting_layers`\n - ``nn.ModuleDict``: final layer output size must be the same as the input layer input size.\n * - :attr:`~.AttentionWeightedGraphMaker.AttentionWeightingMLPInstance.fused_layers`\n - ``nn.Sequential``\n\n\nThe following modifications can be made to the **fusion** model :class:`~fusilli.fusionmodels.tabularfusion.attention_weighted_GNN.AttentionWeightedGNN`:\n\n.. list-table::\n :widths: 40 60\n :header-rows: 1\n :stub-columns: 0\n\n * - Attribute\n - Guidance\n * - :attr:`~.AttentionWeightedGNN.graph_conv_layers`\n - ``nn.Sequential`` of ``torch_geometric.nn`` Layers.\n * - :attr:`~.AttentionWeightedGNN.dropout_prob`\n - Float between (not including) 0 and 1.\n\nLet's modify the model! More info about how to do this can be found in `modifying-models`.\n\n"
]
},
{
Expand Down Expand Up @@ -83,7 +83,7 @@
"cell_type": "markdown",
"metadata": {},
"source": [
"## What happens when the modifications are incorrect?\n\nLet's see what happens when we try to modify an **attribute that doesn't exist**.\n\n\n"
"You can see that the input features to the ``final_prediction`` layer changed to fit with our modification to the ``graph_conv_layers`` output features!\n\n## What happens when the modifications are incorrect?\n\nLet's see what happens when we try to modify an **attribute that doesn't exist**.\n\n\n"
]
},
{
Expand Down Expand Up @@ -130,7 +130,7 @@
"cell_type": "markdown",
"metadata": {},
"source": [
"## What about modifying multiple attributes with the **conflicting modifications**?\n\n"
"## What about modifying multiple attributes with the **conflicting modifications**?\n\n\nFor this, let's switch to looking at the :class:`~fusilli.fusionmodels.tabularfusion.concat_feature_maps.ConcatTabularFeatureMaps` model.\nThis model concatenates the feature maps of the two modalities and then passes them through a prediction layer.\n\nWe can modify the layers that each tabular modality goes through before being concatenated, as well as the layers that come after the concatenation.\n\nThe output features of our modified ``mod1_layers`` and ``mod2_layers`` are 100 and 128, so the input features of the ``fused_layers`` should be 228. However, we've set the input features of the ``fused_layers`` to be 25.\n\nLet's see what happens when we try to modify the model in this way. It should throw an error when the data is passed through the model.\n\n"
]
},
{
Expand All @@ -141,7 +141,7 @@
},
"outputs": [],
"source": [
"#\n# For this, let's switch to looking at the :class:`~fusilli.fusionmodels.tabularfusion.concat_feature_maps.ConcatTabularFeatureMaps` model.\n# This model concatenates the feature maps of the two modalities and then passes them through a prediction layer.\n#\n# We can modify the layers that each tabular modality goes through before being concatenated, as well as the layers that come after the concatenation.\n#\n# The output features of our modified ``mod1_layers`` and ``mod2_layers`` are 100 and 128, so the input features of the ``fused_layers`` should be 228. However, we've set the input features of the ``fused_layers`` to be 25.\n#\n# Let's see what happens when we try to modify the model in this way. It should throw an error when the data is passed through the model.\n\nlayer_mods = {\n \"ConcatTabularFeatureMaps\": {\n \"mod1_layers\": nn.ModuleDict(\n {\n \"layer 1\": nn.Sequential(\n nn.Linear(10, 32),\n nn.ReLU(),\n ),\n \"layer 2\": nn.Sequential(\n nn.Linear(32, 66),\n nn.ReLU(),\n ),\n \"layer 3\": nn.Sequential(\n nn.Linear(66, 128),\n nn.ReLU(),\n ),\n }\n ),\n \"mod2_layers\": nn.ModuleDict(\n {\n \"layer 1\": nn.Sequential(\n nn.Linear(15, 45),\n nn.ReLU(),\n ),\n \"layer 2\": nn.Sequential(\n nn.Linear(45, 70),\n nn.ReLU(),\n ),\n \"layer 3\": nn.Sequential(\n nn.Linear(70, 100),\n nn.ReLU(),\n ),\n }\n ),\n \"fused_layers\": nn.Sequential(\n nn.Linear(25, 150),\n nn.ReLU(),\n nn.Linear(150, 75),\n nn.ReLU(),\n nn.Linear(75, 50),\n nn.ReLU(),\n ),\n },\n}\n\n# get the data and train the model\n\nfrom fusilli.fusionmodels.tabularfusion.concat_feature_maps import ConcatTabularFeatureMaps\n\ndatamodule = get_data_module(ConcatTabularFeatureMaps, params, layer_mods=layer_mods)\ntrained_model_list = train_and_save_models(\n data_module=datamodule,\n params=params,\n fusion_model=ConcatTabularFeatureMaps,\n layer_mods=layer_mods,\n max_epochs=5,\n)"
"layer_mods = {\n \"ConcatTabularFeatureMaps\": {\n \"mod1_layers\": nn.ModuleDict(\n {\n \"layer 1\": nn.Sequential(\n nn.Linear(10, 32),\n nn.ReLU(),\n ),\n \"layer 2\": nn.Sequential(\n nn.Linear(32, 66),\n nn.ReLU(),\n ),\n \"layer 3\": nn.Sequential(\n nn.Linear(66, 128),\n nn.ReLU(),\n ),\n }\n ),\n \"mod2_layers\": nn.ModuleDict(\n {\n \"layer 1\": nn.Sequential(\n nn.Linear(15, 45),\n nn.ReLU(),\n ),\n \"layer 2\": nn.Sequential(\n nn.Linear(45, 70),\n nn.ReLU(),\n ),\n \"layer 3\": nn.Sequential(\n nn.Linear(70, 100),\n nn.ReLU(),\n ),\n }\n ),\n \"fused_layers\": nn.Sequential(\n nn.Linear(25, 150),\n nn.ReLU(),\n nn.Linear(150, 75),\n nn.ReLU(),\n nn.Linear(75, 50),\n nn.ReLU(),\n ),\n },\n}\n\n# get the data and train the model\n\nfrom fusilli.fusionmodels.tabularfusion.concat_feature_maps import ConcatTabularFeatureMaps\n\ndatamodule = get_data_module(ConcatTabularFeatureMaps, params, layer_mods=layer_mods)\ntrained_model_list = train_and_save_models(\n data_module=datamodule,\n params=params,\n fusion_model=ConcatTabularFeatureMaps,\n layer_mods=layer_mods,\n max_epochs=5,\n)"
]
},
{
Expand Down
Loading

0 comments on commit 27ac475

Please sign in to comment.