diff --git a/models/brats_mri_generative_diffusion/docs/README.md b/models/brats_mri_generative_diffusion/docs/README.md index 1c01d861..444db62e 100644 --- a/models/brats_mri_generative_diffusion/docs/README.md +++ b/models/brats_mri_generative_diffusion/docs/README.md @@ -1,11 +1,11 @@ # Model Overview A pre-trained model for volumetric (3D) Brats MRI 3D Latent Diffusion Generative Model. -This model is trained on BraTS 2016 and 2017 data from [Medical Decathlon](http://medicaldecathlon.com/), using the Latent diffusion model [1]. +This model is trained based on BraTS 2018 data from [Multimodal Brain Tumor Segmentation Challenge (BraTS) 2018](https://www.med.upenn.edu/sbia/brats2018.html), using the Latent diffusion model [1]. ![model workflow](https://developer.download.nvidia.com/assets/Clara/Images/monai_brain_image_gen_ldm3d_network.png) -This model is a generator for creating images like the Flair MRIs based on BraTS 2016 and 2017 data. It was trained as a 3d latent diffusion model and accepts Gaussian random noise as inputs to produce an image output. The `train_autoencoder.json` file describes the training process of the variational autoencoder with GAN loss. The `train_diffusion.json` file describes the training process of the 3D latent diffusion model. +This model is a generator for creating images like the T1CE MRIs based on BraTS 2018 data. It was trained as a 3d latent diffusion model and accepts Gaussian random noise as inputs to produce an image output. The `train_autoencoder.json` file describes the training process of the variational autoencoder with GAN loss. The `train_diffusion.json` file describes the training process of the 3D latent diffusion model. In this bundle, the autoencoder uses perceptual loss, which is based on ResNet50 with pre-trained weights (the network is frozen and will not be trained in the bundle). In default, the `pretrained` parameter is specified as `False` in `train_autoencoder.json`. To ensure correct training, changing the default settings is necessary. There are two ways to utilize pretrained weights: 1. if set `pretrained` to `True`, ImageNet pretrained weights from [torchvision](https://pytorch.org/vision/stable/_modules/torchvision/models/resnet.html#ResNet50_Weights) will be used. However, the weights are for non-commercial use only. @@ -20,12 +20,21 @@ An example result from inference is shown below: **This is a demonstration network meant to just show the training process for this sort of network with MONAI. To achieve better performance, users need to use larger dataset like [Brats 2021](https://www.synapse.org/#!Synapse:syn25829067/wiki/610865) and have GPU with memory larger than 32G to enable larger networks and attention layers.** ## Data -The training data is BraTS 2016 and 2017 from the Medical Segmentation Decathalon. Users can find more details on the dataset (`Task01_BrainTumour`) at http://medicaldecathlon.com/. +The training data is from the [Multimodal Brain Tumor Segmentation Challenge (BraTS) 2018](https://www.med.upenn.edu/sbia/brats2018.html). - Target: Image Generation - Task: Synthesis - Modality: MRI -- Size: 388 3D volumes (1 channel used) +- Size: 285 3D volumes (1 channel used) + +The provided labelled data was partitioned, based on our own split, into training (200 studies), validation (42 studies) and testing (43 studies) datasets. + +### Preprocessing +The data list/split can be created with the script `scripts/prepare_datalist.py`. + +``` +python scripts/prepare_datalist.py --path your-brats18-dataset-path +``` ## Training Configuration If you have a GPU with less than 32G of memory, you may need to decrease the batch size when training. To do so, modify the `train_batch_size` parameter in the [configs/train_autoencoder.json](../configs/train_autoencoder.json) and [configs/train_diffusion.json](../configs/train_diffusion.json) configuration files. @@ -34,46 +43,42 @@ If you have a GPU with less than 32G of memory, you may need to decrease the bat The autoencoder was trained using the following configuration: - GPU: at least 32GB GPU memory -- Actual Model Input: 112 x 128 x 80 +- Actual Model Input: 128 x 128 x 128 - AMP: False - Optimizer: Adam -- Learning Rate: 1e-5 +- Learning Rate: 2e-5 - Loss: L1 loss, perceptual loss, KL divergence loss, adversarial loss, GAN BCE loss #### Input -1 channel 3D MRI Flair patches +1 channel 3D MRI T1CE patches #### Output - 1 channel 3D MRI reconstructed patches -- 8 channel mean of latent features -- 8 channel standard deviation of latent features +- 4 channel mean of latent features +- 4 channel standard deviation of latent features ### Training Configuration of Diffusion Model The latent diffusion model was trained using the following configuration: - GPU: at least 32GB GPU memory -- Actual Model Input: 36 x 44 x 28 +- Actual Model Input: 48 x 48 x 32 - AMP: False - Optimizer: Adam -- Learning Rate: 1e-5 +- Learning Rate: 2e-5 - Loss: MSE loss #### Training Input -- 8 channel noisy latent features +- 4 channel noisy latent features - a long int that indicates the time step #### Training Output -8 channel predicted added noise +4 channel predicted added noise #### Inference Input -8 channel noise +4 channel noise #### Inference Output -8 channel denoised latent features - -### Memory Consumption Warning - -If you face memory issues with data loading, you can lower the caching rate `cache_rate` in the configurations within range [0, 1] to minimize the System RAM requirements. +4 channel denoised latent features ## Performance