The reference code of Improving Factual Completeness and Consistency of Image-to-text Radiology Report Generation.
- CNN-RNN-RNN (Liu et al., 2019)
- Knowing When to Look (Lu et al., 2017)
- Meshed-Memory Transformer (Cornia et al., 2020)
- Show, Attend and Tell (Xu et al., 2015)
- TieNet (Wang et al., 2018)
- MIMIC-CXR-JPG (Johnson et al., 2019)
- Open-i (Demner-Fushman et al., 2012)
NOTE : We are working to make the radiology NLI dataset publicly available.
- A Linux OS (tested on Ubuntu 16.04)
- Memory over 24GB
- A gpu with memory over 12GB (tested on NVIDIA Titan X and NVIDIA Titan XP)
Create a conda environment
$ conda env create -f environment.yml
NOTE
: environment.yml
is set up for CUDA 10.1 and cuDNN 7.6.3. This may need to be changed depending on a runtime environment.
- Download MIMIC-CXR-JPG
- Make a resized copy of MIMIC-CXR-JPG using resize_mimic-cxr-jpg.py (MIMIC_CXR_ROOT is a dataset directory containing mimic-cxr)
$ python resize_mimic-cxr-jpg.py MIMIC_CXR_ROOT
- Create the sections file of MIMIC-CXR (mimic_cxr_sectioned.csv.gz) with create_sections_file.py
- Move mimic_cxr_sectioned.csv.gz to MIMIC_CXR_ROOT/mimic-cxr-resized/2.0.0/
Pre-calculate document frequencies that will be used in CIDEr by:
$ python cider-df.py MIMIC_CXR_ROOT mimic-cxr_train-df.bin.gz
Pre-recognize named entities in MIMIC-CXR by:
$ python ner_reports.py --stanza-download MIMIC_CXR_ROOT mimic-cxr_ner.txt.gz
- Download CheXpert Dataset v1.0
- Train a CheXpert classification model by:
$ python train_image.py --cuda --epochs 12 --batch-size 16 --eval-interval 65000 --cache-data cache CheXpert-v1.0-small densenet chexpert_densenet
Download pre-trained radiology NLI weights and GloVe embeddings
$ cd resources
$ ./download.sh
First, train the Meshed-Memory Transformer model with an NLL loss.
# NLL
$ python train.py --cuda --corpus mimic-cxr --cache-data cache --epochs 32 --batch-size 24 --cider-df mimic-cxr_train-df.bin.gz --entity-match mimic-cxr_ner.txt.gz --img-model densenet --img-pretrained chexpert_densenet/model_auc14.dict.gz --cider-df mimic-cxr_train-df.bin.gz --bert-score distilbert-base-uncased --corpus mimic-cxr --lr-scheduler trans MIMIC_CXR_ROOT resources/glove_mimic-cxr_train.512.txt.gz out_m2trans_nll
Second, further train the model a joint loss using the self-critical RL to achieve a better performance.
# RL with NLL + BERTScore + EntityMatchExact
$ python train.py --cuda --corpus mimic-cxr --cache-data cache --epochs 32 --batch-size 24 --rl-epoch 1 --rl-metrics BERTScore,EntityMatchExact --rl-weights 0.01,0.495,0.495 --entity-match resources/mimic-cxr_ner.txt.gz --baseline-model out_m2trans_nll/model_31-152173.dict.gz --img-model densenet --img-pretrained chexpert_densenet/chexpert_auc14.dict.gz --cider-df mimic-cxr_train-df.bin.gz --bert-score distilbert-base-uncased --lr 5e-6 --lr-step 32 MIMIC_CXR_ROOT resources/glove_mimic-cxr_train.512.txt.gz out_m2trans_nll-bs-emexact
# RL with NLL + BERTScore + EntityMatchNLI
$ python train.py --cuda --corpus mimic-cxr --cache-data cache --epochs 32 --batch-size 24 --rl-epoch 1 --rl-metrics BERTScore,EntityMatchNLI --rl-weights 0.01,0.495,0.495 --entity-match resources/mimic-cxr_ner.txt.gz --baseline-model out_m2trans_nll/model_31-152173.dict.gz --img-model densenet --img-pretrained chexpert_densenet/chexpert_auc14.dict.gz --cider-df mimic-cxr_train-df.bin.gz --bert-score distilbert-base-uncased --lr 5e-6 --lr-step 32 MIMIC_CXR_ROOT resources/glove_mimic-cxr_train.512.txt.gz out_m2trans_nll-bs-emnli
A training result can be checked with TensorBoard.
$ tensorboard --logdir out_m2trans_nll-bs-emnli/log
Serving TensorBoard on localhost; to expose to the network, use a proxy or pass --bind_all
TensorBoard 2.0.0 at http://localhost:6006/ (Press CTRL+C to quit)
In this project we mainly modify the code as two reasons. First we would like to make the model support customized mimic-cxr-jpg dataset since we have no access to the official dataset yet. Second is that we would like to fit the model with IU XRAY dataset. The modified py files are showed as below:
train.py
train_images.py
clinicgen/eval.py
clinicgen/utils.py
clinicgen/data/mimiccxr.py
clinicgen/data/iuxray.py
Followed the original IFCC author, we hard code the number of used images in clinicgen/data/mimiccxr.py and clinicgen/data/iuxray.py at the beginning of these files. If you would like to use different dataset, you may modify the numbers
The prject should be able to fit the datasets from below:
https://github.com/cuhksz-nlp/R2Gen
See LICENSE and clinicgen/external/LICENSE_bleu-cider-rouge-spice for details.