This is the official Github repository of the Benchmarking a foundational cell model for post-perturbation RNAseq prediction. This is a fork of the scGPT repository.
- notebooks/
- bulk_models.ipynb: train RF and EN, compares performance to scGPT and "Train Mean"
- data_analysis.ipynb: runs data analysis and generates Figure 2
- scgpt_mean.ipynb: runs the mean model and compares it with scGPT
- Tutorial_PerturbationAdamson.ipynb: trains scGPT on the Adamson et al. dataset
- Tutorial_PerturbationNorman.ipynb: trains scGPT on the Norman et al. dataset
- Tutorial_PerturbationReplogle.ipynb: trains scGPT on the Replogle et al. dataset
To reproduce the results of the paper, please follow the following steps:
- Run
git lfs pull
to download the required data from Git Large File System. If lfs is not installed, pleaser refer to this guide - Run
make setup
to create the conda environment, install the ipython kernel and unzip the replogle dataset - Select the scgpt_yml conda environment as the Python kernel for the notebooks
- Run data_analysis.ipynb
- Run the Tutorial notebooks to get the results of scGPT
- Run scgpt_mean.ipynb
- Run bulk_models.ipynb