Skip to content

Commit

Permalink
updating README
Browse files Browse the repository at this point in the history
  • Loading branch information
jlause committed Dec 3, 2020
1 parent 7727e24 commit b444856
Showing 1 changed file with 6 additions and 4 deletions.
10 changes: 6 additions & 4 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,21 +1,21 @@
# Analytic Pearson residuals for normalization of single-cell RNA-seq UMI data
###### Jan Lause, Philipp Berens, Dmitry Kobak
###### Jan Lause, Philipp Berens & Dmitry Kobak


### How to use this repository

This repository contains the code to reproduce the analysis presented in our preprint on UMI data normalization (https://www.biorxiv.org/content/10.1101/2020.12.01.405886v1).
This repository contains the code to reproduce the analysis presented in our paper on UMI data normalization (Lause, Berens & Kobak (2020), https://www.biorxiv.org/content/10.1101/2020.12.01.405886v1).

To start, follow these steps:

- install the required software listed below
- clone this repository to your system
- go to `tools.py`, and adapt the three import paths as needed
- go to `tools.py` and adapt the three import paths as needed
- follow the dataset download instructions below

Then, you can step through our analysis by following the sequence of the notebooks. There are four independet analyses:

- Reproduction and investigation of the NB regression model by Hafemeister & Satija (Notebookes `01` & `02`, producing Figure 1 from our paper)
- Reproduction of the NB regression model by Hafemeister & Satija (2019) and investigation of alternative models (Notebookes `01` & `02`, producing Figure 1 from our paper)
- Estimation of technical overdispersion from negative control datasets (Notebooks `01` & `03`, producing Figure S1)
- Benchmarking normalization by Analytical Pearson residuals vs. GLM-PCA vs. standard methods:
- on the PBMC dataset (Notebooks `01`, `04`, `05`, producing Figures 2, S3, S4, S6, S7, S8, and additional figures)
Expand All @@ -25,6 +25,8 @@ Each of the analyses will first preprocess and filter the datasets. Next, comput

We recommend to run the code on a powerful machine with at least 150 GB RAM.

For questions or feedback, feel free to use the issue system or email us.

### Pre-requisites

- Python (version used in the paper: `3.6.9`)
Expand Down

0 comments on commit b444856

Please sign in to comment.