ARuS/README.md at main · JasonCharamis/ARuS · GitHub

ARuS (Automated RNAseq analysis using Snakemake)

This is a fully automated Snakemake pipeline for streamlining end-to-end RNAseq analysis, from fastq reads to DE analysis.

To use as a Docker container, run:

git clone https://github.com/JasonCharamis/ARuS.git
cd ARuS/workflow/ && sudo docker build -t automated_rnaseq_analysis:latest .
sudo docker run -it -v $(pwd):/mnt/workdir -w /mnt/workdir automated_rnaseq_analysis:latest snakemake --snakefile ARuS/workflow/Snakefile --cores 1 --use-conda --conda-frontend mamba

Usage: Wildcard for sample identification is "{sample}_1.fastq.gz" and "{sample}_2.fastq.gz".

The pipeline is designed for 150-bp paired-end Illumina reads and it includes:

Read quality control (QC) and adapter-trimming
Mapping of reads against provided genome sequence
Assign mapped reads to genes - this step also computes TPM values and uses them to produce a PCA plot
Differential expression (DE) analysis using edgeR
Post-DE annotation of DE genes and optionally combine with orthology results

Dependencies:

FASTQC https://github.com/s-andrews/FastQC
Trimommatic https://github.com/usadellab/Trimmomatic
STAR https://github.com/alexdobin/STAR
featureCounts https://github.com/torkian/subread-1.6.1
edgeR https://bioconductor.org/packages/release/bioc/html/edgeR.html
Trinity-bundled Perl scripts for DE analysis using edgeR https://github.com/trinityrnaseq/trinityrnaseq

If you use it on hisat2-mapping mode, you will also need:

hisat2 https://github.com/DaehwanKimLab/hisat2
samtools https://github.com/samtools/samtools

Every dependency is automatically installed through conda.