- About the Pipeline
- Activate the Environment on HPC
- Examine the Sample Sheet
- Run the Example Data
- Configure Pipeline for Your Data
- Run Script
- Expected Outputs
The pipeline runs the
Bowtie2
alignment, quality trimming of reads with trimgalore,
SEACR peak calling, and optionally
MACS2 peak calling. MACS2
requires an effective genome size to call peaks, which you can provide
directly or call
unique-kmers.py
to calculate the effective genome size on the fly. Coverage tracks are
produced for visualization in IGV.
It will also perform general QC statistics on the fastqs with fastqc, the alignment, peak calling, and sample similarity using deeptools. Finally, the QC reports are collected into a single file using multiQC.
A DAG (directed acyclic graph) of the default workflow is show below:
The directions to set-up the Nextflow workflow requirements are found in the README.md. Ensure that you have followed the steps to fork and clone the repository and created the conda nextflow environment before starting with this document.
Optional but recommended: use tmux
on the cybertron login nodes. Name
the session nextflow and then request an interactive session, then
activate the nextflow conda environment. The project codes can be found
with project info
command. Change the $QUEUE
and $NAME
variables
in the code chunk below to be accurate for your Cybertron projects.
tmux new-session -s nextflow
project info
NAME="RSC_adhoc"
QUEUE="sceaq"
qsub -I -q $QUEUE -P $(project code $NAME) -l select=1:ncpus=1:mem=8g -l walltime=8:00:00
Navigate to where you place the cloned (copied) cutandrun_nf directory, and then checkout the latest release branch.
cd /path/to/my/cutandrun_nf
git fetch
# this will list all branches of the repository. Find the release branch with the latest version in red text, which means you don't yet have a local copy of that branch.
git branch -a
# then select the latest version, for example 2.0.0. This downloads the stable version of pipeline locally.
git checkout release/2.0.0
# Now the * indicates that you're on the release branch and its no longer red text.
git branch
Activate the Nextflow conda environment.
conda env create -f env/nextflow.yaml
conda activate nextflow
A sample sheet in csv (comma separated values) format is used as input to the pipeline. This sample sheet must have the following column names in any order:
- “sample”
- “sample_id”
- “target_or_control”
- “single_end”
- “read1”
- “read2”
column_name | column_description |
---|---|
sample | Any alphanumeric string for each biological sample in the dataset. Will have the same sample IDs for each antibody used. For example SAMPLE_1 has both H3K27me3 and IgG control CUT&RUN, and thus SAMPLE_1 has 1 row with the files for H3K27me3, and SAMPLE_1 has 2nd row with the files for IgG data. |
sample_id | Any alphanumeric string for each unique sample+condition. No duplicates allowed. For example SAMPLE_1 has both H3K27me3 and IgG control CUT&RUN. Thus, SAMPLE_1 is the value in sample , and “SAMPLE_1_H3K27me3” is the value in sample_id . Again, SAMPLE_1 has 2nd row with the files for IgG data, where SAMPLE_1 is the value in sample , and “SAMPLE_1_IgG” is the value in sample_id |
target_or_control | Must contain the values [target or control] case-sensitive. Target is for the antibodies using the immunoprecipitation for the proteins of interest, such as transcription factors or histone modifications like H3K27me3, or the value control for the isotype control (eg IgG). |
read1 | Contain absolute filepaths to read 1 in paired-end fastqs. |
read2 | Contain absolute filepaths to read 2 in paired-end fastqs. |
single_end | For CUT&RUN data it should always be [false] case-sensitive. |
1) Below is an example of a complete sample sheet for use in the
pipeline, which can be edited for your own samples in
test_data/test_dataset_sample_sheet.csv
.
- It contains IgG control samples for peak calling.
- This sample sheets OK to use even if you elect to skip IgG normalization in SEACR or use IgG background in MACS2. The pipeline will simply not use the controls.
- Use
threshold
, andno_control_macs2
parameters innextflow.config
to change this. Details found in Configure Pipeline for Your Data
sample | sample_id | single_end | target_or_control | read1 | read2 |
---|---|---|---|---|---|
M1 | M1_H3K27_NK | false | target | /gpfs/shared_data/demo_data/mus_musculus/cutandrun/fastqs/M1_H3K27_NK_chr17_R1_ecoli.fastq.gz | /gpfs/shared_data/demo_data/mus_musculus/cutandrun/fastqs/M1_H3K27_NK_chr17_R2_ecoli.fastq.gz |
M1 | M1_H3K4_NK | false | target | /gpfs/shared_data/demo_data/mus_musculus/cutandrun/fastqs/M1_H3K4_NK_chr17_R1_ecoli.fastq.gz | /gpfs/shared_data/demo_data/mus_musculus/cutandrun/fastqs/M1_H3K4_NK_chr17_R2_ecoli.fastq.gz |
M1 | M1_IgG_NK | false | control | /gpfs/shared_data/demo_data/mus_musculus/cutandrun/fastqs/M1_IgG_NK_chr17_R1_ecoli.fastq.gz | /gpfs/shared_data/demo_data/mus_musculus/cutandrun/fastqs/M1_IgG_NK_chr17_R2_ecoli.fastq.gz |
M2 | M2_H3K27_NK | false | target | /gpfs/shared_data/demo_data/mus_musculus/cutandrun/fastqs/M2_H3K27_NK_chr17_R1_ecoli.fastq.gz | /gpfs/shared_data/demo_data/mus_musculus/cutandrun/fastqs/M2_H3K27_NK_chr17_R2_ecoli.fastq.gz |
M2 | M2_H3K4_NK | false | target | /gpfs/shared_data/demo_data/mus_musculus/cutandrun/fastqs/M2_H3K4_NK_chr17_R1_ecoli.fastq.gz | /gpfs/shared_data/demo_data/mus_musculus/cutandrun/fastqs/M2_H3K4_NK_chr17_R2_ecoli.fastq.gz |
M2 | M2_IgG_NK | false | control | /gpfs/shared_data/demo_data/mus_musculus/cutandrun/fastqs/M2_IgG_NK_chr17_R1_ecoli.fastq.gz | /gpfs/shared_data/demo_data/mus_musculus/cutandrun/fastqs/M2_IgG_NK_chr17_R2_ecoli.fastq.gz |
2) Below is another example of a complete sample sheet for use in the pipeline.
- It lacks IgG control samples for peak calling.
- This sample sheets OK to use only if you modify the parameters to skip using IgG controls.
- Use
threshold
, andno_control_macs2
parameters innextflow.config
to modify this. Details found in Configure Pipeline for Your Data.
sample | sample_id | single_end | target_or_control | read1 | read2 |
---|---|---|---|---|---|
M1 | M1_H3K27_NK | false | target | /gpfs/shared_data/demo_data/mus_musculus/cutandrun/fastqs/M1_H3K27_NK_chr17_R1_ecoli.fastq.gz | /gpfs/shared_data/demo_data/mus_musculus/cutandrun/fastqs/M1_H3K27_NK_chr17_R2_ecoli.fastq.gz |
M1 | M1_H3K4_NK | false | target | /gpfs/shared_data/demo_data/mus_musculus/cutandrun/fastqs/M1_H3K4_NK_chr17_R1_ecoli.fastq.gz | /gpfs/shared_data/demo_data/mus_musculus/cutandrun/fastqs/M1_H3K4_NK_chr17_R2_ecoli.fastq.gz |
M1 | M1_IgG_NK | false | target | /gpfs/shared_data/demo_data/mus_musculus/cutandrun/fastqs/M1_IgG_NK_chr17_R1_ecoli.fastq.gz | /gpfs/shared_data/demo_data/mus_musculus/cutandrun/fastqs/M1_IgG_NK_chr17_R2_ecoli.fastq.gz |
M2 | M2_H3K27_NK | false | target | /gpfs/shared_data/demo_data/mus_musculus/cutandrun/fastqs/M2_H3K27_NK_chr17_R1_ecoli.fastq.gz | /gpfs/shared_data/demo_data/mus_musculus/cutandrun/fastqs/M2_H3K27_NK_chr17_R2_ecoli.fastq.gz |
M2 | M2_H3K4_NK | false | target | /gpfs/shared_data/demo_data/mus_musculus/cutandrun/fastqs/M2_H3K4_NK_chr17_R1_ecoli.fastq.gz | /gpfs/shared_data/demo_data/mus_musculus/cutandrun/fastqs/M2_H3K4_NK_chr17_R2_ecoli.fastq.gz |
M2 | M2_IgG_NK | false | target | /gpfs/shared_data/demo_data/mus_musculus/cutandrun/fastqs/M2_IgG_NK_chr17_R1_ecoli.fastq.gz | /gpfs/shared_data/demo_data/mus_musculus/cutandrun/fastqs/M2_IgG_NK_chr17_R2_ecoli.fastq.gz |
To ensure that the pipeline works, first run the test data set. This
example will run using the data found in the test_sample_sheet.csv
.
./main_run.sh "test_dataset"
Open the configuration file nextflow.config
and edit the necessary
parameters for building the index, and/or running the alignment or peak
calling steps.
## //working directory for temporary/intermediate files produced in the workflow processes
## workDir = "$HOME/temp"
##
## //global parameters
## params {
## // general options
## sample_sheet = "./test_data/test_dataset_sample_sheet.csv"
## queue = 'paidq'
## project = '207f23bf-acb6-4835-8bfe-142436acb58c'
## outdir = "./results/mouse"
## peaks_outdir = "${params.outdir}/peak_calls"
## publish_dir_mode = 'copy'
##
## //Bowtie params for target genome
## build_index = false
## fasta = '/gpfs/shared_data/Bowtie2/mm39.fa' // required
## index = '/gpfs/shared_data/Bowtie2/mm39_index/' // bowtie2 index path is required unless `build_index = true`
## save_unaligned = false
##
## // Bowtie params for spike-in genome
## <...>
Be sure to change the following lines for the global parameters:
- sample_sheet
- queue
- project code
- outdir
- peak_outdir
## Warning in params_lines[1]:end: numerical expression has 2 elements: only the
## first used
## //global parameters
## params {
## // general options
## sample_sheet = "./test_data/test_dataset_sample_sheet.csv"
## queue = 'paidq'
## project = '207f23bf-acb6-4835-8bfe-142436acb58c'
## outdir = "./results/mouse"
## peaks_outdir = "${params.outdir}/peak_calls"
## publish_dir_mode = 'copy'
Additionally, determine if you require a new bowtie2 index to be build for the target genome and/or the spike-in genome. The pipeline requires either a fasta filepath OR Bowtie2 index filepath. This is also required for the spike-in, with E. Coli provided as a default.
E. coli is the default since it that is a carry over DNA from the Cut&Run library prep methodology and is expected to be present in all Cut&Run experiments regardless if exogenous spike-in is used like Yeast. Please see here for more information on spike-in normalization.
Change the following lines for alignment reference files when needed:
- build_index
- fasta
- index
- build_spike_index
- spike_fasta
- spike_index
## //Bowtie params for target genome
## build_index = false
## fasta = '/gpfs/shared_data/Bowtie2/mm39.fa' // required
## index = '/gpfs/shared_data/Bowtie2/mm39_index/' // bowtie2 index path is required unless `build_index = true`
## save_unaligned = false
##
## // Bowtie params for spike-in genome
SEACR defaults to using IgG control normalization and stringent peak
calling, eg SEACR_1.3.sh target_bedgraph igg_bedgraph norm stringent
.
If skipping the use of IgG control all together, set threshold
to any
value > 0.
If you would like to use a spike-in normalization, either E. Coli or an
exongenous spike-in like Drosophila, set spike_norm
to true. You must
see Advanced Options for details on turning off igg
normalization.
- threshold
- spike_norm
- chrom_sizes
- scale_factor_constant
## //SEACR params
## threshold = 0 //any value > 0 will use threshold, even if IgG is available in sample sheet
## spike_norm = false
## chrom_sizes = '/gpfs/shared_data/Bowtie2/mm39.chrom.sizes'
## scale_factor_constant = 10000 //scientific notation NOT allowed
Finally, decide whether to run MACS2 calls along with the SEACR peak
calling algorithm (default = true). MACS2 will use the effective genome
size value provided in gsize
parameter.
If you are using a non-model organism or simply don’t want to use the
effective genome size provided in literature or MACS2 documentation, you
can set calc_effective_gsize = true
to calculate an effective genome
size using the target genome fasta fasta
filepath and read-length.
- run_macs2
- no_control_macs2
- gsize
- calc_effective_gsize
- read_length
## //MACS2 params
## run_macs2 = true
## no_control_macs2 = false // if true, do not use IgG control in peak calling
##
## gsize = 1.87e9 //default effective genome size for mouse from MACS2
## calc_effective_gsize = false //if true, will override the value in gsize parameter
## read_length = 150 //if calc_effective_gsize, provide illumina read-length in base pairs
In the nextflow.config
, you can define additional command line
arguments to the scientific software under process
scope. You may use
the advanced options to change computational resources requested for
different processes. The CPUs and memory parameters can updated to
request a larger amount of resources like CPUs or memory if files are
large. You may also edit the commandline parameters for processes in the
workflow using the ext.arg
directive.
Please be aware the default command line parameters for Bowtie2
processes are already provided for both target and spike-in alignment,
but can be edited.
The most commonly modified and important process parameters are listed
toward the top of the process scope in the nextflow.config
file.
You can edit the command line parameters for SEACR
and MACS2
parameters that often need to be re-run multiple times when deciding on
the appropriate peak-set to use. For example, MACS2
broad and narrow
peak calling parameters for different histone modifications which can be
modified using the ext.args
parameter.
## [1] // Computational resource allocation for the processes run in the workflow
## [2] process {
## [3] //Bowtie2 aligner process specific parameters
## [4] withName: BOWTIE2_ALIGN {
## [5] cpus = { 2 * task.attempt }
## [6] memory = { 32.GB * task.attempt }
## [7] ext.prefix = { "${meta.id}.sort" }
## [8] ext.args = '--local --very-sensitive-local --no-unal --no-mixed --no-discordant --phred33 -I 10 -X 700'
## [9] ext.args2 = '' //command line arguments for `samtools sort`
## [10] }
## [11] //SEACR peak calling resources
## [12] withName: SEACR_CALLPEAK {
## [13] cpus = { 1 * task.attempt }
## [14] memory = { 16.GB * task.attempt }
## [15] ext.version = '1.4' //version 1.3 and 1.4 supported
## [16] ext.args = '--normalize norm --mode stringent --remove yes'
## [17] publishDir = [...]
## [18]
## [19] }
## [20] //MACS2 peak calling resources
## [21] withName: MACS2_CALLPEAK {
## [22] cpus = { 1 * task.attempt }
## [23] memory = { 16.GB * task.attempt }
## [24] ext.args = '-q 0.01 --keep-dup all --bdg'
## [25] publishDir = [...]
## [26]
## [27] }
## [28] //BAMCOVERAGE bigwig file parameters
## [29] withName: DEEPTOOLS_BAMCOVERAGE {
## [30] cpus = { 4 * task.attempt }
## [31] memory = { 16.GB * task.attempt }
## [32] ext.args = '--normalizeUsing CPM --centerReads --verbose'
## [33] }
SEACR has the option to be set to SEACR v1.4 or SEACR v1.3 - which have particularly different commandline interfaces, changes in the methods for normalization to IgG, and v1.4 can optionally remove peaks found in IgG. Please see here for the full changelog.
For SEACR v1.3, Often, you will need to change SEACR
from “non” to
“norm” for different normalization strategies whether you’re using IgG
normalization or spike-in normalization. The example below demonstrates
how to change the commandline params and version by editing
ext.version
and ext.args
.
## //SEACR peak calling resources
## withName: SEACR_CALLPEAK {
## cpus = { 1 * task.attempt }
## memory = { 16.GB * task.attempt }
## ext.version = '1.3' //version 1.3 and 1.4 supported
## ext.args = 'norm stringent'
## publishDir = [...]
##
## }
usethis::edit_file(here::here("main_run.sh"))
## • Edit '/active/taylor_s/people/jsmi26/RPDEV/cutandrun_nf/main_run.sh'
Decide on the NFX_PROFILE
, which allows you to run the processes
either locally, or using the PBS job scheduler on Cybertron, and
determine if you’d like to use singularity containers or docker
containers.
-
PBS_singularity
[DEFAULT, recommended] * you can submit a PBS job that will use singularity containers on Cybertron * This takes care of requesting the appropriate resources using PBS -
local_singularity
* locally on an interactive session Cybertron with singularity * requires appropriate computational resources be requested usingqsub -I -q <queue_name> -P <project_code> -l select=1:ncpus=4:mem=32GB
Edit the script main_run.sh
and change the values for the
NFX_PROFILE
variable if desired.
## #Options: 'PBS_apptainer','local_apptainer','local_singularity', 'PBS_singularity'
## NFX_PROFILE='PBS_singularity'
Edit the variables in the main_run.sh
script for entry-point of the
workflow. The default option “align_call_peaks” for the NFX_ENTRY
will run the full pipeline (QC, alignment, peak calling, coverage
tracks).
## #Options: 'bowtie2_index_only', 'align_call_peaks', 'call_peaks'
## NFX_ENTRY='align_call_peaks'
If you already have aligned BAM files, see
test_data/test_dataset_bams_sample_sheet.csv
for an example to call
peaks only using the entry call_peaks
.
## #Options: 'bowtie2_index_only', 'align_call_peaks', 'call_peaks'
## NFX_ENTRY='call_peaks'
Then, execute the main_run.sh
script in order to complete the peak
calling on the samples. Provide a small descriptive prefix for the
pipeline run.
./main_run.sh "my_analysis"
You can also change the entry-point of the workflow, which is
accomplished by setting the NFX_ENTRY
variable in the main_run.sh
script to be bowtie2_index_only
. This will allow the pipeline to run
only the Bowtie2 build process and exit upon completion of the index
building step.
## #Options: 'bowtie2_index_only', 'align_call_peaks', 'call_peaks'
## NFX_ENTRY='bowtie2_index_only'
./main_run.sh "bowtie2_index"
Under the path provided in the nextflow config for params “outdir”, you will find directories named for each of the modules.
results/{params.outdir}
samtools_view/
- aligned, coordinate sorted, marked duplicates, and optionally quality filtered bam file
- {sample_id}.markedDup.filter.bam
samtools_index/
- {sample_id}.markedDup.filter.sort.bam.bai
deeptools_bamcoverage/
- Counts per million normalized coverage track (bigwig) file.
- {sample_id}_CPM.bigWig
{params.peaks_outdir}/seacr_callpeak/
- If using IgG normalization, the {sample_id} of the IgG control used is appended to the target
- {sample_id}_[norm,non].[stringent,relaxed].bed
{params.peaks_outdir}/macs2_callpeak/
- Optional output if
run_macs2 = true
- {sample_id}_peaks.[narrowPeak,broadPeak]
- Optional output if
In addition, there will be an HTML report with information on where the
temp data is stored in the workDir
path, and general run statistics
such as resource utilized versus requested, which helps with
optimization. It will also provide information on how much walltime was
used per sample, total CPU hours, etc.
The HTML file is found in reports
directory and will have the prefix
defined on the command line when the ./main_run.sh "my_analysis"
was
invoked, so in this example it would be named
“my_analysis_{DATE}.html”.
There will also be a detailed nextflow log file that is useful for de-bugging which will also be named in this example, “my_analysis_{DATE}_nextflow.log”.
Finally, the pipeline will produce a DAG - Directed acyclic graph, which
describes the workflow channels (inputs) and the modules. The DAG image
will be saved under dag/
directory with the name
“my_analysis_{DATE}_dag.pdf”.
There will be the following file structure:
## ../results/mouse
## ├── bamtobedgraph
## │ ├── M1_H3K27_NK_aligned.bed
## │ ├── M1_H3K27_NK_aligned.clean.bed
## │ ├── M1_H3K27_NK_aligned.fragments.bed
## │ ├── M1_H3K27_NK_aligned_fragments.bg
## │ ├── M1_H3K4_NK_aligned.bed
## │ ├── M1_H3K4_NK_aligned.clean.bed
## │ ├── M1_H3K4_NK_aligned.fragments.bed
## │ ├── M1_H3K4_NK_aligned_fragments.bg
## │ ├── M1_IgG_NK_aligned.bed
## │ ├── M1_IgG_NK_aligned.clean.bed
## │ ├── M1_IgG_NK_aligned.fragments.bed
## │ ├── M1_IgG_NK_aligned_fragments.bg
## │ ├── M2_H3K27_NK_aligned.bed
## │ ├── M2_H3K27_NK_aligned.clean.bed
## │ ├── M2_H3K27_NK_aligned.fragments.bed
## │ ├── M2_H3K27_NK_aligned_fragments.bg
## │ ├── M2_H3K4_NK_aligned.bed
## │ ├── M2_H3K4_NK_aligned.clean.bed
## │ ├── M2_H3K4_NK_aligned.fragments.bed
## │ ├── M2_H3K4_NK_aligned_fragments.bg
## │ ├── M2_IgG_NK_aligned.bed
## │ ├── M2_IgG_NK_aligned.clean.bed
## │ ├── M2_IgG_NK_aligned.fragments.bed
## │ └── M2_IgG_NK_aligned_fragments.bg
## ├── bowtie2_align
## │ ├── M1_H3K27_NK.sort.bam
## │ ├── M1_H3K27_NK.sort.bowtie2.log
## │ ├── M1_H3K4_NK.sort.bam
## │ ├── M1_H3K4_NK.sort.bowtie2.log
## │ ├── M1_IgG_NK.sort.bam
## │ ├── M1_IgG_NK.sort.bowtie2.log
## │ ├── M2_H3K27_NK.sort.bam
## │ ├── M2_H3K27_NK.sort.bowtie2.log
## │ ├── M2_H3K4_NK.sort.bam
## │ ├── M2_H3K4_NK.sort.bowtie2.log
## │ ├── M2_IgG_NK.sort.bam
## │ └── M2_IgG_NK.sort.bowtie2.log
## ├── deeptools_bamcoverage
## │ ├── M1_H3K27_NK_CPM.bigWig
## │ ├── M1_H3K4_NK_CPM.bigWig
## │ ├── M1_IgG_NK_CPM.bigWig
## │ ├── M2_H3K27_NK_CPM.bigWig
## │ ├── M2_H3K4_NK_CPM.bigWig
## │ └── M2_IgG_NK_CPM.bigWig
## ├── deeptools_multibigwigsummary
## │ └── test_dataset_sample_sheet_scores_per_bin.npz
## ├── deeptools_plotcorrelation
## │ ├── test_dataset_sample_sheet.plotCorrelation.mat.tab
## │ └── test_dataset_sample_sheet.plotCorrelation.pdf
## ├── deeptools_plotfingerprint
## │ ├── M1_H3K27_NK.plotFingerprint.pdf
## │ ├── M1_H3K27_NK.plotFingerprint.qcmetrics.txt
## │ ├── M1_H3K27_NK.plotFingerprint.raw.txt
## │ ├── M1_H3K4_NK.plotFingerprint.pdf
## │ ├── M1_H3K4_NK.plotFingerprint.qcmetrics.txt
## │ ├── M1_H3K4_NK.plotFingerprint.raw.txt
## │ ├── M1_IgG_NK.plotFingerprint.pdf
## │ ├── M1_IgG_NK.plotFingerprint.qcmetrics.txt
## │ ├── M1_IgG_NK.plotFingerprint.raw.txt
## │ ├── M2_H3K27_NK.plotFingerprint.pdf
## │ ├── M2_H3K27_NK.plotFingerprint.qcmetrics.txt
## │ ├── M2_H3K27_NK.plotFingerprint.raw.txt
## │ ├── M2_H3K4_NK.plotFingerprint.pdf
## │ ├── M2_H3K4_NK.plotFingerprint.qcmetrics.txt
## │ ├── M2_H3K4_NK.plotFingerprint.raw.txt
## │ ├── M2_IgG_NK.plotFingerprint.pdf
## │ ├── M2_IgG_NK.plotFingerprint.qcmetrics.txt
## │ ├── M2_IgG_NK.plotFingerprint.raw.txt
## │ ├── test_dataset_sample_sheet.plotFingerprint.pdf
## │ ├── test_dataset_sample_sheet.plotFingerprint.qcmetrics.txt
## │ └── test_dataset_sample_sheet.plotFingerprint.raw.txt
## ├── deeptools_plotpca
## │ ├── test_dataset_sample_sheet.plotPCA.pdf
## │ └── test_dataset_sample_sheet.plotPCA.tab
## ├── fastqc
## │ ├── M1_H3K27_NK_FASTQC
## │ ├── M1_H3K4_NK_FASTQC
## │ ├── M1_IgG_NK_FASTQC
## │ ├── M2_H3K27_NK_FASTQC
## │ ├── M2_H3K4_NK_FASTQC
## │ └── M2_IgG_NK_FASTQC
## ├── fastqc_trim
## │ ├── M1_H3K27_NK_FASTQC_TRIM
## │ ├── M1_H3K4_NK_FASTQC_TRIM
## │ ├── M1_IgG_NK_FASTQC_TRIM
## │ ├── M2_H3K27_NK_FASTQC_TRIM
## │ ├── M2_H3K4_NK_FASTQC_TRIM
## │ └── M2_IgG_NK_FASTQC_TRIM
## ├── multiqc
## │ ├── test_dataset_sample_sheet_multiqc_report.html
## │ └── test_dataset_sample_sheet_multiqc_report_data
## ├── peak_calls
## │ ├── macs2_callpeak
## │ ├── macs2_plotenrichment
## │ ├── macspeakstobed
## │ ├── seacr_callpeak
## │ └── seacr_plotenrichment
## ├── picard_markduplicates
## │ ├── M1_H3K27_NK.markedDup.MarkDuplicates.metrics.txt
## │ ├── M1_H3K27_NK.markedDup.bai
## │ ├── M1_H3K27_NK.markedDup.bam
## │ ├── M1_H3K27_NK.markedDup.bam.md5
## │ ├── M1_H3K4_NK.markedDup.MarkDuplicates.metrics.txt
## │ ├── M1_H3K4_NK.markedDup.bai
## │ ├── M1_H3K4_NK.markedDup.bam
## │ ├── M1_H3K4_NK.markedDup.bam.md5
## │ ├── M1_IgG_NK.markedDup.MarkDuplicates.metrics.txt
## │ ├── M1_IgG_NK.markedDup.bai
## │ ├── M1_IgG_NK.markedDup.bam
## │ ├── M1_IgG_NK.markedDup.bam.md5
## │ ├── M2_H3K27_NK.markedDup.MarkDuplicates.metrics.txt
## │ ├── M2_H3K27_NK.markedDup.bai
## │ ├── M2_H3K27_NK.markedDup.bam
## │ ├── M2_H3K27_NK.markedDup.bam.md5
## │ ├── M2_H3K4_NK.markedDup.MarkDuplicates.metrics.txt
## │ ├── M2_H3K4_NK.markedDup.bai
## │ ├── M2_H3K4_NK.markedDup.bam
## │ ├── M2_H3K4_NK.markedDup.bam.md5
## │ ├── M2_IgG_NK.markedDup.MarkDuplicates.metrics.txt
## │ ├── M2_IgG_NK.markedDup.bai
## │ ├── M2_IgG_NK.markedDup.bam
## │ └── M2_IgG_NK.markedDup.bam.md5
## ├── samtools_faidx
## │ └── mm39.fa.fai
## ├── samtools_index
## │ ├── M1_H3K27_NK.markedDup.filter.sort.bam.bai
## │ ├── M1_H3K4_NK.markedDup.filter.sort.bam.bai
## │ ├── M1_IgG_NK.markedDup.filter.sort.bam.bai
## │ ├── M2_H3K27_NK.markedDup.filter.sort.bam.bai
## │ ├── M2_H3K4_NK.markedDup.filter.sort.bam.bai
## │ └── M2_IgG_NK.markedDup.filter.sort.bam.bai
## ├── samtools_nsort
## │ ├── M1_H3K27_NK.markedDup.filter.nsort.bam
## │ ├── M1_H3K4_NK.markedDup.filter.nsort.bam
## │ ├── M1_IgG_NK.markedDup.filter.nsort.bam
## │ ├── M2_H3K27_NK.markedDup.filter.nsort.bam
## │ ├── M2_H3K4_NK.markedDup.filter.nsort.bam
## │ └── M2_IgG_NK.markedDup.filter.nsort.bam
## ├── samtools_sort
## │ ├── M1_H3K27_NK.markedDup.filter.sort.bam
## │ ├── M1_H3K4_NK.markedDup.filter.sort.bam
## │ ├── M1_IgG_NK.markedDup.filter.sort.bam
## │ ├── M2_H3K27_NK.markedDup.filter.sort.bam
## │ ├── M2_H3K4_NK.markedDup.filter.sort.bam
## │ └── M2_IgG_NK.markedDup.filter.sort.bam
## ├── samtools_stats
## │ ├── M1_H3K27_NK.markedDup.stats
## │ ├── M1_H3K4_NK.markedDup.stats
## │ ├── M1_IgG_NK.markedDup.stats
## │ ├── M2_H3K27_NK.markedDup.stats
## │ ├── M2_H3K4_NK.markedDup.stats
## │ └── M2_IgG_NK.markedDup.stats
## ├── samtools_view
## │ ├── M1_H3K27_NK.markedDup.filter.bam
## │ ├── M1_H3K4_NK.markedDup.filter.bam
## │ ├── M1_IgG_NK.markedDup.filter.bam
## │ ├── M2_H3K27_NK.markedDup.filter.bam
## │ ├── M2_H3K4_NK.markedDup.filter.bam
## │ └── M2_IgG_NK.markedDup.filter.bam
## └── trimgalore
## ├── M1_H3K27_NK_1.fastq.gz_trimming_report.txt
## ├── M1_H3K27_NK_1_val_1.fq.gz
## ├── M1_H3K27_NK_2.fastq.gz_trimming_report.txt
## ├── M1_H3K27_NK_2_val_2.fq.gz
## ├── M1_H3K4_NK_1.fastq.gz_trimming_report.txt
## ├── M1_H3K4_NK_1_val_1.fq.gz
## ├── M1_H3K4_NK_2.fastq.gz_trimming_report.txt
## ├── M1_H3K4_NK_2_val_2.fq.gz
## ├── M1_IgG_NK_1.fastq.gz_trimming_report.txt
## ├── M1_IgG_NK_1_val_1.fq.gz
## ├── M1_IgG_NK_2.fastq.gz_trimming_report.txt
## ├── M1_IgG_NK_2_val_2.fq.gz
## ├── M2_H3K27_NK_1.fastq.gz_trimming_report.txt
## ├── M2_H3K27_NK_1_val_1.fq.gz
## ├── M2_H3K27_NK_2.fastq.gz_trimming_report.txt
## ├── M2_H3K27_NK_2_val_2.fq.gz
## ├── M2_H3K4_NK_1.fastq.gz_trimming_report.txt
## ├── M2_H3K4_NK_1_val_1.fq.gz
## ├── M2_H3K4_NK_2.fastq.gz_trimming_report.txt
## ├── M2_H3K4_NK_2_val_2.fq.gz
## ├── M2_IgG_NK_1.fastq.gz_trimming_report.txt
## ├── M2_IgG_NK_1_val_1.fq.gz
## ├── M2_IgG_NK_2.fastq.gz_trimming_report.txt
## └── M2_IgG_NK_2_val_2.fq.gz
Within each directory you will find the following files (top 5 files per directory are shown):
path | type | process | filename |
---|---|---|---|
../results/mouse/bamtobedgraph | directory | /bamtobedgraph | |
../results/mouse/bamtobedgraph/M1_H3K27_NK_aligned.bed | file | /bamtobedgraph | M1_H3K27_NK_aligned.bed |
../results/mouse/bamtobedgraph/M1_H3K27_NK_aligned.clean.bed | file | /bamtobedgraph | M1_H3K27_NK_aligned.clean.bed |
../results/mouse/bamtobedgraph/M1_H3K27_NK_aligned.fragments.bed | file | /bamtobedgraph | M1_H3K27_NK_aligned.fragments.bed |
../results/mouse/bamtobedgraph/M1_H3K27_NK_aligned_fragments.bg | file | /bamtobedgraph | M1_H3K27_NK_aligned_fragments.bg |
../results/mouse/bowtie2_align | directory | /bowtie2_align | |
../results/mouse/bowtie2_align/M1_H3K27_NK.sort.bam | file | /bowtie2_align | M1_H3K27_NK.sort.bam |
../results/mouse/bowtie2_align/M1_H3K27_NK.sort.bowtie2.log | file | /bowtie2_align | M1_H3K27_NK.sort.bowtie2.log |
../results/mouse/bowtie2_align/M1_H3K4_NK.sort.bam | file | /bowtie2_align | M1_H3K4_NK.sort.bam |
../results/mouse/bowtie2_align/M1_H3K4_NK.sort.bowtie2.log | file | /bowtie2_align | M1_H3K4_NK.sort.bowtie2.log |
../results/mouse/deeptools_bamcoverage | directory | /deeptools_bamcoverage | |
../results/mouse/deeptools_bamcoverage/M1_H3K27_NK_CPM.bigWig | file | /deeptools_bamcoverage | M1_H3K27_NK_CPM.bigWig |
../results/mouse/deeptools_bamcoverage/M1_H3K4_NK_CPM.bigWig | file | /deeptools_bamcoverage | M1_H3K4_NK_CPM.bigWig |
../results/mouse/deeptools_bamcoverage/M1_IgG_NK_CPM.bigWig | file | /deeptools_bamcoverage | M1_IgG_NK_CPM.bigWig |
../results/mouse/deeptools_bamcoverage/M2_H3K27_NK_CPM.bigWig | file | /deeptools_bamcoverage | M2_H3K27_NK_CPM.bigWig |
../results/mouse/deeptools_multibigwigsummary | directory | /deeptools_multibigwigsummary | |
../results/mouse/deeptools_multibigwigsummary/test_dataset_sample_sheet_scores_per_bin.npz | file | /deeptools_multibigwigsummary | test_dataset_sample_sheet_scores_per_bin.npz |
../results/mouse/deeptools_plotcorrelation | directory | /deeptools_plotcorrelation | |
../results/mouse/deeptools_plotcorrelation/test_dataset_sample_sheet.plotCorrelation.mat.tab | file | /deeptools_plotcorrelation | test_dataset_sample_sheet.plotCorrelation.mat.tab |
../results/mouse/deeptools_plotcorrelation/test_dataset_sample_sheet.plotCorrelation.pdf | file | /deeptools_plotcorrelation | test_dataset_sample_sheet.plotCorrelation.pdf |
../results/mouse/deeptools_plotfingerprint | directory | /deeptools_plotfingerprint | |
../results/mouse/deeptools_plotfingerprint/M1_H3K27_NK.plotFingerprint.pdf | file | /deeptools_plotfingerprint | M1_H3K27_NK.plotFingerprint.pdf |
../results/mouse/deeptools_plotfingerprint/M1_H3K27_NK.plotFingerprint.qcmetrics.txt | file | /deeptools_plotfingerprint | M1_H3K27_NK.plotFingerprint.qcmetrics.txt |
../results/mouse/deeptools_plotfingerprint/M1_H3K27_NK.plotFingerprint.raw.txt | file | /deeptools_plotfingerprint | M1_H3K27_NK.plotFingerprint.raw.txt |
../results/mouse/deeptools_plotfingerprint/M1_H3K4_NK.plotFingerprint.pdf | file | /deeptools_plotfingerprint | M1_H3K4_NK.plotFingerprint.pdf |
../results/mouse/deeptools_plotpca | directory | /deeptools_plotpca | |
../results/mouse/deeptools_plotpca/test_dataset_sample_sheet.plotPCA.pdf | file | /deeptools_plotpca | test_dataset_sample_sheet.plotPCA.pdf |
../results/mouse/deeptools_plotpca/test_dataset_sample_sheet.plotPCA.tab | file | /deeptools_plotpca | test_dataset_sample_sheet.plotPCA.tab |
../results/mouse/fastqc | directory | /fastqc | |
../results/mouse/fastqc/M1_H3K27_NK_FASTQC | directory | /fastqc | |
../results/mouse/fastqc/M1_H3K27_NK_FASTQC/M1_H3K27_NK_1_fastqc.html | file | /fastqc | M1_H3K27_NK_1_fastqc.html |
../results/mouse/fastqc/M1_H3K27_NK_FASTQC/M1_H3K27_NK_1_fastqc.zip | file | /fastqc | M1_H3K27_NK_1_fastqc.zip |
../results/mouse/fastqc/M1_H3K27_NK_FASTQC/M1_H3K27_NK_2_fastqc.html | file | /fastqc | M1_H3K27_NK_2_fastqc.html |
../results/mouse/fastqc_trim | directory | /fastqc_trim | |
../results/mouse/fastqc_trim/M1_H3K27_NK_FASTQC_TRIM | directory | /fastqc_trim | |
../results/mouse/fastqc_trim/M1_H3K27_NK_FASTQC_TRIM/M1_H3K27_NK_1_fastqc.html | file | /fastqc_trim | M1_H3K27_NK_1_fastqc.html |
../results/mouse/fastqc_trim/M1_H3K27_NK_FASTQC_TRIM/M1_H3K27_NK_1_fastqc.zip | file | /fastqc_trim | M1_H3K27_NK_1_fastqc.zip |
../results/mouse/fastqc_trim/M1_H3K27_NK_FASTQC_TRIM/M1_H3K27_NK_2_fastqc.html | file | /fastqc_trim | M1_H3K27_NK_2_fastqc.html |
../results/mouse/multiqc | directory | /multiqc | |
../results/mouse/peak_calls | directory | /peak_calls | |
../results/mouse/peak_calls/macs2_callpeak | directory | /peak_calls/macs2_callpeak | |
../results/mouse/peak_calls/macs2_callpeak/M1_H3K27_NK_control_lambda.bdg | file | /peak_calls/macs2_callpeak | M1_H3K27_NK_control_lambda.bdg |
../results/mouse/peak_calls/macs2_callpeak/M1_H3K27_NK_peaks.narrowPeak | file | /peak_calls/macs2_callpeak | M1_H3K27_NK_peaks.narrowPeak |
../results/mouse/peak_calls/macs2_callpeak/M1_H3K27_NK_peaks.xls | file | /peak_calls/macs2_callpeak | M1_H3K27_NK_peaks.xls |
../results/mouse/peak_calls/macs2_callpeak/M1_H3K27_NK_summits.bed | file | /peak_calls/macs2_callpeak | M1_H3K27_NK_summits.bed |
../results/mouse/peak_calls/macs2_plotenrichment | directory | /peak_calls/macs2_plotenrichment | |
../results/mouse/peak_calls/macs2_plotenrichment/M1_H3K27_NK.plotEnrichment.pdf | file | /peak_calls/macs2_plotenrichment | M1_H3K27_NK.plotEnrichment.pdf |
../results/mouse/peak_calls/macs2_plotenrichment/M1_H3K27_NK.plotEnrichment.txt | file | /peak_calls/macs2_plotenrichment | M1_H3K27_NK.plotEnrichment.txt |
../results/mouse/peak_calls/macs2_plotenrichment/M1_H3K4_NK.plotEnrichment.pdf | file | /peak_calls/macs2_plotenrichment | M1_H3K4_NK.plotEnrichment.pdf |
../results/mouse/peak_calls/macs2_plotenrichment/M1_H3K4_NK.plotEnrichment.txt | file | /peak_calls/macs2_plotenrichment | M1_H3K4_NK.plotEnrichment.txt |
../results/mouse/peak_calls/macspeakstobed | directory | /peak_calls/macspeakstobed | |
../results/mouse/peak_calls/macspeakstobed/M1_H3K27_NK_peaks.bed | file | /peak_calls/macspeakstobed | M1_H3K27_NK_peaks.bed |
../results/mouse/peak_calls/macspeakstobed/M1_H3K4_NK_peaks.bed | file | /peak_calls/macspeakstobed | M1_H3K4_NK_peaks.bed |
../results/mouse/peak_calls/macspeakstobed/M2_H3K27_NK_peaks.bed | file | /peak_calls/macspeakstobed | M2_H3K27_NK_peaks.bed |
../results/mouse/peak_calls/macspeakstobed/M2_H3K4_NK_peaks.bed | file | /peak_calls/macspeakstobed | M2_H3K4_NK_peaks.bed |
../results/mouse/peak_calls/seacr_callpeak | directory | /peak_calls/seacr_callpeak | |
../results/mouse/peak_calls/seacr_callpeak/M1_H3K27_NK_vs_M1_IgG_NK_norm.stringent.bed | file | /peak_calls/seacr_callpeak | M1_H3K27_NK_vs_M1_IgG_NK_norm.stringent.bed |
../results/mouse/peak_calls/seacr_callpeak/M1_H3K4_NK_vs_M1_IgG_NK_norm.stringent.bed | file | /peak_calls/seacr_callpeak | M1_H3K4_NK_vs_M1_IgG_NK_norm.stringent.bed |
../results/mouse/peak_calls/seacr_callpeak/M2_H3K27_NK_vs_M2_IgG_NK_norm.stringent.bed | file | /peak_calls/seacr_callpeak | M2_H3K27_NK_vs_M2_IgG_NK_norm.stringent.bed |
../results/mouse/peak_calls/seacr_callpeak/M2_H3K4_NK_vs_M2_IgG_NK_norm.stringent.bed | file | /peak_calls/seacr_callpeak | M2_H3K4_NK_vs_M2_IgG_NK_norm.stringent.bed |
../results/mouse/peak_calls/seacr_plotenrichment | directory | /peak_calls/seacr_plotenrichment | |
../results/mouse/peak_calls/seacr_plotenrichment/M1_H3K27_NK.plotEnrichment.pdf | file | /peak_calls/seacr_plotenrichment | M1_H3K27_NK.plotEnrichment.pdf |
../results/mouse/peak_calls/seacr_plotenrichment/M1_H3K27_NK.plotEnrichment.txt | file | /peak_calls/seacr_plotenrichment | M1_H3K27_NK.plotEnrichment.txt |
../results/mouse/peak_calls/seacr_plotenrichment/M1_H3K4_NK.plotEnrichment.pdf | file | /peak_calls/seacr_plotenrichment | M1_H3K4_NK.plotEnrichment.pdf |
../results/mouse/peak_calls/seacr_plotenrichment/M1_H3K4_NK.plotEnrichment.txt | file | /peak_calls/seacr_plotenrichment | M1_H3K4_NK.plotEnrichment.txt |
../results/mouse/picard_markduplicates | directory | /picard_markduplicates | |
../results/mouse/picard_markduplicates/M1_H3K27_NK.markedDup.MarkDuplicates.metrics.txt | file | /picard_markduplicates | M1_H3K27_NK.markedDup.MarkDuplicates.metrics.txt |
../results/mouse/picard_markduplicates/M1_H3K27_NK.markedDup.bai | file | /picard_markduplicates | M1_H3K27_NK.markedDup.bai |
../results/mouse/picard_markduplicates/M1_H3K27_NK.markedDup.bam | file | /picard_markduplicates | M1_H3K27_NK.markedDup.bam |
../results/mouse/picard_markduplicates/M1_H3K27_NK.markedDup.bam.md5 | file | /picard_markduplicates | M1_H3K27_NK.markedDup.bam.md5 |
../results/mouse/samtools_faidx | directory | /samtools_faidx | |
../results/mouse/samtools_faidx/mm39.fa.fai | file | /samtools_faidx | mm39.fa.fai |
../results/mouse/samtools_index | directory | /samtools_index | |
../results/mouse/samtools_index/M1_H3K27_NK.markedDup.filter.sort.bam.bai | file | /samtools_index | M1_H3K27_NK.markedDup.filter.sort.bam.bai |
../results/mouse/samtools_index/M1_H3K4_NK.markedDup.filter.sort.bam.bai | file | /samtools_index | M1_H3K4_NK.markedDup.filter.sort.bam.bai |
../results/mouse/samtools_index/M1_IgG_NK.markedDup.filter.sort.bam.bai | file | /samtools_index | M1_IgG_NK.markedDup.filter.sort.bam.bai |
../results/mouse/samtools_index/M2_H3K27_NK.markedDup.filter.sort.bam.bai | file | /samtools_index | M2_H3K27_NK.markedDup.filter.sort.bam.bai |
../results/mouse/samtools_nsort | directory | /samtools_nsort | |
../results/mouse/samtools_nsort/M1_H3K27_NK.markedDup.filter.nsort.bam | file | /samtools_nsort | M1_H3K27_NK.markedDup.filter.nsort.bam |
../results/mouse/samtools_nsort/M1_H3K4_NK.markedDup.filter.nsort.bam | file | /samtools_nsort | M1_H3K4_NK.markedDup.filter.nsort.bam |
../results/mouse/samtools_nsort/M1_IgG_NK.markedDup.filter.nsort.bam | file | /samtools_nsort | M1_IgG_NK.markedDup.filter.nsort.bam |
../results/mouse/samtools_nsort/M2_H3K27_NK.markedDup.filter.nsort.bam | file | /samtools_nsort | M2_H3K27_NK.markedDup.filter.nsort.bam |
../results/mouse/samtools_sort | directory | /samtools_sort | |
../results/mouse/samtools_sort/M1_H3K27_NK.markedDup.filter.sort.bam | file | /samtools_sort | M1_H3K27_NK.markedDup.filter.sort.bam |
../results/mouse/samtools_sort/M1_H3K4_NK.markedDup.filter.sort.bam | file | /samtools_sort | M1_H3K4_NK.markedDup.filter.sort.bam |
../results/mouse/samtools_sort/M1_IgG_NK.markedDup.filter.sort.bam | file | /samtools_sort | M1_IgG_NK.markedDup.filter.sort.bam |
../results/mouse/samtools_sort/M2_H3K27_NK.markedDup.filter.sort.bam | file | /samtools_sort | M2_H3K27_NK.markedDup.filter.sort.bam |
../results/mouse/samtools_stats | directory | /samtools_stats | |
../results/mouse/samtools_stats/M1_H3K27_NK.markedDup.stats | file | /samtools_stats | M1_H3K27_NK.markedDup.stats |
../results/mouse/samtools_stats/M1_H3K4_NK.markedDup.stats | file | /samtools_stats | M1_H3K4_NK.markedDup.stats |
../results/mouse/samtools_stats/M1_IgG_NK.markedDup.stats | file | /samtools_stats | M1_IgG_NK.markedDup.stats |
../results/mouse/samtools_stats/M2_H3K27_NK.markedDup.stats | file | /samtools_stats | M2_H3K27_NK.markedDup.stats |
../results/mouse/samtools_view | directory | /samtools_view | |
../results/mouse/samtools_view/M1_H3K27_NK.markedDup.filter.bam | file | /samtools_view | M1_H3K27_NK.markedDup.filter.bam |
../results/mouse/samtools_view/M1_H3K4_NK.markedDup.filter.bam | file | /samtools_view | M1_H3K4_NK.markedDup.filter.bam |
../results/mouse/samtools_view/M1_IgG_NK.markedDup.filter.bam | file | /samtools_view | M1_IgG_NK.markedDup.filter.bam |
../results/mouse/samtools_view/M2_H3K27_NK.markedDup.filter.bam | file | /samtools_view | M2_H3K27_NK.markedDup.filter.bam |
../results/mouse/trimgalore | directory | /trimgalore | |
../results/mouse/trimgalore/M1_H3K27_NK_1.fastq.gz_trimming_report.txt | file | /trimgalore | M1_H3K27_NK_1.fastq.gz_trimming_report.txt |
../results/mouse/trimgalore/M1_H3K27_NK_1_val_1.fq.gz | file | /trimgalore | M1_H3K27_NK_1_val_1.fq.gz |
../results/mouse/trimgalore/M1_H3K27_NK_2.fastq.gz_trimming_report.txt | file | /trimgalore | M1_H3K27_NK_2.fastq.gz_trimming_report.txt |
../results/mouse/trimgalore/M1_H3K27_NK_2_val_2.fq.gz | file | /trimgalore | M1_H3K27_NK_2_val_2.fq.gz |