Skip to content

Latest commit

 

History

History
45 lines (28 loc) · 2.42 KB

README.md

File metadata and controls

45 lines (28 loc) · 2.42 KB

FruitsC : A Snakemake Pipeline for HiC data analysis

This snakemake pipeline was developed for genome wide HiC data and uses a popular HiC tool called Juicer. It has been developed by the CCBR team for use in the NIH's Biowulf HPC Cluster and is currently in development mode

More details about Juicer

The Juicer pipeline was installed and optimized for the NIH's HPC Biowulf cluster by its HPC team. On top of that, the CCBR team has optimized some of the memory settings for the pipeline to run smoothly for very large samples. The largest fastq.gz file executed is about 135-140GB in size (each of the forward and reverse fastq.gz files were close to 70GB)

About this FruitsC Snakemake Pipeline

This pipeline includes the following steps described below.

  • Quality check of raw fastq reads
  • Trimming low quality reads
  • Check quality of the trimmed reads
  • Make a QC HTML report
  • Call Juicer HiC tool which involves the following steps
    • Generate Hi-C contact maps
    • Normalization of hic files
    • Call Arrowhead tool to identify TADs
    • Call Hiccups tool to identify loops

How to set up snakemake pipeline

Declarations

This work has been developed and tested solely on NIH HPC Biowulf.

Author contributions

The following members contributed to the development of the this pipeline including source code and logic: