Skip to content

TheJacksonLaboratory/haplotype_reconstruction_qtl-nf

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

50 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

haplotype_reconstruction_qtl-nf: A slim Nextflow pipeline for mouse cross haplotype reconstruction and quality control

JAX users are required to have access to the Sumner cluster, and to have Nextflow installed in their home directory. Any setup for external users will require additional support, and those wishing to share these workflows are encouraged to contact the maintainers of this repository.

This pipeline is implemented using Nextflow, a scalable, reproducible, and increasingly common language used in the development and maintenance of bioinformatics workflows. The modular nature of the workflow is enabled by software containers, such as Docker and Singularity, with all the software requirements for executing each step. Specific combinations and versions of software are specified in each container making analyses perfectly reproducible over time as long as the source data is unchanged.

Execution:

Clone the repository using the standard procedure. On the JAX HPC, from within the cloned haplotype_reconstruction_qtl-nf directory:

sbatch run_scripts/run_HR_QC.sh

Overview:

The pipeline reads in the raw genotypes from GigaMUGA FinalReport files and makes them into files amenable to analysis using R/qtl2. These include cross files, genotype probabilities, allele probabilities, imputed genotypes from probabilities (maxmarg output).

Files used for sample and genotype quality control are also generated, such as inferred genotyping errors, poorly performing markers, and a markdown document outlining results from sex checks and calculations of sample duplication.

flowchart TD
    p0((FinalReport.zip))
    p1((R/qtl2 Covar File))
    p2((GigaMUGA Reference Files))
    p3[GS_TO_QTL2]:::process
    p4[WRITE_CROSS]:::process
    p5[GENOPROBS]:::process
    p6[CONCAT_GENOPROBS]:::process
    p7[CONCAT_INTENSITIES]:::process
    o1((Sex Chromosome Marker Intensities)):::output
    o2((All Marker Intensities)):::output
    o3((Excluded File List)):::output
    o4((36-state/Genotype Probabilities)):::output
    o5((8-state/Allele Probabilities)):::output
    o6((Cross Object)):::output
    o7((Imputed Genotype States)):::output
    o8((Genotyping Error LOD Scores)):::output
    o9((Sample Quality Control Flag Summary)):::output
    o10((Bad Markers List)):::output
    o11((Sample Quality Control Summary Markdown)):::output

    p0 --> p3
    p1 --> p3

    p3 --> p4
    p3 --> p7
    p3 --> o1
    p3 --> o2

    p2 --> p4
    p4 --> p5
    p5 --> p6

    p6 --> o3
    p6 --> o4
    p6 --> o5
    p6 --> o6
    p6 --> o7
    p6 --> o8

    p7 --> o9
    p7 --> o10
    p7 --> o11

classDef output fill:#99e4ff,stroke:#000000,stroke-width:5px,color:#000000
classDef process fill:#00A2DC,stroke:#000000,stroke-width:2px,color:#000000
Loading

The run script run_HR_QC.sh specifies only one user--generated comma-separated sample manifest with four named columns: finalreport_file, project_id, covar_file, and cross_type (see README within sample_sheets subdirectory).

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published