Home

Welcome to the PathoPore wiki!

PathoPore - longer sequences and NextFlow faster analysis

This project looks to

streamline the data on-analyses of Nanopore sequencing using high performance computing.
provide a scalable workflow package utilising NextFlow

Expected Nanopore input data sizes -:

The workflow will have input data sets up to 10 Tbytes in size.

Workflow description -:

PathoPore will accept Nanopore output formats (Fast5 and FastQ) and demultiplex individual samples. Sample reads are then quality checked and filtered for de novo assembly. The sample reads are then mapped to the quality assessed de novo assembly to call consensus bases, to improve the quality of the assembly, referred to as polishing. The reads are then mapped to the polished assembly to call base variants and detect methylated bases.

Outline of workflow tasks - Software tools -:

Long read sequence base calls - guppy
Demultiplexing (optional) - Qcat
Read and quality score distribution - nanoplot
Read filter (based on score and length) - filtlong
De novo assembly - pomoxis mini_assemble/canu
Assess assembly quality - QUAST
Map filtered reads to the assembly - promoxis mini_align/minimap2/bwa
Generate alignment stats - WUB
Index filtered reads - nanopolish index
Call consensus bases - nanopolish variants-consensus/racon/pilon
Map filtered reads to the polished assembly - promoxis mini_align/minimap2/bwa (maybe optional)
Generate polished assembly - nanopolish vcf2fasta
Call variant bases - nanopolish variants
Detect methylated bases - nanopolish call-methylation
Calculate methylation frequency - nanopolish calculate_methylation_frequency.py

Provide feedback

Saved searches