-
Notifications
You must be signed in to change notification settings - Fork 10
PacBio ReadMe
MikeWLloyd edited this page Apr 22, 2024
·
4 revisions
For input sample:
• PBMM2 Mapping to reference genome
• PBSV SV calling
• SNIFFLES SV calling
• SURVIVOR SV merging
• SURVIVOR Annotation of results based on intersection with previously identified mouse SVs, genic and exonic regions
flowchart TD
p00([PACBIO READS\nFASTQ])
p001([REFERENCE_GENOME\nGRCm39])
p002[PBMM2_INDEX]
p003[PRE-ALIGNED BAM]
p01[PBMM2_CALL]
p02[PBSV_DISCOVER]
p03[PBSV_CALL]
p04[SNIFFLES]
p05[SURVIVOR_MERGE]
p06[SURVIVOR_SUMMARY]
p07[SURVIVOR_VCF_TO_TABLE]
p08[SURVIVOR_TO_BED]
p09[SURVIVOR_BED_INTERSECT]
p10[SURVIVOR_ANNOTATION]
p11[SURVIVOR_ANNOTATION_WITH_EXONS]
o1([Genomic BAM]):::output
o2([PB SV Calls]):::output
o3([SNIFFLES SV Calls]):::output
o4([Merged VCF]):::output
o5([Annotated SV Calls]):::output
o6([SV Joined Results]):::output
o7([Intersect BEDS]):::output
p00 --> p01
p001 -..-> |Generate Reference Index if Neccesary| p002
p002 --> p01
p01 -->o1
o1 --> p02
p02 --> p03
p001 --> p03
o1 --> p04
p003 -..-> |If Pre-Aligned Bam Provided| p02
p003 -..-> |If Pre-Aligned Bam Provided| p04
p03 --> o2
o2 --> p05
p04 --> o3
o3 --> p05
p05 --> o4
o4 --> p06
o4 --> p07
p06 --> p08
p06 --> p10
p07 --> p10
p07 --> p08
p08 --> p09
p08 --> p10
p09 --> o7
o7 --> p10
o4 --> p11
o7 --> p11
p10 --> o6
p11 --> o5
classDef output fill:#90aaff,stroke:#6c8eff,stroke-width:2px,color:#000000
-
--sampleID
- Default:
<STRING>
- Comment: The sample ID for the input data (required).
- Default:
-
--pubdir
- Default:
/<PATH>
- Comment: The directory that the saved outputs will be stored.
- Default:
-
--organize_by
- Default:
sample
- Comment: How to organize the output folder structure. Options: sample or analysis.
- Default:
-
--cacheDir
- Default:
'/projects/omics_share/meta/containers'
- Comment: This is directory that contains cached Singularity containers. JAX users should not change this parameter.
- Default:
-
-w
- Default:
/<PATH>
- Comment: The directory that all intermediary files and nextflow processes utilize. This directory can become quite large. This should be a location on /fastscratch or other directory with ample storage.
- Default:
-
--data_type
- Selected:
pacbio
- Comment: The germline sv workflow will run in pacbio mode with this option selected.
- Selected:
-
--pbmode
- Selected:
null
- Comment: Options: CCS or CLR. Specify whether input data are from PacBio CCS or CLR data.
- Selected:
-
--fastq1
- Default: null
- Comment: The path to a single FASTQ file, or one of a pair of FASTQs for paired-end data.
-
--fastq2
- Default: null
- Comment: The path to the second of a pair of FASTQs for paired-end data.
-
--bam
- Default: null
- Comment: The path to a BAM input data if alignment has already been performed outside this pipeline.
-
--fasta
- Default:
/<PATH>
- Comment: The path to the reference genome in FASTA format.
- Default:
-
--fasta_index
- Default:
/<PATH>
- Comment: Optional paramter to specify index for reference genome. If not provided, pipeline will generate an index.
- Default:
-
--genome_build
- Default:
GRCm38
- Comment: Mouse specific. Options: GRCm38 or GRCm39. Parameter that controls reference data used for alignment and annotation.
- Default:
-
--tandem_repeats
- Default:
'/ref_data/ucsc_mm10_trf_chr_sorted.bed'
- Comment: BED file that lists the coordinates of centromeres and telomeres to exclude as alignment targets. Note: default path refers to a location within the containers qquay.io/jaxcompsci/pbsv-td_refs:2.8.0--refv0.2.0 and quay.io/jaxcompsci/sniffles-td_refs:2.0.7--refv0.2.0, which require this file.
- Default:
-
--sv_ins_ref
- Default:
'/ref_data/variants_freeze5_sv_INS_mm39_to_mm10_sorted.bed.gz'
- Comment: BED file that lists previously indentified insertion SVs. Note: default path refers to a location within the container quay.io/jaxcompsci/bedtools-sv_refs:2.30.0--refv0.2.0, which requires this file.
- Default:
-
--sv_del_ref
- Default:
'/ref_data/variants_freeze5_sv_DEL_mm39_to_mm10_sorted.bed.gz'
- BED file that lists previously indentified deletion SVs. Note: default path refers to a location within the container quay.io/jaxcompsci/bedtools-sv_refs:2.30.0--refv0.2.0, which requires this file.
- Default:
-
--sv_inv_ref
- Default:
'/ref_data/variants_freeze5_sv_INV_mm39_to_mm10_sorted.bed.gz'
- BED file that lists previously indentified inversion SVs. Note: default path refers to a location within the container quay.io/jaxcompsci/bedtools-sv_refs:2.30.0--refv0.2.0, which requires this file.
- Default:
-
--reg_ref
- Default:
'/ref_data/mus_musculus.GRCm38.Regulatory_Build.regulatory_features.20180516.gff.gz'
- BED file that lists regulatory features. Note: default path refers to a location within the container quay.io/jaxcompsci/bedtools-sv_refs:2.30.0--refv0.2.0, which requires this file.
- Default:
-
--genes_bed
- Default:
'/ref_data/Mus_musculus.GRCm38.102.gene_symbol.bed'
- BED file that lists gene symbol IDs and coordinates. Note: default path refers to a location within the container quay.io/jaxcompsci/bedtools-sv_refs:2.30.0--refv0.2.0, which requires this file.
- Default:
-
--exons_bed
- Default:
'/ref_data/Mus_musculus.GRCm38.102.exons.bed'
- BED file that lists exons and coordinates. Note: default path refers to a location within the container quay.io/jaxcompsci/bedtools-sv_refs:2.30.0--refv0.2.0, which requires this file.
- Default:
-
--surv_dist
- Default: 1000
- Maximum distance between breakpoints for merging SVs.
-
--surv_supp
- Default: 1
- The number of callers (out of 4) required to support an SV.
-
--surv_type
- Default: 1
- Boolean (0/1) that requires SVs to be the same type for merging.
-
--surv_strand
- Default: 1
- Boolean (0/1) that requires SVs to be on the same strand for merging.
-
--surv_min
- Default: 30
- Minimum length (bp) to output SVs.
Naming Convention | Description |
---|---|
germline_sv_report.html |
Nextflow autogenerated report |
trace/trace.txt |
Nextflow trace of processes |
${sampleID}/${sampleID}_PACBIO_PS_struct_var.vcf |
VCF output combining merged PBSV and Sniffles calls annotated for overlap with exonic regions |
${sampleID}/${sampleID}_survivor_joined_results.csv |
Table of SVs annotated with overlaps of previously identified SVs (beck), genes, exons, regulatory regions |
${sampleID}/alignments/${sampleID}.pbmm2.aligned.bam |
Analysis-ready alignment of reads |
${sampleID}/alignments/${sampleID}.pbmm2.aligned.bam.bai |
Index for analysis-ready alignment of reads |
${sampleID}/unmerged_calls/${sampleID}.pbsv_calls.vcf |
SV calls from PBSV |
${sampleID}/unmerged_calls/${sampleID}.sniffles_sorted_prefix.vcf |
SV calls from Sniffles |