Skip to content

Latest commit

 

History

History
59 lines (27 loc) · 2.25 KB

genomemapping.md

File metadata and controls

59 lines (27 loc) · 2.25 KB

Mapping and preparing sequences for analysis

Trimming adapter sequences from files. I used a very large adapter file because subsets of this were leaving adapter contamination in my sequences Set minimum length to 36 bases.

trimming script

Mapped with BWA to the Drosophila melanogaster v6.23 genome

mapping script

I then merged reads from the same biological sample. Samples were sequenced twice to increase read depth.

merging biological replicates

Following mapping, I converted SAM to BAM files (compressing) using SAMtools. This step also sorts the file. Will also filter for mapping quality less than 20.

convert to BAM and sort

Realigning around indels using GATK

First I have to add read group information. In the future, I should do this earlier in the pipeline

add read groups

Indexing the files

GATK index

This step identifies the indels in the files

find intrivals

Realignment step

GATK reallign

Finally, I removed duplicate reads using Picard

remove duplicates

At this point, I can calculate the read depth for each sample

calculate read depth

I also merged treatments together at this point for the artificial selection experiment merging ee lineages

At this point, I was able to move on to creating pileup/mpileup files and population genetics analysis.