Accounting for genome region specific coverage biases #17

AyushSaxena · 2018-07-10T17:24:59Z

We have observed in our data (generated through multiple different Illumina machines and library prep methods), that local coverage density varies across the genome, predictably so, across all genotypes. When we calculate read coverage by bin size in any two genotypes, we observe a correlation between the two read coverage in two genotypes in a specific bin. Ideally, if sampling across the genome is random, we should see no correlation. Also, in the real data, the correlation coefficient stays the same regardless of the bin size.

Reads produced through wg-sim also produce this correlation, albeit the correlation coefficient is smaller, and approaches the correlation coefficient of real data at bin sizes of >100kb. Is there a way to manipulate this correlation coefficient ourselves?

Ayush

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Accounting for genome region specific coverage biases #17

Accounting for genome region specific coverage biases #17

AyushSaxena commented Jul 10, 2018

Accounting for genome region specific coverage biases #17

Accounting for genome region specific coverage biases #17

Comments

AyushSaxena commented Jul 10, 2018