Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

varscan report wrong DP related values, almost 50% off #46

Open
yingchen69 opened this issue Aug 20, 2019 · 2 comments
Open

varscan report wrong DP related values, almost 50% off #46

yingchen69 opened this issue Aug 20, 2019 · 2 comments

Comments

@yingchen69
Copy link

Hi,

I just found a wired issue with varscan and it happened to both 2.4.4 and 2.4.3. Basically the varscan vcf files have wrong DP related values, about 50% off the values reported by IGV or Mutect2 vcf files.

Here is the varscan 2.4.4 vcf for EGFR L858R:

chr7 55259515 . T G . VarBaseQual ADP=1191;WT=0;HET=1;HOM=0;NC=0;ANN=G|missense_variant|MODERATE|EGFR|EGFR|transcript|NM_005228.3|protein_coding|21/28|c.2573T>G|p.L858R|2819/5600|2573/3633|858/1210||,G|sequence_feature|LOW|EGFR|EGFR|helix:combinatorial_evidence_used_in_manual_assertion|NM_005228.3|protein_coding|21/28|c.2573T>G||||||,G|upstream_gene_variant|MODIFIER|EGFR-AS1|EGFR-AS1|transcript|NR_047551.1|pseudogene||n.-2873A>C|||||2873| GT:GQ:SDP:DP:RD:AD:FREQ:PVAL:RBQ:ABQ:RDF:RDR:ADF:ADR 0/1:27:1191:1191:1179:12:1.01%:1.6693E-3:53:54:1013:166:11:1

Here is gatk Mutect2 vcf for the same variant:

chr7 55259515 . T G . PASS AC=1;AF=0.500;AN=2;CONTQ=93;ClippingRankSum=0.125;DP=2318;ECNT=2;FS=0.000;GERMQ=93;LikelihoodRankSum=-1.190;MBQ=20,20;MFRL=172,176;MMQ=60,60;MPOS=37;MQ=60.00;MQRankSum=0.000;POPAF=7.30;ROQ=77;ReadPosRankSum=0.086;SEQQ=28;SOR=0.615;STRANDQ=31;TLOD=6.62;UNIQ_ALT_READ_COUNT=26;ANN=G|missense_variant|MODERATE|EGFR|EGFR|transcript|NM_005228.3|protein_coding|21/28|c.2573T>G|p.L858R|2819/5600|2573/3633|858/1210||,G|sequence_feature|LOW|EGFR|EGFR|helix:combinatorial_evidence_used_in_manual_assertion|NM_005228.3|protein_coding|21/28|c.2573T>G||||||,G|upstream_gene_variant|MODIFIER|EGFR-AS1|EGFR-AS1|transcript|NR_047551.1|pseudogene||n.-2873A>C|||||2873| GT:AD:AF:DP:F1R2:F2R1:SB 0/1:2146,26:9.246e-03:2172:1016,11:1122,14:1029,1117,13,13

When I look at the bam file in IGV, the DP numbers from IGV are almost identical to the values from Mutect2 vcf.

Any explanation?

Thanks,

Ying

@yingchen69 yingchen69 changed the title varscan report run DP related values, almost 50% off varscan report wrong DP related values, almost 50% off Aug 20, 2019
@bioinfolusx
Copy link

I have the same question.

@nh13
Copy link

nh13 commented May 21, 2020

Have you take a look at the FAQ- Read counts from VarScan are different from SAMtools/IGV counts?

Also, the "DP" reported by Varscan2 seems to be across all the reads not just those that observe the called alleles? For example, if there are three candidate alleles A, C, and G, but the genotype was A/G. Then DP would be the sum of reads observing the three candidate alleles, while AD and RD would be for the G and A allele respectively (assuming A is the reference). Note that it depends on the samtools pileup, so if you've filtered bases there, then that would also reduce the depth counts.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants