Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Data normalisation, Annotate FACETS output and how to identify tandem duplication pattern #206

Open
tanayb001 opened this issue Jan 25, 2025 · 0 comments

Comments

@tanayb001
Copy link

Dear Developers,

Thank you for developing the FACETS tools. I am using FACETS to identify CNVs in my WES data. However, I have some queries regarding the output and how to identify some patterns. I am running FACETS like below from the wrapper:

cnv_facets.R -n P03-BD.fixmate.sorted.duprm.recal.bam -t P03-TD/IITK-P03-TD.fixmate.sorted.duprm.recal.bam -vcf gnomad/renamed.hg38.vcf.gz -T covered_liftoverto_hg38.bed -cv 25 200 -g hg38 -a covered_liftoverto_hg38.bed -N 24 -o P03-TD

I am getting the outputs as:
P03-TD_25_200.cnv.png P03-TD_25_200.csv.gz P03-TD_25_200.vcf.gz
P03-TD_25_200.cov.pdf P03-TD_25_200.spider.pdf P03-TD_25_200.vcf.gz.tbi

Image

Please see the output CNV figure. I am not able to extract meaningful information from this. Also, the figure looks less resolve. In your paper the y-axis of CNV figure 7 is more resolved but in my case I do not able to call a LOH event.

I have a few questions regarding the the output and the labeling the CNVs.

Q1: If my tumor sample coverage is 200X and the matched normal sample coverage is 100X, the hod to normalize the dept ration. If we do not normalize, in that case most of my call would be amplification which is not true.

Q2: The attached figure is from a CDK12 mutated sample. Since, tandem duplication is a signature of CDK12 mutations. I want to find out the spikes for tandem duplication in the CNV figure. How to identify tandem duplication pattern in my samples?

Q3: How did you label the relevant CNVs in the CNV figure? Like you did for figure 7 in the FACETS paper.

Q4: How to identify and label the heterozygous loss, homozygous deletions, copy neutral LOH and the actual copy number (for example in figure 7 you did for PPM1D). I know by looking at the values is can infer the CNAs but since there are a huge number of segments it is hard to extract the relevant information. If I can label directly that would be better.

Q5: How to identify subclonal loss in my sample like you did for figure2 in the FACETS paper?

Apologies for so many questions. But it will be very helpful for me to understand the FACETS output for my data.

I am looking forward from hearing you soon.

Regards,

Tanay

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant