-
Notifications
You must be signed in to change notification settings - Fork 33
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
AssertionError #7
Comments
Hi Alejandro, Can you please share how you solved the problem? I am getting the following. Arguments are: Valid sample for THetA analysis: Reading in query file... Thanks ! |
Hi @ChiragNepal, I couldn't solve the problem. I used other tools in the end. For relative copy number alterations I used CopywriteR, for tumour purity and absolute copy numbers I used ABSOLUTE, and finally to get the cellular prevalence of the mutations I used PyClone. I hope that helps. Best, |
I'm also having this problem. Removing multiprocessing and changing
I don't have any problems running the example, for some reason. Just my own data. |
It has been suggested in another forum that at least 1 SNP has to be present in every single interval_count. I assume that the example works perfectly because the number of interval_count is very limited compared to the number of SNP. Can someone confirm this hypothesis ? Is THetA still supported by the developper ? If it is confirmed, as a consequence, if you want to be more resolutive in terms of interval_count (eg : in case of WGS presenting chromothripsis/chromoplexy events) it will be almost impossible to have 1 SNP per interval. |
Dear @BaptisteAmeline, As you can read in the manual (https://github.com/raphael-group/THetA/blob/master/doc/MANUAL.txt) the usage of SNPs is part of a recommended but optional step. However, to clarify your point (since almost all the allele-specific copy-number callers consider only segments having at least one heterozygous SNP), germinal SNPs are considered and their location is with respect to the reference genome which you used to align the reads. Therefore, catastrophic events as chromothripsis/chromoplexy do not affect in any way the position or presence of germinal-heterozygous SNPs in the reference genome. In the human-reference genome you may expect 1 SNP every 1k bases, and as such you should expect to have many SNPs in your intervals considering standard-bin sizes. The allele counts from these SNPs is used to infer the allele-specific copy numbers of each segment in the tumor sample due the effects of any aberration (including chromothripsis/chromoplexy). E.g. heterozygous SNPs should have a proportion of the alleles of 50%, if one of the alleles is lost, the expected proportions is 0%, assuming this occurs in all cells of the sample. You can search for B-allele frequency (BAF) to get to know more about this. |
I successfully analysed the example files with RunTheta, however when I tried to analyse my sample it produces an error. I have the normal and primary tumour whole exome sequencing bam files. I created the Theta input file and the snp.withCounts files, one for normal and one for tumour files following the instructions. However when I run in the THetA-master directory:
I get:
The normal_snp.withCounts file looks like this:
The primary_snp.withCounts looks like this:
The normal_primary.input looks like this:
I would appreciate very much any help provided.
Many thanks,
Alejandro
The text was updated successfully, but these errors were encountered: