You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
we need to think about how we model the expected distribution for each state:
i,e normal read depth, can be model as normal, poisson, negative binomial. with high enough read depth it approximates to normal, except can't take negative values, previously we were performing a negative binomial transformation to give a normal distribution. that was so we could use the distribution of the difference between 2 normal distributions. However we could model directly as negative binomial.
duplications and a single copy deletion will be similar but have different parameters.
2 copy deletions will be near uniform 0,
Alternatively we could use the distribution for the normal copy number as the only distribution and base the probabilities on distance from mean of that, i.e > mean and low p = high p of duplication.
The text was updated successfully, but these errors were encountered:
how much astray you think the following approach sounds: cluster the observations into as many clusters as needed states. Then for each cluster fit a distribution. Use that probability as the probability distribution for each state in the HMM?
Sure any starting point to get the model will be good, we can always change the distributions.
We could keep it even simpler and base it on the normal copy number as the only distribution, and a window is assessed against that, i.e probably normal, probability above normal, probability below normal and then combine the 2 sets of probabilities to calculate the likelihood of each state, i.e sig below and sig below = 0.90 deletion, 0.05 TUF, 0.04 normal, 0.01 dup, normal normal = 0.75 normal, 0.10 deletion, 0.1 dup, 0.05 TUF etc
ok cool I will start looking into the clustering approach and see what we get...I will add sklearn into our requirements to use their clustering algos although we may have to implement others ourselves
we need to think about how we model the expected distribution for each state:
i,e normal read depth, can be model as normal, poisson, negative binomial. with high enough read depth it approximates to normal, except can't take negative values, previously we were performing a negative binomial transformation to give a normal distribution. that was so we could use the distribution of the difference between 2 normal distributions. However we could model directly as negative binomial.
duplications and a single copy deletion will be similar but have different parameters.
2 copy deletions will be near uniform 0,
Alternatively we could use the distribution for the normal copy number as the only distribution and base the probabilities on distance from mean of that, i.e > mean and low p = high p of duplication.
The text was updated successfully, but these errors were encountered: