-
Notifications
You must be signed in to change notification settings - Fork 29
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Adapt hl.de_novo
function
#760
base: main
Are you sure you want to change the base?
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
thanks for adding these functions! I have some questions (many because I'm not that familiar with this work) and suggestions
) | ||
|
||
|
||
def transform_pl_to_pp(pl_expr: hl.expr.ArrayExpression) -> hl.expr.ArrayExpression: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
naive question, is the pp
here posterior probability?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's the conditional probabilities of each geneotype given the data calculated by HaplotypeCaller, we're transforming it back.
"HIGH", | ||
) | ||
.when((p_de_novo > med_conf_p) & (proband_ab > high_med_conf_ab), "MEDIUM") | ||
.when((p_de_novo > low_conf_p) & (proband_ab >= low_conf_ab), "LOW") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I didn't dig into Kaitlin's code, but her documentation uses >
and not >=
for AB
p_dn > 0.05 and child_AD > 0.2
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yeah I knew that, but <0.2 is failing, what. happens with AD = 0.2?
de_novo_prior=de_novo_prior, | ||
) | ||
|
||
# Determine genomic context |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
rather than running this in this function and in calculate_de_novo_post_prob
, why not switch the order and run get_genomic_context
first and pass in the three return values to calculate_de_novo_post_prob
as function arguments?
.or_missing() | ||
) | ||
|
||
parent_sum_ad_0 = ( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
these aren't sums; should this be renamed parent_ad_0
, parent_ad_0_check
, or something else that's similar?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
great catch, it should be hl.sum()
"min_dp_ratio": dp_ratio < min_dp_ratio, | ||
"parent_sum_ad_0": parent_sum_ad_0, | ||
"max_parent_ab": fail_max_parent_ab, | ||
"min_proband_ab": proband_ab < min_proband_ab, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
should this be <=
instead of <
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Okay, I will change it here.
locus_expr, is_female_expr | ||
) | ||
|
||
is_de_novo = ( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the current setup means you're calculating this probability on all variants, right? could you filter to variants that are eligible for being de novos first and then calculate probabilities? or is there a reason you don't want to do this filter upfront?
I also forgot to ask in my initial review -- could you add tests for the new functions? |
Co-authored-by: Katherine Chao <[email protected]>
These are smaller functions modified from Julia's work on combining hl.de_novo and Kaitlin’s code.