Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Suggestions to per sample counts 4.1 PR #581

Merged

Conversation

jkgoodrich
Copy link
Contributor

No description provided.

Copy link
Contributor

@KoalaQin KoalaQin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for all the changes, it's cleaner than ours! My first round of going through without running the test yet, lmk when you fix the "class too large" issue.


Aggregated statistics can also be computed by ancestry.
"""
# TODO: Maybe move to a folder called assessment and rename to
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@matren395 @KoalaQin thoughts on this TODO?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

does 'assessment' exist in older versions or other projects?

this would be creating stats to go with the release, so I'm very okay leaving it in create_release. However, if more assessments are coming down the road, then making a new folder could make sense? I think I'm leaning against, but not super strongly

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We don't have assesment under gnomad_qc, but we do have it under gnomad_methods and there's the summary_stats.py there. One question though, are we publicly releasing this per_sample count table or just the final aggregate stats? I'm not sure if we're allowed to, like meta isn't public.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Correct, we have not productionized release stats before, but I was using assessment because that is what we used for some release level stats that went into gnomad_methods. Nope, we can't release the HT, and the stats wont actually be fully released, we will add some of them to the stats page or to other pages of the browser. That's why I was thinking moving them to an assessment folder where we can also put the script in this PR into it.

Copy link
Contributor

@KoalaQin KoalaQin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's get Julia's PR in before thinking about combinations?

Copy link
Contributor

@KoalaQin KoalaQin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM! Running test only took ~3.5 minutes.

@jkgoodrich jkgoodrich merged commit cbecc2f into dm/per_sample_counts_4_1 Mar 20, 2024
2 checks passed
@jkgoodrich jkgoodrich deleted the jg/suggestions_to_per_sample_counts_4_1 branch March 20, 2024 22:14
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants