Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Create evaluation utility to compute residue conservation from MSA #61

Open
jeffreyruffolo opened this issue Oct 25, 2023 · 0 comments
Open

Comments

@jeffreyruffolo
Copy link
Collaborator

Conservation of amino acids in multiple sequence alignments is an indicator of functional importance. In lieu of experimental function assays, one way to evaluate the design capabilities of our model is to measure how likely the model is to generate a sequence with correct functional residues.

Towards this goal, we need a utility to identify and quantify the conservation of particular amino acids in a sequence given an MSA. Given a query sequence and an MSA, the goal would be to compute some measurement of conservation (eg, entropy over amino acid distribution) for each position aligned to the query.

Consideration of alignment depth at each position would be a nice-to-have feature. Perhaps indicating positions with depth below some threshold with NaN/None values.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant