Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Aggreagte ISM score across sequence [QUESTION] #29

Open
AndreaMariani-AM opened this issue Jan 17, 2025 · 1 comment
Open

Aggreagte ISM score across sequence [QUESTION] #29

AndreaMariani-AM opened this issue Jan 17, 2025 · 1 comment

Comments

@AndreaMariani-AM
Copy link

Hi Jacob,

Thanks for the great package and work.

I've got a question and i'd like your two cents on that.

I'm having an hard time figuring out how to aggregate ISM score across multiple sequences. I'll give you an example:

  • Suppose you have 10 sequences, and those sequences have roughly the same motifs, maybe in slightly different variants or positions. What i would like to do is to find a way to represent those 10 sequences in an aggregated way (imagine a single logo plot which captures the attributions across sequences).

Do you think it's feasible? Can you spot major issues with the interpretation i'm not seeing?
The first step would be to simply calculate the mean attribution score per base per substitution across the sequences.

Any thoughts?

Thank again!
Andrea

@jmschrei
Copy link
Owner

Hi Andrea

I think the primary question is whether the sequences are aligned. If it's reasonable to say that position 5 in sequence 0 corresponds to position 5 in sequence 1..9, it's reasonable to just take the average score across the sequences. This would allow mismatches to be represented because you would plot multiple characters at those positions. But, if there are any indels or larger structural variants the plots would become less meaningful.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants