Aggreagte ISM score across sequence [QUESTION] #29

AndreaMariani-AM · 2025-01-17T11:16:10Z

Hi Jacob,

Thanks for the great package and work.

I've got a question and i'd like your two cents on that.

I'm having an hard time figuring out how to aggregate ISM score across multiple sequences. I'll give you an example:

Suppose you have 10 sequences, and those sequences have roughly the same motifs, maybe in slightly different variants or positions. What i would like to do is to find a way to represent those 10 sequences in an aggregated way (imagine a single logo plot which captures the attributions across sequences).

Do you think it's feasible? Can you spot major issues with the interpretation i'm not seeing?
The first step would be to simply calculate the mean attribution score per base per substitution across the sequences.

Any thoughts?

Thank again!
Andrea

jmschrei · 2025-01-17T20:37:10Z

Hi Andrea

I think the primary question is whether the sequences are aligned. If it's reasonable to say that position 5 in sequence 0 corresponds to position 5 in sequence 1..9, it's reasonable to just take the average score across the sequences. This would allow mismatches to be represented because you would plot multiple characters at those positions. But, if there are any indels or larger structural variants the plots would become less meaningful.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Aggreagte ISM score across sequence [QUESTION] #29

Aggreagte ISM score across sequence [QUESTION] #29

AndreaMariani-AM commented Jan 17, 2025

jmschrei commented Jan 17, 2025

Aggreagte ISM score across sequence [QUESTION] #29

Aggreagte ISM score across sequence [QUESTION] #29

Comments

AndreaMariani-AM commented Jan 17, 2025

jmschrei commented Jan 17, 2025