You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Hey! Again, just wanted to say what a great resource this package is which is why I'm keen to improve it/call out any issues I find.
To this end, I came across this issue when running tangermeme.tools.fimo on differing PWMs from Jasper. Firstly, consider the example below, for two motifs from Jasper, running each separately through tangermeme's functionality:
from pyjaspar import jaspardb
import torch
#Create the JASPAR2024 release object
jdb_obj = jaspardb(release='JASPAR2024')
#Fetch motif by ID
jasp_motifs = [jdb_obj.fetch_motif_by_id('MA0634.2'),jdb_obj.fetch_motif_by_id('MA0875.2')]
alphabet=['A', 'C', 'G', 'T']
pwm_list = []
pwm_names = []
for pwm_i in jasp_motifs:
pwm_names.append(pwm_i.matrix_id)
pwm_alph = []
for base in alphabet:
pwm_alph.append(torch.tensor(pwm_i.counts[base]).unsqueeze(0))
pwm_list.append(torch.concat(pwm_alph, dim=0).T)
pwms = dict(zip(pwm_names,pwm_list))
#pwms['MA0634.2'].shape => torch.Size([6, 4]) (motif_length,alphabet_size), as expected
from tangermeme.tools.fimo import FIMO
from tangermeme.utils import random_one_hot
motif_len = 40
batch_size = 20
X = random_one_hot((batch_size, 4, motif_len), random_state=0)
for ind,key_i in enumerate(pwms.keys()):
model = FIMO({key_i:pwms[key_i]})
hits = model.hits(X, threshold=0.01)
This runs without error (using the same motif length as your example code) however when I change motif_len = 100 instead of 40, it fails on MA0634.2 with:
--> [312](https://vscode-remote+ssh-002dremote-002b146-002d169-002d8-002d78-002edsi-002eic-002eac-002euk.vscode-resource.vscode-cdn.net/shared/aemurphy/G-CADS/~/anaconda3/envs/g-cads/lib/python3.12/site-packages/tangermeme/tools/fimo.py:312) pval = math.pow(2, self._score_to_pval[motif_idx][score_idx])
IndexError: index 784 is out of bounds for axis 0 with size 783
I tried tracing this back and it appears to be to do with either the _pwm_to_mapping() where the smallest value or the shape of mapping is the issue or in the _score_to_pval() functionality as the index is calculated from the score as follows here:
To note, this error is not dependent on motif length of the sequences to be tested against, I got the same error (just with different index values) with MA1535.2. Changing the code above to this motif, we get the error:
IndexError: index 1390 is out of bounds for axis 0 with size 1390
So it appears to be the combination of the motif length of the sequences and the specific motif PWM.
Secondly, as you might imagine, this is specific to hits() and running y = model(X.float()) doesn't return any errors.
Happy to dig into this further if you have ideas of where to start?
Hey! Again, just wanted to say what a great resource this package is which is why I'm keen to improve it/call out any issues I find.
To this end, I came across this issue when running tangermeme.tools.fimo on differing PWMs from Jasper. Firstly, consider the example below, for two motifs from Jasper, running each separately through tangermeme's functionality:
This runs without error (using the same motif length as your example code) however when I change
motif_len = 100
instead of 40, it fails on MA0634.2 with:I tried tracing this back and it appears to be to do with either the
_pwm_to_mapping()
where thesmallest
value or the shape ofmapping
is the issue or in the_score_to_pval()
functionality as the index is calculated from the score as follows here:To note, this error is not dependent on motif length of the sequences to be tested against, I got the same error (just with different index values) with MA1535.2. Changing the code above to this motif, we get the error:
So it appears to be the combination of the motif length of the sequences and the specific motif PWM.
Secondly, as you might imagine, this is specific to
hits()
and runningy = model(X.float())
doesn't return any errors.Happy to dig into this further if you have ideas of where to start?
Cheers,
Alan.
Package versions
The text was updated successfully, but these errors were encountered: