Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Question about modisco meme result #58

Open
Yurun-Li-1024 opened this issue Jan 2, 2025 · 1 comment
Open

Question about modisco meme result #58

Yurun-Li-1024 opened this issue Jan 2, 2025 · 1 comment

Comments

@Yurun-Li-1024
Copy link

Hi!

When I use modisco meme to generate meme file with CWM-PFM, my result is all round 0.25 for each position and nucleotide. Those values are almost near background frequencies, which seems to mean representing no pattern.

I noticed that there is a similar result in example ModiscoDemonstration.ipynb. I thought the job of modisco meme was to extract the meme file format from the h5 file summarized by modisco, but the output results seems to tell me that the model did not learn any patterns. Does I misunderstand the role of modisco meme?

By the way, how do I understand the difference between CWM and CWM-PFM? And how do I determine which mode to use?

Thank you very much for any advice ^_^ !~

@jmschrei
Copy link
Owner

jmschrei commented Jan 5, 2025

Without seeing the output or an example it's difficult to know exactly what is going on, but it is possible that some of the "patterns" learned by modisco are not actually real patterns but just an artifact from the clustering process. Are you observing this for every pattern?

The documentation for the command says

A case-sensitive string specifying the desired data of the output file.,
The options are as follows:
- 'PFM':      The position-frequency matrix.
- 'CWM':      The contribution-weight matrix.
- 'hCWM':     The hypothetical contribution-weight matrix; hypothetical
              contribution scores are the contributions of nucleotides not encoded
              by the one-hot encoding sequence. 
- 'CWM-PFM':  The softmax of the contribution-weight matrix.
- 'hCWM-PFM': The softmax of the hypothetical contribution-weight matrix."""

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants