You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Is there a way to configure CATCH to avoid generating probes with ambiguous nucleotides and only use standard nucleotides (A, T, C, G)? I understand that it will lead to a greater number of probes for the given dataset.
Thank you very much for any advice and this great tool 👍🏻
Regards,
Sviat
The text was updated successfully, but these errors were encountered:
I'm glad that you find CATCH useful. Yes, there is an option to do what you're looking for! It's --expand-n. The help message for that argument is:
Expand each probe so that 'N' bases are replaced by real
bases; for example, the probe 'ANA' would be replaced
with the probes 'AAA', 'ATA', 'ACA', and 'AGA'; this is
done combinatorially across all 'N' bases in a probe, and
thus the number of new probes grows exponentially with the
number of 'N' bases in a probe. If followed by a command-
line argument (INT), this only expands at most INT randomly
selected N bases, and the rest are replaced with random
unambiguous bases (default INT is 3).
For example, setting --expand-n 10 combinatorially expands up to 10 N nucleotides with real nucleotides, and replaces the rest randomly with real nucleotides. You could set the value to be the probe length if you want to combinatorially expand all Ns. Note that this does not work with non-N ambiguity characters (e.g., Y); if you have those, my suggestion would be to replace them with N in the input.
Dear Dr. Metsky,
Is there a way to configure CATCH to avoid generating probes with ambiguous nucleotides and only use standard nucleotides (A, T, C, G)? I understand that it will lead to a greater number of probes for the given dataset.
Thank you very much for any advice and this great tool 👍🏻
Regards,
Sviat
The text was updated successfully, but these errors were encountered: