Alignment of very short peptides (<12) to protein database #469
Replies: 1 comment
-
This has something to do with the seed shape patterns, see here for a list: https://github.com/bbuchfink/diamond/blob/master/src/search/setup.cpp For example, in default mode There are some more internal cutoffs that could cause short hits like this to be filtered out. For this you probably need to look into changing the Sorry that it's all a bit complicated, but I never really designed the tool to accommodate this particular type of query. |
Beta Was this translation helpful? Give feedback.
-
Dear Benjamin,
I am currently using your very nice Diamond software to align peptides to the complete UniprotKB protein database. My datasets contain peptides somewhere between a length of 5 and 45. So in general the alignment works great, however I did notice that Diamond only produced alignments of peptide queries with a length of 12 and greater. At first, I figured that lowering the e-value cutoff or minimal bitscore for reporting an alignment would do the trick (as the significance of an alignment of a short peptide is much lower of course). However, even at a minimal bitscore of 3 I did not see much improvement. Subsequently, I tried a more sensitive mode of the Diamond software and this did seem to have a positive effect. Compared to the 'normal' mode, the 'sensitive' mode resulted also in alignments of peptides with a length of 10 and 11. This of course came with a computation time penalty, which is also relevant to my project (faster is always better of course). And peptides with a length between 7 and 9 are not being aligned.
Therefore, my question is what settings I should change (or something else in my approach) to be able to push Diamond in also reporting alignments for these short peptides? Should I reduce the minimal bitscore further? Or should I use a more sensitive mode for these short peptides only?
Thank you in advance!
Kind regards,
Ramon
Beta Was this translation helpful? Give feedback.
All reactions