You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I am simulating reads from non-complete bacterial genomes. They tend to have a lot of short contigs.
For example see : Lactobacillus malefermentans KCTC 3548.
So each time the program tries to get a read from such contig it correclty outputs :
[wgsim_core] skip sequence 'gi|338736693|dbj|BACN01000170.1|' as it is shorter than 500!
However, each time it outputs this, a read that should have gotten into output file is skipped.
So in a file with many such short contigs, the resulting file has much fewer reads than specified via -N X.
As a workaround I as it to generate more reads and then keep the top X with "head -n X*4".
However, its a bug I believe :)
The text was updated successfully, but these errors were encountered:
I am simulating reads from non-complete bacterial genomes. They tend to have a lot of short contigs.
For example see : Lactobacillus malefermentans KCTC 3548.
So each time the program tries to get a read from such contig it correclty outputs :
[wgsim_core] skip sequence 'gi|338736693|dbj|BACN01000170.1|' as it is shorter than 500!
However, each time it outputs this, a read that should have gotten into output file is skipped.
So in a file with many such short contigs, the resulting file has much fewer reads than specified via -N X.
As a workaround I as it to generate more reads and then keep the top X with "head -n X*4".
However, its a bug I believe :)
The text was updated successfully, but these errors were encountered: