Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Incorrect number of reads generated #13

Open
elgartmi opened this issue Jul 18, 2016 · 0 comments
Open

Incorrect number of reads generated #13

elgartmi opened this issue Jul 18, 2016 · 0 comments

Comments

@elgartmi
Copy link

I am simulating reads from non-complete bacterial genomes. They tend to have a lot of short contigs.
For example see : Lactobacillus malefermentans KCTC 3548.

So each time the program tries to get a read from such contig it correclty outputs :
[wgsim_core] skip sequence 'gi|338736693|dbj|BACN01000170.1|' as it is shorter than 500!

However, each time it outputs this, a read that should have gotten into output file is skipped.
So in a file with many such short contigs, the resulting file has much fewer reads than specified via -N X.

As a workaround I as it to generate more reads and then keep the top X with "head -n X*4".
However, its a bug I believe :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant