-
Notifications
You must be signed in to change notification settings - Fork 47
benchmark
V-pipe also provides an unified benchmarking platform, by incorporating two additional modules: a read simulator and a module ot evaluate the accuracy of the results.
We implemented forth operating modes:
- We generate a random sequence which we call the "master" haplotype sequence. By default, this sequence is stored as a FASTA file in
references/haplotype_master.fasta
. Simulated haplotypes are generated from the master sequence with fixed mutation, insertion and deletion rates. Then, reads are generated from the set of underlying haplotype sequences. The master haplotype is used as reference for the read alignment. - The user provides the master haplotype sequence and its location is specified as,
[input]
reference = `path/to/reference.fasta`
Haplotypes and reads are generated as above.
[//]: # (TODO 3. The user provides the set of underlying haplotypes, for instance, corresponding to known isolates.)
4. The user provides the FASTQ files containing the sequencing reads. This can be the case for control samples. These samples correspond to mock mixtures in which known isolates are mixed in the laboratory and, then, the sample is sequenced. In such case, the user needs to provide a file containing the sequences of the haplotypes and store it in <datadir>/<sample-ID>/<sample-data>/references/haplotypes/haplotypes.fasta
. In addition, the user should indicate that haplotypes and reads do not need to be simulated,
[general]
simulate = False