GSS is a small tool for simulating sequence reads from a reference genome. A
Simple build by cmake .
Here is a build example.
Enter src
directory:
mkdir build
cd build
cmake ..
make
Binary execute file in bin/Bin/GSS
After build success ,enter example
directory and run example.sh
script.
Run GSS with config.json
file . All infomations are within config.json.
Here is the config.json
from example
{ #Top level is a object !
"filePath": "./sequence.fasta", #Basic genome file.
"fileType": "fasta", #Genome file type.Only support fasta now.
"variable":0.0001,
"InDel":0.015,
"InDel_Extern":0.3,
"error":0.0001,
"output": [ #Array of all simulator infos.
{
"file_name":"f0", #Output name prefix.
"file_type":"fastq", #Output file type,support fasta and fastq
"read_type": "single", #
"read_len": 100, #
"insert_len": 100, #
"depth":80 #The expect depth of Nucleotide.
},
{
"file_name":"f1",
"file_type":"fastq",
"read_type": "pair-end",
"read_len": 100,
"depth":100,
"insert_len": 250
}
]
}
Here is the top 10 lines of each file that generated by above config.json
==> f0.fastq <==
@NC_007205.1_503982_504082_0:0:0_0/2
CTGGAACGGAGTATTCGTTTTAAAAGAGGGTGTATAAGTTTTAAAAAAAAATATTTAAGAACTATAAATATTTAACAAAAAAAATAAAGAAATATAAAAA
+
iiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiii
@NC_007205.1_430241_430341_0:0:0_1/2
TTTTATCAATTTTTTTTATTTAAATAGAATCAACCAAGTTCATACCCTCGCACCGAGAGGAATTTAGTTAAATTTATAAAATTTCTAGTTTAATTTCCAA
+
iiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiii
@NC_007205.1_1139113_1139213_0:0:0_2/1
ATACATAATAAGCACTATCTCTAAAGGATTTTTTTATTTGTTTTTTAGCATTATAAATTTCAATAAATGCATTTCTCATACTATCTTCTAGACCTGAAAA
==> f1_0.fastq <==
@NC_007205.1_1137336_1137436_0:0:0_0/1
CTTACACAATTATTTTGGTAGGCAATTTTACCTCTAGCAATCATTAATTGAGAATCTCTTGTGATTGGATAATGATTATCAGAATAAGAAATGCTACTTA
+
iiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiii
@NC_007205.1_706428_706528_0:0:0_1/2
AAAATTCAGAGGAACTCCGTAACTTTTACCTAACACTGATTTAAGTTAAAATAATCAATGAAGTAAAAAAAGTTTGTAACCTATTAGTTGTTGGGTTGTT
+
iiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiii
@NC_007205.1_555447_555547_0:0:0_2/2
TTTATGCCTACCTACGGTGGAGTAGGATGATGTATAAAGAATTCTTTTAAATAAATATTTCAACCTTTCAAATTATGTTTAATATTTTATAGGAGACTGA
==> f1_1.fastq <==
@NC_007205.1_1138736_1138836_0:0:0_0/2
ATTCATCTCGGATATATGGTTTGAGATTAGATAGAAAATAATCAAAATAATTTCAATGGAAACTATTTCCCCATTACTACTTACGTTGTAATCGATTACC
+
iiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiii
@NC_007205.1_705028_705128_0:0:0_1/1
TATGGCAAAATATTTGAATTCATATATAAAATTGACAAGATTAGTTTAAAAAAAAAGAAATCTATCGATTTTAAAATTATTAATGAAGCTCTTGAGGAAT
+
iiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiii
@NC_007205.1_554047_554147_0:0:0_2/1
TTTGATAGATTTTTAAGTTTTAAAAAATCTTTAGACCTAATTTCAAAAACTGGAATATTTATTTTTGATAATTCTGAAGGATATAATGAACCAAGAGATA