The code from GERP++ Home Page is not available, but I found a copy in Github tvkent/GERPplusplus. You can also find it here.
After downloading it, unzip it and run make
. (Please note that only available in Linux, can't in Windows)
How to run?
-t
tree file
-f
fasta file
-e
target
-a
fasta format
-v
see logs
./gerpcol -t NP_000006.2.blast.dnd -f NP_000006.2.blast.aln.fa -e NP_000006.2 -v -j -a
I found the code from github can't run.
Correct the code from github, and run successfully. But it only can calculate scores for ATCG
.
-
Error Source
When reading MSA in fasta sequences, it will read one more special char (maybe
\n
), which makes it can't map the MSA to tree file. -
How to solve?
-
Example
- The origin code from github
gerp++KRT
- The modified code
gerp++KRT_patch
- gerpcol executor
gerpcol
- MSA Fasta input
NP_000006.2.blast.aln.fa
- Tree input
NP_000006.2.blast.dnd
- Result
NP_000006.2.blast.aln.fa.rates
- Command
./gerpcol -t NP_000006.2.blast.dnd -f NP_000006.2.blast.aln.fa -e NP_000006.2 -v -j -a
- gerpcol executor
- The origin code from github
-
Note
- My input examples come from ClustalOmega, and fasta file is generate by biopython.
- Make tree file in one line without space string.
- It will only calculate the results for
ATCG
, can't apply to protein seq.