Skip to content

How to replicate the paper results

Mateo Torres edited this page Jan 12, 2021 · 2 revisions

Download the right version of the involved databases and dependencies

S2F's installer will attempt to download the latest versions of every involved database, but to reproduce our results extactly, you will need:

You can install S2F using a custom installation configuration file (see installation instructions) and set the interpro, string_links, string_sequences, and string_species to the respective downloaded files.

Download the input data

You need to download the input data from S2F's website

Run the predictions

Each of the files will have both a file with .fasta extension, as well as a matching file with .blacklist extension.

The simplest way to replicate our results is to run the following command for every pair of files

python S2F.py predict --obo go.obo --fasta <tax_id>.fasta --alias S2F-<tax_id> --transfer-blacklist <tax_id>.blacklist --hmmer-blacklist <tax_id>.blacklist

where <tax_id> is the NCBI taxonomy that is shared between the files.