-
Notifications
You must be signed in to change notification settings - Fork 4
Masker
The masker
module performs a genome self-comparison using BLASTn (or takes an outfile) and utilizes those alignments
to mask repetitive regions in a provided genome.
The masked genome is output in the directory the program is called from.
Usage: FACET masker <genome.fasta> [options]
Acceptable Module Aliases: masker
, m
This is a FASTA file that will be masked either by performing a self-comparison or using alignments in an outfile.
Prints a help message with brief descriptions of each option and exits the program
If the user has a BLASTn/FACET output file that they would like to mask the given genome with, they can provide a file using this option. The outfmt option must match the outfmt of the provided outfile!
This option allow users to specify an outformat for CSV files. This option functions similarly to BLASTn's
-outfmt
flag (see more here). Any outfmt can be
used in any order as long as it contains the necessary flags for FACET to run (see --outfmt facet
below).
The two pre-defined outfmts are facet
and 6
Specifying --outfmt 6
in FACET is the same as specifying -outfmt 6
in BLASTn
Specifying --outfmt facet
in FACET (this is the default behavior) is the same as specifying
-outfmt "6 sseqid sstart send qseqid qstart qend sstrand pident"
in BLASTn
If the user is defining a custom outfmt, the flag is used like so:
--outfmt "6 sseqid sstart send qseqid qstart qend"
. The quotation marks and 6 must be present for FACET to properly
interpret the outfmt string!
The depth of coverage needed for a base to be considered repetitive. Any base that has a coverage >= the provided value will be masked with the mask character. The default value is 2 because a self-comparison will always generate an alignment that covers each contig (from the contig matching to itself).
Some important notes:
- If you are trying to mask known repetitive elements in a genome (e.g., searching a genome for known repetitive elements using FACET and using the resulting CSV file as an input), this value should be set to 1.
- If the organism you are performing a self-comparison with is diploid, this value should be set to 3, as a chromosome will align to itself and its homologous chromosome
The character that is used to mask a repetitive base.
Sets the evalue cutoff for BLASTn alignment consideration. Any alignments with an evalue <= this value will be considered by FACET.
The number of threads used for the BLASTn process. This flag is the same as BLASTn's -num_threads
option.
Speeds up BLASTn searches by using more CPUs
By default, FACET exits if running the given command will overwrite existing user files. Using this flag forces the program to overwrite those files.
FACET's default behavior is to create a database using the subject FASTA file. If a database with the same name is present, FACET will not overwrite it unless this option is specified. This reduces the number of times FACET has to create a database and decreases runtime in future runs.
Prints more information to stdout while FACET is running