Releases: steineggerlab/Metabuli
Releases · steineggerlab/Metabuli
Metabuli v1.0.9
DB creation process improved
- Added
updateDB
module for adding new sequences to an existing database. - Added
--cds-info
parameter in thebuild
module. Users can provide CDS information to skip Prodigal's gene prediction.- Currently, only NCBI RefSeq or GenBank CDS files (*cds_from_genomic.fna) are supported.
- For the accessions included in the files, the provided CDS info will be used, skipping Prodigal's gene prediction.
- Added
--max-ram
parameter to thebuild
module. - Added compatibility with taxdump files generated using taxonkit.
- 1.0.9-2: Fixations for bioconda
Metabuli v1.0.8
- Added
extract
module: It extracts reads classified under a specific taxon at any ranks. It can be used after runningclassify
.
Metabuli v1.0.7
Metabuli became faster than v1.0.6
-
Dataset
- Query: SRR24315757_1.fastq, SRR24315757_2.fastq
- 22,107,398 paired-end reads
- 6,632,219,400 nt in total
- DB: GTDB
- Complete Genome or Chromosome level assemblies
- CheckM completeness > 90 and contamination < 5
- 36,203 genomes from 8,465 species
- Query: SRR24315757_1.fastq, SRR24315757_2.fastq
-
Windows: ~8.3 times faster
- Machine: Intel(R) Core(TM) i9-9900 CPU, 32GB RAM
--max-ram
: 32--threads
: 8- v1.0.6: 825s for the first 587,593 reads (2.7% of all). Total time not measured
- v1.0.7: 100s for the first 587,593 reads. 1h 7m 22s in total
-
MacOS: ~1.7 times faster
- Machine: MacBook Pro 14-inch 2023, M2 Pro chip, 32GB RAM
--max-ram
: 32--threads
: 8- v1.0.6: 71m 34s
- v1.0.7: 42m 58s
-
Linux: ~1.3 times faster
- Machine: A server with 64-core AMD EPYC 7742 CPU and 1 TB of RAM
--max-ram
: 128--threads
: 32- v1.0.6: 13m 34s
- v1.0.7: 9m 58s
--threads
: 64- v1.0.6: 9m 36s
- v1.0.7: 7m 19s
Metabuli v1.0.6
Windows OS is supported
Metabuli v1.0.5
The CMake file was edited to pass the Bioconda PR test.
Other than that it is the same as v1.0.4.
Metabuli v1.0.4
- Fixed a minor reproducibility issue.
- Fixed a performance-harming bug occurring with sequences containing lowercased bases.
- Auto adjustment of
--match-per-kmer
parameter. Issue #20 solved. - Record version info. in
db.parameter
Metabuli v1.0.3
- New parameter:
--tie-ratio
inclassify
module. [default 0.95]
When the best matching species has a score MAX, species withscore >= (MAX * --tie-ratio)
is considered as a tie to the best score. When tie species occur for a read, the read is classified into their LCA.
Metabuli v1.0.2
v1.0.2
--accession-level
option forbuild
andclassify
workflow: It reports not only the taxon but also the accession of the best match.- Fix minor bugs in
build
andclassify
workflow. - Generate
taxonomyDB
duringbuild
and load it duringclassify
workflow for faster loading of taxonomy information. - Support gzipped FASTA/FASTQ files in
add-to-library
andclassify
workflows. - low-complexity filtering in
build
workflow as default with--mask-prob 0.9
.
Metabuli v1.0.1
- Fixed memory-related bugs.
classify
generates a Krona report file.- New option of
classify
: Low-complexity masking by--mask
and--mask-prob
(#20) - New option of
classify
:--match-per-kmer
option in classify workflow (#20) databases
downloads DBs astar.gz
and unzips them.classify
ignores reads shorter than 26 nt.database-report
generates a report of taxa included in a database.
Metabuli v1.0.0
It is the first release of Metabuli.
Metabuli is a metagenomic classifier that jointly analyze both DNA and amino-acid sequences to achieve high specificity and sensitivity at the same time. It is implemented based on a novel k-mer structure, metamer, to efficiently index and compare sequences at both DNA and amino-acid level.