Skip to content

Commit

Permalink
Merge 0.4.1 (#155)
Browse files Browse the repository at this point in the history
* Update version to 0.4.1-dev

* Update CircleCI badge in README to monitor develop branch

* Add Singularity deffile for stag-mwc env

* Add Singularity image for main env up to kraken2

* Add link to Singularity bind docs

* Add stag-mwc-main singularity image to all rules that require it

* Add Singularity images for biobakery and assembly

* Set rule threads from cluster_config if defined

* Update CHANGELOG

* Docs: Update module intro paragraph, closes #151

* Add {cluster.extra} to slurm-submit.py script call

* Update CHANGELOG

* Fix typo in kraken2 rules

* the host removal script was ignoring it's flag in config.yaml

* Update CHANGELOG

* Update CHANGELOG

* Tweak docs, update changelog, prepare for release

* Amrplusplus (#156)

* amrplusplus

* amrplusplus scripts

* added scripts

* changed input in config.yaml

Co-authored-by: Fredrik Boulund <[email protected]>

* default db empty

Co-authored-by: Fredrik Boulund <[email protected]>

* db -> megares

Co-authored-by: Fredrik Boulund <[email protected]>

* annotation --> megares_annotation

Co-authored-by: Fredrik Boulund <[email protected]>

* added amrplusplus to circleci

* updated CHANGELOG.md

* updated amrplusplus in config.yaml

* added amrplusplus to modules.rst

* updated amrplusplus.smk

* added license and removed bin dir

* updated modules.rst

* added align_to_amr

* fixed default database

* amrplusplus db empty in conf

* indentations

* clarification regarding database

* Update docs/source/modules.rst

Co-authored-by: Fredrik Boulund <[email protected]>

* Update rules/antibiotic_resistance/amrplusplus.smk

Co-authored-by: Fredrik Boulund <[email protected]>

* Update docs/source/modules.rst grammar

Co-authored-by: Fredrik Boulund <[email protected]>

* Update CHANGELOG.md

Co-authored-by: Fredrik Boulund <[email protected]>

* added bwa to stag-mwc.yaml

* added conda environment

* amrplusplus exectuable with conda

* amrplusplus executable with conda

* samtools version 1.10

* amrplusplus conda executable

* added amrplusplus.yaml to not break stag-mwc.yaml dependencies

* removed amrplusplus dependencies because they broke it

* now using amrplusplus.yaml for conda instead of stag-mwc.yaml

Co-authored-by: Fredrik Boulund <[email protected]>
Co-authored-by: Aron Arzoomand <[email protected]>

* Change report generation call (#161)

* Change report generation call

Ignore additional arguments to snakemake when creating report,

Intended to avoid shell escaping issue with complex arguments to e.g. Singularity, issue #160

* removed snakemake_call

* added report generation workaround

* report generation

Co-authored-by: Fredrik Boulund <[email protected]>

* report generation

Co-authored-by: Fredrik Boulund <[email protected]>

* indentation

Co-authored-by: Aron Arzoomand <[email protected]>

* Slurm profile for CTMR UCP environment (#159)

* First draft for CTMR UCP cluster profile

* Remove references to rackham profile, remove account check

* Updated resource requests for all rules

* Final tweaks

* Rename ctmr_ucp profile to ctmr_gandalf

* Fix yml-yaml typo

* Fix incorrectly named singularity images

* Remove --use prefix for argument to metawrap

* Add missing singularity container to create_kaiju_krona_plot

* added dbdir and threads for align_to_amr

* Disable area plot for Kaiju

* Minor updates to docs (#162)

* Update badge URL in README

Co-authored-by: Kristaps <[email protected]>
Co-authored-by: Aron Arzoomand <[email protected]>
Co-authored-by: Aron Arzoomand <[email protected]>
Co-authored-by: AroArz <[email protected]>
  • Loading branch information
5 people authored Feb 2, 2021
1 parent 0c46f88 commit b9b0df2
Show file tree
Hide file tree
Showing 44 changed files with 1,123 additions and 77 deletions.
3 changes: 2 additions & 1 deletion .circleci/config.yml
Original file line number Diff line number Diff line change
Expand Up @@ -71,7 +71,8 @@ jobs:
sed -i 's/kraken2: False/kraken2: True/' config.yaml
sed -i 's/metaphlan2: False/metaphlan2: True/' config.yaml
sed -i 's/humann2: False/humann2: True/' config.yaml
sed -i 's/antibiotic_resistance: False/antibiotic_resistance: True/' config.yaml
sed -i 's/groot: False/groot: True/' config.yaml
sed -i 's/amrplusplus: False/amrplusplus: True/' config.yaml
sed -i 's/assembly: False/assembly: True/' config.yaml
sed -i 's/binning: False/binning: True/' config.yaml
sed -i 's|db_path: \"\"|db_path: \"db/hg19\"|' config.yaml
Expand Down
30 changes: 30 additions & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -13,6 +13,36 @@ files), and the patch version is typically incremented for any set of changes
committed to the master branch that does not trigger any of the aforementioned
situations.

## [0.4.1] Unreleased
### Added
- Created Singularity images for all conda environments. Run with
`--use-singularity` (do not combine with `--use-conda`).
- New cluster profile "pseudo-rules" for anonymous rules for mappers: `bbmap`
and `bowtie2` can now accept threads from `n` in the cluster profile. They
still use the time allocation for the `__default__` rule, however.
- Added possibility to use `extra:` to define additional arguments passed on to
Slurm submissions. Useful to request e.g. fat nodes with `extra: "-C fat"`
- Added custom reimplementation of AMRPlusPlus v2.0 which can be executed with
either `--use-singularity` or `--use-conda`.

### Fixed
- The host removal module now correctly identifies setting `host_removal: False`
in the config file. Thank you chrsb!

### Changed
- Do not combine `--use-singularity` with `--use-conda` anymore. The new
Singularity images already contain all dependencies.
- All rules now define the number of threads from cluster_config if defined.
Old defaults are still used for local execution.
- The shebang of `area_plot.py` has been changed to work in more environments.
- Implemented workaround for error caused by automatic report generation when
using Singularity.
- Disabled taxonomic area plot for Kaiju outputs due to issues processing the
output files.

### Removed


## [0.4.0] 2020-02-18
### Added
- Added resource limiter for HUMAnN2 due to its intense use of huge temporary
Expand Down
14 changes: 11 additions & 3 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -20,8 +20,11 @@ Go to https://stag-mwc.readthedocs.org for the full documentation.
[Snakemake](https://snakemake.readthedocs.io) are required to be able to use
StaG-mwc. Most people would probably want to install
[Miniconda](https://conda.io/miniconda.html) and install Snakemake into their
base environment. Conda will automatically install the required versions of
all tools required to run StaG-mwc.
base environment. When running StaG with the `--use-conda` or
`--use-singularity` flags, all dependencies are managed automatically. If
using conda it will automatically install the required versions of all tools
required to run StaG-mwc. There is no need to combine the flags: the
Singularity images already contain all required dependencies.

### Step 1: Clone workflow
To use StaG-mwc, you need a local copy of the workflow repository. Start by
Expand Down Expand Up @@ -58,11 +61,16 @@ Make sure you edit the Slurm project account in
documentation](https://snakemake.readthedocs.io) for further details on how to
run Snakemake workflows on other types of cluster resources.

Note that in all examples above, `--use-conda` and essentially be replaced
with `--use-singularity` to run in Singularity containers instead of using a
locally installed conda. Read more about it under the Running section in the
docs.

## Testing
A very basic continuous integration test is currently in place. It merely
validates the syntax by trying to let Snakemake build the dependency graph if
all outputs are activated.
all outputs are activated. Suggestions for how to improve the automated
testing of StaG-mwc are very welcome!


## Contributing
Expand Down
10 changes: 4 additions & 6 deletions Snakefile
Original file line number Diff line number Diff line change
Expand Up @@ -16,8 +16,7 @@ min_version("5.5.4")

from rules.publications import publications

stag_version = "0.4.0"
singularity: "docker://continuumio/miniconda3:4.7.10"
stag_version = "0.4.1"

onstart:
print("\n".join([
Expand Down Expand Up @@ -76,6 +75,7 @@ include: "rules/functional_profiling/humann2.smk"
# Antibiotic resistance
#############################
include: "rules/antibiotic_resistance/groot.smk"
include: "rules/antibiotic_resistance/amrplusplus.smk"

#############################
# Mappers
Expand Down Expand Up @@ -153,12 +153,10 @@ onsuccess:
Path("citations.rst").unlink()
Path("citations.rst").symlink_to(citation_filename)

snakemake_call = " ".join(argv)
shell("{snakemake_call} --unlock".format(snakemake_call=snakemake_call))
shell("{snakemake_call} --unlock".format(snakemake_call=argv[0]))
shell("{snakemake_call} --report {report}-{datetime}.html".format(
snakemake_call=snakemake_call,
snakemake_call=argv[0],
report=config["report"],
datetime=report_datetime,
)
)

8 changes: 8 additions & 0 deletions cluster_configs/ctmr_gandalf/config.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,8 @@
jobscript: "cluster_configs/ctmr_gandalf/slurm-jobscript.sh"
cluster: "cluster_configs/ctmr_gandalf/slurm-submit.py --time {cluster.time} --error {cluster.stderr} --output {cluster.stdout} --job-name '{cluster.jobname}' {cluster.extra}"
cluster-status: "cluster_configs/ctmr_gandalf/slurm-status.py"
cluster-config: "cluster_configs/ctmr_gandalf/ctmr_gandalf.yaml"
max-jobs-per-second: 10
max-status-checks-per-second: 10
local-cores: 1
jobs: 999
90 changes: 90 additions & 0 deletions cluster_configs/ctmr_gandalf/ctmr_gandalf.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,90 @@
# Cluster config file for StaG-mwc for use on CTMR UCP
__default__:
account: "bio"
partition: "ctmr"
extra: ""
time: "03:00:00"
n: 2
stderr: "slurm_logs/slurm-{rule}-{wildcards}.stderr"
stdout: "slurm_logs/slurm-{rule}-{wildcards}.stdout"
jobname: "[{rule}]: {wildcards}"


#############################
# Pre-processing
#############################
fastp:
n: 8
time: "01:00:00"
remove_host:
n: 8
time: "01:00:00"
bbcountunique:
n: 4
time: "00:45:00"

#############################
# Naive comparisons
#############################
sketch:
n: 8
time: "00:20:00"

#############################
# Taxonomic profiling
#############################
kaiju:
n: 10
time: "02:00:00"
kraken2:
n: 10
time: "02:00:00"
metaphlan2:
n: 8
time: "01:30:00"
bracken:
n: 2
time: "01:00:00"

#############################
# Functional profiling
#############################
humann2:
n: 12
time: "12:00:00"

#############################
# Antibiotic resistance
#############################
groot_align:
n: 8
time: "01:00:00"
align_to_amr:
n: 10
time: "04:00:00"

#############################
# Mappers
#############################
bbmap:
n: 10
time: "02:00:00"
bowtie2:
n: 10
time: "02:00:00"

#############################
# Assembly
#############################
assembly:
n: 20
time: "05:00:00"
assembly:
n: 20
time: "05:00:00"
consolidate_bins:
n: 20
time: "05:00:00"
blobology:
n: 20
time: "05:00:00"
3 changes: 3 additions & 0 deletions cluster_configs/ctmr_gandalf/slurm-jobscript.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,3 @@
#!/bin/bash
# properties = {properties}
{exec_job}
61 changes: 61 additions & 0 deletions cluster_configs/ctmr_gandalf/slurm-status.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,61 @@
#!/usr/bin/env python3
import re
import subprocess as sp
import shlex
import sys
import time
import logging
logger = logging.getLogger("__name__")

STATUS_ATTEMPTS = 20

jobid = sys.argv[1]

for i in range(STATUS_ATTEMPTS):
try:
sacct_res = sp.check_output(shlex.split("sacct -P -b -j {} -n".format(jobid)))
res = {x.split("|")[0]: x.split("|")[1] for x in sacct_res.decode().strip().split("\n")}
break
except sp.CalledProcessError as e:
logger.error("sacct process error")
logger.error(e)
except IndexError as e:
pass
# Try getting job with scontrol instead in case sacct is misconfigured
try:
sctrl_res = sp.check_output(shlex.split("scontrol -o show job {}".format(jobid)))
m = re.search("JobState=(\w+)", sctrl_res.decode())
res = {jobid: m.group(1)}
break
except sp.CalledProcessError as e:
logger.error("scontrol process error")
logger.error(e)
if i >= STATUS_ATTEMPTS - 1:
print("failed")
exit(0)
else:
time.sleep(1)

status = res[jobid]

if (status.startswith("BOOT_FAIL")):
print("failed")
elif (status.startswith("CANCELLED")):
print("failed")
elif (status.startswith("COMPLETED")):
print("success")
elif (status.startswith("DEADLINE")):
print("failed")
elif (status.startswith("FAILED")):
print("failed")
elif (status.startswith("NODE_FAIL")):
print("failed")
elif (status.startswith("PREEMPTED")):
print("failed")
elif (status.startswith("TIMEOUT")):
print("failed")
# Unclear whether SUSPENDED should be treated as running or failed
elif (status.startswith("SUSPENDED")):
print("failed")
else:
print("running")
Loading

0 comments on commit b9b0df2

Please sign in to comment.