Skip to content
Permalink

Comparing changes

Choose two branches to see what’s changed or to start a new pull request. If you need to, you can also or learn more about diff comparisons.

Open a pull request

Create a new pull request by comparing changes across two branches. If you need to, you can also . Learn more about diff comparisons here.
base repository: brentp/mosdepth
Failed to load repositories. Confirm that selected base ref is valid, then try again.
Loading
base: v0.2.8
Choose a base ref
...
head repository: brentp/mosdepth
Failed to load repositories. Confirm that selected head ref is valid, then try again.
Loading
compare: master
Choose a head ref

Commits on Feb 21, 2020

  1. stores average coverage (not per-base) for region (like goleft indexcov)

    * regions and global dist should be different
    
    * cleanup of test output
    
    
    * missing :
    
    * adds upper limit to avoid overflowing the array (maybe should increase from 1000 buckets for targeted sequencing or chrM )
    
    * updated doc for -b
    
    * fixes fuctional test of -b <digits> mode
    
    * moves doc to README
    bwlang authored Feb 21, 2020

    Verified

    This commit was created on GitHub.com and signed with GitHub’s verified signature.
    Copy the full SHA
    fe07b1c View commit details

Commits on Feb 25, 2020

  1. use cumsum from math

    brentp committed Feb 25, 2020
    Copy the full SHA
    068c9fe View commit details
  2. Copy the full SHA
    84f199c View commit details
  3. remove errant change

    brentp committed Feb 25, 2020
    Copy the full SHA
    cc8a03e View commit details
  4. fast int to str

    brentp committed Feb 25, 2020
    Copy the full SHA
    9b39879 View commit details
  5. fewer allocations

    brentp committed Feb 25, 2020
    Copy the full SHA
    53ff157 View commit details

Commits on Mar 2, 2020

  1. prep for release

    brentp committed Mar 2, 2020
    Copy the full SHA
    4331b8f View commit details

Commits on Mar 25, 2020

  1. exclude more chroms from plot

    brentp committed Mar 25, 2020
    Copy the full SHA
    083d037 View commit details

Commits on Apr 21, 2020

  1. handle chroms with "-"

    see #115
    brentp committed Apr 21, 2020
    Copy the full SHA
    e14d85b View commit details

Commits on May 2, 2020

  1. Copy the full SHA
    ada9402 View commit details

Commits on May 11, 2020

  1. add .github/workflow

    from @danielecook
    
    requires changes to functional tests due to zcat differences on osx
    brentp committed May 11, 2020
    3
    Copy the full SHA
    48eae5f View commit details

Commits on Jun 22, 2020

  1. improve readme (#121)

    rollf authored Jun 22, 2020
    Copy the full SHA
    5bd49f0 View commit details

Commits on Jul 24, 2020

  1. update install intstructions

    re #57
    brentp committed Jul 24, 2020
    Copy the full SHA
    d34b5fb View commit details
  2. Copy the full SHA
    c5027de View commit details

Commits on Sep 16, 2020

  1. D4 (#126)

    * d4 support
    
    * add d4-nim dep
    brentp authored Sep 16, 2020
    Copy the full SHA
    6cdf0c1 View commit details
  2. add d4 to readme

    brentp committed Sep 16, 2020
    Copy the full SHA
    76ac47f View commit details

Commits on Sep 29, 2020

  1. Copy the full SHA
    389ca70 View commit details
  2. ci

    brentp committed Sep 29, 2020
    Copy the full SHA
    7fbd938 View commit details

Commits on Nov 3, 2020

  1. Update README.md

    NagaComBio authored Nov 3, 2020
    Copy the full SHA
    58f90d8 View commit details
  2. Copy the full SHA
    aa9a070 View commit details

Commits on Nov 20, 2020

  1. update link to blog post

    closes #132
    brentp committed Nov 20, 2020
    Copy the full SHA
    a9e4e6f View commit details

Commits on May 24, 2021

  1. Copy the full SHA
    e77263d View commit details
  2. Merge pull request #145 from hyphaltip/master

    small typo in wording 'noly' to 'only'
    brentp authored May 24, 2021
    Copy the full SHA
    17c3ae8 View commit details

Commits on Jul 19, 2021

  1. handle read-groups correctly

    closes #153
    brentp committed Jul 19, 2021
    Copy the full SHA
    503d7e7 View commit details
  2. Copy the full SHA
    fed74a2 View commit details

Commits on Feb 2, 2022

  1. allow specifying a custom index (via hts-nim)

    use /path/to/bam##idx##/other-path/to/bam.index
    brentp committed Feb 2, 2022
    Copy the full SHA
    be00d6f View commit details
  2. fix build.yml

    brentp committed Feb 2, 2022
    Copy the full SHA
    9cdf4c0 View commit details

Commits on Mar 18, 2022

  1. Create FUNDING.yml

    brentp authored Mar 18, 2022
    Copy the full SHA
    15de135 View commit details
  2. drop travis

    brentp committed Mar 18, 2022
    Copy the full SHA
    a7faf1c View commit details
  3. d4 build

    brentp committed Mar 18, 2022
    Copy the full SHA
    c84faf7 View commit details
  4. d4 build

    brentp committed Mar 18, 2022
    Copy the full SHA
    893135e View commit details

Commits on May 1, 2022

  1. bump version for gs builds

    brentp committed May 1, 2022
    Copy the full SHA
    2b784f4 View commit details
  2. Copy the full SHA
    0c2c2e9 View commit details
  3. bump hts version

    brentp committed May 1, 2022
    Copy the full SHA
    5577e1f View commit details

Commits on May 2, 2022

  1. bump version in readme

    brentp committed May 2, 2022
    Copy the full SHA
    e015d2e View commit details
  2. Copy the full SHA
    556a460 View commit details

Commits on Mar 8, 2023

  1. Copy the full SHA
    6e5428c View commit details
  2. Merge pull request #196 from jethror1/master

    Update Docker installation in README from v0.2.4 -> v0.3.3
    brentp authored Mar 8, 2023
    Copy the full SHA
    8d9e381 View commit details

Commits on Mar 16, 2023

  1. dont error when region larger than chrom

    closes #197
    brentp committed Mar 16, 2023
    Copy the full SHA
    92b822f View commit details

Commits on Jun 23, 2023

  1. update changes for release

    brentp committed Jun 23, 2023
    Copy the full SHA
    1d9b913 View commit details

Commits on Sep 15, 2023

  1. fix but on summary stats with regions

    closes #207
    brentp committed Sep 15, 2023
    Copy the full SHA
    1bfeaa2 View commit details

Commits on Oct 6, 2023

  1. pin docopt

    brentp committed Oct 6, 2023
    Copy the full SHA
    8cb4ad3 View commit details

Commits on Nov 11, 2023

  1. fragment-length filtering (#214)

    * Adds length filter in coverage
    
    * Adds min_len and max_len to CLI
    
    * tries isize instead of tlen
    
    * Passes min_len and max_len to coverage()
    
    * replaces template length with insert size in docs
    
    * Removes bad let
    
    * replaces fragment length with insert size in docs
    
    * Sets min_len default to -1
    
    Otherwise, this PR might cause reads with insert size 0 to be excluded, which might be okay but is not the intended change I want to make
    
    * Adheres to argument naming standard ("-" not "_")
    
    * Updates argnames with "-frag-". Adds error when max < min.
    
    * Adds tests of fragment length filtering
    
    ---------
    
    Co-authored-by: Ludvig <ludvig@clin.au.dk>
    LudvigOlsen and Ludvig authored Nov 11, 2023
    Copy the full SHA
    8696b5b View commit details

Commits on Nov 22, 2023

  1. bump versions and fix #216

    closes #216
    brentp committed Nov 22, 2023
    Copy the full SHA
    c300eb2 View commit details
  2. update changes for 0.3.6

    brentp committed Nov 22, 2023
    Copy the full SHA
    5fcd103 View commit details

Commits on Mar 20, 2024

  1. bump version

    update binary to fix #224 by building with htslib v1.19.1
    brentp committed Mar 20, 2024
    Copy the full SHA
    71e5fd2 View commit details
  2. update CHANGES

    brentp committed Mar 20, 2024
    Copy the full SHA
    33b9aca View commit details

Commits on Apr 11, 2024

  1. Update Github actions (#227)

    * Update Github Actions' versions and runners
    
    Signed-off-by: Martin Tzvetanov Grigorov <mgrigorov@apache.org>
    
    * Add a workaround for 38/d4-format#77
    
    Signed-off-by: Martin Tzvetanov Grigorov <mgrigorov@apache.org>
    
    * Use dtolnay/rust-toolchain Github Action to install Rustc/Cargo
    
    Signed-off-by: Martin Tzvetanov Grigorov <mgrigorov@apache.org>
    
    * Do not fail fast the build matrix
    
    Signed-off-by: Martin Tzvetanov Grigorov <mgrigorov@apache.org>
    
    * Build with Nim 1.6.x and 2.x
    
    Signed-off-by: Martin Tzvetanov Grigorov <mgrigorov@apache.org>
    
    ---------
    
    Signed-off-by: Martin Tzvetanov Grigorov <mgrigorov@apache.org>
    martin-g authored Apr 11, 2024
    Copy the full SHA
    388d2e0 View commit details

Commits on Apr 17, 2024

  1. work on #229

    brentp committed Apr 17, 2024
    Copy the full SHA
    61e4ec2 View commit details
  2. bump versions

    brentp committed Apr 17, 2024
    Copy the full SHA
    cae7970 View commit details
Showing with 752 additions and 248 deletions.
  1. +13 −0 .github/FUNDING.yml
  2. +126 −0 .github/workflows/build.yml
  3. +6 −0 .gitignore
  4. +0 −20 .travis.yml
  5. +53 −0 CHANGES.md
  6. +39 −47 README.md
  7. +3 −5 depthstat.nim
  8. +63 −10 functional-tests.sh
  9. +95 −0 int2str.nim
  10. +303 −157 mosdepth.nim
  11. +2 −2 mosdepth.nimble
  12. +22 −5 scripts/install.sh
  13. +3 −0 scripts/plot-dist.py
  14. BIN tests/full-fragment-pairs.bam
  15. BIN tests/full-fragment-pairs.bam.bai
  16. +24 −2 tests/funcs.nim
13 changes: 13 additions & 0 deletions .github/FUNDING.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,13 @@
# These are supported funding model platforms

github: [brentp]
patreon: # Replace with a single Patreon username
open_collective: # Replace with a single Open Collective username
ko_fi: # Replace with a single Ko-fi username
tidelift: # Replace with a single Tidelift platform-name/package-name e.g., npm/babel
community_bridge: # Replace with a single Community Bridge project-name e.g., cloud-foundry
liberapay: # Replace with a single Liberapay username
issuehunt: # Replace with a single IssueHunt username
otechie: # Replace with a single Otechie username
lfx_crowdfunding: # Replace with a single LFX Crowdfunding project-name e.g., cloud-foundry
custom: # Replace with up to 4 custom sponsorship URLs e.g., ['link1', 'link2']
126 changes: 126 additions & 0 deletions .github/workflows/build.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,126 @@
# copied from Daniel Cook's Seq collection
name: Build

on:
push:
branches:
- master
pull_request:
branches:
- master

jobs:
build:

runs-on: ${{ matrix.os }}
strategy:
fail-fast: false
matrix:
os: [ubuntu-20.04, macos-13]
version:
- 1.6.18
- 2.0.2


steps:
- uses: actions/checkout@v4

# Caching
- name: Cache choosenim
id: cache-choosenim
uses: actions/cache@v4
with:
path: ~/.choosenim
key: ${{ runner.os }}-choosenim-stable

- name: Cache nimble
id: cache-nimble
uses: actions/cache@v4
with:
path: ~/.nimble
key: ${{ runner.os }}-nimble-stable

- name: Cache htslib
id: cache-htslib
uses: actions/cache@v4
with:
path: $HOME/htslib
key: ${{ runner.os }}-htslib-1.10

# Install Dependencies
- name: Install dependencies (Linux)
if: runner.os == 'Linux'
run: |
sudo apt-get update
sudo apt-get -qy install bwa make build-essential cmake libncurses-dev ncurses-dev libbz2-dev lzma-dev liblzma-dev \
curl libssl-dev libtool autoconf automake libcurl4-openssl-dev zlib1g-dev
# Setup htslib
- name: Install htslib (linux)
if: runner.os == 'Linux'
run: |
cd
git clone -b 1.18 --recursive https://github.com/samtools/htslib.git
cd htslib && autoheader && autoconf && ./configure --enable-libcurl
sudo make -j 4 install
sudo ldconfig
#echo "::set-env name=LD_LIBRARY_PATH::${LD_LIBRARY_PATH}:${HOME}/htslib"
#ls -lh $HOME/htslib/*.so
- name: Install hstlib (macos)
if: runner.os == 'macOS'
run: |
brew install htslib
- name: Set DYLD_LIBRARY_PATH
if: runner.os == 'macOS'
run: |
echo "DYLD_LIBRARY_PATH=$(brew --prefix htslib)/lib" >> $GITHUB_ENV
- name: Install d4
run: |
#export HTSLIB=system
git clone https://github.com/38/d4-format
echo 'location' >> ~/.curlrc # https://github.com/38/d4-format/pull/77#issuecomment-2044438359
cd d4-format
cargo build --release --all-features --package=d4binding
sudo cp target/release/libd4binding.* /usr/local/lib
sudo cp d4binding/include/d4.h /usr/local/include/
sudo ldconfig || true
- uses: iffy/install-nim@v5
with:
version: ${{ matrix.version }}

- name: Rust Toolchain
uses: dtolnay/rust-toolchain@nightly
with:
toolchain: stable

# Build and Test
- name: Build test executable
run: nimble build -Y mosdepth.nimble

- name: "Copy binary"
run: chmod +x mosdepth && mkdir bin && cp mosdepth bin/mosdepth_debug_${{ matrix.os }}

- name: "Build and Copy release binary"
run: nim c --mm:refc -d:danger -d:release -o:bin/mosdepth_${{ matrix.os }} mosdepth

- name: Functional Tests
env:
TERM: "xterm"
run: |
bash ./functional-tests.sh
- name: Unit Tests
run: |
nim c -r tests/all.nim
- name: Upload Artifact
if: success()
uses: actions/upload-artifact@v4
with:
name: mosdepth_${{ matrix.os }}_nim${{ matrix.version }}_executable
path: bin/
6 changes: 6 additions & 0 deletions .gitignore
Original file line number Diff line number Diff line change
@@ -1,8 +1,14 @@
tests/all
tests/funcs
nimcache
*.d4
*.bam
*.cram
*.bai
*.crai
profile_results.txt
t.*.bed.gz
t.*.bed.gz.csi
*.bed.gz*
*.dist.txt
*.summary.txt
20 changes: 0 additions & 20 deletions .travis.yml

This file was deleted.

53 changes: 53 additions & 0 deletions CHANGES.md
Original file line number Diff line number Diff line change
@@ -1,3 +1,56 @@
v0.3.11
=======
+ add --fragment-mode (#246 from @LudvigOlsen). calculates coverage over a full fragment, including insert.

v0.3.10
=======
+ write sfi index in d4 files (#243)

v0.3.9
======
+ fix d4 output (#237)

v0.3.8
======
+ mosdepth is now much faster on bams/crams with a large number of contigs (#229)

v0.3.7
======
+ support CRAM v3.1. only updates htslib that binary is built with to v1.19.1 (#224)

v0.3.6
======
+ allow filtering on fragment length thanks @LudvigOlsen for implementing! (#214)
+ fix bug where empty chromosomes are not reported as having 0 depth (#216)

v0.3.5
======
+ fix bug with summary min for regions (#207 thanks to Xavier for supplying test-case)

v0.3.4
======
+ bump version for build supporting gs:// urls
+ dont error on regions larger than chromosome.

v0.3.3
======
+ allow specifying a custom index by passing '/path/to/bam##idx##/other-path/to/index.bai'

v0.3.1
======
+ fix bug with regions and d4 that would cause error even when --d4 was not used.

v0.3.0
======
+ allow chromosome names containing '-' as arguments for -c
+ d4 output

0.2.9
=====
+ modifies region.dist.txt to contain the aggregate coverage of each window when -b (integer) is specified
(otherwise region.dist.txt and global.disk.txt are identical with -b (integer) )
+ improve speed by ~30% when using per-base output with better int2str method

0.2.8
=====
+ fix off-by-one error in CSI index (but not data) of output bed files (#98)
86 changes: 39 additions & 47 deletions README.md
Original file line number Diff line number Diff line change
@@ -4,34 +4,33 @@ fast BAM/CRAM depth calculation for **WGS**, **exome**, or **targeted sequencing

![logo](https://user-images.githubusercontent.com/1739/29678184-da1f384c-88ba-11e7-9d98-df4fe3a59924.png "logo")

[![Build Status](https://travis-ci.org/brentp/mosdepth.svg?branch=master)](https://travis-ci.org/brentp/mosdepth)

[![citation](https://img.shields.io/badge/cite-open%20access-orange.svg)](https://academic.oup.com/bioinformatics/article/doi/10.1093/bioinformatics/btx699/4583630?guestAccessKey=35b55064-4566-4ab3-a769-32916fa1c6e6)
[![Build](https://github.com/brentp/mosdepth/actions/workflows/build.yml/badge.svg)](https://github.com/brentp/mosdepth/actions/workflows/build.yml) [![citation](https://img.shields.io/badge/cite-open%20access-orange.svg)](https://academic.oup.com/bioinformatics/article/doi/10.1093/bioinformatics/btx699/4583630?guestAccessKey=35b55064-4566-4ab3-a769-32916fa1c6e6)

`mosdepth` can output:

+ per-base depth about 2x as fast `samtools depth`--about 25 minutes of CPU time for a 30X genome.
+ mean per-window depth given a window size--as would be used for CNV calling.
+ the mean per-region given a BED file of regions.
* the mean or median per-region cumulative coverage histogram given a window size
+ a distribution of proportion of bases covered at or above a given threshold for each chromosome and genome-wide.
+ quantized output that merges adjacent bases as long as they fall in the same coverage bins e.g. (10-20)
+ threshold output to indicate how many bases in each region are covered at the given thresholds.
+ A summary of mean depths per chromosome and within specified regions per chromosome.
+ a [d4](https://github.com/38/d4-format) file (better than bigwig).

when appropriate, the output files are bgzipped and indexed for ease of use.

## usage

```
mosdepth 0.2.3
mosdepth 0.3.11
Usage: mosdepth [options] <prefix> <BAM-or-CRAM>
Arguments:
<prefix> outputs: `{prefix}.mosdepth.global.dist.txt`
`{prefix}.mosdepth.summary.txt`
`{prefix}.mosdepth.region.dist.txt` (if --by is specified)
`{prefix}.per-base.bed.gz` (unless -n/--no-per-base is specified)
`{prefix}.regions.bed.gz` (if --by is specified)
`{prefix}.quantized.bed.gz` (if --quantize is specified)
@@ -41,26 +40,29 @@ Arguments:
Common Options:
-t --threads <threads> number of BAM decompression threads. (use 4 or fewer) [default: 0]
-t --threads <threads> number of BAM decompression threads [default: 0]
-c --chrom <chrom> chromosome to restrict depth calculation.
-b --by <bed|window> optional BED file or (integer) window-sizes.
-n --no-per-base dont output per-base depth. skipping this output will speed execution
substantially. prefer quantized or thresholded values if possible.
-f --fasta <fasta> fasta file for use with CRAM files.
-f --fasta <fasta> fasta file for use with CRAM files [default: ].
Other options:
-F --flag <FLAG> exclude reads with any of the bits in FLAG set [default: 1796]
-i --include-flag <FLAG> only include reads with any of the bits in FLAG set. default is unset. [default: 0]
-x --fast-mode dont look at internal cigar operations or correct mate overlaps (recommended for most use-cases).
-q --quantize <segments> write quantized output see docs for description.
-Q --mapq <mapq> mapping quality threshold. reads with a mapping quality less than this are ignored [default: 0]
-T --thresholds <thresholds> for each interval in --by, write number of bases covered by at
least threshold bases. Specify multiple integer values separated
by ','.
-R --read-groups <string> only calculate depth for these comma-separated read groups IDs.
-h --help show help
-F --flag <FLAG> exclude reads with any of the bits in FLAG set [default: 1796]
-i --include-flag <FLAG> only include reads with any of the bits in FLAG set. default is unset. [default: 0]
-x --fast-mode dont look at internal cigar operations or correct mate overlaps (recommended for most use-cases).
-a --fragment-mode count the coverage of the full fragment including the full insert (proper pairs only).
-q --quantize <segments> write quantized output see docs for description.
-Q --mapq <mapq> mapping quality threshold. reads with a quality less than this value are ignored [default: 0]
-l --min-frag-len <min-frag-len> minimum insert size. reads with a smaller insert size than this are ignored [default: -1]
-u --max-frag-len <max-frag-len> maximum insert size. reads with a larger insert size than this are ignored. [default: -1]
-T --thresholds <thresholds> for each interval in --by, write number of bases covered by at
least threshold bases. Specify multiple integer values separated
by ','.
-m --use-median output median of each region (in --by) instead of mean.
-R --read-groups <string> only calculate depth for these comma-separated read groups IDs.
-h --help show help
```
If you use mosdepth please cite [the publication in bioinformatics](https://academic.oup.com/bioinformatics/article/doi/10.1093/bioinformatics/btx699/4583630?guestAccessKey=35b55064-4566-4ab3-a769-32916fa1c6e6)

@@ -133,20 +135,30 @@ Output will go to `$sample.mosdepth.dist.txt`

This also forces the output to have 5 decimals of precision rather than the default of 2.

## D4

D4 is a format created by [Hao Hou](https://github.com/38) in the Quinlan lab. It is
incorporated into `mosdepth` as of version 0.3.0 for per-base output with the `--d4` flag.
It improves write speed dramatically; for one test-case it takes **24.8s** to write a
per-base.bed.gz with mosdepth compared to **7.7s** to write a d4 file. For the same case,
running `mosdepth` without writing per-base takes 5.9 seconds so D4 greatly mitigates
the cost of outputing per-base depth **and** the output is more useful.

## Installation

The simplest way is to [![install with bioconda](https://img.shields.io/badge/install%20with-bioconda-brightgreen.svg?style=flat-square)](http://bioconda.github.io/recipes/mosdepth/README.html)

The simplest option is to download the [binary from the releases](https://github.com/brentp/mosdepth/releases).

Another quick way is to [![install with bioconda](https://img.shields.io/badge/install%20with-bioconda-brightgreen.svg?style=flat-square)](http://bioconda.github.io/recipes/mosdepth/README.html)

It can also be installed with `brew` as `brew install brewsci/bio/mosdepth` or used via docker with quay:
```
docker pull quay.io/biocontainers/mosdepth:0.2.4--he527e40_0
docker pull quay.io/biocontainers/mosdepth:0.3.3--h37c5b7d_2
docker run -v /hostpath/:/opt/mount quay.io/biocontainers/mosdepth:0.2.4--he527e40_0 mosdepth -n --fast-mode -t 4 --by 1000 /opt/mount/sample /opt/mount/$bam
```

Unless you want to install [nim](https://nim-lang.org), simply download the
[binary from the releases](https://github.com/brentp/mosdepth/releases).

`mosdepth` uses requires htslib version 1.4 or later. If you get an error
The binary from releases is static, with no dependencies. If you build it yourself,
`mosdepth` requires htslib version 1.4 or later. If you get an error
about "`libhts.so` not found", set `LD_LIBRARY_PATH` to the directory that
contains `libhts.so`. e.g.

@@ -155,27 +167,7 @@ contains `libhts.so`. e.g.
If you get the error `could not import: hts_check_EOF` you may need to
install a more recent version of htslib.

`mosdepth` also requires a recent version of [PCRE](https://www.pcre.org/), and will give the error `could not import: pcre_free_study` if the version of PCRE on your system is too old.
If you do not have root and cannot get the system version of PCRE upgraded, you download and compile a local copy
```
cd ~/src/
wget ftp://ftp.csx.cam.ac.uk/pub/software/programming/pcre/pcre-8.41.tar.gz
tar zxvf pcre-8.41.tar.gz
cd pcre-8.41/
./configure
make
```

Then pass that path to mosdepth just like we did with htslib
```
LD_LIBRARY_PATH=~/src/pcre-8.41/.libs/:/~/src/htslib/ mosdepth -h
```

If you still see an error about `could not import: pcre_free_study` then
for some, the solution has been to do: `ln -s /usr/local/lib/libpcre.so /usr/local/lib/libpcre.so.3`

If you do want to install from source, see the [travis.yml](https://github.com/brentp/mosdepth/blob/master/.travis.yml)
and the [install.sh](https://github.com/brentp/mosdepth/blob/master/scripts/install.sh).
If you do want to install from source, see the [install.sh](https://github.com/brentp/mosdepth/blob/master/scripts/install.sh).

If you use archlinux, you can [install as a package](https://aur.archlinux.org/packages/mosdepth/)

@@ -189,7 +181,7 @@ for at least a given coverage value. It does this for each chromosome, and for t
whole genome.

Each row will indicate:
+ chromosome (or "genome")
+ chromosome (or "total")
+ coverage level
+ proportion of bases covered at that level

@@ -211,7 +203,7 @@ close to 30X coverage for almost 40% of the genome.

![Y Example](https://user-images.githubusercontent.com/1739/29646191-2a246564-883f-11e7-951a-aa68d7a1a6ed.png "Y Example")

See [this blog post](http://www.gettinggeneticsdone.com/2014/03/visualize-coverage-exome-targeted-ngs-bedtools.html) for
See [this blog post](https://web.archive.org/web/20181018084459/http://www.gettinggeneticsdone.com/2014/03/visualize-coverage-exome-targeted-ngs-bedtools.html) for
more details.

## thresholds
8 changes: 3 additions & 5 deletions depthstat.nim
Original file line number Diff line number Diff line change
@@ -15,12 +15,10 @@ proc initCountStat*[T](size:int=32768): CountStat[T] =

proc add*[T](c: var CountStat, value: T) {.inline.} =
c.n.inc
if value.int > c.counts.high.int:
c.counts[c.counts.high].inc
elif value < 0:
raise newException(IndexError, "error setting negative depth value:" & $value)
if value < 0:
raise newException(IndexDefect, "error setting negative depth value:" & $value)
else:
c.counts[value].inc
c.counts[min(c.counts.high.T, value)].inc

proc median*[T](c: CountStat[T]): int {.inline.} =
var stop_n = int(0.5 + c.n.float64 * 0.5)
73 changes: 63 additions & 10 deletions functional-tests.sh
100644 → 100755
Original file line number Diff line number Diff line change
@@ -10,8 +10,10 @@ test -e ssshtest || wget -q https://raw.githubusercontent.com/ryanlayer/ssshtest

set -o nounset


set -e
nim c --boundChecks:on -x:on mosdepth.nim
nim c --boundChecks:on -x:on -d:useSysAssert -d:useGcAssert --lineDir:on --debuginfo --mm:refc mosdepth.nim
nim c -x:on --mm:refc -d:useSysAssert -d:useGcAssert --lineDir:on --debuginfo -r tests/funcs.nim
set +e
exe=./mosdepth
bam=/data/human/NA12878.subset.bam
@@ -45,7 +47,7 @@ assert_exit_code 1

run unordered_bed $exe --by tests/unordered.bed t tests/ovl.bam
assert_exit_code 0
assert_equal $(zcat t.regions.bed.gz | wc -l) 2
assert_equal $(zcat < t.regions.bed.gz | wc -l) 2

# theres data left in the bam but the region tree is empty...
run missing_bed_chrom $exe --by tests/missing.bed t tests/ovl.bam
@@ -56,6 +58,40 @@ assert_exit_code 0
assert_equal $(zgrep -c "MT" t.per-base.bed.gz) 2
assert_equal "MT 0 16569 0.00" "$(zgrep ^MT t.regions.bed.gz)"


# fragment length filtering
run length_filter $exe t tests/ovl.bam --min-frag-len 80 --max-frag-len 80
assert_exit_code 0
assert_equal $(zgrep -c "MT" t.per-base.bed.gz) 2
assert_equal "MT 0 80 1 MT 80 16569 0 " "$(zgrep ^MT t.per-base.bed.gz | tr -s '[:space:]' ' ')"

run length_filter $exe t tests/ovl.bam --min-frag-len 81
assert_exit_code 0
assert_equal "MT 0 16569 0" "$(zgrep ^MT t.per-base.bed.gz)"

run length_filter $exe t tests/ovl.bam --max-frag-len 79
assert_exit_code 0
assert_equal "MT 0 16569 0" "$(zgrep ^MT t.per-base.bed.gz)"

run bad_frag_len_filter $exe t tests/ovl.bam --min-frag-len 10 --max-frag-len 9
assert_in_stderr "--max-frag-len was lower than --min-frag-len."
assert_exit_code 2


# fragment-mode
run fragment_mode $exe t --fragment-mode tests/full-fragment-pairs.bam
assert_equal "$(zcat < t.per-base.bed.gz)" "chr22:20000000-23000000 0 17318 0
chr22:20000000-23000000 17318 17320 1
chr22:20000000-23000000 17320 17420 2
chr22:20000000-23000000 17420 17756 1
chr22:20000000-23000000 17756 52130 0
chr22:20000000-23000000 52130 52135 1
chr22:20000000-23000000 52135 52235 2
chr22:20000000-23000000 52235 52546 1
chr22:20000000-23000000 52546 3000001 0"
assert_exit_code 0


unset MOSDEPTH_Q0
unset MOSDEPTH_Q1
unset MOSDEPTH_Q2
@@ -72,13 +108,13 @@ assert_exit_code 0

rm -f t.thresholds.bed.gz*
run threshold_test $exe --by 100 -T 0,1,2,3,4,5 -c MT t tests/ovl.bam
assert_equal "$(zcat t.thresholds.bed.gz | tail -n +2 | head -1)" "MT 0 100 unknown 100 80 0 0 0 0"
assert_equal "0" "$(zcat t.thresholds.bed.gz | tail -n+2 | cut -f 7 | uniq)"
assert_equal "$(zcat < t.thresholds.bed.gz | tail -n +2 | head -1)" "MT 0 100 unknown 100 80 0 0 0 0"
assert_equal "0" "$(zcat < t.thresholds.bed.gz | tail -n+2 | cut -f 7 | uniq)"
assert_exit_code 0

rm -f t.thresholds.bed.gz*
run threshold_test_by $exe --by tests/track.bed -T 0,1,2 -c MT t tests/ovl.bam
assert_equal "$(zcat t.thresholds.bed.gz | tail -n +2)" "MT 2 80 aregion 78 78 0"
assert_equal "$(zcat < t.thresholds.bed.gz | tail -n +2)" "MT 2 80 aregion 78 78 0"
assert_exit_code 0

export MOSDEPTH_Q0=AAA
@@ -95,7 +131,7 @@ assert_equal "$(tail -n 1 t.mosdepth.summary.txt)" "total 16569 80 0.0048283 0 1

run track_header $exe --by tests/track.bed t tests/ovl.bam
assert_exit_code 0
assert_equal "$(zcat t.regions.bed.gz)" "MT 2 80 aregion 1.00"
assert_equal "$(zcat < t.regions.bed.gz)" "MT 2 80 aregion 1.00"

run track_header_by $exe --by tests/bad.bed t tests/ovl.bam
assert_exit_code 1
@@ -120,18 +156,32 @@ assert_equal $(cat t.mosdepth.summary.txt | cut -f 5 | tail -n +2 | uniq) 0
assert_equal $(cat t.mosdepth.summary.txt | cut -f 6 | tail -n +2 | uniq) 1
assert_exit_code 0

rm -f tt.mosdepth.region.dist.txt
rm -f t.mosdepth.region.dist.txt
rm -f tt.*
rm -f t.*

run empty_tids $exe t -n --thresholds 1,5 --by tests/empty-tids.bed tests/empty-tids.bam
assert_exit_code 0
assert_equal $(grep -w HPV26 -c t.mosdepth.region.dist.txt) 0
rm -f t.*

#regions and global should not be the same when -b is specified
run regions $exe t_regions -n -b 100 tests/empty-tids.bam
assert_equal $(diff -u t_regions.mosdepth.region.dist.txt t_regions.mosdepth.global.dist.txt | wc -l) 1436
rm -f t_regions.*

rm -f t.per-base.bed.gz*
run overlappingPairs $exe t tests/overlapping-pairs.bam
assert_equal "$(zcat t.per-base.bed.gz)" "1 0 565173 0
assert_equal "$(zcat < t.per-base.bed.gz)" "1 0 565173 0
1 565173 565253 1
1 565253 249250621 0"
assert_exit_code 0
rm -f t.*


nim c -d:d4 --boundChecks:on -x:on mosdepth.nim
rm -f t.per-base.bed.gz
run d4test $exe --d4 t --fast-mode tests/ovl.bam
assert_exit_code 0


test -e $bam || exit

@@ -147,3 +197,6 @@ run check_exit_code_on_bad_args $exe -t4prefix sample.bam
assert_exit_code 1
assert_in_stderr "error parsing arguments"




95 changes: 95 additions & 0 deletions int2str.nim
Original file line number Diff line number Diff line change
@@ -0,0 +1,95 @@
import bitops

#[
this intToStr is adpated from: https://github.com/miloyip/itoa-benchmark countlut method.
https://github.com/miloyip/itoa-benchmark/blob/master/license.txt
Copyright (C) 2014 Milo Yip
Permission is hereby granted, free of charge, to any person obtaining a copy
of this software and associated documentation files (the "Software"), to deal
in the Software without restriction, including without limitation the rights
to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
copies of the Software, and to permit persons to whom the Software is
furnished to do so, subject to the following conditions:
The above copyright notice and this permission notice shall be included in
all copies or substantial portions of the Software.
THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN
THE SOFTWARE.
]#


const gDigitsLut = [
'0','0','0','1','0','2','0','3','0','4','0','5','0','6','0','7','0','8','0','9',
'1','0','1','1','1','2','1','3','1','4','1','5','1','6','1','7','1','8','1','9',
'2','0','2','1','2','2','2','3','2','4','2','5','2','6','2','7','2','8','2','9',
'3','0','3','1','3','2','3','3','3','4','3','5','3','6','3','7','3','8','3','9',
'4','0','4','1','4','2','4','3','4','4','4','5','4','6','4','7','4','8','4','9',
'5','0','5','1','5','2','5','3','5','4','5','5','5','6','5','7','5','8','5','9',
'6','0','6','1','6','2','6','3','6','4','6','5','6','6','6','7','6','8','6','9',
'7','0','7','1','7','2','7','3','7','4','7','5','7','6','7','7','7','8','7','9',
'8','0','8','1','8','2','8','3','8','4','8','5','8','6','8','7','8','8','8','9',
'9','0','9','1','9','2','9','3','9','4','9','5','9','6','9','7','9','8','9','9'
];

const powers_of_10 = [
0'i32,
10,
100,
1000,
10000,
100000,
1000000,
10000000,
100000000,
1000000000]

proc countdigits(value:int32): int {.inline.} =
result = (32 - countLeadingZeroBits(value or 1)) * 1233 shr 12
result = result - int(value < powers_of_10[result]) + 1

proc fastIntToStr*(value:int32, outstr:var string, offset:int=0) {.inline.} =
outstr.setLen(offset + countdigits(value))
var value = value
var L = outstr.high
while value >= 100:
let i = (value mod 100) shl 1
value = int32(value / 100)

outstr[L] = gDigitsLut[i + 1]
outstr[L-1] = gDigitsLut[i]
L -= 2
if value < 10:
outstr[L] = (value + '0'.int).char

else:
let i = value shl 1
outstr[L] = gDigitsLut[i + 1]
outstr[L-1] = gDigitsLut[i]


when isMainModule:

import times
import strutils

var t0 = cpuTime()
var outstr = newString(10)
for i in 0'i32..200_000_000:
fastIntToStr(i, outstr)
doAssert outstr == $i
echo cpuTime() - t0

t0 = cpuTime()
echo "only to 100m for $"
for i in 0'i32..100_000_000:
let outstr = intToStr(i)
echo cpuTime() - t0
460 changes: 303 additions & 157 deletions mosdepth.nim

Large diffs are not rendered by default.

4 changes: 2 additions & 2 deletions mosdepth.nimble
Original file line number Diff line number Diff line change
@@ -1,13 +1,13 @@
# Package

version = "0.2.8"
version = "0.3.11"
author = "Brent Pedersen"
description = "fast depth"
license = "MIT"

# Dependencies

requires "hts >= 0.3.1", "docopt >= 0.6.8", "nim >= 1.0.0"
requires "hts >= 0.3.22", "docopt == 0.7.1", "nim >= 1.0.0", "https://github.com/brentp/d4-nim >= 0.0.5"

bin = @["mosdepth"]
skipDirs = @["tests"]
27 changes: 22 additions & 5 deletions scripts/install.sh
Original file line number Diff line number Diff line change
@@ -21,10 +21,27 @@ cd $base
nimble refresh
$base/nim-$BRANCH/bin/nimble install -y

git clone --recursive https://github.com/samtools/htslib.git
curl --proto '=https' --tlsv1.2 -sSf https://sh.rustup.rs | sh -s -- -qy

export HTSLIB=system
sudo apt-get update
sudo apt-get install git llvm curl wget libcurl4-openssl-dev
wget https://github.com/samtools/htslib/archive/1.11.tar.gz
tar xzf 1.11.tar.gz
cd htslib-1.11/
autoheader && autoconf && ./configure --disable-libcurl
sudo make -j4 install
git clone https://github.com/38/d4-format
cd d4-format

~/.cargo/bin/cargo build --release
sudo cp ../d4-format/target/release/libd4binding.* /usr/local/lib
sudo cp ./d4binding/include/d4.h /usr/local/include/
sudo ldconfig


cd htslib && git checkout 1.10 && autoheader && autoconf && ./configure --enable-libcurl
cd ..
make -j 4 -C htslib
export LD_LIBRARY_PATH=$base/htslib
ls -lh $base/htslib/*.so
export LD_LIBRARY_PATH=$base/htslib-1.10.2
ls -lh $base/htslib-1.10.2/*.so

nimble install -y https://github.com/brentp/d4-nim/
3 changes: 3 additions & 0 deletions scripts/plot-dist.py
Original file line number Diff line number Diff line change
@@ -19,6 +19,9 @@ def main():
for chrom, data in it.groupby(gen, itemgetter(0)):
if chrom.startswith("GL"):
continue
if "Un" in chrom: continue
if "random" in chrom or "HLA" in chrom: continue
if chrom.endswith("alt"): continue
chroms[chrom] = True
xs, ys = [], []
v50 = 0
Binary file added tests/full-fragment-pairs.bam
Binary file not shown.
Binary file added tests/full-fragment-pairs.bam.bai
Binary file not shown.
26 changes: 24 additions & 2 deletions tests/funcs.nim
Original file line number Diff line number Diff line change
@@ -1,7 +1,21 @@
import unittest, mosdepth as mosdepth
import unittest
import mosdepth
import depthstat
import os

suite "mosdepth-suite":

test "depthstat min":
var d = newSeq[int32]()
var t = newDepthStat(d)
check t.min_depth > 0

var dd: depth_stat
# "not that this always starts as 0 so we must set clear to increase it"
check dd.min_depth == 0
dd.clear()
check dd.min_depth > 0

test "test-quantize-args":

var rs = get_quantize_args(":1")
@@ -64,6 +78,14 @@ suite "mosdepth-suite":
check ts[2] == 3
check ts.len == 3

test "name splitting":

var r = region_line_to_region("Super-Scaffold_52")
check r.chrom == "Super-Scaffold_52"
check r.start == 0
check r.stop == 0


r = region_line_to_region("Super-Scaffold_52:2-1000")
check r.chrom == "Super-Scaffold_52"
check r.start == 1
check r.stop == 1000