Skip to content

Commit

Permalink
Add option for using SRA Run IDs (#15)
Browse files Browse the repository at this point in the history
* add modules for sra fetch and test conf

* fix label resources

* add gitpod starter

* SRA inclusion fixed for ILLUMINA data

* fix module tag

* fixed nanopore workflow for SRA and Local files

* workflow fixed, now all techs are running properly

* remove .view()

* update test config

* update index.html docs

* fix indentation

* update manual with --sra_ids information

* add info about test profile

* fix channels when SRA ids are not used

* on pacbio, barcode input should be file() as it is optional

* fix URL

* avoid using fixOwnership

* update comment

* small changes on namings and conventions

* fixed list of tools and changelog

* Adding date to changelog
  • Loading branch information
fmalmeida authored Oct 30, 2022
1 parent c44cdf7 commit 2e1d74c
Show file tree
Hide file tree
Showing 36 changed files with 652 additions and 407 deletions.
22 changes: 0 additions & 22 deletions .github/workflows/docker-pull-image.yml

This file was deleted.

14 changes: 14 additions & 0 deletions .gitpod.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,14 @@
image: nfcore/gitpod:latest

vscode:
extensions: # based on nf-core.nf-core-extensionpack
- codezombiech.gitignore # Language support for .gitignore files
# - cssho.vscode-svgviewer # SVG viewer
- esbenp.prettier-vscode # Markdown/CommonMark linting and style checking for Visual Studio Code
- eamodio.gitlens # Quickly glimpse into whom, why, and when a line or code block was changed
- EditorConfig.EditorConfig # override user/workspace settings with settings found in .editorconfig files
- Gruntfuggly.todo-tree # Display TODO and FIXME in a tree view in the activity bar
- mechatroner.rainbow-csv # Highlight columns in csv files in different colors
# - nextflow.nextflow # Nextflow syntax highlighting
- oderwat.indent-rainbow # Highlight indentation level
- streetsidesoftware.code-spell-checker # Spelling checker for source code
2 changes: 1 addition & 1 deletion .zenodo.json
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,7 @@
"description": "<p>The pipeline</p>\n\n<p>ngs-preprocess is built using Nextflow, a workflow tool to run tasks across multiple compute infrastructures in a very portable manner. It uses Docker/Singularity containers making installation trivial and results highly reproducible. It is an easy to use pipeline that uses state-of-the-art software for quality check and pre-processing ngs reads of Illumina, Pacbio and Oxford Nanopore Technologies.</p>",
"license": "other-open",
"title": "fmalmeida/ngs-preprocess: A pipeline for preprocessing short and long sequencing reads",
"version": "v2.4.2",
"version": "v2.5",
"upload_type": "software",
"creators": [
{
Expand Down
6 changes: 6 additions & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,6 +2,12 @@

The tracking for changes started in v2.2

## v2.5 -- [2022-Oct-30]

Add possibility for users to automatically fetch fastq files from SRA NCBI database. For that, users just need to use the `--sra_ids` parameter, passing a file with a list of SRA RunIDs, one per line.

> More tools have been added so the versioning and docker image have now changed to v2.5.
## v2.4.2 -- [2022-Oct-17]

Cleanup change. Short reads output are are now written as "preprocessed_reads/short_reads" instead of "preprocessed/illumina" as sometimes other technology may be used.
Expand Down
7 changes: 5 additions & 2 deletions Dockerfile
Original file line number Diff line number Diff line change
Expand Up @@ -8,7 +8,10 @@ COPY environment.yml /
RUN mamba env create --quiet -f /environment.yml && mamba clean -a

# Add conda installation dir to PATH (instead of doing 'conda activate')
ENV PATH /opt/conda/envs/ngs-preprocess-2.4/bin:$PATH
ENV PATH /opt/conda/envs/ngs-preprocess-2.5/bin:$PATH

# Dump the details of the installed packages to a file for posterity
RUN conda env export --name ngs-preprocess-2.4 > ngs-preprocess-2.4.yml
RUN conda env export --name ngs-preprocess-2.5 > ngs-preprocess-2.5.yml

# cp config
RUN cp -R /root/.ncbi / && chmod -R 777 /root/.ncbi /.ncbi
7 changes: 5 additions & 2 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -38,6 +38,7 @@ It wraps up the following software:

| Step | tools |
| :--- | :---- |
| SRA NBCI fetch | [Entrez-direct](https://anaconda.org/bioconda/entrez-direct) & [sra-tools](https://github.com/ncbi/sra-tools) |
| Illumina pre-processing | [Fastp](https://github.com/OpenGene/fastp) |
| Nanopore pre-processing | [Porechop](https://github.com/rrwick/Porechop), [pycoQC](https://github.com/tleonardi/pycoQC), [NanoPack](https://github.com/wdecoster/nanopack) |
| Pacbio pre-processing | [bam2fastx](https://github.com/PacificBiosciences/bam2fastx), [bax2bam](https://github.com/PacificBiosciences/bax2bam), [lima](https://github.com/PacificBiosciences/barcoding), [pacbio ccs](https://ccs.how/) |
Expand Down Expand Up @@ -66,7 +67,7 @@ This pipeline has two complementary pipelines (also written in nextflow) for [ge

```bash
# for docker
docker pull fmalmeida/ngs-preprocess:v2.4
docker pull fmalmeida/ngs-preprocess:v2.5
# run
nextflow run fmalmeida/ngs-preprocess -profile docker [options]
Expand All @@ -82,7 +83,7 @@ This pipeline has two complementary pipelines (also written in nextflow) for [ge
export NXF_SINGULARITY_CACHEDIR=MY_SINGULARITY_CACHE # your singularity cache dir
singularity pull \
--dir $NXF_SINGULARITY_LIBRARYDIR \
fmalmeida-ngs-preprocess-v2.4.img docker://fmalmeida/ngs-preprocess:v2.4
fmalmeida-ngs-preprocess-v2.5.img docker://fmalmeida/ngs-preprocess:v2.5
# run
nextflow run fmalmeida/ngs-preprocess -profile singularity [options]
Expand Down Expand Up @@ -212,6 +213,8 @@ This pipeline uses code and infrastructure developed and maintained by the [nf-c

In addition, users are encouraged to cite the programs used in this pipeline whenever they are used. Links to resources of tools and data used in this pipeline are as follows:

* [Entrez-direct](https://anaconda.org/bioconda/entrez-direct)
* [sra-tools](https://github.com/ncbi/sra-tools)
* [Fastp](https://github.com/OpenGene/fastp)
* [Porechop](https://github.com/rrwick/Porechop)
* [pycoQC](https://github.com/a-slide/pycoQC)
Expand Down
2 changes: 1 addition & 1 deletion conf/base.config
Original file line number Diff line number Diff line change
Expand Up @@ -18,7 +18,7 @@ process {
withLabel:process_low {
cpus = { check_max( 2 * task.attempt, 'cpus' ) }
memory = { check_max( 4.GB * task.attempt, 'memory' ) }
time = { check_max( 1.h * task.attempt, 'time' ) }
time = { check_max( 4.h * task.attempt, 'time' ) }
}
withLabel:process_medium {
cpus = { check_max( 4 * task.attempt, 'cpus' ) }
Expand Down
2 changes: 1 addition & 1 deletion conf/conda.config
Original file line number Diff line number Diff line change
Expand Up @@ -2,5 +2,5 @@
singularity.enabled = false
docker.enabled = false
process {
conda = "$CONDA_PREFIX/envs/ngs-preprocess-2.4"
conda = "$CONDA_PREFIX/envs/ngs-preprocess-2.5"
}
3 changes: 3 additions & 0 deletions conf/defaults.config
Original file line number Diff line number Diff line change
Expand Up @@ -25,6 +25,9 @@ params {
// Sets output directory
output = "output"

// inputs from SRA. A file containing on SRA ID per line.
sra_ids = null

/*
Expand Down
3 changes: 1 addition & 2 deletions conf/docker.config
Original file line number Diff line number Diff line change
Expand Up @@ -2,5 +2,4 @@
singularity.enabled = false
docker.enabled = true
docker.runOptions = '-u \$(id -u):\$(id -g)'
docker.fixOwnership = true
process.container = "fmalmeida/ngs-preprocess:v2.4"
process.container = "fmalmeida/ngs-preprocess:v2.5"
2 changes: 1 addition & 1 deletion conf/singularity.config
Original file line number Diff line number Diff line change
Expand Up @@ -2,5 +2,5 @@
docker.enabled = false
singularity.enabled = true
singularity.autoMounts = true
process.container = "docker://fmalmeida/ngs-preprocess:v2.4"
process.container = "docker://fmalmeida/ngs-preprocess:v2.5"
singularity.autoMounts = true
9 changes: 9 additions & 0 deletions conf/test.config
Original file line number Diff line number Diff line change
@@ -0,0 +1,9 @@
// small test config using SRA IDs
params {
sra_ids = 'https://github.com/fmalmeida/test_datasets/raw/main/sra_ids.txt'
output = 'test_output'
tracedir = 'test_output/pipeline_info'
max_cpus = 2
max_memory = '6.GB'
max_time = '6.h'
}
78 changes: 41 additions & 37 deletions docs/index.rst
Original file line number Diff line number Diff line change
Expand Up @@ -16,55 +16,59 @@ and `Docker <https://www.docker.com/>`_. It was designed to provide an easy-to-u
It wraps up the following tools:

.. list-table::
:widths: 10 60 40
:header-rows: 1
:widths: 10 60 40
:header-rows: 1

* - Software
- Analysis step
- Source
* - Software
- Analysis step
- Source

* - sra-tools & entrez-direct
- Interaction with SRA database for fetching fastqs and metadata
- https://anaconda.org/bioconda/entrez-direct ; https://github.com/ncbi/sra-tools

* - Fastp
- tool designed to provide fast all-in-one preprocessing for FastQ files
- https://github.com/OpenGene/fastp
* - Fastp
- tool designed to provide fast all-in-one preprocessing for FastQ files
- https://github.com/OpenGene/fastp

* - Porechop
- ONT reads trimming and demultiplexing
- https://github.com/rrwick/Porechop
* - Porechop
- ONT reads trimming and demultiplexing
- https://github.com/rrwick/Porechop

* - pycoQC
- ONT reads QC
- https://github.com/tleonardi/pycoQC
* - pycoQC
- ONT reads QC
- https://github.com/tleonardi/pycoQC

* - NanoPack
- Long reads QC and filter
- https://github.com/wdecoster/nanopack
* - NanoPack
- Long reads QC and filter
- https://github.com/wdecoster/nanopack

* - bax2bam
- Convert PacBio bax files to bam
- https://github.com/PacificBiosciences/bax2bam
* - bax2bam
- Convert PacBio bax files to bam
- https://github.com/PacificBiosciences/bax2bam

* - bam2fastx
- Extract reads from PacBio bam files
- https://github.com/PacificBiosciences/bam2fastx
* - bam2fastx
- Extract reads from PacBio bam files
- https://github.com/PacificBiosciences/bam2fastx

* - lima
- PacBio reads demultiplexing
- https://github.com/PacificBiosciences/barcoding
* - lima
- PacBio reads demultiplexing
- https://github.com/PacificBiosciences/barcoding

* - pacbio ccs
- Generate PacBio Highly Accurate Single-Molecule Consensus Reads
- https://ccs.how/
* - pacbio ccs
- Generate PacBio Highly Accurate Single-Molecule Consensus Reads
- https://ccs.how/


.. toctree::
:hidden:

installation
profiles
quickstart
manual
config
examples
:hidden:

installation
profiles
quickstart
manual
config
examples

Support Contact
***************
Expand Down
8 changes: 4 additions & 4 deletions docs/installation.rst
Original file line number Diff line number Diff line change
Expand Up @@ -31,7 +31,7 @@ This pipeline `Nextflow <https://www.nextflow.io/docs/latest/index.html>`_ requi
.. code-block:: bash
# for docker
docker pull fmalmeida/ngs-preprocess:v2.4
docker pull fmalmeida/ngs-preprocess:v2.5
nextflow run fmalmeida/ngs-preprocess -profile docker [options]
# for singularity
Expand All @@ -40,8 +40,8 @@ This pipeline `Nextflow <https://www.nextflow.io/docs/latest/index.html>`_ requi
export NXF_SINGULARITY_LIBRARYDIR=MY_SINGULARITY_IMAGES # your singularity storage dir
export NXF_SINGULARITY_CACHEDIR=MY_SINGULARITY_CACHE # your singularity cache dir
singularity pull \
--dir $NXF_SINGULARITY_LIBRARYDIR \
fmalmeida-ngs-preprocess-v2.4.img docker://fmalmeida/ngs-preprocess:v2.4
--dir $NXF_SINGULARITY_LIBRARYDIR \
fmalmeida-ngs-preprocess-v2.5.img docker://fmalmeida/ngs-preprocess:v2.5
nextflow run fmalmeida/ngs-preprocess -profile singularity [options]
# for conda
Expand All @@ -55,4 +55,4 @@ This pipeline `Nextflow <https://www.nextflow.io/docs/latest/index.html>`_ requi

.. note::

Now, everything is set up and ready to run. Remember to always keep your Docker images up to date (Docker pull will always download the latest).
Now, everything is set up and ready to run. Remember to always keep your Docker images up to date (Docker pull will always download the latest).
Loading

0 comments on commit 2e1d74c

Please sign in to comment.