Several recent studies have demonstrated the ability of cfDNA sequencing to provide early prognostication, better molecular profiling and monitoring of disease dynamics with many applications in genomic-driven oncology.
We provide a bioinformatics pipeline that offers:
- cfDNA data de-duplication using UMIs
- sensitive detection of SNVs using VarDict and duplexCaller
- annotation of variants using VEP
- evaluation of fragmentation profiles of cfDNA (NEW)
Title: Detection of genomic alterations in breast cancer with circulating free DNA sequencing
Journal: The paper is under 2nd major revision in Scientific Reports.
Published: pre-print available at bioRxiv https://www.biorxiv.org/content/10.1101/733691v1?rss=1
DOI: https://doi.org/10.1101/733691
Our bioinformatics pipeline is developed with Python version 2.7.10 and requires the following packages:
- bwa version 0.7.15
- ensembl-vep version 96.0
- fgbio version 0.8.1
- picard version 2.20.3
- pysam version 0.15.0.1
- R version 3.6.0
- samtools version 1.9
- vardict version 2019.06.04
To resolve all of the abovementioned dependenices we recommend installing these packages in a conda environment. You may refer to the following bash script which installs miniconda and all the required packages automatically.
myEnvConfig.sh
Please refer to Execution_examples.md for more information. We also provide step by step execution guidelines.
Our bioinformatics pipeline is developed on a Mac OS computer with Mojave version 10.14.5 and tested on Amazon EC2 m4.xlarge machines with Centos 7 operating system installed.
19-Jul-2019 : Beta version 1 (installed and tested on Centos)
5-Jun-2020 : We updated the codes and improved the documentation. Some comments with "TODO" reminders were removed after the concern of one of the reviewers.
Comments and bug reports are welcome, please email: Dimitrios Kleftogiannis ([email protected]) OR Liew Jun Xian ([email protected])
We are also interested to know about how you have used our source code, including any improvements that you have implemented.
You are free to modify, extend or distribute our source code, as long as our copyright notice remains unchanged and included in its entirety.
This project is licensed under the MIT License.
Copyright 2019 Genome Institute of Singapore (GIS) and Agency for Science, Technology and Research (A*STAR).
You may only use the source code in this repository in compliance with the license provided in this repository. For more details, please refer to the file named "LICENSE.md".