Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: fusion calling #222

Merged
merged 68 commits into from
Feb 27, 2024
Merged
Show file tree
Hide file tree
Changes from 18 commits
Commits
Show all changes
68 commits
Select commit Hold shift + click to select a range
210603c
Initial commit
FelixMoelder Dec 15, 2022
2659312
concat arriba calls
FelixMoelder Dec 19, 2022
d7b814f
generalize workflow
FelixMoelder Jan 13, 2023
f6a1345
Merge branch 'master' into fusion_calling
FelixMoelder Mar 16, 2023
b4b1538
update arriba
FelixMoelder Mar 16, 2023
f499e26
Update tests
FelixMoelder Mar 16, 2023
6bcaa1d
typo
FelixMoelder Mar 16, 2023
788080d
Merge branch 'master' into fusion_calling
FelixMoelder May 10, 2023
273fbdc
intermediate changes
FelixMoelder May 12, 2023
5a24102
Merge branch 'fusion_calling' of github.com:snakemake-workflows/dna-s…
FelixMoelder May 12, 2023
0cb7f25
remove repetetive code
FelixMoelder May 12, 2023
08ae08c
fix typo
FelixMoelder May 12, 2023
1cea901
indexing
FelixMoelder Aug 31, 2023
b5f29fe
Merge branch 'master' into fusion_calling
FelixMoelder Aug 31, 2023
910803f
snakefmt
FelixMoelder Aug 31, 2023
9b141b3
refactoring
FelixMoelder Sep 6, 2023
b27f1f3
fmt
FelixMoelder Sep 6, 2023
0043a9f
fmt
FelixMoelder Sep 6, 2023
6e21bac
fixed incompatibilities
FelixMoelder Sep 7, 2023
92f4434
skip vep
FelixMoelder Sep 11, 2023
e1e9b62
improve report
FelixMoelder Sep 12, 2023
ab94e13
Fix final output
FelixMoelder Sep 15, 2023
d8e27f9
fix output
FelixMoelder Sep 15, 2023
1357b44
handle mutational burden
FelixMoelder Sep 15, 2023
e42caa2
remove unused rules, separate groups
FelixMoelder Sep 15, 2023
b0d24ec
fix minor issues
FelixMoelder Sep 20, 2023
176f681
cleanup
FelixMoelder Sep 20, 2023
b3d574e
Add missing script
FelixMoelder Sep 20, 2023
0201f7f
update readme
FelixMoelder Sep 21, 2023
8f90131
update freebayes
FelixMoelder Sep 22, 2023
333ae5e
reset download revel
FelixMoelder Oct 17, 2023
289315a
Merge branch 'master' into fusion_calling
FelixMoelder Oct 26, 2023
91815b6
renaming
FelixMoelder Oct 27, 2023
cdc8d2d
fix datatype handling
FelixMoelder Oct 27, 2023
3e5a3ba
fmt
FelixMoelder Oct 27, 2023
805df4d
fmt
FelixMoelder Oct 27, 2023
1d8a179
merge master
FelixMoelder Nov 13, 2023
b968321
Update wrapper
FelixMoelder Nov 13, 2023
aa8d20a
Merge branch 'master' into fusion_calling
FelixMoelder Nov 29, 2023
251597d
Merge branch 'master' into fusion_calling
FelixMoelder Nov 30, 2023
b88a90c
Merge branch 'master' into fusion_calling
FelixMoelder Dec 1, 2023
dad77e0
breaking up workflow (not yet working)
FelixMoelder Dec 12, 2023
721d5c9
Merge branch 'fusion_calling' of github.com:snakemake-workflows/dna-s…
FelixMoelder Dec 12, 2023
612c041
fmt
FelixMoelder Dec 12, 2023
aa9e2fa
fmt
FelixMoelder Dec 12, 2023
e536e5e
fmt
FelixMoelder Dec 12, 2023
2b48e51
Merge branch 'master' into fusion_calling
FelixMoelder Dec 12, 2023
d1e2383
invoked datatype and candidate-calling
FelixMoelder Jan 16, 2024
ada0627
fix formatting
FelixMoelder Jan 16, 2024
c71aef4
unified workflow
FelixMoelder Jan 18, 2024
8a15e4e
formatting
FelixMoelder Jan 18, 2024
2bbbb37
update samplesheet
FelixMoelder Jan 18, 2024
479c144
Add read group for star
FelixMoelder Jan 18, 2024
1bd1720
fix param
FelixMoelder Jan 18, 2024
0e4f5c1
support fusions and variants in rna
FelixMoelder Jan 19, 2024
d6f4a9c
feat: render canonical transcript source
FelixMoelder Jan 24, 2024
eb8268f
clean report
FelixMoelder Jan 26, 2024
26c92dd
Merge remote-tracking branch 'origin/feat/render_canonical_source' in…
FelixMoelder Jan 29, 2024
6cbea56
cleanup template
FelixMoelder Feb 13, 2024
bae0416
github action workaround
FelixMoelder Feb 13, 2024
8e8e581
update readme
FelixMoelder Feb 15, 2024
c738abb
cleanup
FelixMoelder Feb 26, 2024
486722a
fmt
FelixMoelder Feb 26, 2024
02a693d
introduce pattern delegatoin
FelixMoelder Feb 26, 2024
2976f30
fmt
FelixMoelder Feb 26, 2024
fca151c
add comment to script
FelixMoelder Feb 26, 2024
6ec5818
typo
FelixMoelder Feb 26, 2024
a02b21f
Update convert_fusions_to_vcf.sh
FelixMoelder Feb 26, 2024
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
4 changes: 2 additions & 2 deletions .test/config-chm-eval/samples.tsv
Original file line number Diff line number Diff line change
@@ -1,2 +1,2 @@
sample_name group alias platform
chm chm ILLUMINA
sample_name group alias platform datatype
chm chm ILLUMINA dna
4 changes: 2 additions & 2 deletions .test/config-giab/samples.tsv
Original file line number Diff line number Diff line change
@@ -1,2 +1,2 @@
sample_name alias group platform purity
NA12878 NA12878 NA12878 ILLUMINA
sample_name alias group platform purity datatype
NA12878 NA12878 NA12878 ILLUMINA dna
6 changes: 3 additions & 3 deletions .test/config-no-candidate-filtering/samples.tsv
Original file line number Diff line number Diff line change
@@ -1,3 +1,3 @@
sample_name group alias platform
a a ILLUMINA
b b ILLUMINA
sample_name group alias platform datatype
a a ILLUMINA dna
b b ILLUMINA dna
10 changes: 5 additions & 5 deletions .test/config-simple/samples.tsv
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
sample_name group alias platform
a one x ILLUMINA
b one y ILLUMINA
b two x ILLUMINA
a two y ILLUMINA
sample_name group alias platform datatype
a one x ILLUMINA dna
b one y ILLUMINA dna
b two x ILLUMINA dna
a two y ILLUMINA dna
6 changes: 3 additions & 3 deletions .test/config-sra/samples.tsv
Original file line number Diff line number Diff line change
@@ -1,3 +1,3 @@
sample_name group alias platform
a a ILLUMINA
b b ILLUMINA
sample_name group alias platform datatype
a a ILLUMINA dna
b b ILLUMINA dna
6 changes: 3 additions & 3 deletions .test/config-target-regions/samples.tsv
Original file line number Diff line number Diff line change
@@ -1,3 +1,3 @@
sample_name group alias platform
a a ILLUMINA
b b ILLUMINA
sample_name group alias platform datatype
a a ILLUMINA dna
b b ILLUMINA dna
6 changes: 3 additions & 3 deletions .test/config_primers/samples.tsv
Original file line number Diff line number Diff line change
@@ -1,3 +1,3 @@
sample_name group alias platform
a a ILLUMINA
b b ILLUMINA
sample_name group alias platform datatype
a a ILLUMINA dna
b b ILLUMINA dna
1 change: 1 addition & 0 deletions config/config.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -117,6 +117,7 @@ calling:
# only alphanumerics and underscores
# ("somatic" below is just an example and can be modified as needed).
FelixMoelder marked this conversation as resolved.
Show resolved Hide resolved
some_id:
types: ["variants", "fusions"]
FelixMoelder marked this conversation as resolved.
Show resolved Hide resolved
# labels for the callset, displayed in the report. Will fall back to id if no labels specified
labels:
some-label: label text
Expand Down
2 changes: 1 addition & 1 deletion config/samples.tsv
Original file line number Diff line number Diff line change
@@ -1 +1 @@
sample_name alias group platform purity panel umi_read umi_read_structure
sample_name alias group platform purity panel datatype umi_read umi_read_structure
1 change: 1 addition & 0 deletions workflow/Snakefile
Original file line number Diff line number Diff line change
Expand Up @@ -35,6 +35,7 @@ include: "rules/table.smk"
include: "rules/regions.smk"
include: "rules/plugins.smk"
include: "rules/datavzrd.smk"
include: "rules/fusion_calling.smk"
include: "rules/testcase.smk"


Expand Down
5 changes: 5 additions & 0 deletions workflow/envs/arriba.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,5 @@
channels:
- conda-forge
- bioconda
dependencies:
- arriba =2.4
2 changes: 1 addition & 1 deletion workflow/envs/bcftools.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -2,4 +2,4 @@ channels:
- conda-forge
- bioconda
dependencies:
- bcftools =1.14
- bcftools =1.16
16 changes: 16 additions & 0 deletions workflow/resources/datavzrd/fusion-calls-template.datavzrd.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,16 @@
name: ?f"Fusion calls {wildcards.event}"

default-view: ?f"{params.groups[0]}-fusions"
max-in-memory-rows: 1500

datasets:
?for group, path in zip(params.groups, params.fusion_calls):
?f"{group}-fusions":
path: ?path
separator: "\t"

views:
?for group in params.groups:
?f"{group}-fusions":
desc: ?f"Fusion calls.\n{config['calling']['fdr-control']['events'][wildcards.event]['desc']}"
dataset: ?f"{group}-fusions"
12 changes: 6 additions & 6 deletions workflow/rules/annotation.smk
Original file line number Diff line number Diff line change
Expand Up @@ -24,16 +24,16 @@ rule annotate_candidate_variants:

rule annotate_variants:
input:
calls="results/calls/{group}.{scatteritem}.bcf",
calls="results/calls/{group}.{analysis}.{scatteritem}.bcf",
cache="resources/vep/cache",
plugins="resources/vep/plugins",
revel=lambda wc: get_plugin_aux("REVEL"),
revel_tbi=lambda wc: get_plugin_aux("REVEL", True),
fasta=genome,
fai=genome_fai,
output:
calls="results/calls/{group}.{scatteritem}.annotated.bcf",
stats="results/calls/{group}.{scatteritem}.stats.html",
calls="results/calls/{group}.{analysis}.{scatteritem}.annotated.bcf",
stats="results/calls/{group}.{analysis}.{scatteritem}.stats.html",
params:
# Pass a list of plugins to use, see https://www.ensembl.org/info/docs/tools/vep/script/vep_plugins.html
# Plugin args can be added as well, e.g. via an entry "MyPlugin,1,FOO", see docs.
Expand All @@ -42,7 +42,7 @@ rule annotate_variants:
config["annotations"]["vep"]["final_calls"]["params"]
),
log:
"logs/vep/{group}.{scatteritem}.annotate.log",
"logs/vep/{group}.{analysis}.{scatteritem}.annotate.log",
threads: get_vep_threads()
wrapper:
"v2.5.0/bio/vep/annotate"
Expand Down Expand Up @@ -89,9 +89,9 @@ rule gather_annotated_calls:
calls=get_gather_annotated_calls_input(),
idx=get_gather_annotated_calls_input(ext="bcf.csi"),
output:
"results/final-calls/{group}.annotated.bcf",
"results/final-calls/{group}.{analysis}.annotated.bcf",
log:
"logs/gather-annotated-calls/{group}.log",
"logs/gather-annotated-calls/{group}.{analysis}.log",
params:
extra="-a",
wrapper:
Expand Down
27 changes: 15 additions & 12 deletions workflow/rules/calling.smk
Original file line number Diff line number Diff line change
Expand Up @@ -26,7 +26,8 @@ rule varlociraptor_alignment_properties:
input:
ref=genome,
ref_idx=genome_fai,
bam="results/recal/{sample}.bam",
bam=get_sample_bam,
bai=lambda wc: get_sample_bam(wc, bai=True),
output:
"results/alignment-properties/{group}/{sample}.json",
log:
Expand All @@ -42,8 +43,8 @@ rule varlociraptor_preprocess:
ref=genome,
ref_idx=genome_fai,
candidates=get_candidate_calls(),
bam="results/recal/{sample}.bam",
bai="results/recal/{sample}.bai",
bam=get_sample_bam,
bai=lambda wc: get_sample_bam(wc, bai=True),
alignment_props="results/alignment-properties/{group}/{sample}.json",
output:
"results/observations/{group}/{sample}.{caller}.{scatteritem}.bcf",
Expand All @@ -66,7 +67,7 @@ rule varlociraptor_call:
obs=get_group_observations,
scenario="results/scenarios/{group}.yaml",
output:
temp("results/calls/{group}.{caller}.{scatteritem}.bcf"),
temp("results/calls/{group}.{caller}.{scatteritem}.unsorted.bcf"),
log:
"logs/varlociraptor/call/{group}.{caller}.{scatteritem}.log",
params:
Expand All @@ -86,28 +87,30 @@ rule varlociraptor_call:

rule sort_calls:
input:
"results/calls/{group}.{caller}.{scatteritem}.bcf",
"results/calls/{group}.{caller}.{scatteritem}.unsorted.bcf",
output:
temp("results/calls/{group}.{caller}.{scatteritem}.bcf"),
params:
# Set to True, in case you want uncompressed BCF output
uncompressed_bcf=False,
# Extra arguments
extras="",
log:
"logs/bcf-sort/{group}.{caller}.{scatteritem}.log",
conda:
"../envs/bcftools.yaml"
resources:
mem_mb=8000,
shell:
"bcftools sort --max-mem {resources.mem_mb}M --temp-dir `mktemp -d` "
"-Ob {input} > {output} 2> {log}"
wrapper:
"v1.21.0/bio/bcftools/sort"


rule bcftools_concat:
input:
calls=get_scattered_calls(),
indexes=get_scattered_calls(ext="bcf.csi"),
output:
"results/calls/{group}.{scatteritem}.bcf",
"results/calls/{group}.{analysis}.{scatteritem}.bcf",
log:
"logs/concat-calls/{group}.{scatteritem}.log",
"logs/concat-calls/{group}.{analysis}.{scatteritem}.log",
params:
extra="-a", # TODO Check this
wrapper:
Expand Down
Loading