Skip to content

Commit

Permalink
cleanup
Browse files Browse the repository at this point in the history
  • Loading branch information
JohannesGawron committed Jun 7, 2024
1 parent 6cc4408 commit 0ce2ee8
Show file tree
Hide file tree
Showing 60 changed files with 1,251 additions and 5,335 deletions.
2 changes: 2 additions & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -7,3 +7,5 @@ experiments/.snakemake/*
.Rapp.history
*.Rhistory
experiments/data/markdowns/
experiments/data/htmls/*files
experiments/logs/
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
129 changes: 0 additions & 129 deletions cancel_cluster_jobs.sh

This file was deleted.

26 changes: 0 additions & 26 deletions compare_filters.py

This file was deleted.

1 change: 1 addition & 0 deletions experiments/config/config.yaml
Original file line number Diff line number Diff line change
@@ -1,2 +1,3 @@
#sample: ['Lu2']
sample: ['Br11', 'Br7', 'Br61', 'Br38', 'LM2', 'Pr9', 'Br23','Br39', 'Br57', 'Lu2', 'Br16_AC', 'Br16_B', 'Br16_C', 'Br26', 'Br44', 'Lu7', 'Br30', 'Br45', 'Ov8', 'Br37', 'Br46', 'Brx50', 'Pr6']
author: Johannes Gawron
205 changes: 98 additions & 107 deletions experiments/data/htmls/Br11.html

Large diffs are not rendered by default.

958 changes: 0 additions & 958 deletions experiments/data/htmls/Br16_AC.html

This file was deleted.

997 changes: 0 additions & 997 deletions experiments/data/htmls/Br16_B.html

This file was deleted.

227 changes: 105 additions & 122 deletions experiments/data/htmls/Br23.html

Large diffs are not rendered by default.

207 changes: 99 additions & 108 deletions experiments/data/htmls/Br26.html

Large diffs are not rendered by default.

140 changes: 69 additions & 71 deletions experiments/data/htmls/Br37.html

Large diffs are not rendered by default.

516 changes: 150 additions & 366 deletions experiments/data/htmls/Br38.html

Large diffs are not rendered by default.

770 changes: 0 additions & 770 deletions experiments/data/htmls/Br39.html

This file was deleted.

140 changes: 69 additions & 71 deletions experiments/data/htmls/Br44.html

Large diffs are not rendered by default.

140 changes: 69 additions & 71 deletions experiments/data/htmls/Br45.html

Large diffs are not rendered by default.

140 changes: 69 additions & 71 deletions experiments/data/htmls/Br46.html

Large diffs are not rendered by default.

257 changes: 114 additions & 143 deletions experiments/data/htmls/Br61.html

Large diffs are not rendered by default.

206 changes: 97 additions & 109 deletions experiments/data/htmls/Brx50.html

Large diffs are not rendered by default.

885 changes: 0 additions & 885 deletions experiments/data/htmls/LM2.html

This file was deleted.

176 changes: 83 additions & 93 deletions experiments/data/htmls/Lu2.html

Large diffs are not rendered by default.

138 changes: 66 additions & 72 deletions experiments/data/htmls/Ov8.html

Large diffs are not rendered by default.

214 changes: 101 additions & 113 deletions experiments/data/htmls/Pr9.html

Large diffs are not rendered by default.

80 changes: 42 additions & 38 deletions experiments/workflow/resources/functions.R
Original file line number Diff line number Diff line change
Expand Up @@ -234,40 +234,44 @@ produce_Distance_Posterior <- function(leaf1, leaf2,postSampling, treeName,nCell

tryCatch(
expr = {
plot(
ggplot(data, aes(x = StatisticsOfMutationPlacement)) +
geom_histogram(bins = 10, fill = "skyblue", color = "skyblue", alpha = 0.7)+
xlab("S") + ylab("total count") +
ggtitle("Posterior sampling of branching probabilites") +
geom_vline(xintercept = mean(StatisticsOfMutationPlacement),color = "blue", linetype = "dashed", linewidth = 1) +
labs(subtitle = sprintf("Tree %s - %s", treeName, clusterName),caption = "mean indicated by dashed blue line") +
theme_minimal() +
theme(
plot.title = element_text(size = 20, face = "bold"),
axis.title.x = element_text(size = 18),
axis.title.y = element_text(size = 18),
plot.subtitle = element_text(size= 18),
axis.text = element_text(size = 16)
)
)
histo <- ggplot(data, aes(x = StatisticsOfMutationPlacement)) +
geom_histogram(bins = 10, fill = "skyblue", color = "skyblue", alpha = 0.7)+
xlab("Splitting score") + ylab("total count") +
ggtitle("Posterior sampling of branching probabilites") +
geom_vline(xintercept = mean(StatisticsOfMutationPlacement),color = "blue", linetype = "dashed", linewidth = 1) +
labs(subtitle = sprintf("Tree %s - %s", treeName, clusterName)) +
theme_minimal() +
theme(
plot.title = element_text(size = 20, face = "bold"),
axis.title.x = element_text(size = 18),
axis.title.y = element_text(size = 18),
plot.subtitle = element_text(size= 18),
axis.text = element_text(size = 16)
)
hist_data <- ggplot_build(histo)$data[[1]]
max_y <- max(hist_data$count)
histo <- histo + annotate("text", x = mean(StatisticsOfMutationPlacement) + 0.08, y = 0.9 * max_y, label="mean", color = "blue", size = 7)
print(histo)
},
error = function(e){
plot(
ggplot(data, aes(x = log(StatisticsOfMutationPlacement))) +
geom_histogram(bins = 10, fill = "skyblue", color = "skyblue", alpha = 0.7)+
xlab("Maximal probability of branching evolution") + ylab("total count") +
ggtitle("Posterior sampling of branching probabilites - Logarithmic Scale") +
geom_vline(xintercept = log(mean(StatisticsOfMutationPlacement)),color = "blue", linetype = "dashed", linewidth = 1) +
labs(subtitle = sprintf("Tree %s - %s", treeName, clusterName),caption = "mean indicated by dashed red line") +
theme_minimal() +
theme(
plot.title = element_text(size = 20, face = "bold"),
axis.title.x = element_text(size = 18),
axis.title.y = element_text(size = 18),
plot.subtitle = element_text(size= 18),
axis.text = element_text(size = 16)
)
)
histo <- ggplot(data, aes(x = log(StatisticsOfMutationPlacement))) +
geom_histogram(bins = 10, fill = "skyblue", color = "skyblue", alpha = 0.7)+
xlab("log(Splitting Score") + ylab("total count") +
ggtitle("Posterior sampling of branching probabilites - Logarithmic Scale") +
geom_vline(xintercept = log(mean(StatisticsOfMutationPlacement)),color = "blue", linetype = "dashed", linewidth = 1) +
labs(subtitle = sprintf("Tree %s - %s", treeName, clusterName),caption = "mean indicated by dashed red line") +
theme_minimal() +
theme(
plot.title = element_text(size = 20, face = "bold"),
axis.title.x = element_text(size = 18),
axis.title.y = element_text(size = 18),
plot.subtitle = element_text(size= 18),
axis.text = element_text(size = 16)
)
hist_data <- ggplot_build(histo)$data[[1]]
max_y <- max(hist_data$count)
histo <- histo + annotate("text", x = log(mean(StatisticsOfMutationPlacement)) + 0.08, y = 0.9 * max_y, label="log(mean)", color = "blue", size = 7)
print(histo)
}
)

Expand Down Expand Up @@ -424,12 +428,12 @@ computeClusterSplits <- function(sampleDescription, postSampling, treeName, nCel
)
}

plot(
splittingProbs %>% group_by(Cluster) %>% summarize(meanSplittingProbability = mean(Splitting_probability)) %>%
ggplot(aes(x = Cluster, y = meanSplittingProbability)) +
geom_col() +
theme_minimal()
)
# plot(
# splittingProbs %>% group_by(Cluster) %>% summarize(meanSplittingProbability = mean(Splitting_probability)) %>%
# ggplot(aes(x = Cluster, y = meanSplittingProbability)) +
# geom_col() +
# theme_minimal()
# )

return(list(splittingProbs = splittingProbs, aggregatedBranchingProbabilities = aggregatedProbabilities))
}
Expand Down
30 changes: 16 additions & 14 deletions experiments/workflow/resources/template.Rmd
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,9 @@
title: "__tree__"
author: "__author__"
date: "__date__"
output: html_document
output:
html_document:
keep_md: yes
---

```{r setup, include=FALSE}
Expand All @@ -13,7 +15,7 @@ knitr::opts_chunk$set(echo = TRUE)

This code analyses splitting statistics for CTC-clusters.

The analysis takes a list of trees sampled from its posterior distribution as input and samples mutations placements for each of the trees.
The analysis takes a list of trees sampled from its posterior distribution as input and computes the mutations placement probability distribution for each oneof them. From this distribution we derive a score that quantifies the probability that two cells have experienced divergent evolution. This score is called the splitting score.


## Configure the script
Expand All @@ -26,11 +28,11 @@ nMutationSamplingEvents <- __nSamplingEvents__
```

## Loading data
```{r load}
quiet(source("__functionsScript__"))
```{r load, results="hide"}
source("__functionsScript__")
input <- quiet(load_data(inputFolder, treeName))
input <- load_data(inputFolder, treeName)
```


Expand All @@ -44,14 +46,14 @@ Column description:
- color: Indicates the color of the cluster in the tree, as described in the nodeDescription.tsv
file.

```{r Describe samples}
print(sampleDescription)
```{r sample-description}
print(input$sample_description)
```



## General overview
We sample __nSamplingEvents__ many trees.
We sample __nSamplingEvents__ trees.

For each pair of cells in the same cluster and each sampled tree we compute the splitting score, that is, the probability that the two cells have experienced divergent evolution. A low splitting score (close to 0) indicates that the two cells are likely genealogically closely related, while a high splitting score (close to 1) indicates that the two cells have evolved in a divergent manner.

Expand All @@ -68,7 +70,7 @@ Finally, we print the empirical distribution of the the splitting scores for all
The latter is used to specify the cutoff for oligo-clonality: It is defined as the 95%-percentile of the aggregated distribution of splitting scores.


```{r}
```{r computing-simulated-clusters, results="hide", dev='png'}
cutoffsSplittingProbs <- data.frame(clusterSize = vector(), Cutoff = vector())
cutoffsBranchingProbabilities <- data.frame(clusterSize = vector(), Cutoff = vector())
Expand All @@ -77,7 +79,7 @@ for (clusterSize in 2:5){
{treeNameSimulated <- paste(treeName, clusterSize, sep = '_')
inputSimulated <- quiet(load_data(simulationInputFolder, treeNameSimulated))
inputSimulated <- load_data(simulationInputFolder, treeNameSimulated)
sampleDescriptionSimulated <- inputSimulated$sample_description
Expand Down Expand Up @@ -107,14 +109,14 @@ print(cutoffsBranchingProbabilities)

Now we can compute the aggregated splitting score distributions for each cluster. The distribution's mean is compared to the cutoffs computed above, and if it is higher than the cutoff, we call the cluster oligo-clonal.

```{r}
```{r computing-real-clusters}
nTumorClusters <- 0
nOligoclonalClusters2 <- 0
splittingSummary2 <- data.frame(Color = vector(), Oligoclonal = vector(), ClusterSize = vector())
for(clusterSize in 2:5){
try({
clusterColor <- sampleDescription %>%
clusterColor <- input$sample_description %>%
filter(WBC ==0 & color != 'gray93') %>%
group_by(color) %>%
filter(n() == clusterSize) %>%
Expand All @@ -131,7 +133,7 @@ for(clusterSize in 2:5){
splittingProbs <- mean(distance$splittingProbs$Splitting_probability)
branchingProbs <- mean(distance$aggregatedBranchingProbabilities)
nTumorClusters <- nTumorClusters + 1
oligoclonal <- FALSE
Expand All @@ -145,7 +147,7 @@ for(clusterSize in 2:5){
}
numberOfCancerClusters <- sampleDescription %>%
numberOfCancerClusters <- input$sample_description %>%
filter(WBC ==0 & color != 'gray93') %>%
group_by(color) %>%
filter(n() > 1) %>%
Expand Down
2 changes: 1 addition & 1 deletion experiments/workflow/rules/base.smk
Original file line number Diff line number Diff line change
Expand Up @@ -34,5 +34,5 @@ rule render_markdown_file:
PROJECT_DIR / 'logs' / 'render_markdown_file.{SAMPLE}.log',
shell:
"""
( Rscript -e "rmarkdown::render('{input}', output_file = '{output}')" ) %> {log}
( Rscript -e "rmarkdown::render('{input}', output_file = '{output}')" ) &> {log}
"""

0 comments on commit 0ce2ee8

Please sign in to comment.