Skip to content

Commit

Permalink
update singlecell_lung_adenocarcinoma example
Browse files Browse the repository at this point in the history
  • Loading branch information
josschavezf committed Feb 20, 2024
1 parent 2e235a0 commit 3b746e9
Show file tree
Hide file tree
Showing 4 changed files with 107 additions and 74 deletions.
Binary file modified vignettes/images/singlecell_lung_adenocarcinoma/4_Cluster.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
181 changes: 107 additions & 74 deletions vignettes/singlecell_lung_adenocarcinoma.Rmd
Original file line number Diff line number Diff line change
Expand Up @@ -6,19 +6,22 @@ vignette: >
%\VignetteEngine{knitr::rmarkdown}
%\VignetteEncoding{UTF-8}
---

# Dataset Explanation


[Maynard et al.](https://pubmed.ncbi.nlm.nih.gov/32822576/) Processed Illumina Single Cell RNAseq of metastatic lung cancer using 49 clinical biopsies obtained from 30 patients before and during targeted therapy. The raw data can be found [here](https://www.ncbi.nlm.nih.gov/bioproject/591860).

To run this vignette, download the files from this [Google drive](https://drive.google.com/drive/folders/1sDzO0WOD4rnGC7QfTKwdcQTx3L36PFwX)


# Set up Giotto Environment

```{r, eval=FALSE}
# Ensure Giotto Suite is installed.
if(!"Giotto" %in% installed.packages()) {
pak::pkg_install("drieslab/Giotto")
}
# Ensure GiottoData, a small, helper module for tutorials, is installed.
if(!"GiottoData" %in% installed.packages()) {
pak::pkg_install("drieslab/GiottoData")
}
# Ensure the Python environment for Giotto has been installed.
genv_exists = Giotto::checkGiottoEnvironment()
if(!genv_exists){
Expand All @@ -27,13 +30,9 @@ if(!genv_exists){
}
```


## Set up Giotto Environment

``` {r, eval=FALSE}
library(Giotto)
library(GiottoData)
# 1. set working directory
results_folder = 'path/to/result'
Expand All @@ -49,57 +48,67 @@ instrs = createGiottoInstructions(save_dir = results_folder,
python_path = my_python_path)
```

## Dataset Explanation

[Maynard et al.](https://pubmed.ncbi.nlm.nih.gov/32822576/) Processed Illumina Single Cell RNAseq of metastatic lung cancer using 49 clinical biopsies obtained from 30 patients before and during targeted therapy. The raw data can be found [here](https://www.ncbi.nlm.nih.gov/bioproject/591860).

To run this vignette, download the files from this [Google drive](https://drive.google.com/drive/folders/1sDzO0WOD4rnGC7QfTKwdcQTx3L36PFwX)

## Part 1: Create Giotto object
# 1. Create the Giotto object

Load data

```{r, eval=FALSE}
raw.data <- read.csv("Data_input/csv_files/S01_datafinal.csv",
header=T, row.names = 1)
data_dir = "Data_input/csv_files/"
raw.data <- read.csv(paste0(data_dir, "S01_datafinal.csv"),
header = T,
row.names = 1)
```

Load metadata

```{r, eval=FALSE}
metadata <- read.csv("Data_input/csv_files/S01_metacells.csv",
row.names=1, header=T)
metadata <- read.csv(paste0(data_dir, "S01_metacells.csv"),
row.names = 1,
header = T)
```

Find ERCC's, compute the percent ERCC, and drop them from the raw data.

```{r, eval=FALSE}
erccs <- grep(pattern = "^ERCC-",
x = rownames(x = raw.data),
value = TRUE)
percent.ercc <- Matrix::colSums(raw.data[erccs, ])/Matrix::colSums(raw.data)
ercc.index <- grep(pattern = "^ERCC-",
x = rownames(x = raw.data),
value = FALSE)
raw.data <- raw.data[-ercc.index,]
```

Create Giotto object

``` {r, eval=FALSE}
giotto_SC <- createGiottoObject(expression = raw.data,
instructions = instrs)
```

Calculate percent ribosomal genes and add to metadata

```{r, eval=FALSE}
ribo.genes <- grep(pattern = "^RP[SL][[:digit:]]",
x = rownames(raw.data), value = TRUE)
x = rownames(raw.data),
value = TRUE)
percent.ribo <- Matrix::colSums(raw.data[ribo.genes, ])/Matrix::colSums(raw.data)
giotto_SC <- addCellMetadata(giotto_SC,
new_metadata = data.frame(percent_ribo = percent.ribo))
```

## Part 2: Process Giotto Object
# 2. Process Giotto Object

``` {r, eval=FALSE}
## filter
giotto_SC <- filterGiotto(gobject = giotto_SC,
expression_threshold = 1,
feat_det_in_min_cells = 10,
Expand All @@ -108,39 +117,45 @@ giotto_SC <- filterGiotto(gobject = giotto_SC,
verbose = T)
## normalize
giotto_SC <- normalizeGiotto(gobject = giotto_SC, scalefactor = 6000)
giotto_SC <- normalizeGiotto(gobject = giotto_SC,
scalefactor = 6000)
## add gene & cell statistics
giotto_SC <- addStatistics(gobject = giotto_SC, expression_values = 'raw')
giotto_SC <- addStatistics(gobject = giotto_SC,
expression_values = 'raw')
```

## Part 3: Dimension Reduction
# 3. Dimension Reduction

``` {r, eval=FALSE}
## PCA ##
giotto_SC <- calculateHVF(gobject = giotto_SC)
giotto_SC <- runPCA(gobject = giotto_SC,
center = TRUE,
cale_unit = TRUE)
screePlot(giotto_SC,
ncp = 30,
save_param = list(save_name = '3_scree_plot'))
```

![](images/singlecell_lung_adenocarcinoma/3_scree_plot.png)

## Part 4: Cluster
# 4. Cluster

``` {r, eval=FALSE}
## cluster and run UMAP ##
# sNN network (default)
giotto_SC <- createNearestNetwork(gobject = giotto_SC,
dim_reduction_to_use = 'pca',
dim_reduction_name = 'pca',
dimensions_to_use = 1:10, k = 15)
dimensions_to_use = 1:10,
k = 15)
# UMAP
giotto_SC = runUMAP(giotto_SC, dimensions_to_use = 1:10)
giotto_SC <- runUMAP(giotto_SC,
dimensions_to_use = 1:10)
# Leiden clustering
giotto_SC <- doLeidenCluster(gobject = giotto_SC,
Expand All @@ -156,15 +171,16 @@ plotUMAP(gobject = giotto_SC,

![](images/singlecell_lung_adenocarcinoma/4_Cluster.png)

## Part 5: Differential Expression
# 5. Differential Expression

``` {r, eval=FALSE}
markers_scran = findMarkers_one_vs_all(gobject=giotto_SC,
method="scran",
expression_values="normalized",
cluster_column='leiden_clus',
min_feats=3)
markergenes_scran = unique(markers_scran[, head(.SD, 2), by="cluster"][["feats"]])
markers_scran = findMarkers_one_vs_all(gobject = giotto_SC,
method = "scran",
expression_values = "normalized",
cluster_column = 'leiden_clus',
min_feats = 3)
markergenes_scran = unique(markers_scran[, head(.SD, 2), by = "cluster"][["feats"]])
plotMetaDataHeatmap(giotto_SC,
expression_values = "normalized",
Expand All @@ -177,7 +193,7 @@ plotMetaDataHeatmap(giotto_SC,

![](images/singlecell_lung_adenocarcinoma/5_metaheatmap.png)

## Part 6: FeaturePlot
# 6. FeaturePlot

``` {r, eval=FALSE}
# Plot known marker genes across different cell types. e.g. EPCAM for epithelial cells
Expand All @@ -189,7 +205,7 @@ dimFeatPlot2D(giotto_SC,

![](images/singlecell_lung_adenocarcinoma/6_featureplot.png)

## Part 7: Cell type Annotation
# 7. Cell type Annotation

```{r}
marker_genes = list(
Expand All @@ -211,6 +227,7 @@ marker_genes = list(
```{r, eval=FALSE}
library(dplyr)
library(ComplexHeatmap)
heatmap_table <- calculateMetaTable(gobject = giotto_SC,
expression_values = 'normalized',
metadata_cols = 'leiden_clus',
Expand Down Expand Up @@ -239,9 +256,11 @@ heatmap_table <- heatmap_table %>%
))
heatmap_matrix <- heatmap_table[,c("leiden_clus", "variable","zscores_rescaled_per_feat")]
heatmap_matrix <- tidyr::pivot_wider(heatmap_matrix,
names_from = "leiden_clus",
values_from = "zscores_rescaled_per_feat")
rownames_matrix <- heatmap_matrix$variable
colnames_matrix <- colnames(heatmap_matrix)
Expand All @@ -261,7 +280,7 @@ panel_fun = function(index, nm) {
}
## heatmap z-score per leiden cluster
png(filename = "results/6_heatmap_all_clusters_cell_types.png",
png(filename = paste0(results_folder, "6_heatmap_all_clusters_cell_types.png"),
width = 2000,
height = 1500,
res = 300)
Expand Down Expand Up @@ -334,10 +353,12 @@ lung_labels<-c("carcinoma_cells",#1
)
names(lung_labels) <- 1:32
giotto_SC <- annotateGiotto(gobject = giotto_SC,
annotation_vector = lung_labels ,
cluster_column = 'leiden_clus',
name = 'lung_labels')
dimPlot2D(gobject = giotto_SC,
dim_reduction_name = 'umap',
cell_color = "lung_labels",
Expand All @@ -347,19 +368,20 @@ dimPlot2D(gobject = giotto_SC,
```

![](images/singlecell_lung_adenocarcinoma/7_Annotation.png)
# 8. Session Info

```{r, eval=FALSE}
sessionInfo()
```

```{r, eval=FALSE}
R version 4.3.2 (2023-10-31)
Platform: x86_64-apple-darwin20 (64-bit)
Running under: macOS Sonoma 14.3
Platform: aarch64-apple-darwin20 (64-bit)
Running under: macOS Sonoma 14.2.1
Matrix products: default
BLAS: /System/Library/Frameworks/Accelerate.framework/Versions/A/Frameworks/vecLib.framework/Versions/A/libBLAS.dylib
LAPACK: /Library/Frameworks/R.framework/Versions/4.3-x86_64/Resources/lib/libRlapack.dylib; LAPACK version 3.11.0
LAPACK: /Library/Frameworks/R.framework/Versions/4.3-arm64/Resources/lib/libRlapack.dylib; LAPACK version 3.11.0
locale:
[1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8
Expand All @@ -368,42 +390,53 @@ time zone: America/New_York
tzcode source: internal
attached base packages:
[1] stats graphics grDevices utils datasets methods base
[1] grid stats graphics grDevices utils datasets methods base
other attached packages:
[1] Giotto_4.0.2 GiottoClass_0.1.3
[1] ComplexHeatmap_2.18.0 dplyr_1.1.4 Giotto_4.0.3 GiottoClass_0.1.3
loaded via a namespace (and not attached):
[1] generics_0.1.3 utf8_1.2.4
[3] SparseArray_1.2.3 bitops_1.0-7
[5] gtools_3.9.5 lattice_0.21-9
[7] magrittr_2.0.3 grid_4.3.2
[9] Matrix_1.6-5 GenomeInfoDb_1.38.5
[11] fansi_1.0.6 SingleCellExperiment_1.24.0
[13] scales_1.3.0 codetools_0.2-19
[15] abind_1.4-5 cli_3.6.2
[17] rlang_1.1.3 crayon_1.5.2
[19] XVector_0.42.0 Biobase_2.62.0
[21] munsell_0.5.0 colorRamp2_0.1.0
[23] DelayedArray_0.28.0 S4Arrays_1.2.0
[25] parallel_4.3.2 tools_4.3.2
[27] GiottoUtils_0.1.3 dplyr_1.1.4
[29] colorspace_2.1-0 ggplot2_3.4.4
[31] SpatialExperiment_1.12.0 GenomeInfoDbData_1.2.11
[33] SummarizedExperiment_1.32.0 BiocGenerics_0.48.1
[35] vctrs_0.6.5 R6_2.5.1
[37] matrixStats_1.2.0 stats4_4.3.2
[39] lifecycle_1.0.4 magick_2.8.2
[41] zlibbioc_1.48.0 GiottoVisuals_0.1.2
[43] S4Vectors_0.40.2 IRanges_2.36.0
[45] pkgconfig_2.0.3 terra_1.7-65
[47] pillar_1.9.0 gtable_0.3.4
[49] data.table_1.14.10 glue_1.7.0
[51] Rcpp_1.0.12 tidyselect_1.2.0
[53] tibble_3.2.1 GenomicRanges_1.54.1
[55] rstudioapi_0.15.0 MatrixGenerics_1.14.0
[57] rjson_0.2.21 compiler_4.3.2
[59] RCurl_1.98-1.14
[1] colorRamp2_0.1.0 bitops_1.0-7 rlang_1.1.3
[4] magrittr_2.0.3 clue_0.3-65 GetoptLong_1.0.5
[7] RcppAnnoy_0.0.22 GiottoUtils_0.1.5 matrixStats_1.2.0
[10] compiler_4.3.2 DelayedMatrixStats_1.24.0 png_0.1-8
[13] systemfonts_1.0.5 vctrs_0.6.5 shape_1.4.6
[16] pkgconfig_2.0.3 SpatialExperiment_1.12.0 crayon_1.5.2
[19] fastmap_1.1.1 backports_1.4.1 magick_2.8.3
[22] XVector_0.42.0 scuttle_1.12.0 labeling_0.4.3
[25] utf8_1.2.4 rmarkdown_2.25 ragg_1.2.7
[28] purrr_1.0.2 xfun_0.42 bluster_1.12.0
[31] zlibbioc_1.48.0 beachmat_2.18.1 GenomeInfoDb_1.38.6
[34] jsonlite_1.8.8 DelayedArray_0.28.0 BiocParallel_1.36.0
[37] terra_1.7-71 irlba_2.3.5.1 parallel_4.3.2
[40] cluster_2.1.6 R6_2.5.1 RColorBrewer_1.1-3
[43] limma_3.58.1 reticulate_1.35.0 parallelly_1.37.0
[46] GenomicRanges_1.54.1 iterators_1.0.14 Rcpp_1.0.12
[49] SummarizedExperiment_1.32.0 knitr_1.45 future.apply_1.11.1
[52] usethis_2.2.3 IRanges_2.36.0 Matrix_1.6-5
[55] igraph_2.0.2 tidyselect_1.2.0 rstudioapi_0.15.0
[58] abind_1.4-5 yaml_2.3.8 doParallel_1.0.17
[61] codetools_0.2-19 listenv_0.9.1 lattice_0.22-5
[64] tibble_3.2.1 Biobase_2.62.0 withr_3.0.0
[67] evaluate_0.23 future_1.33.1 circlize_0.4.16
[70] pillar_1.9.0 MatrixGenerics_1.14.0 foreach_1.5.2
[73] checkmate_2.3.1 stats4_4.3.2 generics_0.1.3
[76] dbscan_1.1-12 RCurl_1.98-1.14 S4Vectors_0.40.2
[79] ggplot2_3.4.4 sparseMatrixStats_1.14.0 munsell_0.5.0
[82] scales_1.3.0 gtools_3.9.5 globals_0.16.2
[85] glue_1.7.0 metapod_1.10.1 tools_4.3.2
[88] GiottoVisuals_0.1.4 BiocNeighbors_1.20.2 data.table_1.15.0
[91] ScaledMatrix_1.10.0 locfit_1.5-9.8 fs_1.6.3
[94] scran_1.30.2 Cairo_1.6-2 cowplot_1.1.3
[97] tidyr_1.3.1 edgeR_4.0.15 colorspace_2.1-0
[100] SingleCellExperiment_1.24.0 GenomeInfoDbData_1.2.11 BiocSingular_1.18.0
[103] cli_3.6.2 rsvd_1.0.5 textshaping_0.3.7
[106] fansi_1.0.6 S4Arrays_1.2.0 uwot_0.1.16
[109] gtable_0.3.4 digest_0.6.34 progressr_0.14.0
[112] BiocGenerics_0.48.1 dqrng_0.3.2 SparseArray_1.2.4
[115] ggrepel_0.9.5 rjson_0.2.21 farver_2.1.1
[118] htmltools_0.5.7 lifecycle_1.0.4 GlobalOptions_0.1.2
[121] statmod_1.5.0
```


0 comments on commit 3b746e9

Please sign in to comment.