Merge branch 'master' into sjspielman/2024-dec-render-fixes

AlexsLemonade · Dec 4, 2024 · e006b6f · e006b6f
2 parents db56490 + 247b968
commit e006b6f
Show file tree

Hide file tree

Showing 6 changed files with 909 additions and 181 deletions.
diff --git a/components/dictionary.txt b/components/dictionary.txt
@@ -62,6 +62,7 @@ CellMarker
 centric
 cheatsheet
 cheatsheets
+clustering's
 clusterProfiler
 Cmd
 colData
@@ -70,6 +71,7 @@ ComplexHeatmap
 concordantly
 conda
 config
+connectedness
 CPMs
 csv
 Ctrl
@@ -338,6 +340,7 @@ PBMCs
 pDC
 PDX
 ped
+permalink
 phenotypes
 Phred
 Picelli
@@ -397,6 +400,7 @@ Sca
 scater
 SCE
 SCE's
+ScPCA
 scran
 scRNA
 Sebire

diff --git a/scRNA-seq-advanced/exercise_01-citeseq.Rmd b/scRNA-seq-advanced/exercise_01-citeseq.Rmd
@@ -72,7 +72,7 @@ Finally, in this chunk, include code to check if the `output_dir` directory exis
 
 # Define the output RDS file
 
-# Check if the output directory exists, and if not, create it
+# Create the output directory if it doesn't exist
 
 ```
 
@@ -85,6 +85,7 @@ Next, set the random seed to ensure reproducibility of steps involving randomnes
 
 
 In the following chunk, read in the Cell Ranger results files from `pbmc_dir` using the function `DropletUtils::read10xCounts()`, saving the result as `raw_sce`.
+Make sure to also specify the argument `col.names = TRUE` to ensure barcodes are set as column names in the resulting SCE object.
 
 ```{r read cellranger, solution = TRUE}
 # Read in the raw 10x dataset
@@ -650,7 +651,7 @@ We expect that no cells have any `IgG1` expression, since we filtered out those
 ```{r plot umap igg1}
 # Plot the UMAP colored by IgG1 expression
 scater::plotUMAP(normalized_sce,
-                 colour_by = "IgG1")
+                 color_by = "IgG1")
 ```
 Indeed, this is a very boring plot – all cells have the same expression of 0!
 

diff --git a/scRNA-seq-advanced/exercise_02-integration.Rmd b/scRNA-seq-advanced/exercise_02-integration.Rmd
@@ -58,7 +58,7 @@ Make sure that directory actually exists, and create it if it doesn't!
 ```{r output, solution = TRUE}
 # Define a directory to save the integrated SCE object
 
-# Create output directory if it doesn't exist
+# Create the output directory if it doesn't exist
 
 ```
 
@@ -246,11 +246,13 @@ As our PCA is stored in the `"PCA"` `reducedDim` slot, we will similarly store t
 ```
 
 Now we can use `scater::plotReducedDim()` or `scater::plotUMAP()` to visualize our merged but uncorrected results.
-Use the chunk below to create a UMAP plot of the merged data, colored (`colour`ed) by donor.
+Use the chunk below to create a UMAP plot of the merged data, colored by donor.
 
 ```{r plot uncorrected UMAP, solution = TRUE}
 # Plot UMAP colored by donor
 
+  # add more CVD-friendly color scale and legend title
+
 ```
 
 What do you see in this plot?
@@ -337,6 +339,8 @@ Ideally, we would see that the donors are all mixed within each "blob" of cells.
 ```{r fastMNN UMAP by donor, solution = TRUE}
 # UMAP plot colored by donor
 
+  # add more CVD-friendly color scale and legend title
+
 ```
 
 It looks like there is probably some good overlap there, but now we have a different problem.
@@ -358,6 +362,8 @@ Use the chunk below to do that!
 ```{r plot shuffled SCE object, solution = TRUE}
 # shuffled UMAP plot colored by donor
 
+  # add more CVD-friendly color scale and legend title
+
 ```
 
 Are you satisfied with this integration result?

diff --git a/scRNA-seq-advanced/exercise_03-diffexp.Rmd b/scRNA-seq-advanced/exercise_03-diffexp.Rmd
@@ -63,11 +63,10 @@ data_dir <- file.path("data", "rms")
 #  as created during instruction
 rms_sce_file <- file.path(data_dir, "integrated",  "rms_subset_sce.rds")
 
-# analysis results directory, which should exist from instruction
+# ensure analysis results directory has been created
+# it should exist already from instruction
 deseq_dir <- file.path("analysis", "rms", "deseq")
-if(!dir.exists(deseq_dir)){
-  dir.create(deseq_dir, recursive = TRUE)
-}
+fs::dir_create(deseq_dir)
 
 # File where we will output results from mesoderm DE analysis
 deseq_mesoderm_file <- file.path(deseq_dir, "rms_mesoderm_deseq_results.tsv")
@@ -396,7 +395,7 @@ Let's do the same with the gene that is upregulated in ERMS, and again think abo
 # Plot UMAP showing ENSG00000115762 expression across diagnosis groups
 scater::plotReducedDim(mesoderm_sce,
                        dimred = "fastmnn_UMAP",
-                       colour_by = "ENSG00000115762",
+                       color_by = "ENSG00000115762",
                        other_fields = "diagnosis_group") +
   facet_wrap(vars(diagnosis_group)) +
   theme_bw()
@@ -580,7 +579,7 @@ deseq_results_all |>
 ```
 
 Something you'll see in these results are some pretty different P-values between cell types (also note that the `NA` genes here are lncRNAs with no formally assigned gene symbol).
-Specifically, the myoblast P-values are all 3-15 orders of magnitude lower than their mesoderm counterparts, which _may_ be a result of the relatively higher sample size for myoblast tests - larger sample sizes lead to more extreme P-values.
+Specifically, the myoblast P-values are all 3-7 orders of magnitude lower than their mesoderm counterparts, which _may_ be a result of the relatively higher sample size for myoblast tests - larger sample sizes lead to more extreme P-values.
 Importantly, we do _not_ want to compare these P-values directly and conclude that a given gene was "more or less significant" in one cell type or another, since P-values cannot be compared across tests (again, for a more robust assessment of differential expression, use a multivariate model that accounts for cell types!).
 
 To wrap up, feel free to perform some quick visualization of some of these genes!