2024 December updates #830

sjspielman · 2024-12-09T21:50:19Z

This issue tracks items we notice need fixing from the December 2024 advanced scRNA-seq workshop.

sjspielman · 2024-12-09T21:51:00Z

In the integration notebook, we should probably name the Y-axis proportion rather than its default count

training-modules/scRNA-seq-advanced/02-dataset_integration.Rmd

Line 657 in a53758e

# Use ggplot2 to make a barplot the cell types across samples

sjspielman · 2024-12-10T14:54:57Z

I didn't catch the full sentence Josh said during intro slides, but I caught the bit that we need to add an arrow somewhere in the "single sample roadmap" diagram (I believe related to marker gene or gene-set analysis).

jaclyn-taroni · 2024-12-10T15:46:09Z

This comment should say that it is the path to the Cell Ranger matrix directory:

training-modules/scRNA-seq-advanced/01-read_filter_normalize_scRNA.Rmd

Line 79 in a53758e

# Path to the Cell Ranger matrix file

jaclyn-taroni · 2024-12-10T18:35:37Z

The "Will it integrate?" slide in the integration slides should say "healthy and tumor" instead of "healthy and normal"

jaclyn-taroni · 2024-12-10T19:54:04Z

The first plotReducedDim() call in the integration notebook (line 463) has pretty complicated syntax -- I am not sure it's completely necessary that it is live.

sjspielman · 2024-12-11T14:56:26Z

In integration, I wonder if we want to change up a little of the opening chunks where we set file names and read them in. Some of these thoughts are based on (unrelated to training) code reviews that @jashapiro had left elsewhere, if he wants to weigh in on this potential change too:

Rather than dir() we might use list.files() instead to list out what's in the data directory

training-modules/scRNA-seq-advanced/02-dataset_integration.Rmd

Lines 88 to 90 in a53758e

```{r input dir, live = TRUE}

dir(input_dir)

```

Currently we use file.path() to form all the input file paths, but I wonder if it might simplify code to instead just go ahead and use list.files(full.names = TRUE) in the first place (or show list.files() first without and then with this argument)

training-modules/scRNA-seq-advanced/02-dataset_integration.Rmd

Lines 171 to 175 in a53758e

    
           ```{r define sce_paths, live = TRUE} 
        
           # Now, convert these to file paths: <input_dir>/<sample_name>.rds 
        
           sce_paths <- file.path(input_dir, 
        
                                  glue::glue("{sample_names}.rds") 
        
           )

We add list names to sce_list only after we read in the files. We might want to add those names beforehand

training-modules/scRNA-seq-advanced/02-dataset_integration.Rmd

Lines 203 to 206 in a53758e

    
           ```{r add list names, live = TRUE} 
        
           # Assign the sample names as the names for sce_list 
        
           names(sce_list) <- sample_names 
        
           ```

sjspielman · 2024-12-11T14:58:45Z

Suggested integration changes:

We should beef up some of the explanation for how the cell types in Patel et al were obtained in the first place, since this helps contextualize how we use them to assess integration results

training-modules/scRNA-seq-advanced/02-dataset_integration.Rmd

Lines 213 to 216 in a53758e

    
           If you look closely at the printed SCE objects, you may notice that they all contain `colData` table columns `celltype_fine` and `celltype_broad`. 
        
           These columns (which we added to SCE objects during [pre-processing](https://github.com/AlexsLemonade/training-modules/tree/master/scRNA-seq-advanced/setup/rms)) contain putative _cell type annotations_ as assigned in [Patel _et al._ (2022)](https://doi.org/10.1016/j.devcel.2022.04.003). 
        
           We will end up leveraging these cell type annotations to explore how successful our integration is; after integration, we expect cell types from different samples to group together, rather than being separated by batches.

This code would make more sense with length(), not head()

training-modules/scRNA-seq-advanced/02-dataset_integration.Rmd

Lines 304 to 316 in a53758e

    
           ```{r shared genes} 
        
           # Define vector of shared genes 
        
           shared_genes <- sce_list |> 
        
             # get rownames (genes) for each SCE in sce_list 
        
             purrr::map(rownames) |> 
        
             # reduce to the _intersection_ among lists 
        
             purrr::reduce(intersect) 
        
           ``` 
        
           ```{r print shared genes, live = TRUE} 
        
           # Use head to look at the vector of shared genes: 
        
           head(shared_genes) 
        
           ```

We should probably use sce or similar, not x, in these spots to emphasize that it's nice that you can have informative "loop variables" with this new(-ish) syntax

training-modules/scRNA-seq-advanced/02-dataset_integration.Rmd

Lines 290 to 294 in a53758e

    
           ```{r compare rowdata, live = TRUE} 
        
           # Use `purrr::map()` to quickly extract rowData column names for all SCEs 
        
           purrr::map(sce_list, 
        
                      \(x) colnames(rowData(x))) 
        
           ```

training-modules/scRNA-seq-advanced/02-dataset_integration.Rmd

Lines 334 to 337 in a53758e

    
           ```{r compare coldata} 
        
           purrr::map(sce_list, 
        
                      \(x) colnames(colData(x)) ) 
        
           ```

This code would make more sense with table(), not unique()

training-modules/scRNA-seq-advanced/02-dataset_integration.Rmd

Lines 427 to 428 in a53758e

# What are the unique values in the `sample` column?

unique( colData(merged_sce)$sample )

jaclyn-taroni · 2024-12-11T15:25:05Z

The DE dimension reduction plots by cell type:

training-modules/scRNA-seq-advanced/03-differential_expression.Rmd

Lines 226 to 234 in a53758e

    
           ```{r celltype UMAP} 
        
           # UMAP of all samples labeled by cell type 
        
           scater::plotReducedDim(integrated_sce, 
        
                                  dimred = "fastmnn_UMAP", 
        
                                  # color each point by cell type 
        
                                  color_by = "celltype_broad", 
        
                                  point_size= 0.5,  
        
                                  point_alpha = 0.4) 
        
           ```

Could probably use a tweak to the legend along the lines of what we have in the integration notebook:

training-modules/scRNA-seq-advanced/02-dataset_integration.Rmd

Line 471 in a53758e

    
             guides(color = guide_legend(override.aes = list(size = 3, alpha = 1))) + # Modify the legend key with larger, easier to see points

To make the cell type colors easier to see when projected, etc.

sjspielman · 2024-12-11T16:36:15Z

This is not in fact the "last thing" we do in this notebook, but rather the "next thing":

training-modules/scRNA-seq-advanced/03-differential_expression.Rmd

Lines 630 to 631 in a53758e

    
           The last thing that we will do is take a look at how many genes are significant. 
        
           Here we will want to use the adjusted p-value, found in the `padj` column of the results, as this accounts for multiple test correction.

jashapiro · 2024-12-12T15:20:24Z

Add set.seed() to setup in pathway analysis notebook

training-modules/scRNA-seq-advanced/04-gene_set_enrichment_analysis.Rmd

Line 40 in a53758e

## Set up

jashapiro · 2024-12-12T15:21:17Z

We don't use marker genes anymore, but differential expression analysis results:

training-modules/scRNA-seq-advanced/04-gene_set_enrichment_analysis.Rmd

Line 56 in a53758e

# We'll use the marker genes as GSEA input

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

2024 December updates #830

2024 December updates #830

sjspielman commented Dec 9, 2024

sjspielman commented Dec 9, 2024

sjspielman commented Dec 10, 2024 •

edited

Loading

jaclyn-taroni commented Dec 10, 2024

jaclyn-taroni commented Dec 10, 2024

jaclyn-taroni commented Dec 10, 2024

sjspielman commented Dec 11, 2024 •

edited

Loading

sjspielman commented Dec 11, 2024 •

edited

Loading

jaclyn-taroni commented Dec 11, 2024 •

edited

Loading

sjspielman commented Dec 11, 2024

jashapiro commented Dec 12, 2024

jashapiro commented Dec 12, 2024

2024 December updates #830

2024 December updates #830

Comments

sjspielman commented Dec 9, 2024

sjspielman commented Dec 9, 2024

sjspielman commented Dec 10, 2024 • edited Loading

jaclyn-taroni commented Dec 10, 2024

jaclyn-taroni commented Dec 10, 2024

jaclyn-taroni commented Dec 10, 2024

sjspielman commented Dec 11, 2024 • edited Loading

sjspielman commented Dec 11, 2024 • edited Loading

jaclyn-taroni commented Dec 11, 2024 • edited Loading

sjspielman commented Dec 11, 2024

jashapiro commented Dec 12, 2024

jashapiro commented Dec 12, 2024

sjspielman commented Dec 10, 2024 •

edited

Loading

sjspielman commented Dec 11, 2024 •

edited

Loading

sjspielman commented Dec 11, 2024 •

edited

Loading

jaclyn-taroni commented Dec 11, 2024 •

edited

Loading