Add scAdvanced GSEA notebook that uses pseudobulk RMS DE results #808

jaclyn-taroni · 2024-11-25T15:05:49Z

Closes #711

Here, I am updating the GSEA notebook in the advanced single-cell module to use the output of the 03-differential_expression notebook in that module: pseudobulk DE results comparing expression in myoblasts between ERMS and ARMS samples.

I've written this using Hallmarks gene sets, which are fairly comprehensive and designed to work with GSEA. There are also only 50 of them.

I'm also deleting the ORA and old GSEA notebook.

My rationale for teaching GSEA and not ORA is that many pathway methods that work on the individual cell level are rank-based (including AUCell – #806), and teaching a seminal FCS method, run in a way that is fairly quick, makes sense to me. Accordingly, I plan to make the pathway analysis slides to talk about FCS methods more generally (like ssGSEA).

jaclyn-taroni · 2024-11-25T15:06:35Z

scRNA-seq-advanced/04-gene_set_enrichment_analysis.Rmd

+
+Normalized enrichment scores (NES) are enrichment scores that are scaled to make gene sets that contain different number of genes comparable.
+
+Pathways with significant, highly positive NES are enriched in ERMS myoblasts, whereas pathways with significant, highly negative NES are enriched in ARMS myoblasts.


Before figuring out who to request for a full review, I'm going to ask @jashapiro to take a look at this interpretation as the instructor of the 03 notebook in the upcoming workshop.

This is correct. The scores are relative to ARMS, so:

Positive values ---> ERMS is higher than ARMS ---> Enriched for ERMS.

Negative values ---> ERMS is lower than ARMS ---> Enriched for ARMS.

sjspielman

This seems fine to me, I didn't really have any code comments.

The main thing I'm not 100% sure on about the lesson swap overall is the fact that we lose teaching gene identifier conversion since it was only in the ORA notebook. I think this is important to touch on since it comes up a lot! I wonder we can still at least introduce the concept in this notebook - where we say "no need to do gene conversion!" Maybe we also say, "but if we had to, we might use AnnotationDBI, and here's a nice vignette about that: https://hbctraining.github.io/DGE_workshop_salmon_online/lessons/AnnotationDbi_lesson.html".

scRNA-seq-advanced/04-gene_set_enrichment_analysis.Rmd

sjspielman · 2024-11-25T16:42:01Z

scRNA-seq-advanced/04-gene_set_enrichment_analysis.Rmd

+
+#### Other resources
+
+* For another example using `clusterProfiler` for GSEA, see [_Intro to DGE: Functional Analysis._ from Harvard Chan Bioinformatics Core Training.](https://hbctraining.github.io/DGE_workshop/lessons/09_functional_analysis.html)


I realized this training has been read-only for a little while.

I think this is probably their more maintained version? https://hbctraining.github.io/Training-modules/DGE-functional-analysis/lessons/02_functional_analysis.html

scRNA-seq-advanced/04-gene_set_enrichment_analysis.Rmd

jaclyn-taroni · 2024-11-25T17:08:55Z

The main thing I'm not 100% sure on about the lesson swap overall is the fact that we lose teaching gene identifier conversion since it was only in the ORA notebook. I think this is important to touch on since it comes up a lot!

If this were the scRNA-seq training, I would have retained it. However, these pathway instruction notebooks need to be on the shorter side, and I think it's fair to assume that some participants for this offering will be familiar with gene identifier conversion. I think we can revisit in https://github.com/AlexsLemonade/exercise-notebook-answers/issues/227.

Co-authored-by: Stephanie Spielman <[email protected]>

jaclyn-taroni · 2024-11-25T17:55:33Z

Thank you, @sjspielman. This is ready for another look!

sjspielman

LGTM assuming test passes, and I have no reason to think it won't!

jaclyn-taroni added 3 commits November 25, 2024 09:56

Add renumbered version of GSEA notebook that uses pseudobulk DE results

a555c22

Remove 05 version of GSEA that uses marker genes

c4ae9f9

Remove ORA notebook

6afb720

jaclyn-taroni commented Nov 25, 2024

View reviewed changes

jaclyn-taroni added 2 commits November 25, 2024 15:17

geneset -> gene sets and rerun

086d1ae

Make corresponding changes to render live script

fb4a2a8

jaclyn-taroni requested a review from sjspielman November 25, 2024 15:54

sjspielman reviewed Nov 25, 2024

View reviewed changes

jaclyn-taroni and others added 3 commits November 25, 2024 12:22

Apply suggestions from code review

fff81bf

Co-authored-by: Stephanie Spielman <[email protected]>

Respond to review and rerun

44f3b76

Merge branch 'master' into jaclyn-taroni/711-use-rms-de

fd4e209

jaclyn-taroni requested a review from sjspielman November 25, 2024 17:55

sjspielman approved these changes Nov 25, 2024

View reviewed changes

jaclyn-taroni mentioned this pull request Nov 25, 2024

Add AUCell notebook for scAdvanced pathway analysis #809

Merged

jaclyn-taroni merged commit 6d4bf3c into master Nov 25, 2024
2 checks passed

jaclyn-taroni deleted the jaclyn-taroni/711-use-rms-de branch November 25, 2024 18:19

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add scAdvanced GSEA notebook that uses pseudobulk RMS DE results #808

Add scAdvanced GSEA notebook that uses pseudobulk RMS DE results #808

jaclyn-taroni commented Nov 25, 2024

jaclyn-taroni Nov 25, 2024

sjspielman Nov 25, 2024

sjspielman left a comment

sjspielman Nov 25, 2024

jaclyn-taroni commented Nov 25, 2024

jaclyn-taroni commented Nov 25, 2024

sjspielman left a comment


		Normalized enrichment scores (NES) are enrichment scores that are scaled to make gene sets that contain different number of genes comparable.

		Pathways with significant, highly positive NES are enriched in ERMS myoblasts, whereas pathways with significant, highly negative NES are enriched in ARMS myoblasts.


		#### Other resources

		* For another example using `clusterProfiler` for GSEA, see [_Intro to DGE: Functional Analysis._ from Harvard Chan Bioinformatics Core Training.](https://hbctraining.github.io/DGE_workshop/lessons/09_functional_analysis.html)

Add scAdvanced GSEA notebook that uses pseudobulk RMS DE results #808

Add scAdvanced GSEA notebook that uses pseudobulk RMS DE results #808

Conversation

jaclyn-taroni commented Nov 25, 2024

jaclyn-taroni Nov 25, 2024

Choose a reason for hiding this comment

sjspielman Nov 25, 2024

Choose a reason for hiding this comment

sjspielman left a comment

Choose a reason for hiding this comment

sjspielman Nov 25, 2024

Choose a reason for hiding this comment

jaclyn-taroni commented Nov 25, 2024

jaclyn-taroni commented Nov 25, 2024

sjspielman left a comment

Choose a reason for hiding this comment