-
-
Notifications
You must be signed in to change notification settings - Fork 29
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add scAdvanced GSEA notebook that uses pseudobulk RMS DE results #808
Conversation
|
||
Normalized enrichment scores (NES) are enrichment scores that are scaled to make gene sets that contain different number of genes comparable. | ||
|
||
Pathways with significant, highly positive NES are enriched in ERMS myoblasts, whereas pathways with significant, highly negative NES are enriched in ARMS myoblasts. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Before figuring out who to request for a full review, I'm going to ask @jashapiro to take a look at this interpretation as the instructor of the 03
notebook in the upcoming workshop.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is correct. The scores are relative to ARMS, so:
- Positive values ---> ERMS is higher than ARMS ---> Enriched for ERMS.
- Negative values ---> ERMS is lower than ARMS ---> Enriched for ARMS.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This seems fine to me, I didn't really have any code comments.
The main thing I'm not 100% sure on about the lesson swap overall is the fact that we lose teaching gene identifier conversion since it was only in the ORA notebook. I think this is important to touch on since it comes up a lot! I wonder we can still at least introduce the concept in this notebook - where we say "no need to do gene conversion!" Maybe we also say, "but if we had to, we might use AnnotationDBI
, and here's a nice vignette about that: https://hbctraining.github.io/DGE_workshop_salmon_online/lessons/AnnotationDbi_lesson.html".
|
||
#### Other resources | ||
|
||
* For another example using `clusterProfiler` for GSEA, see [_Intro to DGE: Functional Analysis._ from Harvard Chan Bioinformatics Core Training.](https://hbctraining.github.io/DGE_workshop/lessons/09_functional_analysis.html) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I realized this training has been read-only for a little while.
I think this is probably their more maintained version? https://hbctraining.github.io/Training-modules/DGE-functional-analysis/lessons/02_functional_analysis.html
If this were the |
Co-authored-by: Stephanie Spielman <[email protected]>
Thank you, @sjspielman. This is ready for another look! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM assuming test passes, and I have no reason to think it won't!
Closes #711
Here, I am updating the GSEA notebook in the advanced single-cell module to use the output of the
03-differential_expression
notebook in that module: pseudobulk DE results comparing expression in myoblasts between ERMS and ARMS samples.I've written this using Hallmarks gene sets, which are fairly comprehensive and designed to work with GSEA. There are also only 50 of them.
I'm also deleting the ORA and old GSEA notebook.
My rationale for teaching GSEA and not ORA is that many pathway methods that work on the individual cell level are rank-based (including AUCell – #806), and teaching a seminal FCS method, run in a way that is fairly quick, makes sense to me. Accordingly, I plan to make the pathway analysis slides to talk about FCS methods more generally (like ssGSEA).