-
Notifications
You must be signed in to change notification settings - Fork 1
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
First sections of methods #38
First sections of methods #38
Conversation
Click the link below to download the manuscript build as a ZIP file. |
Click the link below to download the manuscript build as a ZIP file. |
I will review for the level of detail. When we sprint plan, I want to discuss potentially spreading methods pull request review across the team. We want to avoid too many cooks at first, of course, but this seems like the section that is safest to have multiple reviewers, and I anticipate that it would make me much less of a bottleneck. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Level of detail looks good 🎉
Let's have someone who has spent more time with the scpca-nf
and scpcaTools
code bases take the final look.
Co-authored-by: Jaclyn Taroni <[email protected]>
Click the link below to download the manuscript build as a ZIP file. |
I addressed both of @jaclyn-taroni comments. I'm going to send this over to @jashapiro for review now. |
Click the link below to download the manuscript build as a ZIP file. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Overall I think this looks good! My comments are mostly nitpicky edits and some attempts to smooth flow. I think we could reduce the alevin-fry
description just a bit, but that was really the biggest thing I saw.
content/04.methods.md
Outdated
- Parameter choices for alevin-fry | ||
|
||
To quantify each single-cell and single-nuclei RNA-seq gene expression, `scpca-nf` uses `salmon alevin` [@doi:10.1186/s13059-020-02151-8] and `alevin-fry`[@doi:10.1038/s41592-022-01408-3] to generate a gene by cell counts matrix. | ||
Prior to mapping, we generated an index using transcripts from both spliced cDNA and intronic regions, denoted as the `splici` index. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just noting that we do include flanking regions too. I might just say "spliced and unspliced cDNA sequences" but also we might want to cite something about splici
?
content/04.methods.md
Outdated
Raw data was generated, and sample metadata was compiled by each lab and institution contributing to the Portal. | ||
Single-cell or single-nuclei libraries were generated using one of the commercially available kits from 10X Genomics. | ||
For bulk RNA-seq, RNA was collected and sequenced using either paired-end or single-end sequencing. | ||
For spatial transcriptomics, cDNA libraries were generated using the Visium kit from 10X Genomics. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For spatial transcriptomics, cDNA libraries were generated using the Visium kit from 10X Genomics. | |
For spatial transcriptomics, cDNA libraries were generated using the Visium kit from 10x Genomics. |
content/04.methods.md
Outdated
This output is read into R to create a `SingleCellExperiment` using the `fishpond::load_fry()` function. | ||
The resulting `SingleCellExperiment` contains a `counts` assay with a gene-by-cell counts matrix where all spliced and unspliced reads for a given gene are totaled together. | ||
We also include a `spliced` assay, which includes a gene-by-cell counts matrix for only spliced reads. | ||
These matrices include all potential cells, including empty droplets, and are provided in the unfiltered objects included in downloads from the Portal. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wonder if we might want to put "unfiltered" and "filtered" and "processed" in quotes to make it clear that these are labels more than anything?
These matrices include all potential cells, including empty droplets, and are provided in the unfiltered objects included in downloads from the Portal. | |
These matrices include all potential cells, including empty droplets, and are provided in the "unfiltered" objects included in downloads from the Portal. |
Co-authored-by: Joshua Shapiro <[email protected]>
Click the link below to download the manuscript build as a ZIP file. |
Click the link below to download the manuscript build as a ZIP file. |
I updated the text to reflect this comment and added the ref for the alevin-fry paper, since that's where they introduce the splici index. Alternatively, we could link to the tutorial that describes the splici index, but I think the paper is better?
I'm torn on this. If we do it here I think we need to do it everytime we refer to these objects throughout the manuscript. Are we okay with that? Also for the And then I updated the alevin-fry description of the parameters based on your suggestions, this should be ready for another look. |
Click the link below to download the manuscript build as a ZIP file. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM.
I personally think the quotes for "unfiltered" etc. are fine here, even we don't use them everywhere, but it might depend a bit on the context, which I haven't looked at yet! This might also be something where other people have stronger opinions than mine.
An alternative might be to rephrase to say something like "objects labeled as unfiltered." But that gets a bit clunky.
I also suggested a more specific update to the alevin-fry section. Feel free to update/modify that to your taste.
- Parameter choices for alevin-fry | ||
|
||
To quantify RNA-seq gene expression for each cell or nucleus in a library, `scpca-nf` uses `salmon alevin` [@doi:10.1186/s13059-020-02151-8] and `alevin-fry`[@doi:10.1038/s41592-022-01408-3] to generate a gene by barcode counts matrix. | ||
Prior to mapping, we generated an index using transcripts from both spliced cDNA and unspliced cDNA sequences, denoted as the `splici` index [@doi:10.1038/s41592-022-01408-3]. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree the paper is the better reference here.
content/04.methods.md
Outdated
- HVG selection | ||
- PCA and UMAP calculation | ||
|
||
The output from running `alevin-fry` includes a gene by cell counts matrix, with reads from both spliced and unspliced reads. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we want to say "barcode" here too, and then convert this to "cell" after filtering?
That or we should just always use "gene by cell" and maybe add a note that some of the "cells" correspond to barcodes that were not actually observed before this point.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I made an update to use "gene by cell" throughout, but including a caveat that the output from alevin-fry is all potential cell barcodes. I think this should be sufficient?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good to me.
Co-authored-by: Joshua Shapiro <[email protected]>
Click the link below to download the manuscript build as a ZIP file. |
Just noting that I'm going to leave the quotes for now, but I think we can gather opinions on them when going through everything. |
Click the link below to download the manuscript build as a ZIP file. |
Here I'm adding the first few sections of the methods.
alevin-fry
and post-processing of just regular single-cell/single-nuclei RNA-seq libraries. Is this level of detail okay?I want to break this up so it's not one giant PR, so I'll file issues to track completing the rest of the sections.