First sections of methods #38

allyhawkins · 2024-02-20T20:13:14Z

Here I'm adding the first few sections of the methods.

I combined the data generation and data processing into one section, since how data was generated dictates who processed it.
In that same section, I wasn't quite sure how much detail to go into regarding processing? We could just say all libraries were processed in the contributor's lab and remove the rest if we think its too much?
Here, I added the sections for processing with alevin-fry and post-processing of just regular single-cell/single-nuclei RNA-seq libraries. Is this level of detail okay?

I want to break this up so it's not one giant PR, so I'll file issues to track completing the rest of the sections.

github-actions · 2024-02-20T20:15:17Z

Click the link below to download the manuscript build as a ZIP file.
This build is associated with commit 479f968.

Manuscript build

github-actions · 2024-02-20T20:28:07Z

Click the link below to download the manuscript build as a ZIP file.
This build is associated with commit 16079cf.

Manuscript build

jaclyn-taroni · 2024-02-24T19:25:10Z

I will review for the level of detail. When we sprint plan, I want to discuss potentially spreading methods pull request review across the team. We want to avoid too many cooks at first, of course, but this seems like the section that is safest to have multiple reviewers, and I anticipate that it would make me much less of a bottleneck.

jaclyn-taroni

Level of detail looks good 🎉

Let's have someone who has spent more time with the scpca-nf and scpcaTools code bases take the final look.

content/04.methods.md

Co-authored-by: Jaclyn Taroni <[email protected]>

github-actions · 2024-02-26T21:40:49Z

Click the link below to download the manuscript build as a ZIP file.
This build is associated with commit 6519ca6.

Manuscript build

allyhawkins · 2024-02-26T21:44:29Z

I addressed both of @jaclyn-taroni comments. I'm going to send this over to @jashapiro for review now.

github-actions · 2024-02-26T21:45:52Z

Click the link below to download the manuscript build as a ZIP file.
This build is associated with commit 371f392.

Manuscript build

jashapiro

Overall I think this looks good! My comments are mostly nitpicky edits and some attempts to smooth flow. I think we could reduce the alevin-fry description just a bit, but that was really the biggest thing I saw.

jashapiro · 2024-02-27T20:50:30Z

content/04.methods.md

-  - Parameter choices for alevin-fry
+
+To quantify each single-cell and single-nuclei RNA-seq gene expression, `scpca-nf` uses `salmon alevin` [@doi:10.1186/s13059-020-02151-8] and `alevin-fry`[@doi:10.1038/s41592-022-01408-3] to generate a gene by cell counts matrix.
+Prior to mapping, we generated an index using transcripts from both spliced cDNA and intronic regions, denoted as the `splici` index.


Just noting that we do include flanking regions too. I might just say "spliced and unspliced cDNA sequences" but also we might want to cite something about splici?

content/04.methods.md

jashapiro · 2024-02-27T20:53:36Z

content/04.methods.md

+Raw data was generated, and sample metadata was compiled by each lab and institution contributing to the Portal.
+Single-cell or single-nuclei libraries were generated using one of the commercially available kits from 10X Genomics.
+For bulk RNA-seq, RNA was collected and sequenced using either paired-end or single-end sequencing. 
+For spatial transcriptomics, cDNA libraries were generated using the Visium kit from 10X Genomics.


Suggested change

For spatial transcriptomics, cDNA libraries were generated using the Visium kit from 10X Genomics.

For spatial transcriptomics, cDNA libraries were generated using the Visium kit from 10x Genomics.

content/04.methods.md

jashapiro · 2024-02-27T21:14:20Z

content/04.methods.md

+This output is read into R to create a `SingleCellExperiment` using the `fishpond::load_fry()` function. 
+The resulting `SingleCellExperiment` contains a `counts` assay with a gene-by-cell counts matrix where all spliced and unspliced reads for a given gene are totaled together. 
+We also include a `spliced` assay, which includes a gene-by-cell counts matrix for only spliced reads. 
+These matrices include all potential cells, including empty droplets, and are provided in the unfiltered objects included in downloads from the Portal.


I wonder if we might want to put "unfiltered" and "filtered" and "processed" in quotes to make it clear that these are labels more than anything?

Suggested change

These matrices include all potential cells, including empty droplets, and are provided in the unfiltered objects included in downloads from the Portal.

These matrices include all potential cells, including empty droplets, and are provided in the "unfiltered" objects included in downloads from the Portal.

content/04.methods.md

Co-authored-by: Joshua Shapiro <[email protected]>

github-actions · 2024-02-28T15:41:55Z

Click the link below to download the manuscript build as a ZIP file.
This build is associated with commit 45344ec.

Manuscript build

github-actions · 2024-02-28T15:52:36Z

Click the link below to download the manuscript build as a ZIP file.
This build is associated with commit 15f4f75.

Manuscript build

allyhawkins · 2024-02-28T15:54:16Z

Just noting that we do include flanking regions too. I might just say "spliced and unspliced cDNA sequences" but also we might want to cite something about splici?

I updated the text to reflect this comment and added the ref for the alevin-fry paper, since that's where they introduce the splici index. Alternatively, we could link to the tutorial that describes the splici index, but I think the paper is better?

I wonder if we might want to put "unfiltered" and "filtered" and "processed" in quotes to make it clear that these are labels more than anything?

I'm torn on this. If we do it here I think we need to do it everytime we refer to these objects throughout the manuscript. Are we okay with that?

Also for the gene by cell/droplets, I changed to gene by barcode? What do you think of that?
I removed the hyphens throughout, because I think in the figure legends I've been looking at we aren't using hyphens and I can also remove any ones that have slipped into the results so far.

And then I updated the alevin-fry description of the parameters based on your suggestions, this should be ready for another look.

…le-cell

github-actions · 2024-02-28T15:57:30Z

Click the link below to download the manuscript build as a ZIP file.
This build is associated with commit 4bf0b99.

Manuscript build

jashapiro

LGTM.

I personally think the quotes for "unfiltered" etc. are fine here, even we don't use them everywhere, but it might depend a bit on the context, which I haven't looked at yet! This might also be something where other people have stronger opinions than mine.

An alternative might be to rephrase to say something like "objects labeled as unfiltered." But that gets a bit clunky.

I also suggested a more specific update to the alevin-fry section. Feel free to update/modify that to your taste.

jashapiro · 2024-02-28T19:51:24Z

content/04.methods.md

-  - Parameter choices for alevin-fry
+
+To quantify RNA-seq gene expression for each cell or nucleus in a library, `scpca-nf` uses `salmon alevin` [@doi:10.1186/s13059-020-02151-8] and `alevin-fry`[@doi:10.1038/s41592-022-01408-3] to generate a gene by barcode counts matrix.
+Prior to mapping, we generated an index using transcripts from both spliced cDNA and unspliced cDNA sequences, denoted as the `splici` index [@doi:10.1038/s41592-022-01408-3].


I agree the paper is the better reference here.

jashapiro · 2024-02-28T19:54:11Z

content/04.methods.md

-  - HVG selection
-  - PCA and UMAP calculation
+
+The output from running `alevin-fry` includes a gene by cell counts matrix, with reads from both spliced and unspliced reads.


Do we want to say "barcode" here too, and then convert this to "cell" after filtering?
That or we should just always use "gene by cell" and maybe add a note that some of the "cells" correspond to barcodes that were not actually observed before this point.

I made an update to use "gene by cell" throughout, but including a caveat that the output from alevin-fry is all potential cell barcodes. I think this should be sufficient?

Looks good to me.

content/04.methods.md

Co-authored-by: Joshua Shapiro <[email protected]>

github-actions · 2024-02-28T21:05:22Z

Click the link below to download the manuscript build as a ZIP file.
This build is associated with commit dd17e57.

Manuscript build

…le-cell

allyhawkins · 2024-02-28T21:07:51Z

I personally think the quotes for "unfiltered" etc. are fine here, even we don't use them everywhere, but it might depend a bit on the context, which I haven't looked at yet! This might also be something where other people have stronger opinions than mine.

Just noting that I'm going to leave the quotes for now, but I think we can gather opinions on them when going through everything.

github-actions · 2024-02-28T21:08:43Z

Click the link below to download the manuscript build as a ZIP file.
This build is associated with commit b4b962c.

Manuscript build

allyhawkins and others added 2 commits February 20, 2024 13:16

start methods

f20c5d1

consistent tense and dois

479f968

allyhawkins requested a review from jaclyn-taroni February 20, 2024 20:13

fix miQC ref?

16079cf

jaclyn-taroni reviewed Feb 24, 2024

View reviewed changes

content/04.methods.md Outdated Show resolved Hide resolved

content/04.methods.md Outdated Show resolved Hide resolved

Apply suggestions from code review

6519ca6

Co-authored-by: Jaclyn Taroni <[email protected]>

add genome build

371f392

allyhawkins requested a review from jashapiro February 26, 2024 21:44

allyhawkins mentioned this pull request Feb 27, 2024

Methods for processing CITE seq and HTO data #52

Merged

jashapiro reviewed Feb 27, 2024

View reviewed changes

Apply suggestions from code review

45344ec

Co-authored-by: Joshua Shapiro <[email protected]>

add splici ref, simplify fry description, and quote objects

15f4f75

remove hyphens in gene by cell

07bc951

Merge branch 'main' into allyhawkins/methods-data-processing-and-sing…

4bf0b99

…le-cell

allyhawkins requested a review from jashapiro February 28, 2024 15:54

jashapiro approved these changes Feb 28, 2024

View reviewed changes

update af description

dd17e57

Co-authored-by: Joshua Shapiro <[email protected]>

allyhawkins added 2 commits February 28, 2024 15:06

use gene by cell and add caveat about potential cells

bab6f90

Merge branch 'main' into allyhawkins/methods-data-processing-and-sing…

b4b962c

…le-cell

allyhawkins merged commit de8bf69 into main Feb 28, 2024
1 check passed

allyhawkins deleted the allyhawkins/methods-data-processing-and-single-cell branch February 28, 2024 21:11

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

First sections of methods #38

First sections of methods #38

allyhawkins commented Feb 20, 2024

github-actions bot commented Feb 20, 2024

github-actions bot commented Feb 20, 2024

jaclyn-taroni commented Feb 24, 2024

jaclyn-taroni left a comment

github-actions bot commented Feb 26, 2024

allyhawkins commented Feb 26, 2024

github-actions bot commented Feb 26, 2024

jashapiro left a comment

jashapiro Feb 27, 2024

jashapiro Feb 27, 2024

jashapiro Feb 27, 2024

github-actions bot commented Feb 28, 2024

github-actions bot commented Feb 28, 2024

allyhawkins commented Feb 28, 2024

github-actions bot commented Feb 28, 2024

jashapiro left a comment

jashapiro Feb 28, 2024

jashapiro Feb 28, 2024

allyhawkins Feb 28, 2024

jashapiro Feb 28, 2024

github-actions bot commented Feb 28, 2024

allyhawkins commented Feb 28, 2024

github-actions bot commented Feb 28, 2024

	For spatial transcriptomics, cDNA libraries were generated using the Visium kit from 10X Genomics.
	For spatial transcriptomics, cDNA libraries were generated using the Visium kit from 10x Genomics.

	These matrices include all potential cells, including empty droplets, and are provided in the unfiltered objects included in downloads from the Portal.
	These matrices include all potential cells, including empty droplets, and are provided in the "unfiltered" objects included in downloads from the Portal.

First sections of methods #38

First sections of methods #38

Conversation

allyhawkins commented Feb 20, 2024

github-actions bot commented Feb 20, 2024

github-actions bot commented Feb 20, 2024

jaclyn-taroni commented Feb 24, 2024

jaclyn-taroni left a comment

Choose a reason for hiding this comment

github-actions bot commented Feb 26, 2024

allyhawkins commented Feb 26, 2024

github-actions bot commented Feb 26, 2024

jashapiro left a comment

Choose a reason for hiding this comment

jashapiro Feb 27, 2024

Choose a reason for hiding this comment

jashapiro Feb 27, 2024

Choose a reason for hiding this comment

jashapiro Feb 27, 2024

Choose a reason for hiding this comment

github-actions bot commented Feb 28, 2024

github-actions bot commented Feb 28, 2024

allyhawkins commented Feb 28, 2024

github-actions bot commented Feb 28, 2024

jashapiro left a comment

Choose a reason for hiding this comment

jashapiro Feb 28, 2024

Choose a reason for hiding this comment

jashapiro Feb 28, 2024

Choose a reason for hiding this comment

allyhawkins Feb 28, 2024

Choose a reason for hiding this comment

jashapiro Feb 28, 2024

Choose a reason for hiding this comment

github-actions bot commented Feb 28, 2024

allyhawkins commented Feb 28, 2024

github-actions bot commented Feb 28, 2024