Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fig S2 legend #67

Merged
merged 10 commits into from
Mar 6, 2024
31 changes: 31 additions & 0 deletions content/100.figure-table-legends.md
Original file line number Diff line number Diff line change
Expand Up @@ -138,6 +138,37 @@ Distributions reflect broad agreement between platforms in the total number of g
<!-- Figure S2 -->
![**Processing additional single-cell modalities in `scpca-nf`.**](https://raw.githubusercontent.com/AlexsLemonade/scpca-paper-figures/main/figures/compiled_figures/pngs/figure_s2.png?sanitize=true){#fig:figs2 tag="S2" width="7in"}

<!-- TODO: DO WE WANT TO ADD AN ADT-BY-CELL FILE TO THIS FIGURE?-->
A. Overview of the `scpca-nf` workflow for processing libraries with CITE-seq or antibody-derived tag (ADT) derived data.
The workflow mirrors that shown in Figure {@fig:fig2}A with several differences accounting for the presence of ADT data.
First, both an RNA and ADT FASTQ file are required as input to `alevin-fry`, along with a TSV file containing infomation about ADT barcodes.
The gene-by-cell and ADT-by-cell count matrices are produced and read into `R` to create a `SingleCellExperiment` (SCE) object.
Second, during post-processing, statistics are calculated to filter cells based on ADT counts, but the filter is not applied.
ADT counts are also normalized and included in the `Processed SCE Object`.
Third, the summary QC report will include a `CITE-seq` section with additional information about ADT-level processing.
Fourth, the workflow exports `SCE` objects containing both RNA and ADT results, while separate `AnnData` objects for RNA and ADT are exported.

Panels B-D show example figures that appear in the CITE-seq section of the summary QC report, shown here for `SCPCL000290`.

B. The percent of mitochondrial reads in each cell against the number of genes detected in each cell.
The panel labeled "Keep" displays cells that are retained based on both RNA and ADT counts.
The panel labeled "Filter (ADT only)" displays cells that are filtered based on only ADT counts.
The panel labeled "Filter (RNA only)" displays cells that are filtered based on only RNA counts.
The panel labeled "Filter (RNA & ADT)" panel displays cells that are filtered based on both RNA and ADT counts.

C. Density plots of the log-normalized ADT counts shown for the four most variable ADTs in the library.

D. UMAP embeddings of log-normalized RNA expression values where each cell is colored by the expression of the given highly-variable ADT.

E. Overview of the `scpca-nf` workflow for multiplexed libraries.
The workflow mirrors that shown in Figure {@fig:fig2}A with several differences accounting for the presence of multiplexed data.
First, a TSV file providing information about library pools is required as input to `alevin-fry` along with the RNA FASTQ file.
Second, in parallel, the RNA FASTQ file, the HTO FASTQ file, and, if available, a corresponding Bulk RNA FASTQ file for each sample present in the multiplexed library are provided to a demultiplexing subprocess.
The workflow calculates demultiplexing results based on HTO counts, as well as genetic demultiplexing results if the library has corresponding bulk RNA FASTQ files.
Demultiplexing results are stored in all exported `SCE` objects (`Unfiltered`, `Filtered`, and `Processed`), but libraries themselves are not demultiplexed.
Third, only `SCE` files are provided for multiplexed libraries; no corresponding `AnnData` files are provided.



<br><br>

Expand Down