diff --git a/content/100.figure-table-legends.md b/content/100.figure-table-legends.md index 7147f0c..9daf84e 100644 --- a/content/100.figure-table-legends.md +++ b/content/100.figure-table-legends.md @@ -138,6 +138,37 @@ Distributions reflect broad agreement between platforms in the total number of g ![**Processing additional single-cell modalities in `scpca-nf`.**](https://raw.githubusercontent.com/AlexsLemonade/scpca-paper-figures/main/figures/compiled_figures/pngs/figure_s2.png?sanitize=true){#fig:figs2 tag="S2" width="7in"} + +A. Overview of the `scpca-nf` workflow for processing libraries with CITE-seq or antibody-derived tag (ADT) derived data. +The workflow mirrors that shown in Figure {@fig:fig2}A with several differences accounting for the presence of ADT data. +First, both an RNA and ADT FASTQ file are required as input to `alevin-fry`, along with a TSV file containing infomation about ADT barcodes. +The gene-by-cell and ADT-by-cell count matrices are produced and read into `R` to create a `SingleCellExperiment` (SCE) object. +Second, during post-processing, statistics are calculated to filter cells based on ADT counts, but the filter is not applied. +ADT counts are also normalized and included in the `Processed SCE Object`. +Third, the summary QC report will include a `CITE-seq` section with additional information about ADT-level processing. +Fourth, the workflow exports `SCE` objects containing both RNA and ADT results, while separate `AnnData` objects for RNA and ADT are exported. + +Panels B-D show example figures that appear in the CITE-seq section of the summary QC report, shown here for `SCPCL000290`. + +B. The percent of mitochondrial reads in each cell against the number of genes detected in each cell. +The panel labeled "Keep" displays cells that are retained based on both RNA and ADT counts. +The panel labeled "Filter (ADT only)" displays cells that are filtered based on only ADT counts. +The panel labeled "Filter (RNA only)" displays cells that are filtered based on only RNA counts. +The panel labeled "Filter (RNA & ADT)" panel displays cells that are filtered based on both RNA and ADT counts. + +C. Density plots of the log-normalized ADT counts shown for the four most variable ADTs in the library. + +D. UMAP embeddings of log-normalized RNA expression values where each cell is colored by the expression of the given highly-variable ADT. + +E. Overview of the `scpca-nf` workflow for multiplexed libraries. +The workflow mirrors that shown in Figure {@fig:fig2}A with several differences accounting for the presence of multiplexed data. +First, a TSV file providing information about library pools is required as input to `alevin-fry` along with the RNA FASTQ file. +Second, in parallel, the RNA FASTQ file, the HTO FASTQ file, and, if available, a corresponding Bulk RNA FASTQ file for each sample present in the multiplexed library are provided to a demultiplexing subprocess. +The workflow calculates demultiplexing results based on HTO counts, as well as genetic demultiplexing results if the library has corresponding bulk RNA FASTQ files. +Demultiplexing results are stored in all exported `SCE` objects (`Unfiltered`, `Filtered`, and `Processed`), but libraries themselves are not demultiplexed. +Third, only `SCE` files are provided for multiplexed libraries; no corresponding `AnnData` files are provided. + +