AlexsLemonade · allyhawkins · Feb 28, 2024 · Feb 15, 2024 · Feb 24, 2024 · Feb 26, 2024
diff --git a/content/03.results.md b/content/03.results.md
@@ -2,25 +2,30 @@
 
 ## The Single-cell Pediatric Cancer Atlas Portal
 
-1. History and overview of the Portal
-  - In 2022, the Childhood Cancer Data Lab launched the Single-cell Pediatric Cancer Atlas (ScPCA) Portal to make uniformly processed, summarized single-cell and single-nuclei RNA-seq data and de-identified metadata available for download
-  - The Portal currently holds X amount of samples from X amount of tumor types
-  - Data available on the Portal was obtained using two mechanisms - accepting raw data from ALSF-funded investigators and investigators who used our open-source pipeline to produce summarized gene expression data for inclusion on the portal.
-  - In addition to providing summarized gene expression data, we collect a core set of metadata that is provided on the Portal for all samples including, age, sex, diagnosis, subdiagnosis (if applicable), tissue location, and disease stage.
-  - All metadata that is provided by the submitter is reviewed to standardize as much as possible. We also utilize ontology ID's where possible.
-  - Fig. 1A shows how many samples we have from each type of tumor. For each diagnosis, we also indicate what proportion of the samples come from each disease stage (e.g., initial diagnosis, recurrence, post-mortem).
-  - The samples obtained on the portal are mostly from patient tumors, although some are from patient-derived xenografts and human cell lines
-  - In addition to single-cell and single-nuclei RNA-seq, many samples have associated bulk RNA-seq, ADT data (CITE-seq), cell hashing, or spatial transcriptomics.
-  - Fig. 1B summarizes the total number of samples that are single-cell vs. single-nuclei. Additionally, we show how many of the samples on the portal also have either bulk, CITE, cell hashing, or spatial data.
-  - Supplemental Table 1 shows a breakdown of how many of each modality is found in each project.
-
-2. Obtaining additional project information
-  - On the Portal, samples are organized by project. Each project is a collection of similar samples from a single investigator.
-  - To select projects of interest, users can filter based on diagnosis, modality included, single-cell or single-nuclei and 10X version. Additionally, users will be able to filter based on if the project includes cell line samples or xenografts.
-  - A summary of each project, including a list of samples found in each project, is displayed on the Portal.
-  - Fig.1C shows an example of this summary which include an abstract, links to any external information about the projects such as any associated publication information, and links to external places where data may be stored such as SRA or GEO.
-  - If a project includes bulk, CITE, spatial, or multiplexing, this will also be indicated on the project card.
+In March of 2022, the Childhood Cancer Data Lab launched the Single-cell Pediatric Cancer Atlas (ScPCA) Portal to make uniformly processed, summarized single-cell and single-nuclei RNA-seq data and de-identified metadata from pediatric tumor samples available for download.
+Today, the Portal contains data from 500 samples and over 50 tumor types.
+Data available on the Portal was obtained using two different mechanisms.
+Raw data was accepted from ALSF-funded investigators and processed using our open-source pipeline, `scpca-nf`, or investigators processed their raw data using `scpca-nf`, producing summarized gene expression data submitted for inclusion on the Portal.
 
+All samples on the Portal include a core set of metadata obtained from investigators, including age, sex, diagnosis, subdiagnosis (if applicable), tissue location, and disease stage.
+Some investigators submitted additional metadata, such as treatment and tumor stage also found on the Portal.
+All submitted metadata was standardized as much as possible to maintain consistency across projects before adding to the Portal.
+In addition to providing a human-readable value for the submitted metadata, we also provide an ontology term ID, if applicable.
+The total number of samples for each diagnosis is shown in Figure 1A, along with a breakdown of the proportion of samples from each disease stage within a diagnosis group.
+Figure 1A summarizes all samples from patient tumors or patient-derived xenografts currently available on the Portal.
+Along with the patient tumors, the Portal contains a handful of samples from human tumor cell lines.
+
+Each available sample has at minimum summarized gene expression data from either single-cell or single-nuclei RNA-seq.
+However, some samples include additional data, such as quantified data from tagging cells with Antibody-derived tags (ADT), like CITE-seq[@doi:10.1038/nmeth.4380], or multiplexing samples with hashtag oligonucleotides (HTO)[@doi:10.1186/s13059-018-1603-1].
+In some cases, multiple libraries from the same sample were collected to conduct either bulk RNA-seq or spatial transcriptomics.
+Downloading a sample on the Portal will include sequencing data from all associated libraries, including data from any additional modalities mentioned here.
+A summary of the number of samples with each additional modality is shown in Figure 1B, and a detailed summary of the total samples with each sequencing method broken-down by project, is available in Supplemental Table 1.
+
+Samples on the Portal are organized by project, where each project is a collection of similar samples from a single investigator.
+Users can download all samples in a project or navigate to projects of interest and choose individual samples to download.
+To identify projects of interest, users can filter based on diagnosis, included modalities (e.g., CITE-seq, bulk RNA-seq), 10X Genomics version (e.g., 10Xv2, 10Xv3), and whether or not a project includes samples derived from patient-derived xenografts or cell lines.
+The project card displays an abstract, the total number of samples included, a list of diagnoses for all samples included in the Project, and links to any external information associated with the project, such as publications and links to external data, such as SRA or GEO (Figure 1C).
+The project card will also indicate the type(s) of sequencing performed, including the 10X Genomics kit version, the suspension type (cell or nucleus), and if additional sequencing is present, like bulk RNA-seq or multiplexing.
 
 ## Uniform processing of data available on the ScPCA Portal
 
@@ -91,4 +96,4 @@
   - `scpca-nf` is able to quantify both of these additional sequencing methods.
   - Bulk RNA FASTQ are first trimmed using `fastp` and then aligned using `salmon`. The bulk output is a single tsv file with the sample by gene matrix for all samples in that project.
   - For spatial transcriptomics, the spatial RNA FASTQ and slide image are input into `scpca-nf` and quantified using `spaceranger`. The output includes the spot by gene matrix along with a summary report, produced by `spaceranger`.
-  
+