Example Metadata & Configuration Files for `scpca-nf`

Example files

This directory contains the following example files:

File name	Description
`user_template.config`	An example Nextflow configuration file for the `scpca-nf` workflow
`example_run_metadata.tsv`	An example run metadata file for the `scpca-nf` workflow.
`example_sample_metadata.tsv`	An example sample metadata file for the `scpca-nf` workflow.
`example_multiplex_pools.tsv`	An example multiplex pool file for the `scpca-nf` workflow.
`example_project_celltype_metadata.tsv`	An example project cell type metadata file for performing optional cell type annotation in the `scpca-nf` workflow.

These files provide examples of expected formatting and content, but note that the specific values in these files may not be applicable or sufficient for running scpca-nf directly on your system.

Note that example_run_metadata.tsv additionally refers to several example files, whose descriptions can be found as follows:

The format for the example_barcode_files/cite_barcodes.tsv file is described in the ADT processing section of external-instructions.md.
The format for the example_barcode_files/cellhash_barcodes.tsv file is described in the multiplexed libraries section of external-instructions.md.
The format for the example_metadata_files/submitter_celltypes.tsv file is described in the cell type annotation section of external-instructions.md.

Testing your setup with example data

⚠️ These instructions are only intended to be used to test accurate set up of a configuration file. Before following these instructions, please ensure that you have already set up your own configuration file and have created and named a profile to use.

You can test your configuration setup by performing a test run with the example data that we have provided.

We recommend using the example 10X dataset from a human glioblastoma donor that was processed using the 10X Genomics' Next GEM Single Cell 3' Reagent Kits v3.1(note: you may be prompted to provide an email and register upon navigating to the 10X downloads site). The fastq files for this example data can be downloaded from the following link (note: These files will take approximately 10 GB of disk space upon download and expanding the tar file): Brain_Tumor_3p_fastqs.tar.

Following download and unzipping of the fastq files, you will need to create a tab-separated values run metadata file that looks like the following:

scpca_run_id	scpca_library_id	scpca_sample_id	scpca_project_id	technology	assay_ontology_term_id	seq_unit	sample_reference	files_directory
run01	library01	sample01	project01	10Xv3.1	EFO:XXX	cell	Homo_sapiens.GRCh38.104	/path/to/example_fastq_files

Be sure to enter the full path to the directory containing the fastq files in the files_directory column.

You will also need to create a tab-separated values sample metadata file. At a minimum, the sample metadata file must contain a column with scpca_sample_id as the header. The contents of this column should contain all unique sample ids that are present in the scpca_sample_id column of the run metadata file. Additionally, you may include columns is_cell_line and is_xenograft to indicate the sample type:

is_cell_line: Use TRUE if the sample is from a cell line and FALSE otherwise. Cell type annotation will not be performed for samples that are TRUE.
is_xenograft: Use TRUE if the sample is from a patient-derived xenograft and FALSE otherwise.

This information will be reflected in the summary QC report.

Below is an example of a sample metadata file:

scpca_sample_id	is_cell_line	is_xenograft	diagnosis	age
sample01	FALSE	FALSE	glioblastoma	71

Note that the diagnosis and age columns are shown as example sample metadata one might include in the sample metadata file. The metadata file that you create does not need to match this exactly, but it must contain the required scpca_sample_id column.

The following command can then be used to test your configuration setup with the example data:

nextflow run AlexsLemonade/scpca-nf \
  -config <path to config file>  \
  -profile <name of profile> \
  --run_metafile <path to run metadata file> \
  --sample_metafile <path to sample metadata file>

Where <path to config file> is the relative path to the configuration file that you have setup after following the instructions on creating a configuration file, <name of profile> is the name of the profile that you chose when creating a profile, <path to run metadata file> is the full path to the run metadata TSV you created, and <path to sample metadata file> is the full path to the sample metadata TSV you created. For the example configuration file that we provided, we used the profile name cluster and would indicate that we would like to use that profile at the command line with -profile cluster. For more detailed information on setting up the metadata file for your own data, see instructions on preparing the run metadata file and preparing the sample metadata file.

Example output

You can download an example of the expected output files here: scpca_out.zip. For more information on the file structure and what to expect see the description of the output files.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.md

README.md

Example Metadata & Configuration Files for `scpca-nf`

Example files

Testing your setup with example data

Example output

Files

README.md

Latest commit

History

README.md

File metadata and controls

Example Metadata & Configuration Files for scpca-nf

Example files

Testing your setup with example data

Example output

Example Metadata & Configuration Files for `scpca-nf`