-
Notifications
You must be signed in to change notification settings - Fork 2
Output Plots
Peter van Galen edited this page Apr 19, 2022
·
1 revision
The WAT3R pipeline generates multiple plots useful for QC inspection and results interpretation. Next, we give a detailed description of the created graphs.
The wat3r
command generates two plots that can help to set adequate QC thresholds.
- QCplots_preFiltering.pdf: the first page shows a scatterplot comparing the number of masked bases (Ns) per read vs the average qscore per read. The second page shows the distribution of average qscore per read. In both cases, hhe red line indicates the default qscore cut-off (25).
-
QCplots_clusters.pdf is a visualization of the clustering of TCR sequences with same BC and UMI. It shows the proportion of reads assigned to the most highly ranked TCR cluster (x) vs. the ratio of the reads in the first over the second ranked TCR cluster (y). Color scale indicates the number of reads per BC-UMI. The number in the subtitle shows the proportion of reads that is maintained with the default quality thresholds, indicated by the red lines. These thresholds can be changed with the
-p
and-r
parameters in thedownstream
command.
After running downstream
, two alternative scenarios can occur.
If the user does not provide a .txt file with cell barcode annotations, only two plots are generated.
- db_histograms.pdf shows a histogram with the distribution of read counts (>=3) per consensus sequence, and a histogram of the error rate per consensus sequence.
- ReadPercentage_FilteringSteps.pdf shows the percentage of reads remaining after each filtering step.
On the other hand, if the users provides the list of cell annotations, downstream
outputs several additional plots.
- CDR3_UMIcount_distribution.pdf shows a histogram of the number of unique UMI (UMI counts) assigned to TRA and TRB CDR3 sequences. Frequencies are independently counted for barcodes overlapping or not with the scRNAseq dataset.
- scRNAseq_TCRrecovery_proportions.pdf is a set of barplots depictiong the proportion of each cell annotation for which TRA and/or TRB genes were detected.
- valid_reads.pdf shows the total number of reads that are assigned to TRA and TRB genes and selected for the final results, separated by cell annotation.
- CDR3_clones_heatmap.pdf is a heatmap depicting the number of cells matching specific pairs of TRA and TRB CDR3 sequences.
- TRB_TRA_correspondence.pdf shows individual cell correspondence between TRA and TRB CDR3 sequences, together with cell annotation.
- TRA_TRB_clone_size.pdf is a set of plots showing the ranking of clone sizes for both TRA and TRB, expressed in total cell number and normalized cell number (i.e. % of the total cell number with TRA or TRB that belong to a clone).
- trb_top_clones.pdf represents the size and cell composition of the 50 biggest clones, using the TRB CDR3 sequences. Size is measured as total cell number.
- trb_top_clones_norm.pdf shows the size and cell composition of the 50 biggest clones, using the TRB CDR3 sequences. Size is measured as normalized cell number.
- TRB_clone_size_celltype.pdf is a scatter plot repressenting the size of the clone to which each of the annotated cells belongs, separated by annotation.
- TRB_distance_heatmap.pdf shows a heatmap with the Hamming distance between the TRB CDR3 sequences from the 50 biggest clones.