-
Notifications
You must be signed in to change notification settings - Fork 0
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add plots to QC report #57
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Overall this is coming together really nicely! I just wanted to comment on a few minor things:
- I like the placement of the legends and am not overly concerned about if they overlap data. I think for the majority of data where you've placed them should be outside of where the data lie.
- We should be consistent with the type of case we choose to use for the axis titles. I think we could use sentence case as we do most everywhere else, and I think you are mostly using that now so I suggested a few changes to make that consistent throughout.
- You mentioned in the comment that you made the legend go from 0-100% for mito content. I think we should do the same thing with plotting the miQC plot. Since we always anticipate that axis to be on the same scale.
inst/rmd/qc_report.rmd
Outdated
ggplot(unfiltered_celldata, aes(x = rank, y = sum, color = filter_status))+ | ||
geom_point(aes(shape = filter_status)) + | ||
scale_y_log10() + | ||
scale_color_manual(values = c("navyblue", "grey")) + | ||
geom_point(aes(shape = filter_status),alpha = 0.2, size = 2) + | ||
scale_y_log10() + | ||
scale_x_log10() + | ||
scale_color_manual(values = c("navyblue", "grey40")) + | ||
scale_shape_manual(values = c(16, 1)) + # filled and unfilled circles | ||
labs( | ||
x = "Rank", | ||
y = "UMI count", | ||
color = "Filter status", | ||
shape = "Filter status" | ||
) | ||
) + | ||
guides(color = guide_legend(override.aes = list(alpha = 1))) + | ||
theme(legend.position = c(0, 0), | ||
legend.justification = c(0,0), | ||
legend.background = element_rect(color = "grey20", size = 0.25), | ||
legend.box.margin = margin(rep(5, 4))) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think changing the axis from scientific notation would be ideal if we could do that. For both the x and y axis. Also I'm not crazy about the colors because I personally think the purple and grey are hard to see the transition. What if we did red and grey?
inst/rmd/qc_report.rmd
Outdated
geom_point(aes(shape = filter_status),alpha = 0.2, size = 2) + | ||
scale_y_log10() + | ||
scale_x_log10() + | ||
scale_color_manual(values = c("navyblue", "grey40")) + | ||
scale_shape_manual(values = c(16, 1)) + # filled and unfilled circles | ||
labs( | ||
x = "Rank", | ||
y = "UMI count", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
y = "UMI count", | |
y = "Total UMI count", |
Can we change this to say total? I know its minor but its slightly more descriptive.
inst/rmd/qc_report.rmd
Outdated
|
||
## UMI - expressed gene count plot | ||
|
||
```{r, fig.cap="Total UMI x genes expressed"} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not a huge fan of this title. What about changing to something simple like Cell Metrics
. I also don't know if we need the line after that says "Total UMI x genes expressed." I'm not sure it adds anymore information than the plot does.
inst/rmd/qc_report.rmd
Outdated
labs(x = "Total UMI Count", | ||
y = "Number of Genes Expressed", | ||
color = "Mitochondrial\nPercent") + |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
labs(x = "Total UMI Count", | |
y = "Number of Genes Expressed", | |
color = "Mitochondrial\nPercent") + | |
labs(x = "Total UMI count", | |
y = "Number of genes expressed", | |
color = "Mitochondrial\npercent") + |
If you like my suggestion of keeping all axis titles in sentence case then this should be changed.
inst/rmd/qc_report.rmd
Outdated
miQC::plotModel(filtered_sce, model = metadata(filtered_sce)$miQC_model) + | ||
theme(legend.position = c(1, 1), | ||
legend.justification = c(1,1), | ||
legend.background = element_rect(color = "grey20", size = 0.25), | ||
legend.box.margin = margin(rep(5, 4))) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
miQC::plotModel(filtered_sce, model = metadata(filtered_sce)$miQC_model) + | |
theme(legend.position = c(1, 1), | |
legend.justification = c(1,1), | |
legend.background = element_rect(color = "grey20", size = 0.25), | |
legend.box.margin = margin(rep(5, 4))) | |
miQC::plotModel(filtered_sce, model = metadata(filtered_sce)$miQC_model) + | |
theme(legend.position = c(1, 1), | |
legend.justification = c(1,1), | |
legend.background = element_rect(color = "grey20", size = 0.25), | |
legend.box.margin = margin(rep(5, 4))) + | |
coord_cartesian(ylim = c(0,100)) + | |
labs(x = "Number of genes expressed", | |
y = "Mitochondrial percent") |
To keep things consistent for mitochondrial percentages, I think it would be nice to always plot with the same y-axis range. Also to keep labels consistent, seem they are the same values being plotted maybe we should keep the same titles across the two plots?
I updated legends and titles as suggested, using sentence case everywhere, and unifying labels. I moved what I had put as figure captions to alt text because it was laregely redundant, as noted, but I thought we should have some kind of figure label beyond the title. For the knee plot, I chose a more contrasting color scheme, but I didn't want to go to red, because that would imply (to me) that those points failed some QC. So I went with a dark green, and lightened up the grey. I also went with smaller points for the filtered cells. Let me know what you think. Here is a full report file... |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I like the changes and think the green in the knee plot looks a lot better! I don't necessarily think we need the change in size and have previously been taught not to double encode variables where we have both size and color showing if a cell is considered passed
or excluded
but I'll leave that final call up to you. I do think the changes highlight the passed cells which is what we want so I can see an argument for either one.
I had one other minor axis wording change to keep the labels consistent between the Cell read metrics
and miQC
plot, but other than that I think this looks good!
Co-authored-by: Ally Hawkins <[email protected]>
Yeah, I don't usually do this, but I thought it was worth it in this case, as I think it solves a problem where the two kinds of points overlap. |
I would agree, it does help show the separation a lot more. I think for the purposes of this plot its okay to keep it in. |
Here I am mostly adding the Total vs. gene count plot to the QC report. I also fixed the knee plot and made the session info hidden by default, so closing all three of those issues here.
I was also playing around with AlexsLemonade/scpca-nf#508, which I ended up abandoning, but in that process I added a bit of formatting to the tables, including making them not full-width, and striping the rows.
A few things to focus on that I was thinking about as I did them:
Please also look at all axis labels and legend titles! (Should I make the knee plot use non-scientific notation?)
For consistency among reports, I made the mitochondrial percentage go from 0-100% for all reports. This may make lower numbers harder to see in mostly-good plots (where no cells are above 50%, say). I thought the consistency was more important, but we could tweak the color scale if we think it matters.
The other potentially controversial aesthetic decision I made was to move the figure legends into the figure panels, at least for the knee & UMI/gene plots. In both of these plots, the location of the data is predictable, so in the vast majority of cases the positions I chose will not overlap data. Still, there is a chance they will, so if there is worry, I can move them back out. (when we add the miQC plot, the top right ought to work as a clear space)
I am attaching an example report (zipped, b/c github), and the two main figures I worked on below, for reference:
closes #39
closes #40
closes #19
closes #55