You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Is your feature request related to a problem? Please describe.
While miQC filtering is usually reliable, we do not currently detect all failure modes, and our current cell filtering can sometimes allow poor-quality cells into the data, as we are only constraining based on the number of genes expressed, with no mitochondrial filtering.
Sometimes this happens when we get a model fit with two very close lines. For example, in the dataset below (SCPCL000859), the model did fit, but included all cells.
We also have cases where the model fails with similar results, here for SCPCL000863 (a case where I would have totally expected the model to fit!)
Finally, I have also seen rare cases where the model seems far too conservative, excluding cells with mitochondrial percentages of <<5%, resulting in exclusion of cells that are probably just fine. Here is an example from SCPCL001202
Describe the solution you'd like
While I think we are still well justified to use miQC as our primary cutoff, we might want to also implement some are rules. For example, always exclude cells with >25% mitochondrial content (or 20%?) and always include cells with < 5% or maybe a bit higher? Essentially, the idea would be to constrain the miQC model at the back end.
Describe alternatives you've considered
It would be nice if we could find some way to reliably detect failures like the ones above and report on them, but I think improving the data would also be worth the effort here.
The text was updated successfully, but these errors were encountered:
While I think we are still well justified to use miQC as our primary cutoff, we might want to also implement some are rules. For example, always exclude cells with >25% mitochondrial content (or 20%?) and always include cells with < 5% or maybe a bit higher? Essentially, the idea would be to constrain the miQC model at the back end.
I think have a mito filter would definitely make sense. I think 5% is a little low though if we are going to set a minimum and I would feel very comfortable including everything with < 10% mitochondrial content, at least for the single-cell samples. I would expect the single-nuclei samples do tend to have a lower percentage, so if we really want to be careful we might set a different thresholds for cell vs nuclei.
Is your feature request related to a problem? Please describe.
While miQC filtering is usually reliable, we do not currently detect all failure modes, and our current cell filtering can sometimes allow poor-quality cells into the data, as we are only constraining based on the number of genes expressed, with no mitochondrial filtering.
Sometimes this happens when we get a model fit with two very close lines. For example, in the dataset below (
SCPCL000859
), the model did fit, but included all cells.We also have cases where the model fails with similar results, here for
SCPCL000863
(a case where I would have totally expected the model to fit!)Finally, I have also seen rare cases where the model seems far too conservative, excluding cells with mitochondrial percentages of <<5%, resulting in exclusion of cells that are probably just fine. Here is an example from
SCPCL001202
Describe the solution you'd like
While I think we are still well justified to use miQC as our primary cutoff, we might want to also implement some are rules. For example, always exclude cells with >25% mitochondrial content (or 20%?) and always include cells with < 5% or maybe a bit higher? Essentially, the idea would be to constrain the miQC model at the back end.
Describe alternatives you've considered
It would be nice if we could find some way to reliably detect failures like the ones above and report on them, but I think improving the data would also be worth the effort here.
The text was updated successfully, but these errors were encountered: