Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Set minimum and maximum cutoffs for mitochondrial content filtering #802

Open
jashapiro opened this issue Nov 16, 2024 · 1 comment
Open

Comments

@jashapiro
Copy link
Member

jashapiro commented Nov 16, 2024

Is your feature request related to a problem? Please describe.

While miQC filtering is usually reliable, we do not currently detect all failure modes, and our current cell filtering can sometimes allow poor-quality cells into the data, as we are only constraining based on the number of genes expressed, with no mitochondrial filtering.

Sometimes this happens when we get a model fit with two very close lines. For example, in the dataset below (SCPCL000859), the model did fit, but included all cells.

Screenshot 2024-11-16 at 11 17 12 AM Screenshot 2024-11-16 at 11 17 36 AM

We also have cases where the model fails with similar results, here for SCPCL000863 (a case where I would have totally expected the model to fit!)

Screenshot 2024-11-16 at 11 20 27 AM Screenshot 2024-11-16 at 11 20 41 AM

Finally, I have also seen rare cases where the model seems far too conservative, excluding cells with mitochondrial percentages of <<5%, resulting in exclusion of cells that are probably just fine. Here is an example from SCPCL001202

Screenshot 2024-11-16 at 1 49 26 PM Screenshot 2024-11-16 at 1 49 33 PM

Describe the solution you'd like

While I think we are still well justified to use miQC as our primary cutoff, we might want to also implement some are rules. For example, always exclude cells with >25% mitochondrial content (or 20%?) and always include cells with < 5% or maybe a bit higher? Essentially, the idea would be to constrain the miQC model at the back end.

Describe alternatives you've considered

It would be nice if we could find some way to reliably detect failures like the ones above and report on them, but I think improving the data would also be worth the effort here.

@allyhawkins
Copy link
Member

While I think we are still well justified to use miQC as our primary cutoff, we might want to also implement some are rules. For example, always exclude cells with >25% mitochondrial content (or 20%?) and always include cells with < 5% or maybe a bit higher? Essentially, the idea would be to constrain the miQC model at the back end.

I think have a mito filter would definitely make sense. I think 5% is a little low though if we are going to set a minimum and I would feel very comfortable including everything with < 10% mitochondrial content, at least for the single-cell samples. I would expect the single-nuclei samples do tend to have a lower percentage, so if we really want to be careful we might set a different thresholds for cell vs nuclei.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants