batch correction on expression counts and embeddings #18

zhangnan0107 · 2024-08-22T10:54:22Z

Thanks for sharing this dataset. I would like to ask for the data from cellxgene, is the batch correction applied for both the low-dimensional reduction embedding (e.g. UMAP) and the expression counts? Or it's just for the embedding. Thanks : )

grst · 2024-08-22T11:03:40Z

batch effect correction only applies to the low-dimensional embedding (adata.obsm["X_scANVI"]) and whatever is derived from it (e.g. neighborhood graph, UMAP).

For all downstream analyses, we accounted for batch effects independently by including covariates in the linear models used for comparison.

zhangnan0107 · 2024-08-23T11:55:56Z

Thanks for your reply! I might have 2 follow-up questions about the expression counts in cellxgene data:

there are three layers - X, which looks like normalized data, layer count and counts_length_scaled. Not sure if I understood this correctly, count is the raw count from original studies, counts_length_scaled is scaled count for only Smart-seq2 platform data (so raw counts was kept for other platforms?), and may I ask which normalization method is used for X?
regarding batch effects, I think it can be added as cofactor in analysis like differential expression. I wonder for the dotplot of marker genes for cell-type annotation like in figure s1, did you also account batch effects in someway, or this is actually based on non-correction counts?

Thanks

grst · 2024-08-31T17:21:50Z

All you say is correct, X is simply scanpy.pp.normalize_total followed by scanpy.pp.log1p on the length-scaled counts
the dotplots showning the cell-type markers were not adjusted for batch effects (we are also not trying to make any quantitative claims here)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

batch correction on expression counts and embeddings #18

batch correction on expression counts and embeddings #18

zhangnan0107 commented Aug 22, 2024

grst commented Aug 22, 2024

zhangnan0107 commented Aug 23, 2024

grst commented Aug 31, 2024

batch correction on expression counts and embeddings #18

batch correction on expression counts and embeddings #18

Comments

zhangnan0107 commented Aug 22, 2024

grst commented Aug 22, 2024

zhangnan0107 commented Aug 23, 2024

grst commented Aug 31, 2024