Need for batch correction after pseudo bulking on single-data data? #9683
Unanswered
Chiranjit1504
asked this question in
Q&A
Replies: 0 comments
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
Hi,
I have a fundamental question that I have been trying to get my head around regarding pseudobulking. Essentially, I have 4 matched patient single-cell data across tissues (liver and lymph node), so 8 samples in total. I have processed the data in Seurat following the standard pipeline and got to the stage where I have my annotated clusters. These clusters contain cells for all patients from both tissues, that is, all my samples combined into one annotated UMAP. I have already performed harmony integration where I integrated on patient and tissue type, so if I have patient A liver and lymph node samples; they are treated as 2 different samples in the integration step.
Now I want to find DEGs for each cluster between liver and lymph node. For example I want to find gene difference for T cells between Liver and Lymph Node and performed pseudobulking on the cell type and sample. My question is how do I correct for batches?
I have followed the https://satijalab.org/seurat/articles/de_vignette#perform-de-analysis-after-pseudobulking vignette and it uses a FindMarkers after pseudobulking with test.use=DESeq2. My confusion also arises as when handling bulk data, it was easy to model the design parameter of DESeq2 and account for batches, but how to do it here? Is it even necessary to account for batches since the data is already integrated prior to clustering??
Beta Was this translation helpful? Give feedback.
All reactions