-
Notifications
You must be signed in to change notification settings - Fork 27
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
faster logmap labels #209
faster logmap labels #209
Conversation
@Gesmira or @dcollins15 |
@dcollins15 @mojaveazure @rsatija anybody still working on seurat or seurat objects? There are so many great PRs, e.g. satijalab/seurat#8197 or satijalab/seurat#8271 or this one, which wait for review and which would really make Seurat faster, safer and better. I find this quite sad that we don't see any reaction from anyone of the teams for more than 6 months. Seurat is such a great package (and I've had great support e.g. by Gesmira), but Seurat would really require some more care since a year or so. |
Hi @mihem, Thanks for the input. As I’m sure you can imagine, trying to balance development, maintenance, and user support for such widely used tools can be challenging for such a small group of developers. We appreciate your enthusiasm and your patience. Your example is helpful as a simple integration test and for showing off the speedup. Since I wasn’t involved in the development of the
Before this PR is ready to merge, please:
Once those updates are complete I’ll be more than happy to merge this in—should not be too long before the next CRAN release 🚀 |
58bc6d5
to
9f8b465
Compare
credits to dcollins satijalab#209
@dcollins15 Thanks, this is great. I absolutely understand that this is a large project with lots of user requests that require a lot of resources. And I still think that Seurat is the best single cell analysis tool out there by far. I was just worried because there hasn't been much response from the Seurat team in the last year (whereas in the three years or so before that, it worked pretty well) And @igrabski wrote in June that the team will have a look soon: satijalab/seurat#7879 (comment) . So I guess you can also imagine that at least for most PR (like this one), users also put some time and resources and would like to see this merged so that their work pays off and Seurat improves. But now: great that you had a look and even provided tests, perfect! And I know that there is a lot of work and that you cannot do everything on your own and that I repeat myself. But there are so many great PRs out there and reviewing them should be much less work than implementing the feature. E.g. : |
Closing and re-opening to trigger CI checks. |
Issue initially reported here:
satijalab/seurat#7879
Integration preprocessing took long (~10min in my case with 120 000 cells and 61 layers). Time was nearly completely spent on
Seurat:::CreateIntegrationGroup
, more specifically on:Slow computing here was caused by the sapply function in seurat-object
seurat-object/R/logmap.R
Lines 238 to 247 in 58bf437
I rewrote this (also thanks to ChatGPT) using logical indexing.
This speeds up computation > 1000x in my use case from 10 min to less than 1 s. So "real" integration steps start nearly instantaneously.
For a reproducible example use: