You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Just leaving a note since its nice to share interest. My use-case is performing KDE many times on large (10 million) sets of data points.
I thought I would have a play and see if it would be low-hanging fruit to add support for CuPy to KDEpy. CuPy is a GPU library with NumPy-like syntax. One of the really nice features is that (when supported), NumPy functions will automatically use the CuPy equivalent when applied to a CuPy array. GPUs are ridiculously fast at calculating FFTs compared to NumPy, so I thought it might be nice to take the speedup provided by KDEpy even further. From what I can tell from the docstring of FFTKDE, the FFT (and not the linear binning) is the bottleneck.
After playing around a bit, I realised that the cutils code is a hard dependency, but also that you've written a (slower) numpy function. I'm a bit surprised that there don't exist faster numpy (and hence CuPy) binning implementations - perhaps this would be an idea to look out for? Do I understand correctly that the binning algorithm you're using is bilinear binning?
(btw, I kept getting gcc: error: KDEpy/cutils.c: No such file or directory (full error here) when trying to developer install it with pip on Ubuntu)
Just thought I'd bring the topic up, as I thought your library was cool! :)
The text was updated successfully, but these errors were encountered:
Hi. Thanks for letting me know about CuPy, and I'm very happy that you like KDEpy. Some thoughts:
I thought I would have a play and see if it would be low-hanging fruit to add support for CuPy to KDEpy.
I'm not necessarily against it, but another integration/implementation might require support and debugging years into the future. I have relatively little time for continued work/support. Many people have made good suggestions for additional features to implement in KDEpy, but if it's (1) a rare use-case or (2) there is a chance that I will end up maintaining less-than-ideal code written by others, I usually reject it. It's not right for me to include it if I don't have the time to maintain it. I'd rather have KDEpy do a select few things really well.
From what I can tell from the docstring of FFTKDE, the FFT (and not the linear binning) is the bottleneck.
That's true in theory, but in practice we should probably measure it. :)
I'm a bit surprised that there don't exist faster numpy (and hence CuPy) binning implementations - perhaps this would be an idea to look out for? Do I understand correctly that the binning algorithm you're using is bilinear binning?
The binning is basically bilinear interpolation as explained by Wikipedia, generalized to arbitrary dimensions. To my knowledge NumPy/SciPy does not implement it. There are binning algorithms, but they are more general (and slower) since they allow arbitrary grids instead of only equidistant grids.
In summary: I would merge a pull request if there's a real use-case here (great speedup on huge data sets) and if the code is well-written and complete.
Just leaving a note since its nice to share interest. My use-case is performing KDE many times on large (10 million) sets of data points.
I thought I would have a play and see if it would be low-hanging fruit to add support for CuPy to KDEpy. CuPy is a GPU library with NumPy-like syntax. One of the really nice features is that (when supported), NumPy functions will automatically use the CuPy equivalent when applied to a CuPy array. GPUs are ridiculously fast at calculating FFTs compared to NumPy, so I thought it might be nice to take the speedup provided by KDEpy even further. From what I can tell from the docstring of FFTKDE, the FFT (and not the linear binning) is the bottleneck.
After playing around a bit, I realised that the cutils code is a hard dependency, but also that you've written a (slower) numpy function. I'm a bit surprised that there don't exist faster numpy (and hence CuPy) binning implementations - perhaps this would be an idea to look out for? Do I understand correctly that the binning algorithm you're using is bilinear binning?
(btw, I kept getting
gcc: error: KDEpy/cutils.c: No such file or directory
(full error here) when trying to developer install it with pip on Ubuntu)Just thought I'd bring the topic up, as I thought your library was cool! :)
The text was updated successfully, but these errors were encountered: