You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
We query the last element of indptr to determine the output size of the out tensor. I don't think there is any real workaround besides passing in an out tensor as part of the input arguments (already supported). Would that work for you?
I'd have to look more closely at PyG, but in general, what I'm trying to achieve is a fully non-blocking GAT forward pass (with csr data layout). I'd appreciate any hint if you know how to do this, or I'll take some time to figure it out next week.
Is it possible to make gather_csr_cuda() without cpu-gpu sync?
I can only guess that the problem is in line 248 in csrc/cuda/segment_csr_cuda.cu:
The text was updated successfully, but these errors were encountered: