You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Is this simply a "running out of memory" issue that should be addressed by using a smaller chunk size? Where is that determined?
Update:
ah, I see from the linked issue that the problem actually emerges when more memory is available. So more likely to be a pointer arithmetic issue. Capping the chunksize automatically might still be a reasonable solution, but maybe we just need some bigger/unsigned integers.
I guess the next step is to put together a minimal example that can be used as a failing test.
Yes, I'll try to get a smaller test up (without Horace). In any case I think it's probably a good idea to use ptrdiff_t or size_t instead of int.
The chunk size is an input of the euphonic (Python) function but is set automatically in the Horace (Matlab) code based on the amount of system free memory - so for a large instance with 480GB of memory it just passes all the q-points requested in one chunk.
Yes, I'll try to get a smaller test up (without Horace). In any case I think it's probably a good idea to use ptrdiff_t or size_t instead of int.
Yes, this is best practice for pointers anyway.
The chunk size is an input of the euphonic (Python) function but is set automatically in the Horace (Matlab) code based on the amount of system free memory - so for a large instance with 480GB of memory it just passes all the q-points requested in one chunk.
I see! It shouldn't be too difficult to build in a cap if that turns out to be necessary/sensible then. We probably don't want to do all the pointer arithmetic with long types just to save a few (already huge) chunks. (I think size_t is as long as possible anyway so perhaps that's a non-issue.)
@davidvoneshen reported a bug Horace-euphonic-interface#40 where using Euphonic together with Horace with a large number of q-points per chunk causes a segfault.
The bug seems to be in the Euphonic C-code but I'm not exactly sure where.
The system is relatively large with 80 atoms in the supercell, and the crash occurs when the chunk size is set larger than around
1.15e6
q-points.The text was updated successfully, but these errors were encountered: