-
Notifications
You must be signed in to change notification settings - Fork 24
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
HDF5 errors when writing Silo files #243
Comments
@eschnett thanks for report. which |
Yes. I am using |
Ok, so you've linked to an HDF5 installation which is compiled for parallel. That is fine. But, Silo is a serial library and will only ever open serial HDF5 files so the fact that HDF5 library is complaining about that seems off. @eschnett do you by any chance also manipulate any HDF5 files directly from the application where you are seeing this message? @brtnfld I am wondering whether the error message (in the orig. comment above), is potentially bugus? I am pretty sure that Silo is opening only serial hdf5 files but the user is reporting that the HDF5 lib is complaining about Silo's use of the evict on close feature. Now, the application itself is indeed parallel but I am fairly certain its creating only serial HDF5 files. A little more confusing is that HDF5's error messages seem to be savvy to that fact...its reporting MPI Rank ids. I am assuming its somehow interrogating them for added convenience in reporting the error message? Or, is that evidence that somehow the file itself was opened with an MPI communicator? |
Currently, if HDF5 has parallel enabled, then calling H5Pset_evict_on_close will throw an error regardless of it being called with the sec2 driver. So this seems like a bug to me. I'm not sure why all the ranks would print the error stack message if Silo were called from one rank. That is strange. Maybe it is a feature when building with enable parallel. I would assume that all the ranks would print the original error message if more than one rank calls silo. |
Well, the application may be running with one silo-file-per-processor and all ranks are opening a Silo file with sec2 driver. It seems like the message about evict on close is issued only from one rank but an error "stack" is getting dumped from all ranks. |
Yes, that would make sense, forgot about the file-per-process case. |
I like the idea of parallel HDF5 library reporting MPI ranks in its error messages even when using non-parallel drivers. That could be useful at large scale where an odd-ball failure occurs on some of the ranks. That said, I think it can only be assuming But, those rank IDs might be confusing to an application that is somehow using a subsetting MPI communicator for all its I/O without mentioning the fact that those are the rank ids in the world communicator. When HDF5 is using
to
and maybe (or not)
when it actually has a file MPI_Comm I am guessing the guard logic for this error message...
Is handled differently and maybe restricted by rank in some way? Because @eschnett didn't report 8 of those messages. Just 8 HDF5 error stack messages. |
Yes, I am writing one Silo file per process. |
This problem means that I cannot use Silo 4.11, and I am thus using Silo 4.10 instead. I have recently learned (spack/spack#34786) that Silo 4.10 requires HDF5 1.8 and does not support later versions of HDF5. This combination of Silo<->HDF5 constraints is rather inconvenient... Is there a way to resolve one of them? |
I receive the following (harmless?) HDF5 errors when writing Silo files. I am using Silo 4.11 and HDF5 1.12.1. HDF5 is configured with MPI. The relevant error message seems to be
The text was updated successfully, but these errors were encountered: