-
Notifications
You must be signed in to change notification settings - Fork 32
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Large transformation times for databases #68
Comments
Hi Aakash, the database transform is essentially transposing a giant matrix and involves no computations, everything is IO limited and it is difficult to speed up. However, you can also use the raw database as output by axisem directly, at the cost of slower access for the seismogram extraction from file. This may be in large parts compensated by the built in buffer in instaseis, that avoids file access all together if the source location has been sampled before and this may well be the case in your application (compare figure 16 in [1], source inversion is not IO bound in that test). Depending on you machine, you can increase cache hit likelihood further by increasing the buffer size using the 'buffer_size_in_mb' option which defaults to only 100MB (https://instaseis.net/instaseis.html#open-db-function). An experimental alternative additional tweak may be to write the database directly in a different chunking. There are two possible ways to achieve that:
Hope this helps, [1] Driel, Martin van, Lion Krischer, Simon Christian Stähler, Kasra Hosseini, and Tarje Nissen-Meyer. 2015. “Instaseis: Instant Global Seismograms Based on a Broadband Waveform Database.” Solid Earth 6 (2): 701–17. https://doi.org/10.5194/se-6-701-2015. |
Hi Martin, Thank you for your suggestions. I tried some of them. I have been using the serial netcdf option by default as we could never set up the parallel netcdf option on our cluster. Turning chunking to ‘true’ does generate a database at comparable time which is quick to access even without transformation. This seems to be the way ahead. I had a few questions though -
|
Hi Aakash
Ok, this may become a bottleneck if you use very many cores (say more than a few hundred or so), where the round robin IO is not very efficient. That is why we implemented collective IO.
I am not sure it is exactly the same as it is generated through different codes and likely different chunking, but probably very similar.
I have not tried, but I assume the merged version reduces number of file accesses and improves access pattern within the file, so I could imagine a small factor (somewhere between 2-5 I would guess).
This should work fine, but I have not actually tried. You should be able to inspect the file layout using standard netcdf or hdf5 tools like ncdump or h5dump to verify.
Because it potentially slows down file IO in the parallel run, which is crucial when running on very many cores, while a slower transpose in the serial part is then more acceptable. cheers, |
Thanks a lot Martin for your comments and suggestions. |
I am trying to transform the axisem-generated databases for faster access for iterative inversion processes, as suggested by the developers. My axisem jobs are on our cluster, and I will need the database on the cluster for rapid access of millions of different sources (hypocenter + moment tensor). I have run both global-scale domains as well as localized (in depth and laterally) domains. See table below showing details of some test run times. For me, the transformation of the database is the bottleneck for our science applications.
The issue I face is that the transformation of the generated database takes many days, whereas the job to generate the database takes hours (depending on # cores). I believe the transformation of the database is a 1 core job and hence slow (and will depend on the machine). Please let me know if you have advice. Partly I also wanted to share these tests with others.
The text was updated successfully, but these errors were encountered: