You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
As dask processes datasets in chunks, It would be useful to support writing to FITS files in chunks. The alternative is aggregating all chunks into a single large array before writing to disk, which is untenable with the data sizes that we currently encounter.
In the reading case, each chunk is handled as follows:
Either the data or section attributes are accessed, depending whether the file is memory mapped on a local filesystem, or remotely accessed. Presumably these attributes can be used to support chunk writes.
One concern I have is whether writing from multiple threads/processes will be handled properly. This is probably OK in the remote case, and I expect that the OS will handle paging the writes between memory and disk in the memory mapped case.
As dask processes datasets in chunks, It would be useful to support writing to FITS files in chunks. The alternative is aggregating all chunks into a single large array before writing to disk, which is untenable with the data sizes that we currently encounter.
In the reading case, each chunk is handled as follows:
xarray-fits/xarrayfits/fits.py
Lines 66 to 72 in 3360d94
Either the
data
orsection
attributes are accessed, depending whether the file is memory mapped on a local filesystem, or remotely accessed. Presumably these attributes can be used to support chunk writes.One concern I have is whether writing from multiple threads/processes will be handled properly. This is probably OK in the remote case, and I expect that the OS will handle paging the writes between memory and disk in the memory mapped case.
/cc @bennahugo @o-smirnov
The text was updated successfully, but these errors were encountered: