Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

compare to kerchunk? #26

Open
martindurant opened this issue Apr 3, 2024 · 5 comments
Open

compare to kerchunk? #26

martindurant opened this issue Apr 3, 2024 · 5 comments

Comments

@martindurant
Copy link

Kerchunk allows "scanning" FITS datasets and representing them as zarr to be loaded by xarray, including concatenating or otherwise combining multiple HDUs or files into a single logical dataset. What you have here might be independently useful, but I thought I should draw your attention to it.

https://fsspec.github.io/kerchunk/reference.html#kerchunk.fits.process_file

@sjperkins
Copy link
Member

@martindurant Thanks! I'll take a look -- it would be great to have less code to maintain.

@martindurant
Copy link
Author

Do note that the FITS backend is the least documented and tested, so your expertise over at kerchunk would be greatly appreciated, if you have the scope to offer it :)

@sjperkins
Copy link
Member

I definitely think kerchunk is the way forward and it's been on my radar for a while. The other radio astronomy context I can think of it being useful is to establish a view over a CASA Measurement Set v2.0, although I suspect this would be far more challenging than FITS.

Can kerchunk write back to FITS or other formats like HDF5 from a zarr dataset? I took a quick look through the docs again but it wasn't clear. This is something I'll probably attempt in xarray-fits with the caveats mentioned below:

FWIW I'm all for zarrifying radio astronomy data processing, for e.g.

In my context I've had to provide scientists with a hybrid environment that gives them the option to use all the fantastic new technology while still allowing them to fall back to older software + formats, where necessary.

@sjperkins
Copy link
Member

Heh, I suspected that, in the general case, the coordinates would be as big as the data

@martindurant
Copy link
Author

Can kerchunk write back to FITS or other formats like HDF5 from a zarr dataset?

You can, in principle, write new chunks to a kerchunked dataset, and they would be stored in a zarr layout elsewhere, like https://martindurant.github.io/blog/mutable-kerchunk/ (see also array-lake, which a production version of this idea, but not open source). Xarray can, of course, already write to various output formats with or without rechunking, if you wanted to convert all the data.

Heh, I suspected that, in the general case, the coordinates would be as big as the data

This would be true for materialised coordinate arrays, but xarray now has flexible index types, effectively the analytical solution. Whether anyone has explicitly adapted that to astro/fits WCS, I don't know, but it was definitely part of the conversation.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants