-
Notifications
You must be signed in to change notification settings - Fork 92
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add tutorial notebook for open_virtual_dataset
#903
Comments
Thanks for opening this @ayushnag! |
I just tried running this new functionality in an Openscapes Jupyterhub instance. Used A |
Oh! I see now that it is declared in the |
I think doing |
You should be able to do |
After installing via 1) A numpy version warning upon import of
|
Definitely could be problematic, so try creating a new env instead, and see if you get the same problems. |
@danielfromearth (1) and (2) are related - VirtualiZarr has a hard requirement to use numpy 2.0.0 or later because it makes a lot of use of the new variable-length string style internally. I think something in your environment is not compiled with numpy 2. |
@danielfromearth I'm currently using hatch and the pyproject.toml to experiment on my local desktop, I would avoid using the 2i2c Jupyterhub for now since it has dependency locks |
@battistowx, I've also tried it locally without issue, but part of my goal with testing was to try it in-region on AWS West 2. I'm still not sure how to modify the environment in 2i2c Jupyterhub, and if there are dependency locks, is there a way around that for testing purposes? |
@danielfromearth great timing!! I was testing the latest image for the hub, try restarting your instance in the admin console and use |
Okay, just tried this in an instance of result = earthaccess.open_virtual_mfdataset(
granules=results,
access="indirect",
concat_dim="time",
parallel=False,
preprocess=None
) The following new error is being raised by
This is out of my depth. Is it missing some optional dependency, something that provides a "Chunk Manager" that will know how to handle |
@danielfromearth You need to also pass in the arguments |
This comment was marked as outdated.
This comment was marked as outdated.
Please disregard my previous (now-hidden) comment! I discovered a typo that prevented the code from working (I was opening the same group twice and then trying to merge it with itself). |
Sorry @danielfromearth - VirtualiZarr's xarray-at-the-top design means that sometimes obscure errors are thrown from deep inside xarray, that VirtualiZarr doesn't have control to re-raise with clearer messages. I have issues to track ways to make them clearer, but changing xarray is a more involved process than changing VirtualiZarr. (xref zarr-developers/VirtualiZarr#114 and pydata/xarray#8778) For I've tried to document this here but if you think any of this could be clearer in the VirtualiZarr docs please raise an issue there :) |
before opening an issue in File [/srv/conda/envs/notebook/lib/python3.11/site-packages/xarray/core/indexing.py:369](https://openscapes.2i2c.cloud/srv/conda/envs/notebook/lib/python3.11/site-packages/xarray/core/indexing.py#line=368), in IndexCallable.__getitem__(self, key)
368 def __getitem__(self, key: Any) -> Any:
--> 369 return self.getter(key)
File [/srv/conda/envs/notebook/lib/python3.11/site-packages/xarray/core/indexing.py:1508](https://openscapes.2i2c.cloud/srv/conda/envs/notebook/lib/python3.11/site-packages/xarray/core/indexing.py#line=1507), in NumpyIndexingAdapter._oindex_get(self, indexer)
1506 def _oindex_get(self, indexer: OuterIndexer):
1507 key = _outer_to_numpy_indexer(indexer, self.array.shape)
-> 1508 return self.array[key]
File [/srv/conda/envs/notebook/lib/python3.11/site-packages/virtualizarr/manifests/array.py:214](https://openscapes.2i2c.cloud/srv/conda/envs/notebook/lib/python3.11/site-packages/virtualizarr/manifests/array.py#line=213), in ManifestArray.__getitem__(self, key)
212 return self
213 else:
--> 214 raise NotImplementedError(f"Doesn't support slicing with {indexer}") does it ring a bell? it worked for a few granules 🤔 |
@betolink Could you share the code you used to get the error? |
Could be zarr-developers/VirtualiZarr#51, but I would need more context. Also please raise new issues on VirtualiZarr!! That way other people will see them and have the opportunity to jump in and help. |
I've ran into a few interesting errors as well when certain dependencies are not installed. I'll document those as well in Virtualizarr! Also, I'd love to be a part of this tutorial project and help where I can! |
@battistowx What specific errors have you ran into? Perhaps that is an error in the earthaccess dependencies not virtualizarr that we need to add to the |
@ayushnag Yes, adding Zarr was an easy fix and that error appeared when importing |
@battistowx We can fix most of the dependency problems with an added integration test that checks that |
@ayushnag No worries, we already put large warning banners in our in-region notebooks to warn the user of this, and so far that seems to be an effective way of doing it. Currently, I'm working on testing MERRA-2, and some other collections that are currently Cloud OPeNDAP-enabled at GES DISC. I'll record my testing experience in a different Issues thread so that we can keep this tutorial-relevant. |
The newly added
open_virtual_dataset
andopen_virtual_mfdataset
functions need a tutorial notebook to show example usage. The current sections I have planned are:cc @betolink @TomNicholas @danielfromearth
The text was updated successfully, but these errors were encountered: