Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

list available backends and basic descriptors #7000

Merged
merged 42 commits into from
Oct 17, 2022
Merged
Show file tree
Hide file tree
Changes from 26 commits
Commits
Show all changes
42 commits
Select commit Hold shift + click to select a range
b48dc55
add backend desc and docs properties to BackendEntrypoint
JessicaS11 Sep 6, 2022
ea0af7f
add avail_engines function to api
JessicaS11 Sep 6, 2022
88fcd7d
Merge branch 'pydata:main' into listbackends
JessicaS11 Sep 6, 2022
f9e61d1
[pre-commit.ci] auto fixes from pre-commit.com hooks
pre-commit-ci[bot] Sep 6, 2022
7aa2d70
convert engine list to pandas dataframe
JessicaS11 Sep 8, 2022
c8dfb2f
alphabetize (loosely) input/output functions in api.rst
JessicaS11 Sep 8, 2022
e2fc425
add new function to api docs
JessicaS11 Sep 8, 2022
a0fc0c2
add new properties to docstring
JessicaS11 Sep 8, 2022
d405717
add new available backend properties to docs
JessicaS11 Sep 8, 2022
73580dc
[pre-commit.ci] auto fixes from pre-commit.com hooks
pre-commit-ci[bot] Sep 8, 2022
6af2891
remove redundant 'backend' from attribute names
JessicaS11 Sep 8, 2022
897c204
[pre-commit.ci] auto fixes from pre-commit.com hooks
pre-commit-ci[bot] Sep 8, 2022
623bb31
turn properties into attributes as per PR review
JessicaS11 Sep 8, 2022
33412c4
fix docstring formatting for attributes
JessicaS11 Sep 8, 2022
06861fa
add str to BackendEntrypoint class
JessicaS11 Sep 8, 2022
1dd69ac
[pre-commit.ci] auto fixes from pre-commit.com hooks
pre-commit-ci[bot] Sep 8, 2022
14fa13d
Merge branch 'main' into listbackends
JessicaS11 Sep 9, 2022
4fdfad1
update output type for avail_engines function
JessicaS11 Sep 9, 2022
a6b6f55
Merge branch 'main' into listbackends
JessicaS11 Sep 12, 2022
02de430
update docs and code based on PR review
JessicaS11 Sep 12, 2022
b410eec
[pre-commit.ci] auto fixes from pre-commit.com hooks
pre-commit-ci[bot] Sep 12, 2022
0a48f20
commit unsaved changes for last commit
JessicaS11 Sep 12, 2022
d076fba
Merge branch 'main' into listbackends
JessicaS11 Sep 13, 2022
c3c157f
remove api function and update docs post dev discussion to just retur…
JessicaS11 Sep 15, 2022
3d37077
[pre-commit.ci] auto fixes from pre-commit.com hooks
pre-commit-ci[bot] Sep 15, 2022
f40660e
remove missed imports
JessicaS11 Sep 15, 2022
1d78879
Apply suggestions from code review - move comment
JessicaS11 Sep 19, 2022
089393f
[pre-commit.ci] auto fixes from pre-commit.com hooks
pre-commit-ci[bot] Sep 19, 2022
331acf5
add list engines typing
JessicaS11 Sep 20, 2022
98b85dc
change str to repr and improve layout
JessicaS11 Sep 20, 2022
fc4c93a
[pre-commit.ci] auto fixes from pre-commit.com hooks
pre-commit-ci[bot] Sep 20, 2022
b6e6a23
Merge branch 'main' into listbackends
JessicaS11 Sep 22, 2022
db9c422
Merge branch 'main' into listbackends
dcherian Oct 3, 2022
1a7707d
Merge branch 'main' into listbackends
dcherian Oct 13, 2022
b978583
deal with issues from changes on main
JessicaS11 Oct 13, 2022
0fdff0c
actually put release notes in right place
JessicaS11 Oct 13, 2022
28a0799
Merge branch 'main' into listbackends
JessicaS11 Oct 13, 2022
3b7fece
Merge branch 'main' into listbackends
JessicaS11 Oct 13, 2022
988af43
Merge branch 'main' into listbackends
dcherian Oct 17, 2022
6f88789
Fix whats-new
dcherian Oct 17, 2022
fdd1346
Apply suggestions from code review
dcherian Oct 17, 2022
bf5ad18
Revert "Apply suggestions from code review"
dcherian Oct 17, 2022
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
51 changes: 26 additions & 25 deletions doc/api.rst
Original file line number Diff line number Diff line change
Expand Up @@ -588,60 +588,60 @@ Dataset methods
.. autosummary::
:toctree: generated/

open_dataset
load_dataset
open_dataset
open_mfdataset
open_rasterio
open_zarr
Dataset.to_netcdf
Dataset.to_pandas
Dataset.as_numpy
Dataset.to_zarr
save_mfdataset
Dataset.as_numpy
Dataset.from_dataframe
Dataset.from_dict
Dataset.to_array
Dataset.to_dataframe
Dataset.to_dask_dataframe
Dataset.to_dict
Dataset.from_dataframe
Dataset.from_dict
Dataset.to_netcdf
Dataset.to_pandas
Dataset.to_zarr
Dataset.chunk
Dataset.close
Dataset.compute
Dataset.persist
Dataset.load
Dataset.chunk
Dataset.unify_chunks
Dataset.filter_by_attrs
Dataset.info
Dataset.load
Dataset.persist
Dataset.unify_chunks

DataArray methods
-----------------

.. autosummary::
:toctree: generated/

open_dataarray
load_dataarray
open_dataarray
DataArray.as_numpy
DataArray.from_cdms2
DataArray.from_dict
DataArray.from_iris
DataArray.from_series
DataArray.to_cdms2
DataArray.to_dataframe
DataArray.to_dataset
DataArray.to_dict
DataArray.to_index
DataArray.to_iris
DataArray.to_masked_array
DataArray.to_netcdf
DataArray.to_numpy
DataArray.to_pandas
DataArray.to_series
DataArray.to_dataframe
DataArray.to_numpy
DataArray.as_numpy
DataArray.to_index
DataArray.to_masked_array
DataArray.to_cdms2
DataArray.to_iris
DataArray.from_iris
DataArray.to_dict
DataArray.from_series
DataArray.from_cdms2
DataArray.from_dict
DataArray.chunk
DataArray.close
DataArray.compute
DataArray.persist
DataArray.load
DataArray.chunk
DataArray.unify_chunks

Coordinates objects
Expand Down Expand Up @@ -1086,6 +1086,7 @@ Advanced API
Dataset.set_close
backends.BackendArray
backends.BackendEntrypoint
backends.list_engines

These backends provide a low-level interface for lazily loading data from
external file-formats or protocols, and can be manually invoked to create
Expand Down
22 changes: 21 additions & 1 deletion doc/internals/how-to-add-new-backend.rst
Original file line number Diff line number Diff line change
Expand Up @@ -16,6 +16,9 @@ If you also want to support lazy loading and dask see :ref:`RST lazy_loading`.
Note that the new interface for backends is available from Xarray
version >= 0.18 onwards.

You can see what backends are currently available in your working environment
with :py:class:`~xarray.backends.list_engines()`.

.. _RST backend_entrypoint:

BackendEntrypoint subclassing
Expand All @@ -26,7 +29,9 @@ it should implement the following attributes and methods:

- the ``open_dataset`` method (mandatory)
- the ``open_dataset_parameters`` attribute (optional)
- the ``guess_can_open`` method (optional).
- the ``guess_can_open`` method (optional)
- the ``description`` attribute (optional)
- the ``url`` attribute (optional).

This is what a ``BackendEntrypoint`` subclass should look like:

Expand Down Expand Up @@ -55,6 +60,10 @@ This is what a ``BackendEntrypoint`` subclass should look like:
return False
return ext in {".my_format", ".my_fmt"}

description = "Use .my_format files in Xarray"

url = "https://link_to/your_backend/documentation"

``BackendEntrypoint`` subclass methods and attributes are detailed in the following.

.. _RST open_dataset:
Expand Down Expand Up @@ -168,6 +177,17 @@ that always returns ``False``.
Backend ``guess_can_open`` takes as input the ``filename_or_obj`` parameter of
Xarray :py:meth:`~xarray.open_dataset`, and returns a boolean.

.. _RST properties:

description and url
^^^^^^^^^^^^^^^^^^^^

``description`` is used to provide a short text description of the backend.
``url`` is used to include a link to the backend's documentation or code.

These attributes are surfaced when a user prints :py:class:`~xarray.backends.BackendEntrypoint`.
If ``description`` or ``url`` are not defined, an empty string is returned.

.. _RST decoders:

Decoders
Expand Down
4 changes: 4 additions & 0 deletions doc/whats-new.rst
Original file line number Diff line number Diff line change
Expand Up @@ -216,6 +216,10 @@ New Features
By `Joe Hamman <https://github.com/jhamman>`_.
- Upload development versions to `TestPyPI <https://test.pypi.org>`_.
By `Justus Magin <https://github.com/keewis>`_.
- Improves overall documentation around available backends, including adding docstrings for :py:meth:`xarray.backends.list_engines()`
JessicaS11 marked this conversation as resolved.
Show resolved Hide resolved
Add :py:meth:`__str__` to surface the new :py:class:`BackendEntrypoint` ``description``
and ``url`` attributes. (:issue:`6577`, :pull:`7000`)
By `Jessica Scheick <https://github.com/jessicas11>`_.

Breaking changes
~~~~~~~~~~~~~~~~
Expand Down
24 changes: 23 additions & 1 deletion xarray/backends/common.py
Original file line number Diff line number Diff line change
Expand Up @@ -369,10 +369,32 @@ class BackendEntrypoint:
- ``guess_can_open`` method: it shall return ``True`` if the backend is able to open
``filename_or_obj``, ``False`` otherwise. The implementation of this
method is not mandatory.

Attributes
----------

open_dataset_parameters : tuple, default None
A list of ``open_dataset`` method parameters.
The setting of this attribute is not mandatory.
description : str
A short string describing the engine.
The setting of this attribute is not mandatory.
url : str
A string with the URL to the backend's documentation.
The setting of this attribute is not mandatory.
"""

open_dataset_parameters: tuple | None = None
dcherian marked this conversation as resolved.
Show resolved Hide resolved
"""list of ``open_dataset`` method parameters"""
description: str = ""
dcherian marked this conversation as resolved.
Show resolved Hide resolved
url: str = ""
dcherian marked this conversation as resolved.
Show resolved Hide resolved

def __str__(self) -> str:
txt = f"Backend type: {type(self).__name__}"
if self.description:
txt += f"\n{self.description}"
if self.url:
txt += f"\nLearn more at {self.url}"
return txt
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
description: str = ""
url: str = ""
def __str__(self) -> str:
txt = f"Backend type: {type(self).__name__}"
if self.description:
txt += f"\n{self.description}"
if self.url:
txt += f"\nLearn more at {self.url}"
return txt
def __repr__(self) -> str:
description = "Base entrypoint for backends"
url = r"https://xarray.dev/"
return f"<{type(self).__name__}>\n {description}\n {url}"

Here's how I would format it. I don't think the attributes are necessary and can live in the repr. The entrypoints xarray support should be updated with meaningful descriptions and links.

# The single repr:
xr.backends.list_engines()["zarr"]
Out[19]: 
<ZarrBackendEntrypoint>
  Base entrypoint for backends
  https://xarray.dev/

# list_engines is quite readable with the indents:
xr.backends.list_engines()
Out[18]: 
{'netcdf4': <NetCDF4BackendEntrypoint>
   Base entrypoint for backends
   https://xarray.dev/,
 'h5netcdf': <H5netcdfBackendEntrypoint>
   Base entrypoint for backends
   https://xarray.dev/,
 'scipy': <ScipyBackendEntrypoint>
   Base entrypoint for backends
   https://xarray.dev/,
 'pseudonetcdf': <PseudoNetCDFBackendEntrypoint>
   Base entrypoint for backends
   https://xarray.dev/,
 'pydap': <PydapBackendEntrypoint>
   Base entrypoint for backends
   https://xarray.dev/,
 'store': <StoreBackendEntrypoint>
   Base entrypoint for backends
   https://xarray.dev/,
 'zarr': <ZarrBackendEntrypoint>
   Base entrypoint for backends
   https://xarray.dev/}

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But then all external backend will have to re-implement the repr and might not follow the indentation. Personally I like the attributes approach more.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that's okay, it's easy enough to copy/paste the base repr.
I suspect the url will be a wasted attribute as some engines might not have a homepage.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I took an approach somewhere in between. If we make the descriptions part of the base repr instead of attributes, then it will either have filler text (as above) or funny spacing (if we let description/url = "") if each Backend doesn't fill in the repr fields. I did change it to __repr__ and add the nice indentation (though the dict doesn't print with newlines after each key in my output window, which makes it slightly confusing to look at):

{'netcdf4': <NetCDF4BackendEntrypoint>, 'scipy': <ScipyBackendEntrypoint>
  this is how you use scipy as a backend
  Learn more at scipy.org, 'pydap': <PydapBackendEntrypoint>, 'rasterio': <RasterioBackend>, 'store': <StoreBackendEntrypoint>, 'zarr': <ZarrBackendEntrypoint>}

Happy to make further adjustments but trying to include all suggestions!

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think whether or not the description and url live in the attributes is not a big deal, and it would be great to just get get this merged as is so we can include it in the upcoming release! 😀

(though the dict doesn't print with newlines after each key in my output window, which makes it slightly confusing to look at)

I think you're going to have this problem with any solution that still uses a dict of reprs. (Not sure why that doesn't seem to be the case in @Illviljan's example though.)


def open_dataset(
self,
Expand Down
13 changes: 13 additions & 0 deletions xarray/backends/plugins.py
Original file line number Diff line number Diff line change
Expand Up @@ -99,6 +99,19 @@ def build_engines(entrypoints):
@functools.lru_cache(maxsize=1)
def list_engines():
# New selection mechanism introduced with Python 3.10. See GH6514.
"""
Return a dictionary of available engines and their BackendEntrypoint objects.

Returns
-------
dictionary

Notes
-----
This function lives in the backends namespace (``engs=xr.backends.list_engines()``).
If available, more information is available about each backend via ``engs["eng_name"]``.

"""
JessicaS11 marked this conversation as resolved.
Show resolved Hide resolved
if sys.version_info >= (3, 10):
entrypoints = entry_points(group="xarray.backends")
else:
Expand Down