Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add CMIP6 endpoint, using "cryo" coverage #510

Merged
merged 17 commits into from
Jan 13, 2025
Merged

Add CMIP6 endpoint, using "cryo" coverage #510

merged 17 commits into from
Jan 13, 2025

Conversation

Joshdpaul
Copy link
Contributor

@Joshdpaul Joshdpaul commented Jan 10, 2025

This PR is a first shot at the monthly CMIP6 endpoint, which fetches data from the cmip6_monthly_cryo_test coverage which currently includes the complete set of "V1" variables. There are a few known issues with this endpoint (listed below) but this should make for a decent start.

To test:

Fire up the API, being sure to set your environment like so:

export API_GS_BASE_URL=http://earthmaps.io/geoserver/
export API_RAS_BASE_URL=https://zeus.snap.uaf.edu/rasdaman/
  1. Check out the documentation page at http://127.0.0.1:5000/cmip6/.
    Known issues: lacks sources/citations. Variable definitions are a bit clunky and may require some reformatting.

  2. Test some requests, make sure all the options work, try to break it. This is a pan-Arctic coverage, so anything north of 50 degrees latitude is fair game! Make sure to test siconc (sea ice concentration) on definite land points and definite sea points - is the behavior as you would expect?

  3. Specifically do a test with a year range of 2014-2015 (e.g., http://127.0.0.1:5000/cmip6/point/65/-147/2014/2015?vars=pr), and review the notes that start here. Again, is this what you would expect? @charparr solved this problem in a different way for the wet days per year route here. Known issue: Some projected data will still return all 0's if a given scenario lacks data. This happens because all models do not have data for every scenario. I am open to suggestions on how we can improve this part! It's difficult to know what to do when Rasdaman returns 0s for empty arrays instead of NaNs or -9999s. 🤷🏻

  4. Note all the TODO items in cmip6.py - these are specific spots where I had to make changes to accomodate the strings that have appeared in the Rasdaman coverage encodings. If we ever fix that issue in the coverage, I believe we can zip thru here and just delete these TODO sections and the route should work the same as the older cmip6_monthly coverage. I was hoping not to bake that stuff into the cake too much here.

  5. Review the new validate_cmip6_latlon() function, and verify that a bad request gets routed correctly and shows the map with pan-Arctic bounding box, e.g. http://127.0.0.1:5000/cmip6/point/49/-147 spits out:
    image

  6. Besides bugs with the code, is there anything missing here? Do we want any kind of decadal summaries or mmm functionality here?

One question I think is worth discussing, is how this CMIP6 endpoint is organized around a specific dataset (with multiple variables/themes), while the other endpoints are organized around variables or themes (which may contain one or more datasets). Sometimes this list looks weird to me for that reason:

image

Copy link
Member

@charparr charparr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Awesome to see the addition of the CMIP6 cryo stuff! I like how readable the code is here and you've documented why certain things are happening. Nice. I've left a few comments regarding the docs for which you might consider making some changes. Also, let me know if you'd accept a commit here to grab the bounding box from the metadata vs. hard-coding from the config. I'd be happy to do that!

config.py Outdated Show resolved Hide resolved
"""
Make an async request for CMIP6 monthly data for a range of models, scenarios, and years at a specified point

Args:
lat (float): latitude
lon (float): longitude
var_coord (int): variable coordinate from dim_encoding, if specified
time_slice (str): time slice for the data request, if specified
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Glad to have time slicing available here!

@@ -78,24 +96,52 @@ def package_cmip6_monthly_data(point_data_list, var_id=None):
if var_id != None:
varname = var_id
else:

# TODO: fix cryo coverage so we can delete this line below
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for these comments - this will help us update the code if/when the "string-representation-of-dicts' issue becomes resolved on the Rasdaman side. However, this seems to work great as is!

routes/cmip6.py Outdated Show resolved Hide resolved
routes/cmip6.py Outdated
Comment on lines 163 to 174
# Responses from Rasdaman include the same array length for both
# historical and projected data, representing every possible year
# in the request. This means both the historical and projected data
# arrays may include nodata years populated with 0s if the year range
# spans 2014 -2015 (2014 is the last year for historical data, and
# 2015 is the first year of projected data).

# The code below replaces 0s with -9999 for nodata years depending on year.
# If the scenario is historical and the year is greater than 2014,
# all 0 values are replaced with -9999 and will be pruned from the response.
# If the scenario is not historical, and the year is less than 2015,
# all 0 values are replaced with -9999 and will be pruned from the response.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I really like how you explained the "why" of what is happening in the code that follows! I think this issue might be relevant to this quest: ua-snap/rasdaman-ingest#24

Again, this works great at the moment. I think ultimately we'd want to add an "invalid_dimensions" object to our metadata block in our ingest recipes, and then use the describe coverage request to fetch that, and then have a pre-baked map of the dict keys that we expect to contain zero data, and pop them out of of the dictionary.

<strong>evspsbl</strong>: Evaporation, including sublimation and transpiration &mdash; Evaporation at surface (also known as evapotranspiration) in kg m-2 s-1; flux of water into the atmosphere due to conversion of both liquid and solid phases to vapor (from underlying surface and vegetation).
</p>
<p>
<strong>hfls</strong>: Surface upward latent heat flux &mdash; The surface latent heat flux in W m-2; is the exchange of heat between the surface and the air on account of evaporation (including sublimation).
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should the units receive some special typographic treatment like:

W m-2

Not sure, but might be nice if/when we are copy-pasting info from the docs

<strong>rsds</strong>: Surface downwelling shortwave radiation &mdash; Surface solar irradiance in W m-2, for UV calculations.
</p>
<p>
<strong>sfcWind</strong>: Surface wind &mdash; near-surface (usually, 10 meters) wind speed, in m s-1.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unclear to me what "usually" means in this instance and others. I suggest going into more detail or removing.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This definition is essentially copied from the CMIP6 docs and is particularly messy. Not sure how much we want to alter the "official" definitions. I will bring this up to the group.

<strong>tas</strong>: Near surface air temperature &mdash; mean near-surface (usually, 2 meter) air temperature, in degrees K.
</p>
<p>
<strong>tasmax</strong>: Maximum near surface air temperature &mdash; maxmimum near-surface (usually, 2 meter) air temperature, in degrees K.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pedantic point: SI convention is to use just K or kelvin and not degrees K. K is not referred to in degrees in SI convention.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants