-
Notifications
You must be signed in to change notification settings - Fork 0
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add CMIP6 endpoint, using "cryo" coverage #510
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Awesome to see the addition of the CMIP6 cryo stuff! I like how readable the code is here and you've documented why certain things are happening. Nice. I've left a few comments regarding the docs for which you might consider making some changes. Also, let me know if you'd accept a commit here to grab the bounding box from the metadata vs. hard-coding from the config. I'd be happy to do that!
""" | ||
Make an async request for CMIP6 monthly data for a range of models, scenarios, and years at a specified point | ||
|
||
Args: | ||
lat (float): latitude | ||
lon (float): longitude | ||
var_coord (int): variable coordinate from dim_encoding, if specified | ||
time_slice (str): time slice for the data request, if specified |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Glad to have time slicing available here!
@@ -78,24 +96,52 @@ def package_cmip6_monthly_data(point_data_list, var_id=None): | |||
if var_id != None: | |||
varname = var_id | |||
else: | |||
|
|||
# TODO: fix cryo coverage so we can delete this line below |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for these comments - this will help us update the code if/when the "string-representation-of-dicts' issue becomes resolved on the Rasdaman side. However, this seems to work great as is!
routes/cmip6.py
Outdated
# Responses from Rasdaman include the same array length for both | ||
# historical and projected data, representing every possible year | ||
# in the request. This means both the historical and projected data | ||
# arrays may include nodata years populated with 0s if the year range | ||
# spans 2014 -2015 (2014 is the last year for historical data, and | ||
# 2015 is the first year of projected data). | ||
|
||
# The code below replaces 0s with -9999 for nodata years depending on year. | ||
# If the scenario is historical and the year is greater than 2014, | ||
# all 0 values are replaced with -9999 and will be pruned from the response. | ||
# If the scenario is not historical, and the year is less than 2015, | ||
# all 0 values are replaced with -9999 and will be pruned from the response. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I really like how you explained the "why" of what is happening in the code that follows! I think this issue might be relevant to this quest: ua-snap/rasdaman-ingest#24
Again, this works great at the moment. I think ultimately we'd want to add an "invalid_dimensions" object to our metadata block in our ingest recipes, and then use the describe coverage request to fetch that, and then have a pre-baked map of the dict keys that we expect to contain zero data, and pop them out of of the dictionary.
<strong>evspsbl</strong>: Evaporation, including sublimation and transpiration — Evaporation at surface (also known as evapotranspiration) in kg m-2 s-1; flux of water into the atmosphere due to conversion of both liquid and solid phases to vapor (from underlying surface and vegetation). | ||
</p> | ||
<p> | ||
<strong>hfls</strong>: Surface upward latent heat flux — The surface latent heat flux in W m-2; is the exchange of heat between the surface and the air on account of evaporation (including sublimation). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should the units receive some special typographic treatment like:
W m-2
Not sure, but might be nice if/when we are copy-pasting info from the docs
<strong>rsds</strong>: Surface downwelling shortwave radiation — Surface solar irradiance in W m-2, for UV calculations. | ||
</p> | ||
<p> | ||
<strong>sfcWind</strong>: Surface wind — near-surface (usually, 10 meters) wind speed, in m s-1. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Unclear to me what "usually" means in this instance and others. I suggest going into more detail or removing.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This definition is essentially copied from the CMIP6 docs and is particularly messy. Not sure how much we want to alter the "official" definitions. I will bring this up to the group.
<strong>tas</strong>: Near surface air temperature — mean near-surface (usually, 2 meter) air temperature, in degrees K. | ||
</p> | ||
<p> | ||
<strong>tasmax</strong>: Maximum near surface air temperature — maxmimum near-surface (usually, 2 meter) air temperature, in degrees K. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pedantic point: SI convention is to use just K
or kelvin
and not degrees K. K is not referred to in degrees in SI convention.
This PR is a first shot at the monthly CMIP6 endpoint, which fetches data from the
cmip6_monthly_cryo_test
coverage which currently includes the complete set of "V1" variables. There are a few known issues with this endpoint (listed below) but this should make for a decent start.To test:
Fire up the API, being sure to set your environment like so:
Check out the documentation page at
http://127.0.0.1:5000/cmip6/
.Known issues: lacks sources/citations. Variable definitions are a bit clunky and may require some reformatting.
Test some requests, make sure all the options work, try to break it. This is a pan-Arctic coverage, so anything north of 50 degrees latitude is fair game! Make sure to test
siconc
(sea ice concentration) on definite land points and definite sea points - is the behavior as you would expect?Specifically do a test with a year range of 2014-2015 (e.g.,
http://127.0.0.1:5000/cmip6/point/65/-147/2014/2015?vars=pr
), and review the notes that start here. Again, is this what you would expect? @charparr solved this problem in a different way for the wet days per year route here. Known issue: Some projected data will still return all 0's if a given scenario lacks data. This happens because all models do not have data for every scenario. I am open to suggestions on how we can improve this part! It's difficult to know what to do when Rasdaman returns 0s for empty arrays instead of NaNs or -9999s. 🤷🏻Note all the
TODO
items incmip6.py
- these are specific spots where I had to make changes to accomodate the strings that have appeared in the Rasdaman coverage encodings. If we ever fix that issue in the coverage, I believe we can zip thru here and just delete theseTODO
sections and the route should work the same as the oldercmip6_monthly
coverage. I was hoping not to bake that stuff into the cake too much here.Review the new
validate_cmip6_latlon()
function, and verify that a bad request gets routed correctly and shows the map with pan-Arctic bounding box, e.g.http://127.0.0.1:5000/cmip6/point/49/-147
spits out:Besides bugs with the code, is there anything missing here? Do we want any kind of decadal summaries or
mmm
functionality here?One question I think is worth discussing, is how this CMIP6 endpoint is organized around a specific dataset (with multiple variables/themes), while the other endpoints are organized around variables or themes (which may contain one or more datasets). Sometimes this list looks weird to me for that reason: