Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

add multidimensional spatial means tutorial #860

Merged
merged 3 commits into from
Jan 20, 2025
Merged
Changes from 1 commit
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
51 changes: 51 additions & 0 deletions docs/src/tutorials/spatial_mean.md
Original file line number Diff line number Diff line change
Expand Up @@ -141,3 +141,54 @@ We've also seen how to use the `cellarea` function to compute the area of each c

We've seen that the spatial mean is not the same as the arithmetic mean, and that we need to account for the area of each cell when computing the average.

## Bonus: Computing spatial means across dimensions

As a next step, we would like to know how precipitation will change in Chile until the end of the 21st century. To do this, we can use climate model outputs. This data can come from multiple climate models (GCMs) and under different socio-economic scenarios (SSPs). We'll use additional dimensions to keep track of these.

First we define a simple function takes an SSP (socioeconomic scenario) and a GCM (climate model) as input, and downloads the appropriate climate data.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
First we define a simple function takes an SSP (socioeconomic scenario) and a GCM (climate model) as input, and downloads the appropriate climate data.
First we define a simple function which takes an SSP (socioeconomic scenario) and a GCM (climate model) as input, and downloads the appropriate climate data.


````@example zonal
using Dates
getfutureprec(ssp, gcm) = Raster(WorldClim{Future{Climate, CMIP6, gcm, ssp}}, :prec, date = Date(2090))
````

We will leverage some tools from [DimensionalData](https://github.com/rafaqz/DimensionalData.jl), which is the package that underlies Rasters.jl. Rather than having a seperate Raster for each combination of GCM and SSP, `gcm` and `ssp` will be additional dimensions, and our Raster will be 4-dimensional (X-Y-gcm-ssp).

To do this, we first define two dimensions that correspond to the SSPs and GCMs we are interested in, then use the `@d` macro from [DimensionalData](https://github.com/rafaqz/DimensionalData.jl) to preserve these dimensions as we get the data, and then combine all Rasters into a single object using `Rasters.combine`
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe this should link to the docstring of @d directly?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we use DocumenterInterLinks? I've never tried it but seems like this is what it's made for

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah we should try it, Rasters/DD doc links came up on slack too it might make things easier

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

apparently, we don't support that yet in DVpress (never saw the benefit or was even aware of this, until now). I might take a look later this week.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I had some code to do that in my DocumenterMarkdown PR, will see about resurrecting it


````@example cellarea
SSPs = Dim{:ssp}([SSP126, SSP370]) # SSP126 is a low-emission scenario, SSP370 is a high-emission scenario
GCMs = Dim{:gcm}([GFDL_ESM4, IPSL_CM6A_LR]) # These are different general circulation (climate) models

precip_future = (@d getfutureprec.(SSPs, GCMs)) |> RasterSeries |> Rasters.combine
````

Since the format of WorldClim's datasets for future climate is slightly different from the dataset for the historical period, this actually returned a 5-dimensional raster, with a `Band` dimension that represents months. Here we'll just select the 6th month, matching the selection above. We will also replace the `NaN` missing value by the more standard `missing` using [`replace_missing`](@ref).

````@example cellarea
precip_future = precip_future[Band = 6]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We could keep the Band information and do the next steps for all months at the same time?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We could, I just wanted to keep it simple. At the end of the whole thing it just prints a 2x2 dimArray and that's easier to read than a 12x2x2 dimarray

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That makes sense and I would add this info to the tutorial:

We are restricting here to one Band for simplicity but the analysis would also work for all Bands simultaneously.

precip_future = replace_missing(precip_future)
````

On our 4-dimensional raster, functions like `crop` and `mask`, as well as broadcasting, will still work.

Here we repeat the procedure from above to mask out areas so we only have data for Chile, and then multiply by the cell area.

````@example cellarea
masked_precip_future = mask(crop(precip_future; to = chile); with = chile)

precip_litres_future = masked_precip_future .* areas
````

Now we calculate the average precipitation for each SSP and each GCM. Annoyingly, the future WorldClim doesn't have data for all land pixels, so we have to re-calculate the total area.

````@example cellarea
masked_areas_future = mask(areas, with = masked_precip_future[ssp = 1, gcm = 1])
total_area_f = sum(skipmissing(masked_areas_future))

avg_prec_future = map(eachslice(precip_litres_future; dims = (:ssp, :gcm))) do slice
sum(skipmissing(slice)) / total_area_f
end
````

Which shows us that June rainfall in Chile will be slightly lower in the future, especially under the high-emission SSP370 scenario.
Loading