Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

add multidimensional spatial means tutorial #860

Merged
merged 3 commits into from
Jan 20, 2025
Merged
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
57 changes: 54 additions & 3 deletions docs/src/tutorials/spatial_mean.md
Original file line number Diff line number Diff line change
Expand Up @@ -88,13 +88,13 @@ You can see here that cells are largest towards the equator, and smallest away f

## Computing the spatial mean

Now we can compute the average precipitation per square meter. First, we compute total precipitation per grid cell:
Now we can compute the average precipitation per square meter. First, we compute total precipitation over each grid cell. (The units of this Raster will be m^2 * mm, which happens to be equal to liter.)

````@example cellarea
precip_per_area = masked_precip .* masked_areas
````

We can sum this to get the total precipitation per square meter across Chile:
We can sum this to get the total precipitation across Chile:

````@example cellarea
total_precip = sum(skipmissing(precip_per_area))
Expand All @@ -106,7 +106,7 @@ We can also sum the areas to get the total area of Chile (in this raster, at lea
total_area = sum(skipmissing(masked_areas))
````

And we can convert that to an average by dividing by the total area:
And we can convert that to an average (in mm) by dividing by the total area:

````@example cellarea
avg_precip = total_precip / total_area
Expand Down Expand Up @@ -141,3 +141,54 @@ We've also seen how to use the `cellarea` function to compute the area of each c

We've seen that the spatial mean is not the same as the arithmetic mean, and that we need to account for the area of each cell when computing the average.

## Bonus: Computing spatial means across dimensions

As a next step, we would like to know how precipitation will change in Chile until the end of the 21st century. To do this, we can use climate model outputs. This is a bit more complicated than calculating historical precipitation, because the forecast data can come from multiple climate models (GCMs), which each can be run under different socio-economic scenarios (SSPs). Here, we'll show how to use additional dimensions to keep track of this type of data.

To start, we define a simple function that takes an SSP (socioeconomic scenario) and a GCM (climate model) as input, and return the appropriate climate data.

````@example zonal
using Dates
getfutureprec(ssp, gcm) = Raster(WorldClim{Future{Climate, CMIP6, gcm, ssp}}, :prec, date = Date(2090))
````

Rather than having a seperate Raster object for each combination of GCM and SSP, we will do our analysis on a single Raster, which will have `gcm` and `ssp` as additional dimensions. In total, our Raster will have four dimensions: X, Y, gcm, and ssp.

To accomplish this, we will leverage some tools from [DimensionalData](https://github.com/rafaqz/DimensionalData.jl), which is the package that underlies Rasters.jl. We start by defining two dimensions that correspond to the SSPs and GCMs we are interested in, then use the `@d` macro from [DimensionalData](https://github.com/rafaqz/DimensionalData.jl) to preserve these dimensions as we get the data, and then combine all Rasters into a single object using `Rasters.combine`.

````@example cellarea
SSPs = Dim{:ssp}([SSP126, SSP370]) # SSP126 is a low-emission scenario, SSP370 is a high-emission scenario
GCMs = Dim{:gcm}([GFDL_ESM4, IPSL_CM6A_LR]) # These are different general circulation (climate) models

precip_future = (@d getfutureprec.(SSPs, GCMs)) |> RasterSeries |> Rasters.combine
````

Since the format of WorldClim's datasets for future climate is slightly different from the dataset for the historical period, this actually returned a 5-dimensional raster, with a `Band` dimension that represents months. Here we'll just select the 6th month, matching the selection above (but note that the analysis would also work for all Bands simultaneously). We will also replace the `NaN` missing value by the more standard `missing` using [`replace_missing`](@ref).

````@example cellarea
precip_future = precip_future[Band = 6]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We could keep the Band information and do the next steps for all months at the same time?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We could, I just wanted to keep it simple. At the end of the whole thing it just prints a 2x2 dimArray and that's easier to read than a 12x2x2 dimarray

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That makes sense and I would add this info to the tutorial:

We are restricting here to one Band for simplicity but the analysis would also work for all Bands simultaneously.

precip_future = replace_missing(precip_future)
````

On our 4-dimensional raster, functions like `crop` and `mask`, as well as broadcasting, will still work.

Here we repeat the procedure from above to mask out areas so we only have data for Chile, and then multiply by the cell area.

````@example cellarea
masked_precip_future = mask(crop(precip_future; to = chile); with = chile)

precip_litres_future = masked_precip_future .* areas
````

Now we calculate the average precipitation for each SSP and each GCM. Annoyingly, the future WorldClim doesn't have data for all land pixels, so we have to re-calculate the total area.

````@example cellarea
masked_areas_future = mask(areas, with = masked_precip_future[ssp = 1, gcm = 1])
total_area_f = sum(skipmissing(masked_areas_future))

avg_prec_future = map(eachslice(precip_litres_future; dims = (:ssp, :gcm))) do slice
sum(skipmissing(slice)) / total_area_f
end
````

Which shows us that June rainfall in Chile will be slightly lower in the future, especially under the high-emission SSP370 scenario.
Loading