-
Notifications
You must be signed in to change notification settings - Fork 2
/
Copy path10_ana_seasonal_precipitation.Rmd
325 lines (244 loc) · 13.6 KB
/
10_ana_seasonal_precipitation.Rmd
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
---
title: "Historical precipitation patterns and seasonal forecasts"
author: "Linda Petutschnig"
date: "31 5 2023"
output: html_document
---
The aim of this indicator is to predict whether the upcoming six months will have favorable conditions for mosquito breeding. First, it assesses the months that can be defined as 'rainy season' for all individual locations of the AOI. Rainy season months are considered high risk months. In addition, it calculates which months in the next half year are expected to receive more precipitation than in an average year. If a month that is already part of the rainy season is expected to receive exceptionally high precipitation, the final risk is especially high.
To define the rainy season, we use 30 year CHIRPS data (1990-2019) to establish precipitation patterns for each location. Monthly averages are calculated, and the yearly mean precipitation is derived as the average of these monthly values, to establish a baseline value.
For each location, we also calculate the standard deviation of the monthly averages, describing the deviation from a distribution where every month would receieve the same amount of precipitation. Months with precipitation one standard deviation above the baseline are considered part of the rainy season and receive one point. Months with precipitation close to the baseline receive 0 points, indicating a shoulder season. Months with precipitation one standard deviation below the baseline are assigned -1 point.
Regarding precipitation forecasts, we use anomalies expressed in m s-1, which represent the depth of water equivalent in meters which fall per second. It is the depth the water would have if it were spread evenly over the grid box. Positive and negative anomalies indicate above-average and below-average precipitation, respectively. By comparing these anomalies with the global mean and standard deviation of each months' forecasts for the entire area, we assign points accordingly: +1 for forecasted precipitation that is > 1 standard deviation above the global mean, 0 for forecasted precipitation that is near the global mean, and - 1 for forecasted precipitation that is > 1 standard deviation below the global mean.
To combine both information layers, we sum the points obtained for each location from historical and forecast data. This process is carried out for the months of February to July 2020. The resulting six-layered output is further summarized into one final layer, for easier integration into the overall assessment.
This script performs the following tasks:
- Loads pre-processed CHIRPS precipitation data and manipulates in several ways for data analysis
- Calculates averages
- Makes plots to explore the data
- Load seasonal precipitation forecast data
- Calculate the final indicator as described above
- Writes results into hexagons and exports
```{r setup, include=FALSE}
knitr::opts_chunk$set(echo = TRUE)
```
## Load libraries
```{r load libraryies, message = FALSE, warning = FALSE}
library(httr)
library(utils)
library(raster)
library(terra)
library(sf)
library(R.utils)
library(tidyr)
library(dplyr)
library(stringr)
library(lubridate)
library(tsibble)
library(ggplot2)
library(here)
library(purrr)
library(timetk)
source(here("00_hexagonify.R"))
```
## Preparing the data for time series analysis
```{r prepare data, message = FALSE, error = FALSE, warning = FALSE}
# Loads the previously prepared brick
prec_aoi_brick <- terra::rast(here::here("data/rasters", "perc_aoi_brick_90_19.tif"))
# Convert raster cells to points
prec_90_19_points <- rasterToPoints(raster::brick(prec_aoi_brick), spatial = TRUE) %>%
st_as_sf() %>%
mutate(id = row_number())
# Convert to long
# Transforming the data into long format
# We want one Year column and a value column for our analysis.
# The pivot_long function from the tidyr package is used to transform the data from wide to long format.
prec_long <- prec_90_19_points %>%
pivot_longer(
cols = starts_with("chirps.v2.0"),
names_to = "Month",
values_to = "Precipitation"
)
# Cleans up the Year column
# The Year column names are currently long strings like "Plasmodium..falciparum..Incidence..2000".
# Only need the last four characters indicating the year are needed.
prec_long$Month <- str_sub(prec_long$Month,-7,-1)
prec_long$Month <- gsub("\\.(?=\\d{2}$)", "-", prec_long$Month, perl=TRUE)
# Converts the Year column into date format and adds 1st
prec_long$Month <- as_date(paste0(prec_long$Month, "-01"))
# Converting the long format data table to a time series tibble by id
# Every geographical point is now represented as 360 individual rows (1990 to 2019, monthly) that share the same id.
prec_tsbl <- as_tsibble(prec_long, key = id, index = Month)
# Calculates the monthly precipitation averaging over all locations
avg <- prec_long %>%
group_by(Month) %>%
summarise(avg_value = mean(Precipitation, na.rm = TRUE))
# Creates a time series plot with each location represented by one line.
ggplot() +
geom_line(data = prec_long, aes(x = Month, y = Precipitation, group =id), color = "darkseagreen", alpha = 0.1) +
geom_line(data = avg, aes(x = Month, y = avg_value), color = "salmon", linewidth = 2) +
labs(x = "Month", y = "Precipitation", title = "Precipitation monthly 1990-2019") +
theme_minimal(base_size = 14)
```
## Average monthly rainfall per year
The next chunk prepares and plots a graph showing the average yearly rainfall pattern of the years 1990-2019 for all locations combined.
```{r average rainfall throughout the year, message = FALSE, error = FALSE, warning = FALSE}
# We want one column that contains only the month
# Extracts from the current column "Month" (that actually contains a full date) the month and writes it into a new column "month_name" as a number
prec_long$month_name <- month(prec_long$Month, label = FALSE)#, abbr = FALSE)
# Calculates the 30-year average precipitation per month for all locations at once
avg_monthly <- prec_long %>%
group_by(month_name) %>%
summarise(avg_value = mean(Precipitation, na.rm = TRUE))
# Creates a time series plot of the 30-year average precipitation per month for all locations at once
ggplot() +
geom_step(data = avg_monthly, aes(x = month_name, y = avg_value), color = "salmon", linewidth = 1) +
labs(x = "Month", y = "Precipitation", title = "Precipitation monthly 1990-2019 whole AOI") +
theme_minimal(base_size = 14)
```
## Zoom in on one randomly selected location
The next chunk prepares and plots precipitation data for one particular location/grid cell.
The green line displays the 30-year monthly-average while the red points represent the individual years.
}
```{r single areas, message = FALSE, error = FALSE, warning = FALSE}
id <- prec_long %>%
group_by(id)
random_cell = id %>%
subset(id == 10000)
monthly_summary = random_cell %>%
group_by(month_name) %>%
summarise(avg_value = mean(Precipitation, na.rm = TRUE))
ggplot() +
geom_point(data = random_cell, aes(x = month_name, y = Precipitation), color = "salmon") +
geom_line(data = monthly_summary, aes(x = month_name, y = avg_value), color = "darkseagreen", linewidth = 1) +
labs(x = "Month", y = "Precipitation", title = "Precipitation monthly 1990-2019") +
theme_minimal(base_size = 14)
```
## Load and prepare percipitation forecast data
```{r load precipitation forecasts, message = FALSE, error = FALSE, warning = FALSE}
# Load the precipitation forecast data
# Set the folder path
folder_path <- here("data/rasters")
# List all files in the folder with the .grib extension
file_names <- list.files(folder_path, pattern = "\\.grib$", full.names = TRUE)
# Loads the first element from the list (assuming that the folder contains only one grib file)
prec_fore <- rast(file_names[[1]])
# For more information on the data, see here: https://codes.ecmwf.int/grib/param-db/?id=173228
# prec_fore <- rast(here("data/rasters","seasonal_prec_forecast_2020_01_lead123456.grib"))
# Visualizes the raw forecast data
plot(prec_fore)
# Gets the unique layer names in the raster stack
names(prec_fore) %>%
unique()
# Gets the time information for each layer in the raster stack
time(prec_fore)
# Sets the correct layer names with the corresponding dates
# This makes it easier to identify each layer by its date/time
names(prec_fore) <- paste("SFC (Ground or water surface);", time(prec_fore))
# Calculates the standard deviation and average for all AOI (all layers)
std_dev <- global(prec_fore, fun = "sd")
avg <- global(prec_fore, fun = "mean")
# Creates a new raster with deviation information about which areas deviate more or less than one standard deviation from the global mean
# 1: Deviates more than one standard deviation above the global mean
# -1: Deviates more than one standard deviation below the global mean
# 0: Deviates within one standard deviation of the global mean
deviation_raster <- ifel(prec_fore > (avg$mean + std_dev$sd), 1,
ifel(prec_fore < (avg$mean - std_dev$sd), -1, 0))
# Visualizes the deviation raster
plot(deviation_raster)
# Extracts one band of the CHIRPS data to which the resolution of the forecast will be matched to
r <- prec_aoi_brick$`chirps-v2.0.1999.01`
#deviation_raster <- project(deviation_raster, r)
# Resamples the forecast resolution to the resolution of the CHIRPS data using bilinear interpolation
prec_fore_resample <- terra::resample(deviation_raster, rast(r), method = "bilinear")
# Visualizes the resampled deviation raster
plot(prec_fore_resample)
# Masks the bbox of the forecast to the exact AOI
# This ensures that the deviation raster only covers the area of interest ('r')
prec_fore_deviation_resample_mask <- prec_fore_resample %>%
mask(r)
# Visualizes the masked deviation raster
plot(prec_fore_deviation_resample_mask)
```
```{r rainy season, message = FALSE, error = FALSE, warning = FALSE}
# Start with all CHIRPS layers
# Make 12 raster stacks (all January's, all February's,...)
# Assuming you have a terra raster with 252 bands named "prec_aoi_brick"
# Get the unique last two digits (months) from the band names
months <- unique(substr(names(prec_aoi_brick), nchar(names(prec_aoi_brick))-1, nchar(names(prec_aoi_brick))))
# Create 12 separate raster stacks, one for each month
monthly_stacks <- list()
for (month in months) {
# Subset the bands that match the current month
bands <- grep(paste0("\\.", month), names(prec_aoi_brick), value = TRUE)
monthly_raster <- subset(prec_aoi_brick, subset = bands)
# Rename the layers to remove the last two digits (month part) from the layer names
names(monthly_raster) <- sub(paste0("\\.", month), "", names(monthly_raster))
# Append the monthly raster stack to the list
monthly_stacks[[month]] <- monthly_raster
}
# Monthly stack pixel means
# Calculates the average rain in Jan, Feb, ... for each pixel throughout the assessment period
monthly_means <- lapply(months, function(month){
monthly_stack <- monthly_stacks[[month]]
monthly_mean <- mean(monthly_stack, na.rm = TRUE)
return(monthly_mean)
})
# Makes a stack from the 12 layers
monthly_means_stack <- rast(monthly_means)
# Calculates the sd for the precipitation throughout the year per pixel
std_dev_per_pixel <- terra::stdev(monthly_means_stack, na.rm = TRUE)
# Calculates the mean precipitation per pixel throughout the year
mean_per_pixel <- terra::mean(monthly_means_stack, na.rm = TRUE)
# Classifies each pixel stack based on its deviation from the mean precipitation throughout the year
rainy_season <- ifel(monthly_means_stack > (mean_per_pixel$mean + std_dev_per_pixel$std), 1,
ifel(monthly_means_stack < (mean_per_pixel$mean - std_dev_per_pixel), -1, 0))
plot(rainy_season)
# Calculates standard deviation per pixel stack
# raster_stacks_std_devs <- lapply(monthly_means, function(raster_stack){
# std_dev <- terra::stdev(raster_stack, na.rm = TRUE)
# return(std_dev)
# })
# Integrate with forecast:
# Overlay Jan + Jan, Feb + Feb , etc.
# Calulates total per cell for the 6 months
# Use lapply to perform the addition for each layer and return a list of raster layers
summarized_layers <- lapply(1:6, function(i) {
# Get the i'th layer from each stack
layer1 <- prec_fore_deviation_resample_mask[[i]]
layer2 <- rainy_season[[i]]
# Performs the addition and returns the result as a raster layer
layer_sum <- layer1 + layer2
return(layer_sum)
})
# Converts the list of raster layers to a new raster stack
risk_jan_jul_2020 <- rast(summarized_layers)
# Visualizes the forecast risk
plot(risk_jan_jul_2020)
# While having a forecast per month holds more information, for the sake of integration with the rest of the assessment, I will use the sum of the six months.
risk_sum <- sum(risk_jan_jul_2020)
# Visualizes the final risk layer
plot(risk_sum)
```
## Hexagonifies the precipitation results
```{r integration with hexagons, message = FALSE, error = FALSE, warning = FALSE}
# Loads hexagon layer from database
aoi_shape <- st_read(here("data/geopackages","AOI_hex5.gpkg"))
sf_use_s2(FALSE)
# Aligns CRSs'
risk_sum <- project(risk_sum, aoi_shape)
# Calls the hexagonify function to aggregate the precipitation risk values into a hexagonal grid.
aoi_shape$prec_risk <- hexagonify(raster(risk_sum), aoi_shape, mean)
# Map
# Defines the color palette for the map
pal <- c('#60b564','#fed400')
library(tmap)
# Sets the map to an interactive mode
tmap_mode("view")
# Creates the map object
tm_shape(aoi_shape) +
tm_polygons(col = "prec_risk", palette = pal, popup.vars = c("Risk of wet rainy season"="prec_risk")) +
tm_borders(lwd = 0.2) +
tm_layout(title = "Risk of wet rainy season") +
tm_fill(col = '#BFE1F4') +
tm_borders(lwd = 1, col = "black")
#Writes the hexagons file into the db
st_write(aoi_shape, here("data/geopackages","AOI_hex5_ind5.gpkg"), append = FALSE)
```