-
Notifications
You must be signed in to change notification settings - Fork 0
/
Copy path04_interpolation.qmd
275 lines (223 loc) · 9.92 KB
/
04_interpolation.qmd
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
# Data interpolation
Variables which enter hydrological analysis like *temperature* or *precipitation* are measured **locally at point locations**. Many of these qualities are then recalculated on different domains using **interpolation** or **extrapolation** algorithms. We interpolate within the boundary given by the values which are known to us and extrapolate outside this boundary. Here the two common methods for interpolation are demonstrated.
## Inverse distance weighting (IDW)
IDW is a simple **deterministic** interpolation method, which is also non-parametric. The interpolated points are calculated with a weighted average of the values which are at disposal. The space is a weighted average of the distribution of points and the weight assigned to each point decrease with increasing distance from the interpolated point.
The point value $Z_p$ is calculated with the knowledge of $z_i$ and distance $d_i$ to the power of $P$.
$$
Z_L = \dfrac{\sum\limits_{i=1}^{n}\dfrac{z_i}{d_i^p}}{\sum\limits_{i=1}^{n}\dfrac{1}{d_i^p}}
$$
Let's imagine we have three values $z_1 = 10$, $z_2 = 15$, $z_3 = 20$ and we need to estimate
a value in a location $L$. The distances to the location from these points are:
$d_1 = 3.0$, $d_2 = 2.0$, $d_3 = 1.5$. The commonly used power $p$ is $2$.\
So the respective values of weights for these are:
$$
\begin{array}{c}
w_1 = \dfrac{10}{3.0^2}\\
w_2 = \dfrac{15}{2.0^2}\\
w_3 = \dfrac{20}{1.5^2}
\end{array}
$$
Hence the numerator and denominator look as follows:
$$
\begin{array}{l}
N = \dfrac{10}{2.0^2} + \dfrac{15}{2.0^2} + \dfrac{20}{1.5^2}\approx 2.5 + 2.2222 + 6.6666 = 11.3888\\
D = 0.25 + 0.1111 + 0.4444 = 0.8055\\
\end{array}
$$
Since this method is very simple, let's calculate it on our own, in a step-by-step manner.
1. First we generate some point in the spatial domain, which will represent our measurements.
2. Next we create a new domain, spatially regular to which we will interpolate.
3. Finally we will perform the calculations and visualize the results.
### Generation of random measurements network
We will create 25 points in the space and also assign some coordinate reference system.
These points will be used for both of the methods.
```{r, fig.align='center', message=FALSE}
library(sf) # Spatial Feature library for spatial data
library(scico) # Scientific palette collection called "scico"
n <- 30 # <1>
dom <- data.frame(x = runif(n, 0, 50), # <1>
y = runif(n, 0, 50), # <1>
value = rnorm(n, mean = 5)) |> # <1>
sf::st_as_sf(coords = c("x", "y"), # <2>
crs = 4326) # <2>
plot(dom, # <3>
nbreaks = n + 1, # <3>
pal = scico(n = n, # <3>
palette = "davos", # <3>
end = 0.9, # <3>
direction = -1), # <3>
main = "Precipitation [mm]", # <3>
pch = 20, # Point character selection # <3>
graticule = TRUE, # Display graticules # <3>
axes = TRUE, # Display plot axes # <3>
bgc = "#f0f0f033", # Background color # <3>
key.pos = 1) # Legend position # <3>
```
1. Create $25$ points with random values, which will serve as the computation origin,
2. store them as **SimpleFeatures** object with coordinate reference system via EPSG, search in <https://epsg.io>
3. Specify `sf:::plot.sf()` function and use scientific colormaps from `scico` package.
We do have the original points, which are scarcely distributed across the domain. Now we need a grid of new points, which will represent the centroids of a raster, on which we want to recalculate.
```{r, fig.align='center', fig.cap='Point location of Precipitation measuring stations'}
grid <- st_make_grid(x = dom, # <1>
cellsize = 2, # <1>
crs = 4326) |> # <1>
st_sf()
plot(x = dom, # <2>
breaks = seq(min(dom$value), max(dom$value), length.out = n), # <2>
pal = scico(n = n - 1, # <2>
palette = "davos", # <2>
end = 0.9, # <2>
direction = -1), # <2>
main = "Precipitation [mm]", # <2>
pch = 20, # Point character selection # <2>
graticule = TRUE, # Display graticules # <2>
axes = TRUE, # Display plot axes # <2>
bgc = "#f0f0f033", # Background color # <2>
key.pos = 1, reset = FALSE) # Legend position # <2>
plot(x = st_geometry(grid), lwd = 0.1, add = TRUE, reset = TRUE) # <2>
```
1. Construct a regular grid using extreme points as boundary limits.
2. Plot with the points.
Now we compute the distances between the points using the `outer()` function
```{r, eval=TRUE, fig.align='center'}
distances <- sapply(1:nrow(st_coordinates(grid)), function(i) {
sqrt((st_coordinates(grid)[i, "X"] - st_coordinates(dom)[, "X"])^2 +
(st_coordinates(grid)[i, "Y"] - st_coordinates(dom)[, "Y"])^2)
})
```
and use these distances for missing values computation
```{r, eval=TRUE, fig.align='center', fig.ext='svg'}
epsilon <- 1e-6 #<1>
distances <- distances + epsilon # <1>
power <- 2 #<2>
# Compute weighted sums for all grid points
weighted_sum <- sapply(1:ncol(distances), function(i) {
sum(dom$value / (distances[, i]^power))
})
# Compute weights sum for all grid points
weights_sum <- sapply(1:ncol(distances), function(i) {
sum(1 / (distances[, i]^power))
})
# Calculate interpolated values for all grid points
interpolated_values <- weighted_sum / weights_sum
# Create a spatial data frame for results
results <- data.frame(
x = st_coordinates(grid)[, "X"],
y = st_coordinates(grid)[, "Y"],
value = interpolated_values
) |>
sf::st_as_sf(coords = c("x", "y"), crs = 4326)
# Replace NaN values with NA for plotting
results$value[is.nan(results$value)] <- NA
# Join results with the grid for visualization
join <- st_join(grid, results)
```
1. Set a tolerance for when the distance is equal to 0.
2. A power for the distance.
At last we visualize the results.
```{r, eval=TRUE, fig.align='center', fig.ext='svg'}
plot(x = join,
breaks = seq(min(results$value), max(results$value), length.out = n),
lwd = 0.001,
pal = scico(n - 1,
palette = "davos",
end = 0.9,
direction = -1),
main = "Precipitation [mm]",
#pch = 20, # Point character selection
graticule = TRUE, # Display graticules
axes = TRUE, # Display plot axes
bgc = "#f0f0f033", # Background color
key.pos = 1,
reset = FALSE) # Legend position
```
## `gstat`
There are many libraries listed in CRAN **geostatistics** task view. One of these is called gstat, it was developed and is maintained by Edzer Pebesma, who is also behind the `raster` and `terra` packages. The gstat package contains functions no olny for interpolations.
```{r, eval=TRUE, fig.align='center'}
library(gstat)
idw_res <- gstat::idw(formula = value ~ 1,
locations = dom,
newdata = grid)
idw_res <- st_as_sf(idw_res)
plot(idw_res |>
dplyr::select(var1.pred),
breaks = seq(min(idw_res$var1.pred),
max(idw_res$var1.pred),
length.out = n),
lwd = 0.001,
pal = scico(n - 1,
palette = "davos",
end = 0.9,
direction = -1),
main = "Precipitation [mm]",
#pch = 20, # Point character selection
graticule = TRUE, # Display graticules
axes = TRUE, # Display plot axes
bgc = "#f0f0f033", # Background color
key.pos = 1,
reset = FALSE)
```
```{r, eval=FALSE}
power <- 2 # You can adjust the power parameter
weighted_sum <- apply(distances, 1, function(d) sum(dom$value / (d^power)))
weighted_sum
```
## Krigging
Another interpolation technique is called **Krigging**. Opposing to the Inverse Distance Weighted method.
```{r, eval=TRUE, fig.align='center'}
projected_crs <- 32633 # Replace with the appropriate UTM zone for your data
dom_projected <- st_transform(dom, crs = projected_crs)
grid_projected <- st_transform(grid, crs = projected_crs)
set.seed(123) # For reproducibility
idw_points <- st_centroid(idw_res)
sampled_points <- idw_points[sample(1:nrow(idw_points), 50), ]
combined_points <- rbind(
dom,
sampled_points |>
dplyr::select(value = var1.pred, geometry)
)
combined_points <- st_transform(combined_points, crs = projected_crs)
```
An essential part of kriging is definition of a variogram, which is a function
describing the degree of spatial dependence of a spatial process. Due to the `range`
parameter, the spatial data needs to be projected.
```{r, eval=TRUE, fig.align='center'}
variogram <- vgm(psill = 5.05, model = "Exp", range = 300000, nugget = 0.15)
```
```{r, eval=TRUE, fig.align='center'}
krig_res <- gstat::krige(
formula = value ~ 1,
locations = combined_points,
newdata = grid_projected,
model = variogram # Use the appropriate fitted variogram model
)
# Step 3: Reproject the kriged result back to the original CRS (WGS84)
krig_res <- st_transform(st_as_sf(krig_res), crs = 4326)
# Generate unique breaks
min_val <- min(krig_res$var1.pred, na.rm = TRUE)
max_val <- max(krig_res$var1.pred, na.rm = TRUE)
# Ensure that min and max are sufficiently different to create unique breaks
if (max_val - min_val > 0) {
breaks <- seq(min_val, max_val, length.out = 25)
} else {
breaks <- unique(c(min_val, max_val)) # Fallback for nearly constant data
}
# Use the scico color palette with the correct number of colors
colors <- scico(length(breaks) - 1,
palette = "davos",
end = 0.9,
direction = -1)
# Plotting using the 'pal' argument for sf objects
plot(
krig_res |> dplyr::select(var1.pred),
breaks = breaks,
pal = colors, # Use 'pal' to specify the color palette
lwd = 0.001,
main = "Precipitation [mm]",
graticule = TRUE,
axes = TRUE,
bgc = "#f0f0f033",
key.pos = 1,
reset = FALSE
)
```