-
Notifications
You must be signed in to change notification settings - Fork 9
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
add rmd file on estimating densities from point patterns
- Loading branch information
Showing
3 changed files
with
10,585 additions
and
0 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,132 @@ | ||
--- | ||
title: Estimating densities from a point pattern | ||
maintainer: Thierry Onkelinx | ||
date: '2017-06-28' | ||
output: html_document | ||
params: | ||
cellsize: 20 | ||
--- | ||
|
||
In this example we focus on a set of 10450 coordinates in a small area. The goal is to estimate the local density of points, expressed as the number of point per unit area. The raw coordinates are given in [WGS84 (EPSG:4326)](https://epsg.io/4326), which is a geodetic coordinate system. That is not suited for calculating distances, so we need to re-project the points into a local projected coordinate system. In this case we use [Lambert72 (EPSG:3170)](https://epsg.io/31370). Next we calculate the density. To visualise the density, we have to transform the results back in to WGS84. | ||
|
||
The data used in this example is real data by centred to a different location for privacy reasons. The dataset is available on [GitHub](https://github.com/ThierryO/my_blog/tree/master/data/20170628). | ||
|
||
First we must read the data into R. Plotting the raw data helps to check errors in the data. | ||
|
||
```{r read-data, fig.cap = "Raw data"} | ||
points <- read.delim("point-pattern/points.txt", sep = " ") | ||
library(ggmap) | ||
map <- get_map( | ||
location = c(lon = mean(points$lon), lat = mean(points$lat)), | ||
zoom = 17, | ||
maptype = "satellite", | ||
source = "google" | ||
) | ||
ggmap(map) + | ||
geom_point(data = points, alpha = 0.1, colour = "blue", shape = 4) | ||
``` | ||
|
||
The next step is to convert the dataset in to a `SpatialPoints` object with WGS84 project and re-project it into Lambert72. `sp::CRS()` defines the coordinate systems. `sp::coordinates()<-` is an easy way to convert a `data.frame` into a `SpatialPointsDataFrame`, but without specifying a coordinate system. Therefore we need to override the `proj4string` slot with the correct coordinate system. `sp::spTransform()` converts the spatial object from the current coordinate system to another coordinate system. | ||
|
||
```{r reproject} | ||
library(sp) | ||
crs_wgs84 <- CRS("+init=epsg:4326") | ||
crs_lambert <- CRS("+init=epsg:31370") | ||
coordinates(points) <- ~lon + lat | ||
points@proj4string <- crs_wgs84 | ||
points_lambert <- spTransform(points, crs_lambert) | ||
``` | ||
|
||
Once we have the points into a projected coordinate system, we can calculate the densities. We start by defining a grid. `cellsize` is the dimension of the square grid cell in the units of the projected coordinate system. Meters in case of Lambert72. The boundaries of the grid are defined using `pretty()`, which turns a vector of numbers into a "pretty" vector with rounded numbers. The combination of the boundaries and the cell size determine the number of grid cells `n` in each dimension. `diff()` calculates the difference between to adjacent numbers of a vector. The density is calculated with `MASS::kde2d()` based on the vectors with the longitude and latitude, the number of grid cells in each dimension and the boundaries of the grid. This returns the grid as a list with elements `x` (a vector of longitude coordinates of the centroids), `y` (a vector of latitude coordinates of the centroids) and `z` (a matrix with densities). The values in `z` are densities for the 'average' point per unit area. When we multiply the value `z` with the area of the grid cell and sum all of them we get 1. So if we multiple `z` with the number of points we get the density of the points per unit area. | ||
|
||
We use [`dplyr::mutate()`](http://dplyr.tidyverse.org/) to convert it into a `data.frame`. The last two steps convert the centroids into a set of coordinates for square polygons. | ||
|
||
```{r density} | ||
library(MASS) | ||
library(dplyr) | ||
xlim <- range(pretty(points_lambert$lon)) + c(-100, 100) | ||
ylim <- range(pretty(points_lambert$lat)) + c(-100, 100) | ||
n <- c( | ||
diff(xlim), | ||
diff(ylim) | ||
) / params$cellsize + 1 | ||
dens <- kde2d( | ||
x = points_lambert$lon, | ||
y = points_lambert$lat, | ||
n = n, | ||
lims = c(xlim, ylim) | ||
) | ||
dx <- diff(dens$x[1:2]) | ||
dy <- diff(dens$y[1:2]) | ||
sum(dens$z * dx * dy) | ||
dens <- expand.grid( | ||
lon = dens$x, | ||
lat = dens$y | ||
) %>% | ||
mutate( | ||
density = as.vector(dens$z) * length(points_lambert), | ||
id = seq_along(density) | ||
) %>% | ||
merge( | ||
data.frame( | ||
x = dx * (c(0, 0, 1, 1, 0) - 0.5), | ||
y = dy * (c(0, 1, 1, 0, 0) - 0.5) | ||
) | ||
) %>% | ||
mutate( | ||
lon = lon + x, | ||
lat = lat + y | ||
) | ||
``` | ||
|
||
In order to visualise the result, we have to re-project the coordinates back to WGS84. Then we can display the raster with a web based background image. | ||
|
||
```{r ggmap, fig.cap = "Static image of density"} | ||
coordinates(dens) <- ~lon + lat | ||
dens@proj4string <- crs_lambert | ||
dens_wgs <- spTransform(dens, crs_wgs84) %>% | ||
as.data.frame() | ||
ggmap(map) + | ||
geom_polygon(data = dens_wgs, aes(group = id, fill = density), alpha = 0.5) + | ||
scale_fill_gradientn( | ||
"density\n(#/m²)", | ||
colours = rev(rainbow(100, start = 0, end = .7)), | ||
limits = c(0, NA) | ||
) | ||
``` | ||
|
||
Using `leaflet` to generate a map was a bit more laborious. Using the `data.frame dens_wgs`directly failed. So we converted the `data.frame` in a `SpatialPolygonsDataframe`, which is a combination of a `SpatialPolygons` and a `data.frame`. The `SpatialPolygons` consists of a list of `Polygons`, one for each row of the `data.frame`. A `Polygons` object consist of a list of one or more `Polygon` object. In this example a single polygon which represents the grid cell. | ||
|
||
```{r convert-to-Spatial-Polygons} | ||
dens_sp <- lapply( | ||
unique(dens_wgs$id), | ||
function(i){ | ||
filter(dens_wgs, id == i) %>% | ||
select(lon, lat) %>% | ||
Polygon() %>% | ||
list() %>% | ||
Polygons(ID = i) | ||
} | ||
) %>% | ||
SpatialPolygons() %>% | ||
SpatialPolygonsDataFrame( | ||
data = dens_wgs %>% | ||
distinct(id, density), | ||
match.ID = FALSE | ||
) | ||
``` | ||
|
||
`leaflet` requires a predefined function with a colour pallet. We use `leaflet::colorNumeric()` to get a continuous pallet. Setting `stroke = FALSE` removes the borders of the polygon. `fillOpacity` sets the transparency of the polygons. | ||
|
||
```{r leaflet, fig.cap = "Dynamic map of density"} | ||
library(leaflet) | ||
pal <- colorNumeric( | ||
palette = rev(rainbow(100, start = 0, end = .7)), | ||
domain = c(0, dens_sp$density) | ||
) | ||
leaflet(dens_sp) %>% | ||
addTiles() %>% | ||
addPolygons(color = ~pal(density), stroke = FALSE, fillOpacity = 0.5) %>% | ||
addLegend(pal = pal, values = ~density) | ||
``` | ||
|
Oops, something went wrong.