-
Notifications
You must be signed in to change notification settings - Fork 0
/
Copy pathindex.Rmd
104 lines (73 loc) · 3.97 KB
/
index.Rmd
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
---
title: "DfT R Cookbook"
author: "Isi Avbulimen, Hannah Bougdah, Tamsin Forbes, Jack Marks, Tim Taylor, Francesca Bryden"
date: "`r Sys.Date()`"
site: bookdown::bookdown_site
output:
bookdown::gitbook:
df_print: kable
fig_width: 7
fig_height: 6
documentclass: book
bibliography: [book.bib, packages.bib]
biblio-style: apalike
link-citations: yes
github-repo: departmentfortransport/R-cookbook
description: "Guidance and code examples for R usage for DfT and beyond"
---
```{r include=FALSE}
require(dplyr)
require(knitr)
options(knitr.table.format = "html")
# Load example data for Road Casualties in Great Britain 1969-84
data(mtcars)
# Look at the data
mtcars <- tibble::as.tibble(mtcars) %>%
# Add rows for dplyr example
tibble::rownames_to_column(var = "car")
```
# {-}
```{r echo=FALSE, fig.alt="Title page"}
knitr::include_graphics(path = "image/book_image_short.png")
```
# Why do we need _another_ book?
`R` is a very flexible programming language, which inevitably means there are lots of ways to achieve the same result. This is true of all programming languages, but is particularly exaggerated in `R` which makes use of ['meta-programming'](http://adv-r.had.co.nz/).
For example, here is how to calculate a new variable using standard R and filter on a variable:
```{r}
# Calculate kilometers per litre from miles per gallon
mtcars$kpl <- mtcars$mpg * 0.425144
# Select cars with a horsepower greater than 250 & show only mpg and kpl columns
mtcars[mtcars$hp > 250, c("car", "mpg", "kpl")]
```
Here's the same thing using {tidyverse} style R:
```{r}
mtcars %>%
# Calculate kilometers per litre
dplyr::mutate(
kpl = mpg * 0.425144
) %>%
# Filter cars with a horsepower greater than 250
dplyr::filter(
hp > 250
) %>%
# Take only the car, mpg, and newly created kpl columns
dplyr::select(car, mpg, kpl)
```
These coding styles are quite different. As people write more code across the Department, it will become increasingly important that code can be handed over to other R users. It is much easier to pick up code written by others if it uses the same coding style you are familiar with.
This is the main motivation for this book, to establish a way of coding that represents a sensible default for those who are new to R that is readily transferable across DfT.
## Coding standards
Related to this, the Data Science team maintain a [coding standards document](https://department-for-transport.github.io/ds-processes/Coding_standards/r.html), that outlines some best practices when writing R code. This is not prescriptive and goes beyond the scope of this document, but might be useful for managing your R projects.
```{r include=FALSE}
# automatically create a bib database for R packages
knitr::write_bib(c(
.packages(), "bookdown", "knitr", "rmarkdown"
), "packages.bib")
```
## Data
The data used in this book is all derived from opensource data. As well as being availble fro the data folder on the github site [here](https://github.com/departmentfortransport/R-cookbook/tree/master/data) you can also find the larger data sets at the links below.
- [Road Safety Data](https://data.gov.uk/dataset/cb7ae6f0-4be6-4935-9277-47e5ce24a11f/road-safety-data)
- Search and Rescue Helicopter data; SARH0112 record level data download available under **All data** at the bottom of this [webpage](https://www.gov.uk/government/statistical-data-sets/search-and-rescue-helicopter-sarh01).
- Pokemon data, not sure of original source, but borrowed from Matt Dray [here](https://github.com/matt-dray/beginner-r-feat-pkmn/tree/master/data)
- Port data; the port data can be downloaded by clicking on the table named [PORT0499](https://www.gov.uk/government/statistical-data-sets/port-and-domestic-waterborne-freight-statistics-port)
## Work in progress
This book is not static - new chapters can be added and current chapters can be amended. Please let us know if you have suggestions by raising an issue [here](https://github.com/department-for-transport/R-cookbook/issues).