-
Notifications
You must be signed in to change notification settings - Fork 0
/
Copy pathREADME.Rmd
104 lines (83 loc) · 4.61 KB
/
README.Rmd
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
---
output: github_document
---
<!-- README.md is generated from README.Rmd. Please edit that file -->
```{r, include = FALSE}
knitr::opts_chunk$set(
collapse = TRUE,
comment = "#>",
fig.path = "man/figures/README-",
out.width = "100%"
)
```
# spanishflu
<!-- badges: start -->
[![Travis build status](https://travis-ci.org/markushlang/covid19vaccines.svg?branch=master)](https://travis-ci.org/markushlang/covid19vaccines)
<!-- badges: end -->
The `spanishflu` package contains ten datasets covering the Spanish flu in the United States:
- The first two datasets,`naval_forces_in_the_us_deaths` and `army_in_the_us_deaths`, can be used to visualize deaths from influenza and pneumonia in the military forces, where the Spanish flu first spread.
- The third dataset, `deaths_registered_in_certain_cities`, contains weekly death numbers from larger U.S. cities between September 8th 1918 and March 15th 1919.
- The fourth, fifth and sixth datasets, `deaths_registered_in_certain_states_1000`,`deaths_registered_in_certain_cities_1000`,
`deaths_registered_in_nyc_boroughs_1000`, contain mortality figures per 1,000 population for U.S. states, cities of 100,000 population or more, and New York City's five boroughs.
- The seventh dataset, `deaths_by_sex_age`, breaks down deaths from influenza and pneumonia per 1,000 population for all U.S. states, except Hawaii, by sex and age.
- The eigth, nineth and tenth datasets,`death_estimated_vs_excess_states`, `death_estimated_vs_excess_cities`,`death_estimated_vs_excess_nyc_boroughs`, are helpful to distinguish "excess" deaths from influenza and pneumonia from "normal" deaths from influenza and pneumonia in U.S. states, larger cities, and New York City's boroughs.
The sources for the datasets contained in the package come from Alfred W. Crosby's (2003) book ["America's Forgotten Pandemic"](https://www.amazon.com/Americas-Forgotten-Pandemic-Influenza-1918/dp/0521541751). It provides key context for all the datasets. In addition to Crosby's data, I also added data on non-pharmaceutical interventions by larger U.S. cities during the 1918 and 1919 outbreak from [Howard Markel and colleagues](https://jamanetwork.com/journals/jama/fullarticle/208354).
## Installation
You can install the beta version of `spanishflu` from [GitHub](https://github.com/markushlang/spanishflu) with:
```r
install.packages("devtools")
devtools::install_github("markushlang/spanishflu")
````
## Load the data
```{r load}
library(spanishflu)
data("deaths_registered_in_certain_cities")
```
To look at the tibble that contains the data, do this:
```{r load-doc}
head(deaths_registered_in_certain_cities)
```
## Example
```{r example, fig.caption = "Cumulative Reported Deaths from the Spanish Flu, Selected Cities", dpi=300, width = 8, height = 8, message=FALSE, warning=FALSE}
# load packages
library(pacman)
pacman::p_load("tidyverse","lubridate","ggrepel","paletteer","scales","prismatic")
pacman::p_load_gh("markushlang/spanishflu")
# prepare cumulative counts
flu_curve <- deaths_registered_in_certain_cities %>%
select(date,city,deaths) %>%
group_by(city) %>%
arrange(date) %>%
mutate(deaths = ifelse(is.na(deaths),0,deaths)) %>%
mutate(cu_deaths = cumsum(deaths)) %>%
filter(cu_deaths > 9) %>%
mutate(days_elapsed = date - min(date),
end_label = ifelse(date == max(date), city, NA))
# create cumulative deaths plot
flu_curve %>%
filter(city %in% c("New York","Philadelphia","Chicago",
"Boston","Pittsburgh")) %>%
ggplot(mapping = aes(x = days_elapsed, y = cu_deaths,
color = city, label = end_label,
group = city)) +
geom_line(size = 0.8) +
geom_text_repel(nudge_x = 1.1,
nudge_y = 0.1,
segment.color = NA) +
guides(color = FALSE) +
theme_minimal() +
scale_color_manual(values = prismatic::clr_darken(paletteer_d("jcolors::default"), 0.2)) +
scale_y_continuous(labels = scales::comma_format(accuracy = 1),
trans = "log2") +
labs(x = "Days Since 10th Confirmed Death",
y = "Cumulative Number of Deaths (log scale)",
title = "Cumulative Deaths from the Spanish Flu, Selected U.S. Cities") +
theme(plot.title = element_text(size = rel(1), face = "bold"),
axis.text.y = element_text(size = rel(1)),
axis.title.x = element_text(size = rel(0.75)),
axis.title.y = element_text(size = rel(0.75)),
axis.text.x = element_text(size = rel(1)),
legend.text = element_text(size = rel(1))
)
```
This example draws heavily on a [Kieran Healy's Covid-19 Tracking Blogpost](https://kieranhealy.org/blog/archives/2020/03/21/covid-19-tracking/).