Skip to content

Commit

Permalink
Merge pull request #28 from nhs-r-community/update-readme
Browse files Browse the repository at this point in the history
Update readme
  • Loading branch information
Lextuga007 authored Jan 22, 2024
2 parents cbbe8e3 + a8e8632 commit 9a8d9b8
Show file tree
Hide file tree
Showing 4 changed files with 125 additions and 110 deletions.
6 changes: 4 additions & 2 deletions DESCRIPTION
Original file line number Diff line number Diff line change
Expand Up @@ -13,7 +13,7 @@ BugReports: https://github.com/nhs-r-community/NHSRpopulation/issues
Encoding: UTF-8
LazyData: true
Roxygen: list(markdown = TRUE)
RoxygenNote: 7.2.3
RoxygenNote: 7.3.0
Imports:
cli,
dplyr,
Expand All @@ -27,7 +27,9 @@ Depends:
Suggests:
rmarkdown,
knitr,
testthat (>= 3.0.0)
testthat (>= 3.0.0),
purrr,
tibble
Remotes: nhs-r-community/NHSRpostcodetools
VignetteBuilder: knitr
Config/testthat/edition: 3
35 changes: 1 addition & 34 deletions README.Rmd
Original file line number Diff line number Diff line change
Expand Up @@ -20,12 +20,7 @@ knitr::opts_chunk$set(
<!-- badges: end -->

The goal of `NHSRpopulation` is to make population estimates for **Lower layer Super Output Areas (LSOA)** and their **Indices of Multiple Deprivation (IMD)** easily available in R.
Population estimates are broken down by age (0 to 90+) and gender (female/male).
Information about the original sources of the data and a transparent description of all transformation of the data that is made available in this package can be found in this repository, see `"data-raw/imd.R` and `"data-raw/lsoa.R`.
Main changes to the original data structures include (1) the transformation from wide to long data, (2) the addition of further information that was only available in variable names, and (3) renaming variables in a consistent way.

The current version of this package only includes LSOA population estimates and IMD scores for the year 2019 for England.
Because we store quite a lot in this package it currently relatively large (~9mb) compared to other packages.
In its first iteration this package was data saved from [https://www.gov.uk/government/statistics/english-indices-of-deprivation-2019](https://www.gov.uk/government/statistics/english-indices-of-deprivation-2019) and has subsequently been moved to the API [https://services1.arcgis.com/] to keep the data up to date (although it only updated every few years) and give access to all the nations across the UK including Wales, Scotland, Northern Ireland as well as England.

## Installation

Expand All @@ -36,34 +31,6 @@ You can install the current version of `NHSRpopulation` from [GitHub](https://gi
remotes::install_github("nhs-r-community/NHSRpopulation")
```

## Example

```{r}
# Load the package
library(NHSRpopulation)
```

### Lower layer Super Output Areas (LSOA)

The LSOA population estimates are available in the dataset `lsoa`:

```{r}
# Show the first 6 rows of the dataset
# For further information about this dataset see the help file: help(lsoa)
head(lsoa)
```

### Indices of Multiple Deprivation (IMD)

The IMD scores (raw scores and ranked deciles) and available in the dataset `imd`:

```{r}
# Show the first 6 rows of the dataset
# For further information about this dataset see the help file: help(imd)
head(imd)
```


## Sources of Data

The original source of the data provided in this R package is available [here](https://www.ons.gov.uk/peoplepopulationandcommunity/populationandmigration/populationestimates/datasets/lowersuperoutputareamidyearpopulationestimates) and licenced under the [Open Government Licence v3.0](http://www.nationalarchives.gov.uk/doc/open-government-licence/version/3/).
Expand Down
82 changes: 8 additions & 74 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -11,20 +11,14 @@ experimental](https://img.shields.io/badge/lifecycle-experimental-orange.svg)](h

The goal of `NHSRpopulation` is to make population estimates for **Lower
layer Super Output Areas (LSOA)** and their **Indices of Multiple
Deprivation (IMD)** easily available in R. Population estimates are
broken down by age (0 to 90+) and gender (female/male). Information
about the original sources of the data and a transparent description of
all transformation of the data that is made available in this package
can be found in this repository, see `"data-raw/imd.R` and
`"data-raw/lsoa.R`. Main changes to the original data structures include
(1) the transformation from wide to long data, (2) the addition of
further information that was only available in variable names, and (3)
renaming variables in a consistent way.

The current version of this package only includes LSOA population
estimates and IMD scores for the year 2019 for England. Because we store
quite a lot in this package it currently relatively large (~9mb)
compared to other packages.
Deprivation (IMD)** easily available in R. In its first iteration this
package was data saved from
<https://www.gov.uk/government/statistics/english-indices-of-deprivation-2019>
and has subsequently been moved to the API
\[<https://services1.arcgis.com/>\] to keep the data up to date
(although it only updated every few years) and give access to all the
nations across the UK including Wales, Scotland, Northern Ireland as
well as England.

## Installation

Expand All @@ -36,66 +30,6 @@ You can install the current version of `NHSRpopulation` from
remotes::install_github("nhs-r-community/NHSRpopulation")
```

## Example

``` r
# Load the package
library(NHSRpopulation)
#>
#> ── This is NHSRpopulation 0.0.2 ────────────────────────────────────────────────
#> ℹ Please report any issues or ideas at:
#> ℹ https://github.com/nhs-r-community/NHSRpopulation/issues
```

### Lower layer Super Output Areas (LSOA)

The LSOA population estimates are available in the dataset `lsoa`:

``` r
# Show the first 6 rows of the dataset
# For further information about this dataset see the help file: help(lsoa)
head(lsoa)
#> lsoa_year lsoa_code lsoa_name la_year la_code la_name age
#> 1 2019 E01000001 City of London 001A 2019 E09000001 City of London 0
#> 2 2019 E01000001 City of London 001A 2019 E09000001 City of London 1
#> 3 2019 E01000001 City of London 001A 2019 E09000001 City of London 2
#> 4 2019 E01000001 City of London 001A 2019 E09000001 City of London 3
#> 5 2019 E01000001 City of London 001A 2019 E09000001 City of London 4
#> 6 2019 E01000001 City of London 001A 2019 E09000001 City of London 5
#> gender est_year n
#> 1 f 2019 2
#> 2 f 2019 9
#> 3 f 2019 4
#> 4 f 2019 12
#> 5 f 2019 11
#> 6 f 2019 5
```

### Indices of Multiple Deprivation (IMD)

The IMD scores (raw scores and ranked deciles) and available in the
dataset `imd`:

``` r
# Show the first 6 rows of the dataset
# For further information about this dataset see the help file: help(imd)
head(imd)
#> lsoa_year lsoa_code lsoa_name la_year la_code
#> 1 2011 E01000001 City of London 001A 2019 E09000001
#> 2 2011 E01000002 City of London 001B 2019 E09000001
#> 3 2011 E01000003 City of London 001C 2019 E09000001
#> 4 2011 E01000005 City of London 001E 2019 E09000001
#> 5 2011 E01000006 Barking and Dagenham 016A 2019 E09000002
#> 6 2011 E01000007 Barking and Dagenham 015A 2019 E09000002
#> la_name imd_year imd_score imd_decile
#> 1 City of London 2019 6.208 9
#> 2 City of London 2019 5.143 10
#> 3 City of London 2019 19.402 5
#> 4 City of London 2019 28.652 3
#> 5 Barking and Dagenham 2019 19.837 5
#> 6 Barking and Dagenham 2019 31.576 3
```

## Sources of Data

The original source of the data provided in this R package is available
Expand Down
112 changes: 112 additions & 0 deletions vignettes/get-started.Rmd
Original file line number Diff line number Diff line change
@@ -0,0 +1,112 @@
---
title: "Getting started using the package"
output: rmarkdown::html_vignette
bibliography: "references.bib"
link-citations: TRUE
vignette: >
%\VignetteIndexEntry{get-started}
%\VignetteEngine{knitr::rmarkdown}
%\VignetteEncoding{UTF-8}
---

```{r, include = FALSE}
knitr::opts_chunk$set(
collapse = TRUE,
echo = TRUE,
eval = FALSE,
comment = "#>",
fig.path = "man/figures/README-",
out.width = "100%"
)
```

### Indices of Multiple Deprivation (IMD)

To get the IMD scores (raw scores and ranked deciles) for a dataset run the following code to generate some random example postcodes:

```{r}
library(purrr)
library(tibble)
library(PostcodesioR)
library(NHSRpopulation)
postcodes <- purrr::map_chr(
1:10,
.f = ~PostcodesioR::random_postcode() |>
purrr::pluck(1)
)
tibble_postcodes <- postcodes |>
tibble::as_tibble()
```

Then, using the `get_imd()` function for a vector (returning just the first five columns):

```{r}
NHSRpopulation::get_imd(postcodes) |>
dplyr::select(1:5)
```

Or with a data frame (returning just the first five columns):

```{r}
NHSRpopulation::get_imd(tibble_postcodes$value) |>
dplyr::select(1:5)
```

This function can be used to fix missing postcodes as some are terminated or are invalid:

```{r}
postcodes <- c("HD1 2UT", "HD1 2UU", "HD1 2UV")
NHSRpopulation::get_imd(postcodes) |>
dplyr::select(1:5)
```

Currently, although the postcode is fixed with the column `new_postcode` the IMD is not overwritten.

## Lower Super Output area (LSOA)

To return the `IMD`, `imd_decile` and `imd_quintile` for LSOAs this can be as a vector:

```{r}
# Example LSOAs from each England Decile group
lsoa_imd <- c("E01000002",
"E01000001",
"E01000117",
"E01000119",
"E01000069",
"E01000070",
"E01000066",
"E01000005",
"E01000008",
"E01000048")
NHSRpopulation::get_lsoa(lsoa_imd) |>
head(10) # first 10 rows
```

Or from a data frame:

```{r}
tibble_lsoa_imd <- lsoa_imd |>
tibble::as_tibble()
NHSRpopulation::get_lsoa(tibble_lsoa_imd$value) |>
head(10)
```

The functions return everything in those LSOAs and if you would like to return some random postcodes from each decile:

```{r}
NHSRpopulation::get_lsoa(lsoa_imd, return = "random")
```

Or just the first postcode that appears in each decile:

```{r}
NHSRpopulation::get_lsoa(lsoa_imd, return = "first")
```

0 comments on commit 9a8d9b8

Please sign in to comment.