Skip to content

Commit

Permalink
Ref & acknowledge @AmeliaMN paper
Browse files Browse the repository at this point in the history
  • Loading branch information
hadley committed Oct 27, 2017
1 parent 0f956d6 commit 73151e7
Showing 1 changed file with 4 additions and 2 deletions.
6 changes: 4 additions & 2 deletions factors.Rmd
Original file line number Diff line number Diff line change
Expand Up @@ -6,8 +6,6 @@ In R, factors are used to work with categorical variables, variables that have a

Historically, factors were much easier to work with than characters. As a result, many of the functions in base R automatically convert characters to factors. This means that factors often crop up in places where they're not actually helpful. Fortunately, you don't need to worry about that in the tidyverse, and can focus on situations where factors are genuinely useful.

For more historical context on factors, I recommend [_stringsAsFactors: An unauthorized biography_](http://simplystatistics.org/2015/07/24/stringsasfactors-an-unauthorized-biography/) by Roger Peng, and [_stringsAsFactors = \<sigh\>_](http://notstatschat.tumblr.com/post/124987394001/stringsasfactors-sigh) by Thomas Lumley.

### Prerequisites

To work with factors, we'll use the __forcats__ package, which provides tools for dealing with **cat**egorical variables (and it's an anagram of factors!). It provides a wide range of helpers for working with factors. forcats is not part of the core tidyverse, so we need to load it explicitly.
Expand All @@ -17,6 +15,10 @@ library(tidyverse)
library(forcats)
```

### Learning more

If you want to learn more about factors, I recommend reading Amelia McNamara and Nicholas Horton’s paper, [_Wrangling categorical data in R_](https://peerj.com/preprints/3163/). This paper lays out some of the history discussed in [_stringsAsFactors: An unauthorized biography_](http://simplystatistics.org/2015/07/24/stringsasfactors-an-unauthorized-biography/) and [_stringsAsFactors = \<sigh\>_](http://notstatschat.tumblr.com/post/124987394001/stringsasfactors-sigh), and compares the tidy approaches to categorical data outlined in this book with base R methods. A early version of the paper help motivate and scope the forcats package; thanks Amelia & Nick!

## Creating factors

Imagine that you have a variable that records month:
Expand Down

0 comments on commit 73151e7

Please sign in to comment.