Usando las instrucciones de Ariane Aumaitre en su
hice un pequeño análisis de las letras de las canciones del album
folklore de Taylor Swift. Lamentablemente, no pude usar el paquete
, así que descargué una por una las letras del album usando el
paquete geniusr
. Para usar este último paquete, es necesario crear una
cuenta en genius.
folklore <- read_rds("folklore.rds")
# Token
folklore_tokens <- folklore %>%
unnest_tokens(word, line)
## # A tibble: 4,915 × 6
## section_name section_artist song_name artist_name song_id word
## <chr> <chr> <chr> <chr> <chr> <chr>
## 1 Verse 1 Taylor Swift cardigan Taylor Swift 5793984 vintage
## 2 Verse 1 Taylor Swift cardigan Taylor Swift 5793984 tee
## 3 Verse 1 Taylor Swift cardigan Taylor Swift 5793984 brand
## 4 Verse 1 Taylor Swift cardigan Taylor Swift 5793984 new
## 5 Verse 1 Taylor Swift cardigan Taylor Swift 5793984 phone
## 6 Verse 1 Taylor Swift cardigan Taylor Swift 5793984 high
## 7 Verse 1 Taylor Swift cardigan Taylor Swift 5793984 heels
## 8 Verse 1 Taylor Swift cardigan Taylor Swift 5793984 on
## 9 Verse 1 Taylor Swift cardigan Taylor Swift 5793984 cobblestones
## 10 Verse 1 Taylor Swift cardigan Taylor Swift 5793984 when
## # … with 4,905 more rows
folklore_tokens %>%
count(word, sort = T)
## # A tibble: 1,019 × 2
## word n
## <chr> <int>
## 1 you 243
## 2 i 235
## 3 the 178
## 4 and 122
## 5 me 97
## 6 a 92
## 7 to 90
## 8 in 85
## 9 my 85
## 10 your 64
## # … with 1,009 more rows
tidy_folklore <- folklore_tokens %>%
anti_join(stop_words) %>%
filter(!word %in% c("ooh","ah"))
## Joining, by = "word"
tidy_folklore %>%
count(word, sort = TRUE)
## # A tibble: 735 × 2
## word n
## <chr> <int>
## 1 time 38
## 2 love 13
## 3 mad 13
## 4 call 11
## 5 hope 11
## 6 woman 11
## 7 mine 10
## 8 heart 9
## 9 pulled 9
## 10 signs 9
## # … with 725 more rows