-
Notifications
You must be signed in to change notification settings - Fork 0
/
Copy pathDay1.1_IntroToHumdrumR.Rmd
377 lines (218 loc) · 9.33 KB
/
Day1.1_IntroToHumdrumR.Rmd
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
---
title: "Introducing humdrumR"
subtitle: "Georgia Tech, humdrumR Workshop"
author: "Nat Condit-Schultz"
date: "May 11, 2023"
output:
html_document:
df_print: paged
toc: true
toc_float: true
theme: flatly
---
In this notebook, we'll get started with humdrumℝ!
Start by loading the package (you'll need to install it, if you haven't already!):
```{r}
library(humdrumR)
humdrumR_version
```
# Musical Tools
Let's start by familiarizing ourselves with a few of the common pitch and rhythm manipulation functions that humdrumR provides.
There are a *bunch* of "pitch functions" and "rhythm functions"; each one corresponds to a different pitch/rhythm data representation.
We will focus on these:
+ *Pitch*
+ `kern()`
+ `semits()`
+ `solfa()`
+ *Rhythm*
+ `duration()`
+ `recip()`
> Use `?rhythmFunctions` or `?pitchFunctions` to see the complete list of functions.
All of these functions take input vectors, use regular-expression matching to (attempt to) interpret the vector as pitch/rhythm information, then output in *their* format.
In some cases, you might want to override/tell a humdrum function how to interpret data: you can do this usin the `Exclusive` argument---which tells the function what the exclusive interpretation is supposed to be.
## Pitch
Let's start by applying `kern()` to various data tokens:
```{r, results='hold'}
kern('Ab5')
kern('fi', Key = 'G:')
kern(0:12)
heyjude <- c('4c', '2A', '8r', '8A', '8c', '8d', '2G')
kern(heyjude)
```
See what is happening? `kern()`, like other humdrumR functions, uses regular expressions to determine (guess) what type of input you are giving it, and then interprets (parses) the input appropriately.
Notice that the rest token (`'r'`) ends up as `.` (`NA`), because `kern()` doesn't know how to parse it.
Let's try a different pitch function:
```{r, results='hold'}
solfa('Ab5')
solfa('fi', Key = 'G:')
solfa(0:12)
solfa(heyjude, Key = 'F:')
```
All the pitch functions have arguments `generic` and `simple`:
```{r}
kern(c('Ab5', 'F#3', 'C4'))
kern(c('Ab5', 'F#3', 'C4'), generic = TRUE)
kern(c('Ab5', 'F#3', 'C4'), simple = TRUE)
kern(c('Ab5', 'F#3', 'C4'), generic = TRUE, simple = TRUE)
```
## Rhythm
For rhythmic (duration) information, `recip()` and `duration()` are the two most used functions:
```{r}
duration(c('4a', '4a', '8b', '8c', '4d'))
recip(c(.25, .25, .5, .125, .125, .25))
```
# Humdrum Data
These few basic tools are enough for us to get started with some real humdrum data.
Use the function `readHumdrum()` to import the Bach chorales, which are a very easy (homegenous) dataset to work with.
```{r}
chorales <- readHumdrum('ChoralesBH/KernOnly/.*krn')
```
We can look at the data:
```{r}
chorales
```
Or individual pieces, by using single-bracket indices:
```{r}
chorales[4]
```
We can also index spines or records using the `[[records, spines]]`:
```{r}
chorales[4][[, 2]]
chorales[[40:60,]]
```
We can learn more about the data set using the `r ?humSummary` commands.
```{r}
census(chorales)
spines(chorales)
interpretations(chorales)
reference(chorales)
```
## With and Within
Ok, now what if we want to work with the actual content of data?
We will need to think about the content of the underlying, "under the hood," humdrum data table.
Remember each row in the data table represents **one** token.
The original humdrum data token itself is saved in a field (column) called `Token`---we will be referring to `Token` a lot!
Lots of other information is recorded in other fields.
You can see all the fields using the `fields()` command:
```{r}
fields(chorales)
```
So *every single token* in the humdrum data actually has 47 different pieces of information associated with it.
All 47 "fields" are vectors of the same length, so we can use them in vectorized operations!
When we print out `chorales` we are, by default, seeing the original token as they appeared in the imported data.
However, we can *see* other fields by using the `$` operator for example:
```{r}
chorales[1]
chorales[1]$Spine
chorales[1]$Record
chorales$Key
```
If we actually want to access these 47 vectors, we can use `with()`:
```{r}
with(chorales, tally(solfa(Token, simple = TRUE))) |> barplot()
with(chorales, mean(semits(Token), na.rm=T))
with(chorales, {
semits <- semits(Token)
hist(semits[Spine == 1], col = rgb(1,0,0,.2), xlim = c(-20, 30), breaks = seq(-24,24,2), ylim = c(0,6000))
hist(semits[Spine == 2], col = rgb(1,1,0,.2), add = TRUE, breaks = seq(-24,24,2))
hist(semits[Spine == 3], col = rgb(1,0,1,.2), add = TRUE, breaks = seq(-24,24,2))
hist(semits[Spine == 4], col = rgb(0,1,0,.2), add = TRUE, breaks = seq(-24,24,2))
})
```
When I'm inside my `with()` call, I can "see" the fields of the table, including `Token`.
I can treat the fields just like any vector, and apply any R function, like `kern()` or `table()`.
The cool thing is, I can also see all those other 46 fields!
So I could do something like this:
```{r}
with(chorales, tally(kern(Token, simple = TRUE), Spine))
```
### Adding Fields
We can take things to the next level by *adding* our own new fields to our data using the `within()` function.
`within()` acts much like `with()`, except it (tries to) put your output back into the humdrumR-data object.
We kept calling `kern(Token, simple = TRUE)`---to save time we could save that output as a new field:
```{r}
chorales <- within(chorales, Kern <- kern(Token, simple = TRUE))
chorales
```
We now have two data fields: `Token` and `Kern`!
Notice that I had to assign the name `Kern` (or any other name you choose) "within" `chorales`, and I also had to resave `chorales` itself.
This might seem redundant, but its actually a nice safety valve...you can check that stuff works first before saving it for real.
We can see that the `Token` field is still there by using `$` to activate it:
```{r}
chorales$Token
```
We might want to extract our rhythm information as well:
```{r}
chorales <- within(chorales, Duration <- duration(Token))
```
This is going to be a common pattern for **kern data---split the pitch and rhythm information into two separate fields.
```{r}
chorales <- within(chorales, Fermata <- grepl(';', Token))
with(chorales, tally(Kern, Fermata)) |> barplot(beside=T)
#
with(chorales, plot(duration(Token), semits(Token)))
```
# Subset
We can subset our data using...`subset()`.
For example, maybe we are only interested in notes with durations of a quarter-note or more.
```{r}
subset(chorales, Duration >= .25)$Token
```
We could make histograms of the pitches in different subsets...
```{r}
with(subset(chorales, Duration >= .25), hist(semits(Token)))
with(subset(chorales, Duration < .25), hist(semits(Token)))
```
As a shortcut, you can also write the subsetting expression directly into `with`/`within`:
```{r}
with(chorales, subset = Duration >= .25, hist(semits(Token)))
wit
```
# Groupby
In many cases, we'll want to break-down/group our data into more categories.
We can do this with the groupby argument to `with`/`within` (and/or, many other humdrumR functions).
```{r}
with(chorales,
subset = Spine < 5,
by = Spine,
barplot(tally(SimplePitch),
main = paste('Spine', Spine[1])))
```
```{r}
# (Need to use semits() to get pitch as number => create a new field)
chorales <- within(chorales, Semits <- semits(Token))
# normalize pitch for each voice
chorales <- within(chorales,
ScaledSemits <- (Semits - mean(Semits)) / sd(Semits),
by = Spine)
# correlate duration with normalized pitch
with(chorales, cor(Duration, ScaledSemits))
with(chorales,
plot(jitter(Duration), ScaledSemits),
abline(lm(ScaledSemits ~ Duration))
)
```
# Null data and the Active field
One thing you might've noticed is that `with` and `within` ignore non-data tokens---anything starting with `*`, `!`, or `=`, or any of those pesky null tokens `.`.
This is an option that can be controlled with the `dataTypes` argument to `with`/`within`.
However, it is often the case that data that is null in one field is not null in another field.
For example, if you run the command `kern(Token)`, all the *rests* will be output as `NA` (you'll see a `.` token);
however, if you run the command `recip(Token)`, the rests have durations, so the duration of rests is not null.
So, how does humdrumR decide which tokens are null?
Bases on the "active field."
We've already worked with the active field before...even if you weren't aware.
The `$` operator sets the active field!
The active field is the field that prints when you look at a humdrumR data object on the command line.
```{r}
chorales$Kern # Kern <- kern(Token)
chorales$Duration # Duration <- duration(Token)
```
What this means is that, if we do `with(chorales$Token, ...)`, all the originally non-null data points (including rests) will be "visible" to `with`.
However, if we do `with(chorales$Kern)`, the rest tokens will be considered null, and thus won't be visible inside `with`.
We can show this by comparing the lengths of the two calls:
```{r}
with(chorales$Token, length(Token))
with(chorales$Kern, length(Token))
```
-----------------------------------------------------------------------------------
This whole business with the active field and null tokens is confusing at first, but once you get the hang of it, it becomes quite easy to control which data points you want to include, or not, in your various analyses.