forked from proback/BeyondMLR
-
Notifications
You must be signed in to change notification settings - Fork 0
/
Copy path08-Introduction-to-Multilevel-Models.Rmd
1444 lines (1201 loc) · 130 KB
/
08-Introduction-to-Multilevel-Models.Rmd
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622
623
624
625
626
627
628
629
630
631
632
633
634
635
636
637
638
639
640
641
642
643
644
645
646
647
648
649
650
651
652
653
654
655
656
657
658
659
660
661
662
663
664
665
666
667
668
669
670
671
672
673
674
675
676
677
678
679
680
681
682
683
684
685
686
687
688
689
690
691
692
693
694
695
696
697
698
699
700
701
702
703
704
705
706
707
708
709
710
711
712
713
714
715
716
717
718
719
720
721
722
723
724
725
726
727
728
729
730
731
732
733
734
735
736
737
738
739
740
741
742
743
744
745
746
747
748
749
750
751
752
753
754
755
756
757
758
759
760
761
762
763
764
765
766
767
768
769
770
771
772
773
774
775
776
777
778
779
780
781
782
783
784
785
786
787
788
789
790
791
792
793
794
795
796
797
798
799
800
801
802
803
804
805
806
807
808
809
810
811
812
813
814
815
816
817
818
819
820
821
822
823
824
825
826
827
828
829
830
831
832
833
834
835
836
837
838
839
840
841
842
843
844
845
846
847
848
849
850
851
852
853
854
855
856
857
858
859
860
861
862
863
864
865
866
867
868
869
870
871
872
873
874
875
876
877
878
879
880
881
882
883
884
885
886
887
888
889
890
891
892
893
894
895
896
897
898
899
900
901
902
903
904
905
906
907
908
909
910
911
912
913
914
915
916
917
918
919
920
921
922
923
924
925
926
927
928
929
930
931
932
933
934
935
936
937
938
939
940
941
942
943
944
945
946
947
948
949
950
951
952
953
954
955
956
957
958
959
960
961
962
963
964
965
966
967
968
969
970
971
972
973
974
975
976
977
978
979
980
981
982
983
984
985
986
987
988
989
990
991
992
993
994
995
996
997
998
999
1000
---
title: "Chapter 8"
subtitle: "Introduction to Multilevel Models"
output:
pdf_document:
number_sections: yes
html_document: default
---
# Introduction to Multilevel Models {#ch-multilevelintro}
```{r,include=F}
if(knitr::is_html_output()){options(knitr.table.format = "html")} else {options(knitr.table.format = "latex")}
```
## Learning Objectives
After finishing this chapter, you should be able to:
- Recognize when response variables and covariates have been collected at multiple (nested) levels.
- Apply exploratory data analysis techniques to multilevel data.
- Write out a multilevel statistical model, including assumptions about variance components, in both by-level and composite forms.
- Interpret model parameters (including fixed effects and variance components) from a multilevel model, including cases in which covariates are continuous, categorical, or centered.
- Understand the taxonomy of models, including why we start with an unconditional means model.
- Select a final model, using criteria such as AIC, BIC, and deviance.
## Case Study: Music Performance Anxiety {#cs:music}
Stage fright can be a serious problem for performers, and understanding the personality underpinnings of performance anxiety is an important step in determining how to minimize its impact. @Miller2010 studied the emotional state of musicians before performances and factors which may affect their emotional state. Data was collected by having 37 undergraduate music majors from a competitive undergraduate music program fill out diaries prior to performances over the course of an academic year. In particular, study participants completed a Positive Affect Negative Affect Schedule (PANAS) before each performance. The PANAS instrument provided two key outcome measures: negative affect (a state measure of anxiety) and positive affect (a state measure of happiness). We will focus on negative affect as our primary response measuring performance anxiety.
Factors which were examined for their potential relationships with performance anxiety included: performance type (solo, large ensemble, or small ensemble); audience (instructor, public, students, or juried); if the piece was played from memory; age; gender; instrument (voice, orchestral, or keyboard); and, years studying the instrument. In addition, the personalities of study participants were assessed at baseline through the Multidimensional Personality Questionnaire (MPQ). The MPQ provided scores for one lower-order factor (absorption) and three higher-order factors: positive emotionality (PEM---a composite of well-being, social potency, achievement, and social closeness); negative emotionality (NEM---a composite of stress reaction, alienation, and aggression); and, constraint (a composite of control, harm avoidance, and traditionalism).
Primary scientific hypotheses of the researchers included:
- Lower music performance anxiety will be associated with lower levels of a subject's negative emotionality.
- Lower music performance anxiety will be associated with lower levels of a subject's stress reaction.
- Lower music performance anxiety will be associated with greater number of years of study.
## Initial Exploratory Analyses {#explore}
### Data Organization {#organizedata1}
Our examination of the data from @Miller2010 in `musicdata.csv` will focus on the following key variables:
- `id` = unique musician identification number
- `diary` = cumulative total of diaries filled out by musician
- `perf_type` = type of performance (Solo, Large Ensemble, or Small Ensemble)
- `audience` = who attended (Instructor, Public, Students, or Juried)
- `memory` = performed from Memory, using Score, or Unspecified
- `na` = negative affect score from PANAS
- `gender` = musician gender
- `instrument` = Voice, Orchestral, or Piano
- `mpqab` = absorption subscale from MPQ
- `mpqpem` = positive emotionality (PEM) composite scale from MPQ
- `mpqnem` = negative emotionality (NEM) composite scale from MPQ
Sample rows containing selected variables from our data set are illustrated in Table \@ref(tab:table1chp8); note that each subject (id) has one row for each unique diary entry.
```{r load_packages8, include=FALSE}
library(MASS)
library(gridExtra) # enable multiple plotting
library(mnormt) # contour plot?
library(lme4) #for models
library(knitr) #for kable
library(tidyverse)
```
```{r,include=FALSE}
#making a data frame for table 1 so it can be made using kable
Obs <- c(1,2,3,495,496,497)
id <- c(1,1,1,43,43,43)
diary <-c(1,2,3,2,3,4)
perf_type <-c("Solo","Large Ensemble","Large Ensemble","Solo","Small Ensemble","Solo")
memory <- c("Unspecified","Memory","Memory","Score","Memory","Score")
na <- c(11,19,14,13,19,11)
gender <-c("Female","Female","Female","Female","Female","Female")
instrument <-c("voice","voice","voice","voice","voice","voice")
mpqab <-c(16,16,16,31,31,31)
mpqpem <-c(52,52,52,64,64,64)
mpqnem <- c(16,16,16,17,17,17)
```
```{r table1chp8,echo=FALSE}
table1chp8 <- data.frame(Obs,id,diary,perf_type,memory,na,gender,instrument,mpqab,mpqpem,mpqnem)
kable(table1chp8, booktabs=T,caption="A snapshot of selected variables from the first three and the last three observations in the Music Performance Anxiety case study.")
```
As with any statistical analysis, our first task is to explore the data, examining distributions of individual responses and predictors using graphical and numerical summaries, and beginning to discover relationships between variables. With multilevel models, exploratory analyses must eventually account for the level at which each variable is measured. In a two-level study such as this one, __Level One__ will refer to variables measured at the most frequently occurring observational unit, while __Level Two__ will refer to variables measured on larger observational units. For example, in our study on music performance anxiety, many variables are measured at every performance. These "Level One" variables include:
- negative affect (our response variable)
- performance characteristics (type, audience, if music was performed from memory)
- number of previous performances with a diary entry
However, other variables measure characteristics of study participants that remain constant over all performances for a particular musician; these are considered "Level Two" variables and include:
- demographics (age and gender of musician)
- instrument used and number of previous years spent studying that instrument
- baseline personality assessment (MPQ measures of positive emotionality, negative emotionality, constraint, stress reaction, and absorption)
### Exploratory Analyses: Univariate Summaries {#explore1}
Because of this data structure---the assessment of some variables on a performance-by-performance basis and others on a subject-by-subject basis---we cannot treat our data set as consisting of 497 independent observations. Although negative affect measures from different subjects can reasonably be assumed to be independent (unless, perhaps, the subjects frequently perform in the same ensemble group), negative affect measures from different performances by the same subject are not likely to be independent. For example, some subjects tend to have relatively high performance anxiety across all performances, so that knowing their score for Performance 3 was 20 makes it more likely that their score for Performance 5 is somewhere near 20 as well. Thus, we must carefully consider our exploratory data analysis, recognizing that certain plots and summary statistics may be useful but imperfect in light of the correlated observations.
First, we will examine each response variable and potential covariate individually. Continuous variables can be summarized using histograms and summaries of center and spread; categorical variables can be summarized with tables and possibly bar charts. When examining Level One covariates and responses, we will begin by considering all 497 observations, essentially treating each performance by each subject as independent even though we expect observations from the same musician to be correlated. Although these plots will contain dependent points, since each musician provides data for up to 15 performances, general patterns exhibited in these plots tend to be real. Alternatively, we can calculate mean scores across all performances for each of the 37 musicians so that we can more easily consider each plotted point to be independent. The disadvantage of this approach would be lost information which, in a study such as this with a relatively small number of musicians each being observed over many performances, could be considerable. In addition, if the sample sizes varied greatly by subject, a mean based on 1 observation would be given equal weight to a mean based on 15 observations. Nevertheless, both types of exploratory plots typically illustrate similar relationships.
In Figure \@ref(fig:mli-hist1) we see histograms for the primary response (negative affect); plot (a) shows all 497 (dependent) observations, while plot (b) shows the mean negative affect for each of the 37 musicians across all their performances. Through plot (a), we see that performance anxiety (negative affect) across all performances follows a right skewed distribution with a lower bound of 10 (achieved when all 10 questions are answered with a 1). Plot (b) shows that mean negative affect is also right-skewed (although not as smoothly decreasing in frequency), with range 12 to 23.
```{r, include=FALSE}
#Getting started
music = read.csv("data/musicdata.csv")
dim(music) # should be 497 x 18
head(music) # examine first 6 rows
```
```{r, include=FALSE, warning=FALSE, message=FALSE}
#Getting started-2
music %>% count(id) # number of diary entries for each subject
music %>% count(diary) # number of subjects with diary entry of a given number
select <- dplyr::select
keydata <- music %>%
dplyr::select(id, diary, perform_type, memory, audience, na, gender,
instrument, mpqab, mpqpem, mpqnem)
keydata
# Create Level2 data set by picking off one observation per subject,
# which would be easier if every subject had a diary entry labeled '1'
# - should be 37 rows and 6 columns (one per L2 variable)
music.lev2 <- keydata %>%
group_by(id) %>%
filter(row_number() == 1) %>%
select(id, gender:mpqnem)
# Add average across all performances for each subject for EDA plots
meanbysubj <- music %>% group_by(id) %>%
summarise(meanbysubj = mean(na, na.rm = TRUE))
music.lev2 <- music.lev2 %>%
left_join(meanbysubj, by = "id")
# Exploratory data analysis
# Summarize Level 1 covariates (and responses) by ignoring within subject
# correlation and pretending all observations are independent
music %>% count(perform_type)
music %>% count(memory)
music %>% count(audience)
# create ggplot theme for plots
# theme with grid, grey background
theme.1 <- theme(axis.title.x = element_text(size = 14),
axis.title.y = element_text(size = 14),
plot.title=element_text(hjust=.9,face="italic",size=12))
theme1 <- theme(axis.line = element_line(size = .5, color = "black"),
panel.background = element_rect(fill="white"),
panel.grid.major = element_blank(),panel.grid.minor = element_blank(),
axis.title.x = element_text(size = 14),axis.title.y = element_text(size = 14),
plot.title=element_text(hjust=.9,face="italic",size=12))
## Histogram of negative affect frequencies
na.all <- ggplot(data=music,aes(x=na)) +
geom_histogram(binwidth = 2, fill = "white",color = "black") +
theme.1 + xlim(10,35) +
xlab("Negative Affect") + ylab("Frequency") + labs(title="(a)")
na.mean <- ggplot(data=music.lev2,aes(x=meanbysubj)) +
geom_histogram(binwidth = 2, fill = "white", color = "black") +
theme.1 + xlim(10,35) +
xlab("Mean Negative Affect") + ylab("Frequency") + labs(title="(b)")
mli.hist1 <- grid.arrange(na.all,na.mean,ncol=1)
```
```{r mli-hist1,fig.align="center",out.width="60%",fig.cap= 'Histogram of the continuous Level One response (negative effect). Plot (a) contains all 497 performances across the 37 musicians, while plot (b) contains one observation per musician (the mean negative affect across all performances).', echo=FALSE}
grid.arrange(na.all,na.mean,ncol=1)
```
We can also summarize categorical Level One covariates across all (possibly correlated) observations to get a rough relative comparison of trends. 56.1\% of the 497 performances in our data set were solos, while 27.3\% were large ensembles and 16.5\% were small emsembles. The most common audience type was a public performance (41.0\%), followed by instructors (30.0\%), students (20.1\%), and finally juried recitals (8.9\%). In 30.0\% of performances, the musician played by memory, while 55.1\% used the score and 14.9\% of performances were unspecified.
To generate an initial examination of Level Two covariates, we consider a data set with just one observation per subject, since Level Two variables are constant over all performances from the same subject. Then, we can proceed as we did with Level One covariates---using histograms to illustrate the distributions of continuous covariates (see Figure \@ref(fig:mli-histmat1)) and tables to summarize categorical covariates. For example, we learn that the majority of subjects have positive emotionality scores between 50 and 60, but that several subjects fall into a lengthy lower tail with scores between 20 and 50. A summary of categorical Level Two covariates reveals that among the 37 subjects (26 female and 11 male), 17 play an orchestral instrument, 15 are vocal performers, and 5 play a keyboard instrument.
```{r,include=FALSE}
# Summarize Level 2 covariates using data set with one observation per subject,
music.lev2 %>% ungroup(id) %>% count(gender)
music.lev2 %>% ungroup(id) %>% count(instrument)
nem1 <- ggplot(data=music.lev2,aes(x=mpqnem)) +
geom_histogram(binwidth = 5, fill = "white",color = "black") + theme.1 +
xlab("NEM") + ylab("Frequency") + labs(title="(a)")
pem1 <- ggplot(data=music.lev2,aes(x=mpqpem)) +
geom_histogram(binwidth = 5, fill = "white", color = "black") + theme.1 +
xlab("PEM") + ylab("") + labs(title="(b)")
abs <- ggplot(data=music.lev2,aes(x=mpqab)) +
geom_histogram(binwidth = 5, fill = "white", color = "black ") + theme.1 +
xlab("Absorption") + ylab("") + labs(title="(c)")
mli.histmat1 <- grid.arrange(nem1,pem1,abs,ncol=3)
```
```{r mli-histmat1, fig.cap= "Histograms of the 3 continuous Level Two covariates (negative emotionality (NEM), positive emotionality (PEM), and absorption). Each plot contains one observation per musician.", fig.align="center",out.width="60%", echo=FALSE}
grid.arrange(nem1,pem1,abs,ncol=3)
```
### Exploratory Analyses: Bivariate Summaries {#explore2}
The next step in an initial exploratory analysis is the examination of numerical and graphical summaries of relationships between model covariates and responses. In examining these bivariate relationships, we hope to learn: (1) if there is a general trend suggesting that as the covariate increases the response either increases or decreases, (2) if subjects at certain levels of the covariate tend to have similar mean responses (low variability), and (3) if the variation in the response differs at different levels of the covariate (unequal variability).
As with individual variables, we will begin by treating all 497 performances recorded as independent observations, even though blocks of 15 or so performances were performed by the same musician. For categorical Level One covariates, we can generate boxplots against negative affect as in Figure \@ref(fig:mli-boxscatmat1), plots (a) and (b). From these boxplots, we see that lower levels of performance anxiety seem to be associated with playing in large ensembles and playing in front of an instructor. For our lone continuous Level One covariate (number of previous performances), we can generate a scatterplot against negative affect as in plot (c) from Figure \@ref(fig:mli-boxscatmat1), adding a fitted line to illustrate general trends upward or downward. From this scatterplot, we see that negative affect seems to decrease slightly as a subject has more experience.
To avoid the issue of dependent observations in our three plots from Figure \@ref(fig:mli-boxscatmat1), we could generate separate plots for each subject and examine trends within and across subjects. These "lattice plots" are illustrated in Figures \@ref(fig:mli-lattice1), \@ref(fig:mli-lattice2), and \@ref(fig:mli-lattice3); we discuss such plots more thoroughly in Chapter \@ref(ch-lon). While general trends are difficult to discern from these lattice plots, we can see the variety in subjects in sample size distributions and overall level of performance anxiety. In particular, in Figure \@ref(fig:mli-lattice3), we notice that linear fits for many subjects illustrate the same same slight downward trend displayed in the overall scatterplot in Figure \@ref(fig:mli-boxscatmat1), although some subjects experience increasing anxiety and others exhibit non-linear trends. Having an idea of the range of individual trends will be important when we begin to draw overall conclusions from this study.
```{r, include=FALSE}
# Look at relationships among Level 1 covariates and primary response
# (again ignoring correlation). Boxplots for categorical covariates and
# scatterplots and lattice plot for continuous covariates.
# boxplot of negative affect by performance type
box.perform <- ggplot(data=music,aes(factor(perform_type),na)) + geom_boxplot() +
theme.1 + coord_flip() + ylab("Negative affect") + xlab("") + labs(title="(a)")
# boxplot of negative affect by audience
box.audience <- ggplot(data=music,aes(factor(audience),na)) + geom_boxplot() +
theme.1 + coord_flip() + ylab("Negative affect") + xlab("") + labs(title="(b)")
# scatterplot of negative affect versus number of previous performances
## flip coordinates
scatter.previous <- ggplot(data=music, aes(x=previous,y=na)) + geom_point() +
theme.1 + geom_smooth(method="lm",color="black") + ylab("Negative affect") +
xlab("Previous Performances") + labs(title="(c)")
# all three together
mli.boxscatmat1 <- grid.arrange(box.perform,box.audience,scatter.previous,ncol=2)
```
```{r mli-boxscatmat1,fig.align="center",out.width="60%",echo=FALSE, fig.cap='Boxplots of two categorical Level One covariates (performance type (a) and audience type (b)) vs. model response, and scatterplot of one continuous Level One covariate (number of previous diary entries (c)) vs. model response (negative affect). Each plot contains one observation for each of the 497 performances.',warning=FALSE}
# all three together
grid.arrange(box.perform,box.audience,scatter.previous,ncol=2)
```
```{r mli-lattice1, fig.align="center",out.width="60%", fig.cap='Lattice plot of performance type vs. negative affect, with separate dotplots by subject.', echo=FALSE, warning=FALSE}
# Lattice plot for NA vs. Performance Type
ggplot(music,aes(x=factor(perform_type),y=na)) + theme.1 +
geom_dotplot(binaxis="y",stackdir="center",binwidth=25/30) +
facet_wrap(~id,ncol=5) +
theme(strip.text.x=element_blank()) + coord_flip() +
labs(x="Performance Type",y="Negative Affect")
```
```{r mli-lattice2,fig.align="center",out.width="60%", fig.cap='Lattice plot of audience type vs. negative affect, with separate dotplots by subject.',echo=FALSE,warning=FALSE}
# Lattice plot for NA vs. Audience
ggplot(music,aes(x=factor(audience),y=na)) + theme.1 +
geom_dotplot(binaxis="y",stackdir="center",binwidth=25/30) +
facet_wrap(~id,ncol=5) +
theme(strip.text.x=element_blank()) + coord_flip() +
labs(x="Audience",y="Negative Affect")
```
```{r, mli-lattice3,fig.align="center",out.width="60%", fig.cap='Lattice plot of previous performances vs. negative affect, with separate scatterplots with fitted lines by subject.',echo=FALSE, warning=FALSE}
# Lattice plot for NA vs. Previous Performances
ggplot(music,aes(x=previous,y=na)) + theme.1 +
geom_point() + geom_smooth(method="lm",color="black") +
facet_wrap(~id,ncol=5) +
theme(strip.text.x=element_blank()) + ylim(10,35) +
labs(x="Previous Performances",y="Negative Affect")
```
In Figure \@ref(fig:mli-boxmat1), we use boxplots to examine the relationship between our primary categorical Level Two covariate (instrument) and our continuous model response. Plot (a) uses all 497 performances, while plot (b) uses one observation per subject (the mean performance anxiety across all performances) regardless of how many performances that subject had. Naturally, plot (b) has a more condensed range of values, but both plots seem to support the notion that performance anxiety is slightly lower for vocalists and maybe a bit higher for keyboardists
```{r, include=FALSE}
# Look at relationships among Level 2 covariates and negative affect
# (again ignoring correlation)
instr.all <- ggplot(data=music,aes(factor(instrument),na)) + geom_boxplot() +
coord_flip() + theme.1 + ylab("Negative Affect") + xlab("") + labs(title="(a)") +
ylim(10,35)
instr.mean <- ggplot(data=music.lev2,aes(factor(instrument),meanbysubj)) +
geom_boxplot() + coord_flip() + theme.1 + ylab("Mean Negative Affect") +
xlab("") + labs(title="(b)") + ylim(10,35)
mli.boxmat1 <- grid.arrange(instr.all,instr.mean,ncol=1)
```
```{r mli-boxmat1,fig.align="center",out.width="60%",fig.cap= 'Boxplots of the categorical Level Two covariate (instrument) vs. model response (negative affect). Plot (a) is based on all 497 observations from all 37 subjects, while plot (b) uses only one observation per subject.', echo=FALSE, warning=FALSE}
grid.arrange(instr.all,instr.mean,ncol=1)
```
In Figure \@ref(fig:mli-scatmat1), we use scatterplots to examine the relationships between continuous Level Two covariates and our model response. Performance anxiety appears to vary little with a subject's positive emotionality, but there is some evidence to suggest that performance anxiety increases with increasing negative emotionality and absorption level. Plots based on mean negative affect, with one observation per subject, support conclusions based on plots with all observations from all subjects; indeed the overall relationships are in the same direction and of the same magnitude.
```{r, include=FALSE}
pem2.all <- ggplot(data=music,aes(x=mpqpem,y=na)) + geom_point() +
geom_smooth(method="lm",color="black") + theme.1 + ylab("Negative Affect") +
xlab("PEM") + labs(title="(a1)")
nem2.all <- ggplot(data=music,aes(x=mpqnem,y=na)) + geom_point() +
geom_smooth(method="lm",color="black") + theme.1 + ylab("") + xlab("NEM") +
labs(title="(b1)")
abs2.all <- ggplot(data=music,aes(x=mpqab,y=na)) + geom_point() +
geom_smooth(method="lm",color="black") + theme.1 + ylab("") +
xlab("Absorption") + labs(title="(c1)")
pem2.mean <- ggplot(data=music.lev2,aes(x=mpqpem,y=meanbysubj)) + geom_point() +
geom_smooth(method="lm",color="black") + theme.1 + ylab("Mean Negative Affect") +
xlab("PEM") + labs(title="(a2)")
nem2.mean <- ggplot(data=music.lev2,aes(x=mpqnem,y=meanbysubj)) + geom_point() +
geom_smooth(method="lm",color="black") + theme.1 + ylab("") + xlab("NEM") +
labs(title="(b2)")
abs2.mean <- ggplot(data=music.lev2,aes(x=mpqab,y=meanbysubj)) + geom_point() +
geom_smooth(method="lm",color="black") + theme.1 + ylab("") +
xlab("Absorption") + labs(title="(c2)")
mli.scatmat1 <- grid.arrange(pem2.all,nem2.all,abs2.all,
pem2.mean,nem2.mean,abs2.mean,ncol=3)
```
```{r mli-scatmat1,fig.align="center",out.width="60%",fig.cap=' Scatterplots of continuous Level Two covariates (positive emotionality (PEM), negative emotionality (NEM), and absorption) vs. model response (negative affect). The top plots (a1, b1, c1) are based on all 497 observations from all 37 subjects, while the bottom plots (a2, b2, c2) use only one observation per subject.', echo=FALSE, warning=FALSE}
grid.arrange(pem2.all,nem2.all,abs2.all,
pem2.mean,nem2.mean,abs2.mean,ncol=3)
```
Of course, any graphical analysis is exploratory, and any notable trends at this stage should be checked through formal modeling. At this point, a statistician begins to ask familiar questions such as:
- which characteristics of individual performances are most associated with performance anxiety?
- which characteristics of study participants are most associated with performance anxiety?
- are any of these associations statistically significant?
- does the significance remain after controlling for other covariates?
- how do we account for the lack of independence in performances by the same musician?
As you might expect, answers to these questions will arise from proper consideration of variability and properly identified statistical models.
## Two level modeling: preliminary considerations {#twolevelmodeling}
### Ignoring the two level structure (not recommended) {#multregr}
Armed with any statistical software package, it would be relatively simple to take our complete data set of 497 observations and run an OLS multiple linear regression model seeking to explain variability in negative affect with a number of performance-level or musician-level covariates. As an example, output from a model with two binary covariates (Does the subject play an orchestral instrument? and, Was the performance a large ensemble?) is presented below. Do you see any problems with this approach?
```{r, include=FALSE}
hook_output <- knit_hooks$get("output")
knit_hooks$set(output = function(x, options) {
lines <- options$output.lines
if (is.null(lines)) {
return(hook_output(x, options)) # pass to default hook
}
x <- unlist(strsplit(x, "\n"))
more <- "..."
if (length(lines)==1) { # first n lines
if (length(x) > lines) {
# truncate the output, but add ....
x <- c(head(x, lines), more)
}
} else {
x <- c(more, x[lines], more)
}
# paste these lines together
x <- paste(c(x, ""), collapse = "\n")
hook_output(x, options)
})
```
```{r verb1, echo=FALSE, comment=NA}
# Try using multiple linear regression to analyze data even though
# many assumptions are violated.
# First generate indicator variables for two covariates used in Section 3
music <- music %>%
mutate(orch = ifelse(instrument=="orchestral instrument",1,0),
large = ifelse(perform_type=="Large Ensemble",1,0) )
# Run the regression model, assuming linear, indep, normal, equal var
modelc0 = lm(na ~ orch + large + orch:large, data=music)
tmp <- capture.output(summary(modelc0))
cat(tmp[c(3:4, 9:15, 18:20)], sep='\n')
# residual diagnostic plots
# plot(modelc0)
```
Other than somewhat skewed residuals, residual plots (not shown) do not indicate any major problems with the OLS multiple regression model. However, another key assumption in these models is the independence of all observations. While we might reasonably conclude that responses from different study participants are independent (although possibly not if they are members of the same ensemble group), it is not likely that the 15 or so observations taken over multiple performances from a single subject are similarly independent. If a subject begins with a relatively high level of anxiety (compared to other subjects) before her first performance, chances are good that she will have relatively high anxiety levels before subsequent performances. Thus, OLS multiple linear regression using all 497 observations is not advisable for this study (or multilevel data sets in general).
### A two-stage modeling approach (better but imperfect) {#twostage}
If we assume that the 37 study participants can reasonably be considered to be independent, we could use traditional OLS regression techniques to analyze data from this study if we could condense each subject's set of responses to a single meaningful outcome. Candidates for this meaningful outcome include a subject's last performance anxiety measurement, average performance anxiety, minimum anxiety level, etc. For example, in clinical trials, data is often collected over many weekly or monthly visits for each patient, except that many patients will drop out early for many reasons (e.g., lack of efficacy, side effects, personal reasons). In these cases, treatments are frequently compared using "last-value-carried-forward" methods---the final visit of each patient is used as the primary outcome measure, regardless of how long they remained in the study. However, "last-value-carried-forward" and other summary measures feel inadequate, since we end up ignoring much of the information contained in the multiple measures for each individual. A more powerful solution is to model performance anxiety at multiple levels.
We will begin by considering all performances by a single individual. For instance, consider the 15 performances for which Musician \#22 recorded a diary, illustrated in Table \@ref(tab:table2chp8).
```{r table2chp8,echo=FALSE}
table2chp8 <- music[music$id==22,c("id","diary","perform_type","audience","na","instrument")]
kable(table2chp8, booktabs=T, caption="Data from the 15 performances of Musician 22")
```
Does this musician tend to have higher anxiety levels when he is playing in a large ensemble or playing in front of fellow students? Which factor is the biggest determinant of anxiety for a performance by Musician \#22? We can address these questions through OLS multiple linear regression applied to only Musician \#22's data, using appropriate indicator variables for factors of interest.
Let $Y_{j}$ be the performance anxiety score of Musician \#22 before performance $j$. Consider the observed performances for Musician \#22 to be a random sample of all conceivable performances by that subject. If we are initially interested in examining the effect of playing in a large ensemble, we can model the performance anxiety for Musician \#22 according to the model:
\begin{equation}
Y_{22j}=a_{22}+b_{22}\textstyle{LargeEns}_{22j}+\epsilon_{22j} \textrm{ where } \epsilon_{22j}\sim N(0,\sigma^2) \textrm{ and }
(\#eq:level1a)
\end{equation}
\[ \textstyle{LargeEns}_{j} =
\left\lbrace
\begin{tabular}{l l}
1 & if `perf\_type` = Large Ensemble \\
0 & if `perf\_type` = Solo or Small Ensemble.
\end{tabular}\right.
\]
The parameters in this model ($a_{22}$, $b_{22}$, and $\sigma^2$) can be estimated through OLS methods. $a_{22}$ represents the true intercept for Musician \#22---the expected anxiety score for Musician \#22 when performance type is a Solo or Small Ensemble ($\textstyle{LargeEns}=0$), or the true average anxiety for Musician \#22 over all Solo or Small Ensemble performances he may conceivably give. $b_{22}$ represents the true slope for Musician \#22---the expected increase in performance anxiety for Musician \#22 when performing as part of a Large Ensemble rather than in a Small Ensemble or as a Solo, or the true average difference in anxiety scores for Musician \#22 between Large Ensemble performances and other types. Finally, the $\epsilon_{22j}$ terms represent the deviations of Musician \#22's actual performance anxiety scores from the expected scores under this model---the part of Musician \#22's anxiety before performance $j$ that is not explained by performance type. The variability in these deviations from the regression model is denoted by $\sigma^2$.
For Subject 22, we estimate $\hat{a}_{22}=24.5$, $\hat{b}_{22}=-7.8$, and $\hat{\sigma}=4.8$. Thus, according to our simple linear regression model, Subject 22 had an estimated anxiety score of 24.5 before Solo and Small Ensemble performances, and 16.7 (7.8 points lower) before Large Ensemble performances. With an $R^2$ of 0.425, the regression model explains a moderate amount (42.5\%) of the performance-to-performance variability in anxiety scores for Subject 22, and the trend toward lower scores for large ensemble performances is statistically significant at the 0.05 level (t(13)=-3.10, p=.009).
```{r, include=FALSE}
# Add new indicators to music data set
music <- music %>%
mutate(students = ifelse(audience=="Student(s)",1,0),
juried = ifelse(audience=="Juried Recital",1,0),
public = ifelse(audience=="Public Performance",1,0),
solo = ifelse(perform_type=="Solo",1,0),
memory1 = ifelse(memory=="Memory",1,0),
female = ifelse(gender=="Female",1,0),
vocal = ifelse(instrument=="voice",1,0) )
#Model id22 (for Musician #22 only)
music %>% filter(id==22) %>%
select(id, diary, perform_type, audience, na, instrument)
id22 <- music %>% filter(id==22) # pick off data for Musician 22
```
```{r verb2, echo=FALSE, comment=NA}
regr.id22 = lm(na ~ large, data=id22)
tmp <- capture.output(summary(regr.id22))
cat(tmp[c(3:4, 9:13, 16:18)], sep='\n')
```
We could continue trying to build a better model for Subject 22, adding indicators for audience and memory, and even adding a continuous variable representing the number of previous performance where a diary was kept. As our model R-square value increased, we would be explaining a larger proportion of Subject 22's performance-to-performance variability in anxiety. It would not, however, improve our model to incorporate predictors for age, gender, or even negative emotionality based on the MPQ---why is that?
For the present time, we will model Subject 22's anxiety scores for his 15 performances using the model given by Equation \@ref(eq:level1a), with a lone indicator variable for performing in a Large Ensemble. We can then proceed to fit the OLS regression model in Equation \@ref(eq:level1a) to examine the effect of performing in a Large Ensemble for each of the 37 subjects in this study. These are called __Level One models__. As displayed in Figure \@ref(fig:mli-histmat2), there is considerable variability in the fitted intercepts and slopes among the 37 subjects. Mean performance anxiety scores for Solos and Small Ensembles range from 11.6 to 24.5, with a median score of 16.7, while mean differences in performance anxiety scores for Large Ensembles compared to Solos and Small Ensembles range from -7.9 to 5.0, with a median difference of -1.7. Can these differences among individual musicians be explained by (performance-invariant) characteristics associated with each individual, such as gender, age, instrument, years studied, or baseline levels of personality measures? Questions like these can be addressed through further statistical modeling.
```{r, include=FALSE}
# LVCF analysis (can't analyze large with only 1 obs per subject)
lvcf.na <- music %>%
group_by(id) %>%
filter(row_number() == n() ) %>%
select(id, na) %>%
rename (lvcf.na = na)
music.lev2 <- music.lev2 %>%
left_join(lvcf.na, by = "id") %>%
mutate(orch = ifelse(instrument=="orchestral instrument",1,0) )
lvcf.model=lm(lvcf.na ~ orch, data = music.lev2)
summary(lvcf.model)
# histogram for the intercepts by subject
# "by" line runs "coefficients" function for each subject
# "coefficients" puts int and slope from regression of na on large into list
# "[[1]]" picks off the first value from the list (the intercept)
int <- by(music, music$id, function(data)
coefficients(lm(na ~ large, data = data))[[1]])
summary(int) # summary statistics for 37 intercepts
music.lev2 <- music.lev2 %>%
ungroup(id) %>%
mutate(int = int[1:37])
inthist <- ggplot(data=music.lev2,aes(x=int)) +
geom_histogram(color="black",fill="white",binwidth=2) + theme.1 +
xlab("Intercepts from 37 subjects") + ylab("Frequency") + labs(title="(a)")
# histogram for fitted rate of change
# same as above, except second coefficient (slope) picked off
rate <- by(music, music$id, function(data)
coefficients(lm(na ~ large, data = data))[[2]])
summary(rate)
music.lev2 <- music.lev2 %>%
mutate(rate = rate[1:37])
slopehist <- ggplot(data=music.lev2,aes(x=rate)) +
geom_histogram(color="black",fill="white",binwidth=2) + theme.1 +
xlab("Slopes from 37 subjects") + ylab("Frequency") + labs(title="(b)")
mli.histmat2 <- grid.arrange(inthist,slopehist,ncol=1)
```
```{r mli-histmat2,fig.align="center",out.width="60%", fig.cap='Histograms of intercepts and slopes from fitting simple regression models by subject, where each model contained a single binary predictor indicating if a performance was part of a large ensemble.',echo=FALSE, warning=FALSE}
grid.arrange(inthist,slopehist,ncol=1)
```
As an illustration, we can consider whether or not there are significant relationships between individual regression parameters (intercepts and slopes) and instrument played. From a modeling perspective, we would build a system of two __Level Two models__ to predict the fitted intercept ($a_{i}$) and fitted slopes ($b_{i}$) for Subject $i$:
\begin{align}
a_{i} & = \alpha_{0}+\alpha_{1}\textstyle{Orch}_{i}+u_{i}
(\#eq:level2s0) \\
b_{i} & = \beta_{0}+\beta_{1}\textstyle{Orch}_{i}+v_{i}
(\#eq:level2s1)
\end{align}
where $\textstyle{Orch}_{i}=1$ if Subject $i$ plays an orchestral instrument and $\textstyle{Orch}_{i}=0$ if Subject $i$ plays keyboard or is a vocalist. Note that, at Level Two, our response variables are not observed measurements such as performance anxiety scores, but rather the fitted regression coefficients from the Level One models fit to each subject. (Well, in our theoretical model, the responses are actually the true intercepts and slopes from Level One models for each subject, but in reality, we have to use our estimated slopes and intercepts.)
Exploratory data analysis (see boxplots by instrument in Figure \@ref(fig:mli-boxmat2) suggests that subjects playing orchestral instruments have higher intercepts than vocalists or keyboardists, and that orchestral instruments are associated with slight lower (more negative) slopes, although with less variability that the slopes of vocalists and keyboardists. These trends are borne out in regression modeling. If we fit Equations \@ref(eq:level2s0) and \@ref(eq:level2s1) using fitted intercepts and slopes as our response variables, we obtain the following estimated parameters: $\hat{\alpha}_{0}=16.3$, $\hat{\alpha}_{1}=1.4$, $\hat{\beta}_{0}=-0.8$, and $\hat{\beta}_{1}=-1.4$. Thus, the intercept ($a_{i}$) and slope ($b_{i}$) for Subject $i$ can be modeled as:
\begin{align}
\hat{a}_{i} & = 16.3+1.4\textstyle{Orch}_{i}+u_{i}
(\#eq:level2s0hat) \\
\hat{b}_{i} & = -0.8-1.4\textstyle{Orch}_{i}+v_{i}
(\#eq:level2s1hat)
\end{align}
where $a_{i}$ is the true mean negative affect when Subject $i$ is playing solos or small ensembles, and $b_{i}$ is the true mean difference in negative affect for Subject $i$ between large ensembles and other performance types. Based on these models, average performance anxiety before solos and small ensembles is 16.3 for vocalists and keyboardists, but 17.7 (1.4 points higher) for orchestral instrumentalists. Before playing in large ensembles, vocalists and instrumentalists have performance anxiety (15.5) which is 0.8 points lower, on average, than before solos and small ensembles, while subjects playing orchestral instruments experience an average difference of 2.2 points, producing an average performance anxiety of 15.5 before playing in large ensembles just like subjects playing other instruments. However, the difference between orchestral instruments and others does not appear to be statistically significant for either intercepts (t=1.424, p=0.163) or slopes (t=-1.168, p=0.253).
```{r, include=FALSE}
# Descriptive statistics of the estimates obtained by fitting the
# linear model by id.
mean(int)
sd(int)
mean(rate,na.rm=T) # issues here since 7 NAs among rates
sd(rate,na.rm=T)
# correlation between slopes and intercepts
cor(int, rate,use="pairwise.complete.obs")
# Fit Level Two models to fitted intercepts and slopes from Level One
model.int = lm(int~music.lev2$orch)
summary(model.int)
model.rate = lm(rate~music.lev2$orch)
summary(model.rate)
# Plot the relationships between instrument and intercept and slope
int.box <- ggplot(data=music.lev2,aes(factor(orch),int)) + geom_boxplot() +
theme.1 + coord_flip() + ylab("Fitted Intercepts") + xlab("Orchestral") +
labs(title="(a)")
slope.box <- ggplot(data=music.lev2,aes(factor(orch),rate)) + geom_boxplot() +
theme.1 + coord_flip() + ylab("Fitted Slopes") + xlab("Orchestral") +
labs(title="(b)")
mli.boxmat2 <- grid.arrange(int.box,slope.box,ncol=1)
```
```{r mli-boxmat2,fig.align="center",out.width="60%", fig.cap='Boxplots of fitted intercepts (plot (a)) and slopes (plot (b)) by orchestral instrument (1) vs. keyboard or vocalist (0).',echo=FALSE, warning=FALSE}
grid.arrange(int.box,slope.box,ncol=1)
```
This two stage modeling process does have some drawbacks. Among other things, (1) it weights every subject the same regardless of the number of diary entries we have, (2) it responds to missing individual slopes (from 7 subjects who never performed in a large ensemble) by simply dropping those subjects, and (3) it does not share strength effectively across individuals. These issues can be better handled through a unified multilevel modeling framework which we will develop over the remainder of this section.
## Two level modeling: a unified approach {#twolevelmodelingunified}
### Our framework {#ourframework}
For the unified approach, we will still envision two levels of models as in section \@ref(twostage), but we will use likelihood-based methods for parameter estimation rather than ordinary least squares to address the drawbacks associated with the two-stage approach. To illustrate the unified approach, we will first generalize the models presented in section \@ref(twostage). Let $Y_{ij}$ be the performance anxiety score of the $i^{th}$ subject before performance $j$. If we are initially interested in examining the effects of playing in a large ensemble and playing an orchestral instrument, then we can model the performance anxiety for Subject $i$ in performance $j$ with the following system of equations:
- Level One:
\begin{equation}
Y_{ij} = a_{i}+b_{i}\textstyle{LargeEns}_{ij}+\epsilon_{ij}
(\#eq:level1large)
\end{equation}
- Level Two:
\begin{align*}
a_{i} & = \alpha_{0}+\alpha_{1}\textstyle{Orch}_{i}+u_{i} \\
b_{i} & = \beta_{0}+\beta_{1}\textstyle{Orch}_{i}+v_{i},
\end{align*}
In this system, there are 4 key __fixed effects__ to estimate: $\alpha_{0}$, $\alpha_{1}$, $\beta_{0}$ and $\beta_{1}$. Fixed effects are the fixed but unknown population effects associated with certain covariates. The intercepts and slopes for each subject from Level One, $a_{i}$ and $b_{i}$, don't need to be formally estimated as we did in section \@ref(twostage); they serve to conceptually connect Level One with Level Two. In fact, by substituting the two Level Two equations into the Level One equation, we can view this two-level system of models as a single __Composite Model__ without $a_{i}$ and $b_{i}$:
\begin{align*}
Y_{ij} & = [\alpha_{0}+\alpha_{1}\textstyle{Orch}_{i}+\beta_{0}\textstyle{LargeEns}_{ij}+\beta_{1}\textstyle{Orch}_{i}\textstyle{LargeEns}_{ij}] \\
& \textrm{} + [u_{i}+v_{i}\textstyle{LargeEns}_{ij}+\epsilon_{ij}]
\end{align*}
From this point forward, when building multilevel models, we will use Greek letters (such as $\alpha_{0}$) to denote final fixed effects model parameters to be estimated empirically, and Roman letters (such as $a_{0}$) to denote preliminary fixed effects parameters at lower levels. Variance components that will be estimated empirically will be denoted with $\sigma$ or $\rho$, while terms such as $\epsilon$ and $u_{i}$ represent error terms. In our framework, we can estimate final parameters directly without first estimating preliminary parameters, which can be seen with the Composite Model formulation (although we can obtain estimates of preliminary parameters in those occasional cases when they are of interest to us). Note that when we model a slope term like $b_{i}$ from Level One using Level Two covariates like $\textstyle{Orch}_{i}$, the resulting Composite Model contains an **interaction term**, denoting that the effect of $\textstyle{LargeEns}_{ij}$ depends on the instrument played.
Furthermore, with a binary predictor at Level Two such as instrument, we can write out what our Level Two model looks like for those who play keyboard or are vocalists ($\textstyle{Orch}_{i}=0$) and those who play orchestral instruments ($\textstyle{Orch}_{i}=1$):
- Keyboardists and Vocalists ($\textstyle{Orch}_{i}=0$)
\begin{align*}
a_{i} & = \alpha_{0}+u_{i} \\
b_{i} & = \beta_{0}+v_{i}
\end{align*}
- Orchestral instrumentalists ($\textstyle{Orch}_{i}=1$)
\begin{align*}
a_{i} & = (\alpha_{0}+\alpha_{1})+u_{i} \\
b_{i} & = (\beta_{0}+\beta_{1})+v_{i}
\label{eq:level2byorch}
\end{align*}
Writing the Level Two model in this manner helps us interpret the model parameters from our two-level model. In this case, even the Level One covariate is binary, so that we can write out expressions for mean performance anxiety based on our model for four different combinations of instrument played and performance type:
- Keyboardists or vocalists playing solos or small ensembles: $\alpha_{0}$
- Keyboardists or vocalists playing large ensembles: $\alpha_{0}+\beta_{0}$
- Orchestral instrumentalists playing solos or small ensembles: $\alpha_{0}+\alpha_{1}$
- Orchestral instrumentalists playing large ensembles: $\alpha_{0}+\alpha_{1}+\beta_{0}+\beta_{1}$
### Random vs. fixed effects
Before we can use likelihood-based methods to estimate our model parameters, we still must define the distributions of our error terms. The error terms $\epsilon_{ij}$, $u_{i}$, and $v_{i}$ represent random effects in our model. In multilevel models, it is important to distinguish between fixed and random effects. Typically, __fixed effects__ describe levels of a factor that we are specifically interested in drawing inferences about, and which would not change in replications of the study. For example, in our music performance anxiety case study, the levels of performance type will most likely remain as solos, small ensembles, and large ensembles even in replications of the study, and we wish to draw specific conclusions about differences between these three types of performances. Thus, performance type would be considered a fixed effect. On the other hand, __random effects__ describe levels of a factor which can be thought of as a sample from a larger population of factor levels; we are not typically interested in drawing conclusions about specific levels of a random effect, although we are interested in accounting for the influence of the random effect in our model. For example, in our case study the different musicians included can be thought of as a random sample from a population of performing musicians. Although our goal is not to make specific conclusions about differences between any two musicians, we do want to account for inherent differences among musicians in our model, and by doing so, we will be able to draw more precise conclusions about our fixed effects of interest. Thus, musician would be considered a random effect.
### Distribution of errors: the multivariate normal distribution {#MVN}
As part of our multilevel model, we must provide probability distributions to describe the behavior of random effects. Typically, we assume that random effects follow a normal distribution with mean 0 and a variance parameter which must be estimated from the data. For example, at Level One, we will assume that the errors associated with each performance of a particular musician can be described as: $\epsilon_{ij}\sim N(0,\sigma^2)$. At Level Two, we have one error term ($u_{i}$) associated with subject-to-subject differences in intercepts, and one error term ($v_{i}$) associated with subject-to-subject differences in slopes. That is, $u_{i}$ represents the deviation of Subject $i$ from the mean performance anxiety before solos and small ensembles after accounting for his or her instrument, and $v_{i}$ represents the deviation of Subject $i$ from the mean difference in performance anxiety between large ensembles and other performance types after accounting for his or her instrument.
In modeling the random behavior of $u_{i}$ and $v_{i}$, we must also account for the possibility that random effects at the same level might be correlated. Subjects with higher baseline performance anxiety have a greater capacity for showing decreased anxiety in large ensembles as compared to solos and small ensembles, so we might expect that subjects with larger intercepts (performance anxiety before solos and small ensembles) would have smaller slopes (indicating greater decreases in anxiety before large ensembles). In fact, our fitted Level One intercepts and slopes in this example actually show evidence of a fairly strong negative correlation ($r=-0.525$, see Figure \@ref(fig:mli-scat1)).
```{r mli-scat1,fig.align="center",out.width="60%",fig.cap='Scatterplot with fitted regression line for estimated intercepts and slopes (one point per subject).',echo=FALSE, warning=FALSE}
# Plot the relationship between intercepts and slopes
ggplot(data=music.lev2,aes(x=int,y=rate)) + geom_point() +
geom_smooth(method="lm",color="black") + theme.1 + ylab("Fitted slopes") +
xlab("Fitted intercepts")
```
To allow for this correlation, the error terms at Level Two can be assumed to follow a multivariate normal distribution in our unified multilevel model. Mathematically, we can express this as:
\[ \left[ \begin{array}{c}
u_{i} \\ v_{i}
\end{array} \right] \sim N \left( \left[
\begin{array}{c}
0 \\ 0
\end{array} \right], \left[
\begin{array}{cc}
\sigma_{u}^{2} & \\
\rho_{uv}\sigma_{u}\sigma_v & \sigma_{v}^{2}
\end{array} \right] \right) \]
Note that the correlation $\rho_{uv}$ between the error terms is simply the covariance $\sigma_{uv}=\rho_{uv}\sigma_{u}\sigma_{v}$ converted to a $[-1,1]$ scale through the relationship:
\begin{equation}
\rho_{uv} = \frac{\rho\sigma_{u}\sigma_{v}}{\sigma_{u}\sigma_{v}}
(\#eq:cor)
\end{equation}
With this expression, we are allowing each error term to have its own variance (around a mean of 0) and each pair of error terms to have its own covariance (or correlation). Thus, if there are $n$ equations at Level Two, we can have $n$ variance terms and $n(n-1)/2$ covariance terms for a total of $n + n(n-1)/2$ variance components. These variance components are organized in matrix form, with variance terms along the diagonal and covariance terms in the off-diagonal. In our small example, we have $n=2$ equations at Level Two, so we have 3 variance components to estimate---2 variance terms ($\sigma_{u}^{2}$ and $\sigma_{v}^{2}$) and 1 covariance term ($\sigma_{uv}$).
The multivariate normal distribution with $n=2$ is illustrated in Figure \@ref(fig:contour-boundary) for two cases: (a) the error terms are uncorrelated ($\sigma_{uv}=\rho_{uv}=0$), and (b) the error terms are positively correlated ($\sigma_{uv}>0$ and $\rho_{uv} > 0$). In general, if the errors in intercepts ($u_{i}$) are placed on the x-axis and the errors in slopes ($v_{i}$) are placed on the y-axis, then $\sigma_{u}^{2}$ measures spread in the x-direction and $\sigma_{v}^{2}$ measures spread in the y-direction, while $\sigma_{uv}$ measures tilt. Positive tilt ($\sigma_{uv}>0$) indicates a tendency for errors from the same subject to both be positive or both be negative, while negative tilt ($\sigma_{uv}<0$) indicates a tendency for one error from a subject to be positive and the other to be negative. In Figure \@ref(fig:contour-boundary), $\sigma_{u}^{2}=4$ and $\sigma_{v}^{2}=1$, so both contour plots show a greater range of errors in the x-direction than the y-direction. Internal ellipses in the contour plot indicate pairs of $u_{i}$ and $v_{i}$ that are more likely. In Figure \@ref(fig:contour-boundary) (a) $\sigma_{uv}=\rho_{uv}=0$, so the axes of the contour plot correspond to the x- and y-axes, but in Figure \@ref(fig:contour-boundary) (b) $\sigma_{uv}=1.5$, so the contour plot tilts up, reflecting a tendency for high values of $u_{i}$ to be associated with high values of $v_{i}$.
```{r, include=FALSE}
#Code for next plot
e0 <- seq(-8,8,length=51)
e1 <- seq(-4,4,length=51)
xy <- expand.grid(e0,e1)
Sigma <- matrix(c(4,0,0,1),2,2)
Mu <- c(0,0)
z <- dmnorm(xy, Mu, Sigma)
zframe <- data.frame(xy, z)
density <- xy[z==max(z),]
con.1 <- ggplot(data=zframe,aes(x=Var1,y=Var2,z=z)) + theme.1 +
geom_contour(stat="contour",lineend="butt",linejoin="round",linemitre=1,
na.rm=FALSE,colour="black") + labs(x="u",y="v",title="(a)") +
scale_y_continuous(limits=c(-5,5))
# Positive correlation
Sigma <- matrix(c(4,1.5,1.5,1),2,2)
Mu <- c(0,0)
z <- dmnorm(xy, Mu, Sigma)
zframe <- data.frame(xy, z)
density <- xy[z==max(z),]
con.2 <- ggplot(data=zframe,aes(x=Var1,y=Var2,z=z)) + theme.1 +
geom_contour(stat="contour",lineend="butt",linejoin="round",linemitre=1,
na.rm=FALSE,colour="black") + labs(x="u",y="v",title="(b)") +
scale_y_continuous(limits=c(-5,5))
```
```{r contour-boundary,fig.align="center",out.width="60%", fig.cap='Contour plots illustrating multivariate normal density with (a) no correlation between error terms, and (b) positive correlation between error terms.',echo=FALSE, warning=FALSE}
grid.arrange(con.1,con.2, ncol=2)
```
### Technical issues when estimating and testing parameters (Optional) {#multileveltechnical}
Now, our two-level model has 8 parameters that need to be estimated: 4 fixed effects ($\alpha_{0}$, $\alpha_{1}$, $\beta_{0}$, and $\beta_{1}$), and 4 variance components ($\sigma^{2}$, $\sigma_{u}^{2}$, $\sigma_{v}^{2}$, and $\sigma_{uv}$). Note that we use the term __variance components__ to signify model parameters that describe the behavior of random effects. We can use statistical software, such as the lmer() function from the lme4 package in R, to obtain parameter estimates using our 497 observations. The most common methods for estimating model parameters---both fixed effects and variance components---are maximum likelihood (ML) and restricted maximum likelihood (REML). The method of maximum likelihood (ML) was introduced in Chapter \@ref(ch-beyondmost), where parameter estimates are chosen to maximize the value of the likelihood function based on observed data. Restricted maximum likelihood (REML) is conditional on the fixed effects, so that the part of the data used for estimating variance components is separated from that used for estimating fixed effects. Thus REML, by accounting for the loss in degrees of freedom from estimating the fixed effects, provides an unbiased estimate of variance components, while ML estimators for variance components are biased under assumptions of normality, since they use estimated fixed effects rather than the true values. REML is preferable when the number of parameters is large or the primary interest is obtaining estimates of model parameters, either fixed effects or variance components associated with random effects. ML should be used if nested fixed effects models are being compared using a likelihood ratio test, although REML is fine for nested models of random effects (with the same fixed effects model). In this text, we will typically report REML estimates unless we are specifically comparing nested models with the same random effects. In most case studies and most models we consider, there is very little difference between ML and REML parameter estimates. Additional details are beyond the scope of this book [@Singer2003].
Note that the multilevel output shown beginning in the next section contains no p-values for performing hypothesis tests. This is primarily because the exact distribution of the test statistics under the null hypothesis (no fixed effect) is unknown, primarily because the exact degrees of freedom is not known [@Bates2015]. Finding good approximate distributions for test statistics (and thus good approximate p-values) in multilevel models is an area of active research. In most cases, we can simply conclude that t-values (ratios of parameter estimates to estimated standard errors) with absolute value above 2 indicate significant evidence that a particular model parameter is different than 0. Certain software packages will report p-values corresponding to hypothesis tests for parameters of fixed effects; these packages are typically using conservative assumptions, large-sample results, or approximate degrees of freedom for a t-distribution. In section \@ref(multreg-boot), we introduced the bootstrap as a non-parametric, computational approach for producing confidence intervals for model parameters. In addition, in section \@ref(threelevel-paraboot), we will introduce a method called the parametric bootstrap which is being used more frequently by researchers to better approximate the distribution of the likelihood test statistic and produce more accurate p-values by simulating data under the null hypothesis [@Efron2012].
### An initial model with parameter interpretations {#initialmodel}
The output below contains likelihood-based estimates of our 8 parameters from a two-level model applied to the music performance anxiety data:
```{r verb3, echo=FALSE, comment=NA, warning=FALSE, message = FALSE}
#Model 0
model0 <- lmer(na ~ orch + large + orch:large +
(large|id), REML=T, data=music)
tmp <- capture.output(summary(model0))
model0b <- lmer(na ~ orch + large + orch:large +
(large|id), REML=F, data=music)
tmpb <- capture.output(summary(model0b))
cat(" ", tmp[1], "\n",
"A) ", tmp[2], "\n",
" ", tmp[3], "\n",
"B) ", tmp[5], "\n",
" ", tmpb[4], "\n",
"B2)", tmpb[5], "\n",
" ", tmpb[6], "\n",
" ", tmpb[7], "\n",
" ", tmp[11], "\n",
" ", tmp[12], "\n",
"C) ", tmp[13], "\n",
"D) ", tmp[14], "\n",
"E) ", tmp[15], "\n",
"F) ", tmp[16], "\n",
" ", tmp[17], "\n",
" ", tmp[18], "\n",
" ", tmp[19], "\n",
"G) ", tmp[20], "\n",
"H) ", tmp[21], "\n",
"I) ", tmp[22], "\n",
"J) ", tmp[23])
```
This output (except for the capital letters along the left column) was specifically generated by the lmer() function in R; multilevel modeling results from other packages will contain similar elements. Because we will use lmer() output to summarize analyses of case studies in this and following sections, we will spend a little time now orienting ourselves to the most important features in this output.
- A: How our multilevel model is written in R, based on the composite model formulation. For more details, see section \@ref(notesr8).
- B: Measures of model performance. Since this model was fit using REML, this line only contains the REML criterion.
- B2: If the model is fit with ML instead of REML, the measures of performance will contain AIC, BIC, deviance, and the log-likelihood.
- C: Estimated variance components ($\hat{\sigma}_{u}^2$ and $\hat{\sigma}_{u}$) associated with the intercept equation in Level Two.
- D: Estimated variance components ($\hat{\sigma}_{v}^2$ and $\hat{\sigma}_{v}$) associated with the large ensemble effect equation in Level Two, along with the estimated correlation ($\hat{\rho}_{uv}$) between the two Level Two error terms.
- E: Estimated variance components ($\hat{\sigma}^2$ and $\hat{\sigma}$) associated with the Level One equation.
- F: Total number of performances where data was collected (Level One observations = 497) and total number of subjects (Level Two observations = 37).
- G: Estimated fixed effect ($\hat{\alpha}_{0}$) for intercept term, along with its standard error and t-value (which is the ratio of the estimated coefficient to its standard error). As described in section \@ref(multileveltechnical), no p-value testing the significance of the coefficient is provided because the exact null distribution of the t-value is unknown.
- H: Estimated fixed effect ($\hat{\alpha}_{1}$) for the orchestral instrument effect, along with its standard error and t-value .
- I: Estimated fixed effect ($\hat{\beta}_{0}$) for the large ensemble effect, along with its standard error and t-value.
- J: Estimated fixed effect ($\hat{\beta}_{1}$) for the interaction between orchestral instruments and large ensembles, along with its standard error and t-value.
Assuming the 37 musicians in this study are representative of a larger population of musicians, parameter interpretations for our 8 model parameters are given below:
- Fixed effects:
- $\hat{\alpha}_{0} = 15.9$. The estimated mean performance anxiety for solos and small ensembles (Large=0) for keyboard players and vocalists (Orch=0) is 15.9.
- $\hat{\alpha}_{1} = 1.7$. Orchestral instrumentalists have an estimated mean performance anxiety for solos and small ensembles which is 1.7 points higher than keyboard players and vocalists.
- $\hat{\beta}_{0} = -0.9$. Keyboard players and vocalists have an estimated mean decrease in performance anxiety of 0.9 points when playing in large ensembles instead of solos or small ensembles.
- $\hat{\beta}_{1} = -1.4$. Orchestral instrumentalists have an estimated mean decrease in performance anxiety of 2.3 points when playing in large ensembles instead of solos or small ensembles, 1.4 points greater than the mean decrease among keyboard players and vocalists.
- Variance components
- $\hat{\sigma}_{u} = 2.4$. The estimated standard deviation of performance anxiety levels for solos and small ensembles is 2.4 points, after controlling for instrument played.
- $\hat{\sigma}_{v} = 0.7$. The estimated standard deviation of differences in performance anxiety levels between large ensembles and other performance types is 0.7 points, after controlling for instrument played.
- $\hat{\rho}_{uv} = -0.64$. The estimated correlation between performance anxiety scores for solos and small ensembles and increases in performance anxiety for large ensembles is -0.64, after controlling for instrument played. Those subjects with higher performance anxiety scores for solos and small ensembles tend to have greater decreases in performance anxiety for large ensemble performances.
- $\hat{\sigma} = 4.7.$ The estimated standard deviation in residuals for the individual regression models is 4.7 points.
Table \@ref(tab:table3chp8) shows a side-by-side comparison of estimated coefficients from the approaches described to this point. Underlying assumptions, especially regarding the error and correlation structure, differ, and differences in estimated effects are potentially meaningful. Note that some standard errors are greatly *underestimated* under independence, and that no Level One covariates (such as performance type) can be analyzed under a method such as last-visit-carried-forward which uses one observation per subject. Moving forward, we will employ the unified multilevel approach to maximize the information being used to estimate model parameters and to remain faithful to the structure of the data.
```{r, include=FALSE}
Variable <- c("Intercept","Orch","Large","Orch*Large")
Independence <-c("15.72(0.36)","1.79(0.55)","-0.28(0.79)","-1.71(1.06)")
TwoStage <- c("16.28(0.67)","1.41(0.99)","-0.77(0.85)","-1.41(1.20)")
LVCF <- c("15.20(1.25)","1.45(1.84)","-","-")
Multilevel <- c("15.93(0.64)","1.69(0.95)","-0.91(0.85)","-1.42(1.10)")
```
```{r table3chp8,echo=FALSE}
table3chp8 <- data.frame(Variable,Independence,TwoStage,LVCF,Multilevel)
kable(table3chp8, booktabs=T, caption="Comparison of estimated coefficients and standard errors from the approaches mentioned in this section.")
```
Two level modeling as done with the music performance anxiety data usually involves fitting a number of models. Subsequent sections will describe a process of starting with the simplest two-level models and building toward a final model which addresses the research questions of interest.
## Building a multilevel model {#sec:buildmodel}
### Model building strategy {#buildstrategy}
Initially, it is advisable to first fit some simple, preliminary models, in part to establish a baseline for evaluating larger models. Then, we can build toward a final model for description and inference by attempting to add important covariates, centering certain variables, and checking model assumptions. In this study, we are particularly interested in Level Two covariates---those subject-specific variables that provide insight into why individuals react differently in anxiety-inducing situations. To get more precise estimates of the effect of Level Two covariates, we also want to control for Level One covariates that describe differences in individual performances.
Our strategy for building multilevel models will begin with extensive exploratory data analysis at each level. Then, after examining models with no predictors to assess variability at each level, we will first focus on creating a Level One model, starting simple and adding terms as necessary. Next, we will move to Level Two models, again starting simple and adding terms as necessary, beginning with the equation for the intercept term. Finally, we will examine the random effects and variance components, beginning with a full set of error terms and then removing covariance terms and variance terms where advisable (for instance, when parameter estimates are failing to converge or producing impossible or unlikely values). This strategy follows closely with that described by @Bryk2002 and used by @Singer2003. Singer and Willett further find that the modeled error structure rarely matters in practical contexts. Other model building approaches are certainly possible. @Diggle2002, for example, begins with a saturated fixed effects model, determines variance components based on that, and then simplifies the fixed part of the model after fixing the random part.
### An initial model: unconditional means or random intercepts {#modela}
The first model fit in almost any multilevel context should be the **unconditional means model**, also called a **random intercepts model**. In this model, there are no predictors at either level; rather, the purpose of the unconditional means model is to assess the amount of variation at each level---to compare variability within subject to variability between subjects. Expanded models will then attempt to explain sources of between and within subject variability.
The unconditional means (random intercepts) model, which we will denote as Model A, can be specified either using formulations at both levels:
- Level One:
\begin{equation}
Y_{ij} = a_{i}+\epsilon_{ij} \textrm{ where } \epsilon_{ij}\sim N(0,\sigma^2)
(\#eq:level1uncmean)
\end{equation}
- Level Two:
\begin{equation}
a_{i} = \alpha_{0}+u_{i} \textrm{ where } u_{i}\sim N(0,\sigma_{u}^{2})
(\#eq:level2uncmean)
\end{equation}
or as a composite model:
\begin{equation}
Y_{ij}=\alpha_{0}+u_{i}+\epsilon_{ij}
(\#eq:compuncmean)
\end{equation}
In this model, the performance anxiety scores of subject $i$ are not a function of performance type or any other Level One covariate, so that $a_{i}$ is the true mean response of all observations for subject $i$. On the other hand, $\alpha_{0}$ is the grand mean -- the true mean of all observations across the entire population. Our primary interest in the unconditional means model is the variance components -- $\sigma^2$ is the within-person variability, while $\sigma_{u}^{2}$ is the between-person variability. The name **random intercepts model** then arises from the Level Two equation for $a_{i}$: each subject's intercept is assumed to be a random value from a normal distribution centered at $\alpha_{0}$ with variance $\sigma_{u}^{2}$.
Using the composite model specification, the unconditional means model can be fit to the music performance anxiety data using statistical software:
```{r verb4, echo=FALSE, comment=NA}
#Model A (Unconditional means model)
model.a <- lmer(na~ 1 + (1|id), REML=T, data=music)
# print(summary(model.a), correlation=FALSE, show.resids=FALSE)
tmp <- capture.output(summary(model.a))
cat(tmp[c(2, 10:19)], sep='\n')
```
From this output, we obtain estimates of our three model parameters:
- $\hat{\alpha}_{0}=16.2=$ the estimated mean performance anxiety score across all performances and all subjects.
- $\hat{\sigma}^2=22.5=$ the estimated variance in within-person deviations.
- $\hat{\sigma}_{u}^{2}=5.0=$ the estimated variance in between-person deviations.
The relative levels of between- and within-person variabilities can be compared through the __intraclass correlation coefficient__:
\begin{equation}
\hat{\rho}=\frac{\textrm{Between-person variability}}{\textrm{Total variability}} = \frac{\hat{\sigma}_{u}^{2}}{\hat{\sigma}_{u}^{2}+\hat{\sigma}^2} = \frac{5.0}{5.0+22.5} = .182.
(\#eq:intracc)
\end{equation}
Thus, 18.2\% of the total variability in performance anxiety scores are attributable to differences among subjects. In this particular model, we can also say that the average correlation for any pair of responses from the same individual is a moderately low .182. As $\rho$ approaches 0, responses from an individual are essentially independent and accounting for the multilevel structure of the data becomes less crucial. However, as $\rho$ approaches 1, repeated observations from the same individual essentially provide no additional information and accounting for the multilevel structure becomes very important. With $\rho$ near 0, the __effective sample size__ (the number of independent pieces of information we have for modeling) approaches the total number of observations, while with $\rho$ near 1, the effective sample size approaches the number of subjects in the study.
## Binary covariates at Level One and Level Two {#modelb}
### Random slopes and intercepts model {#randomslopeandint}
The next step in model fitting is to build a good model for predicting performance anxiety scores at Level One (within subject). We will add potentially meaningful Level One covariates---those that vary from performance-to-performance for each individual. In this case, mirroring our model from section \@ref(twolevelmodeling), we will include a binary covariate for performance type:
\[ \textstyle{LargeEns}_{ij} =
\left\lbrace
\begin{tabular}{l l}
1 & if `perf\_type` = Large Ensemble \\
0 & if `perf\_type` = Solo or Small Ensemble.
\end{tabular}\right.
\]
and no other Level One covariates (for now). (Note that we may later also want to include an indicator variable for "Small Ensemble" to separate the effects of `Solo` performances and `Small Ensemble` performances.) The resulting model, which we will denote as Model B, can be specified either using formulations at both levels:
- Level One:
\begin{equation}
Y_{ij} = a_{i}+b_{i}\textstyle{LargeEns}_{ij}+\epsilon_{ij}
(\#eq:level1modelb)
\end{equation}
- Level Two:
\begin{align*}
a_{i} & = \alpha_{0}+u_{i} \\
b_{i} & = \beta_{0}+v_{i}
\end{align*}
or as a composite model:
\begin{equation}
Y_{ij}=[\alpha_{0}+\beta_{0}\textstyle{LargeEns}_{ij}]+[u_{i}+v_{i}\textstyle{LargeEns}_{ij}+\epsilon_{ij}]
(\#eq:compmodelb)
\end{equation}
where $\epsilon_{ij}\sim N(0,\sigma^2)$ and
\[ \left[ \begin{array}{c}
u_{i} \\ v_{i}
\end{array} \right] \sim N \left( \left[
\begin{array}{c}
0 \\ 0
\end{array} \right], \left[
\begin{array}{cc}
\sigma_{u}^{2} & \\
\rho\sigma_{u}\sigma_{v} & \sigma_{v}^{2}
\end{array} \right] \right). \]
as discussed in section \@ref(MVN).
In this model, performance anxiety scores for subject $i$ are assumed to differ (on average) for Large Ensemble performances as compared with Solos and Small Ensemble performances; the $\epsilon_{ij}$ terms capture the deviation between the true performance anxiety levels for subjects (based on performance type) and their observed anxiety levels. $\alpha_{0}$ is then the true mean performance anxiety level for Solos and Small Ensembles, and $\beta_{0}$ is the true mean difference in performance anxiety for Large Ensembles compared to other performance types. As before, $\sigma^2$ quantifies the within-person variability (the scatter of points around individuals' means by performance type), while now the between-person variability is partitioned into variability in Solo and Small Ensemble scores ($\sigma_{u}^{2}$) and variability in differences with Large Ensembles ($\sigma_{v}^{2}$).
Using the composite model specification, Model B can be fit to the music performance anxiety data, producing the following output:
```{r verb5, echo=FALSE, comment=NA}
#Model B (Add large as Level 1 covariate)
model.b <- lmer(na ~ large + (large |id),
REML=T, data=music)
# print(summary(model.b), correlation=FALSE, show.resids=FALSE)
tmp <- capture.output(summary(model.b))
cat(tmp[c(2, 10:21)], sep='\n')
```
From this output, we obtain estimates of our six model parameters (2 fixed effects and 4 variance components):
- $\hat{\alpha}_{0}=16.7=$ the mean performance anxiety level before solos and small ensemble performances.
- $\hat{\beta}_{0}=-1.7=$ the mean decrease in performance anxiety before large ensemble performances.
- $\hat{\sigma}^2=21.8=$ the variance in within-person deviations.
- $\hat{\sigma}_{u}^{2}=6.3=$ the variance in between-person deviations in performance anxiety scores before solos and small ensembles.
- $\hat{\sigma}_{v}^{2}=0.7=$ the variance in between-person deviations in increases (or decreases) in performance anxiety scores before large ensembles.
- $\hat{\rho}_{uv}=-0.76=$ the correlation in subjects' anxiety before solos and small ensembles and their differences in anxiety between large ensembles and other performance types.
We see that, on average, subjects had a performance anxiety level of 16.7 before solos and small ensembles, and their anxiety levels were 1.7 points lower, on average, before large ensembles, producing an average performance anxiety level before large ensembles of 15.0. According to the t-value listed in R, the difference between large ensembles and other performance types is statistically significant (t=-3.09).
This random slopes and intercepts model is illustrated in Figure \@ref(fig:mli-spag1). The thicker black line shows the overall trends given by our estimated fixed effects: an intercept of 16.7 and a slope of -1.7. Then, each subject is represented by a gray line. Not only do the subjects' intercepts differ (with variance 6.3), but their slopes differ as well (with variance 0.7). Additionally, subjects' slopes and intercepts are negatively associated (with correlation -0.76), so that subjects with greater intercepts tend to have steeper negative slopes. We can compare this model with the random intercepts model from section \@ref(modela), pictured in Figure \@ref(fig:mli-spag0). With no effect of large ensembles, each subject is represented by a gray line with the identical slope (0), but with varying intercepts (with variance 5.0).
```{r, include=FALSE}
# Plot random intercepts model
ints.a = fixef(model.a)[1] + ranef(model.a)[[1]][1]
n = length(music.lev2$id)
modela.plot = data.frame(id=music.lev2$id,ints.a=ints.a[[1]],slopes.a=rep(0,n))
# Picture variance components from unconditional means model
ggplot(data=music) + geom_boxplot(aes(factor(id),na)) + theme.1 +
xlab("Subject ID") + ylab("Negative Affect")
# Plot random intercepts and slopes model
ints.b = fixef(model.b)[1] + ranef(model.b)[[1]][1]
slopes.b = fixef(model.b)[2] + ranef(model.b)[[1]][2]
modelb.plot = data.frame(id=music.lev2$id,ints.b=ints.b[[1]],slopes.b=slopes.b[[1]])
```
```{r mli-spag1,fig.align="center",out.width="60%",fig.cap='The random slopes and intercepts model fitted to the music performance anxiety data. Each gray line represents one subject, and the thicker black line represents the trend across all subjects.',echo=FALSE, warning=FALSE}
ggplot() + theme.1 +
scale_x_continuous(name="Large Ensemble indicator", limits=c(0,1), breaks=c(0,1)) +
scale_y_continuous(name="Negative Affect", limits=c(10,25)) +
geom_abline(data=modelb.plot, aes(intercept=ints.b, slope=slopes.b),
color="dark gray") +
geom_abline(aes(intercept=fixef(model.b)[1], slope=fixef(model.b)[2]), size=1)
```
```{r mli-spag0, fig.align="center",out.width="60%", fig.cap='The random intercepts model fitted to the music performance anxiety data. Each gray line represents one subject, and the thicker black line represents the trend across all subjects.',echo=FALSE, warning=FALSE}
ggplot() + theme.1 +
scale_x_continuous(name="Large Ensemble indicator", limits=c(0,1), breaks=c(0,1)) +
scale_y_continuous(name="Negative Affect", limits=c(10,25)) +
geom_abline(data=modela.plot, aes(intercept=ints.a, slope=slopes.a),
color="dark gray") +
geom_abline(aes(intercept=fixef(model.a)[1], slope=0), size=1)
```
Figures \@ref(fig:mli-spag1) and \@ref(fig:mli-spag0) use **empirical Bayes estimates** for the intercepts ($a_{i}$) and slopes ($b_{i}$) of individual subjects. Empirical Bayes estimates are sometimes called "shrinkage estimates" since they combine individual-specific information with information from all subjects, thus "shrinking" the individual estimates toward the group averages. Empirical Bayes estimates are often used when a term such as $a_{i}$ involves both fixed and random components; further detail can be found in @Bryk2002 and @Singer2003.
### Pseudo $R^2$ values {#pseudoR2}
The estimated within-person variance $\hat{\sigma}^2$ decreased by 3.1\% (from 22.5 to 21.8) from the unconditional means model, implying that only 3.1\% of within-person variability in performance anxiety scores can be explained by performance type. This calculation is considered a __pseudo R-square__ value:
\begin{equation}
\textrm{Pseudo }R^2_{L1} = \frac{\hat{\sigma}^{2}(\textrm{Model A})-\hat{\sigma}^{2}(\textrm{Model B})}{\hat{\sigma}^{2}(\textrm{Model A})} = \frac{22.5-21.8}{22.5} = 0.031
(\#eq:pseudo)
\end{equation}
Values of $\hat{\sigma}_{u}^{2}$ and $\hat{\sigma}_{v}^{2}$ cannot be compared to between-person variability from Model A, since the inclusion of performance type has changed the interpretation of these values, although they can provide important benchmarks for evaluating more complex Level Two predictions. Finally, $\hat{\rho}_{uv}=-0.76$ indicates a strong negative relationship between a subject's performance anxiety before solos and small ensembles and their (typical) decrease in performance anxiety before large ensembles. As might be expected, subjects with higher levels of performance anxiety before solos and small ensembles tend to have smaller increases (or greater decreases) in performance anxiety before large ensembles; those with higher levels of performance anxiety before solos and small ensembles have more opportunity for decreases before large ensembles.
Pseudo $R^2$ values are not universally reliable as measures of model performance. Because of the complexity of estimating fixed effects and variance components at various levels of a multilevel model, it is not unusual to encounter situations in which covariates in a Level Two equation for, say, the intercept remain constant (while other aspects of the model change), yet the associated pseudo $R^2$ values differ or are negative. For this reason, pseudo $R^2$ values in multilevel models should be interpreted cautiously.
### Adding a covariate at Level Two {#modelc}
The initial two-level model described in section \@ref(initialmodel) essentially expands upon the random slopes and intercepts model by adding a binary covariate for instrument played at Level Two. We will denote this as Model C:
- Level One:
\begin{equation}
Y_{ij} = a_{i}+b_{i}\textstyle{LargeEns}_{ij}+\epsilon_{ij}
(\#eq:level1modelc)
\end{equation}
- Level Two:
\begin{align*}
a_{i} & = \alpha_{0}+\alpha_{1}\textstyle{Orch}_{i}+u_{i} \\
b_{i} & = \beta_{0}+\beta_{1}\textstyle{Orch}_{i}+v_{i},
\end{align*}
where $\epsilon_{ij}\sim N(0,\sigma^2)$ and
\[ \left[ \begin{array}{c}
u_{i} \\ v_{i}
\end{array} \right] \sim N \left( \left[
\begin{array}{c}
0 \\ 0
\end{array} \right], \left[
\begin{array}{cc}
\sigma_{u}^{2} & \\
\rho\sigma_{u}\sigma_{v} & \sigma_{v}^{2}
\end{array} \right] \right). \]
We found that there are no highly significant fixed effects in Model C (other than the intercept). In particular, we have no significant evidence that musicians playing orchestral instruments reported different performance anxiety scores, on average, for solos and small ensembles than keyboardists and vocalists, no evidence of a difference in performance anxiety by performance type for keyboard players and vocalists, and no evidence of an instrument effect in difference between large ensembles and other types.
Since no terms were added at Level One when expanding from the random slopes and intercepts model (Model B), no discernable changes should occur in explained within-person variability (although small changes could occur due to numerical estimation procedures used in likelihood-based parameter estimates). However, Model C expanded Model B by using the instrument which a subject plays to model both intercepts and slopes at Level Two. We can use pseudo R-square values for both intercepts and slopes to evaluate the impact on between-person variability of adding instrument to Model B.
\begin{equation}
\textrm{Pseudo }R^2_{L2_u} = \frac{\hat{\sigma}_{u}^{2}(\textrm{Model B})-\hat{\sigma}_{u}^{2}(\textrm{Model C})}{\hat{\sigma}_{u}^{2}(\textrm{Model B})} = \frac{6.33-5.66}{6.33} = 0.106
(\#eq:pseudol0)
\end{equation}
\begin{equation}
\textrm{Pseudo }R^2_{L2_v} = \frac{\hat{\sigma}_{v}^{2}(\textrm{Model B})-\hat{\sigma}_{v}^{2}(\textrm{Model C})}{\hat{\sigma}_{v}^{2}(\textrm{Model B})} = \frac{0.74-0.45}{0.74} = 0.392
(\#eq:pseudol1)
\end{equation}
$\textrm{Pseudo }R^2_{L2_u}$ describes the improvement in Model C over Model B in explaining subject-to-subject variability in intercepts, and $\textrm{Pseudo }R^2_{L2_v}$ describes the improvement in Model C over Model B in explaining subject-to-subject variability in slopes. Thus, the addition of instrument at Level Two has decreased the between-person variability in mean performance anxiety before solos and small ensembles by 10.6\%, and it has decreased the between-person variability in the effect of large ensembles on performance anxiety by 39.2\%.
We could also run a "random intercepts" version of Model C, with no error term in the equation for the slope at Level Two (and thus no covariance between errors at Level Two as well):
- Level One:
\begin{equation}
Y_{ij} = a_{i}+b_{i}\textstyle{LargeEns}_{ij}+\epsilon_{ij}
(\#eq:level1modelcrandint)
\end{equation}
- Level Two:
\begin{align*}
a_{i} & = \alpha_{0}+\alpha_{1}\textstyle{Orch}_{i}+u_{i} \\
b_{i} & = \beta_{0}+\beta_{1}\textstyle{Orch}_{i},
\end{align*}
where $\epsilon_{ij}\sim N(0,\sigma^2)$ and $u_{i}\sim N(0,\sigma_{u}^{2})$.
The output below contains REML estimates of our 6 parameters from this simplified version of Model C:
```{r verb6, echo=FALSE, comment=NA}
#Model C2 (Run as random intercepts model.)
model.c2 <- lmer(na ~ orch + large + orch:large +
(1|id), REML=T, data=music)
# print(summary(model.c2), correlation=FALSE, show.resids=FALSE)
tmp <- capture.output(summary(model.c2))
cat(tmp[c(2, 10:22)], sep='\n')
```
```{r verb6b, echo=FALSE, comment=NA}
#Model C2 vs Model C: AIC and BIC
model.c <- model0
AIC(model.c, model.c2)
BIC(model.c, model.c2)
```
Note that parameter estimates for the remaining 6 fixed effects and variance components closely mirror the corresponding parameter estimates from Model C. In fact, removing the error term on the slope has improved (reduced) both the AIC and BIC measures of overall model performance. Instead of assuming that the large ensemble effects, after accounting for instrument played, vary by individual, we are assuming that large ensemble effect is fixed across subjects. It is not unusual to run a two-level model like this, with an error term on the intercept equation to account for subject-to-subject differences, but with no error terms on other Level Two equations unless there is an _a priori_ reason to allow effects to vary by subject or if the model performs better after building in those additional error terms.
## Additional covariates: model comparison and interpretability {#sec:modeld}
Recall that we are particularly interested in this study in Level Two covariates---those subject-specific variables that provide insight into why individuals react differently in anxiety-inducing situations. In section \@ref(explore), we saw evidence that subjects with higher baseline levels of negative emotionality tend to have higher performance anxiety levels prior to performances. Thus, in our next step in model building, we will add negative emotionality as a Level Two predictor to Model C. With this addition, our new model can be expressed as a system of Level One and Level Two models:
- Level One:
\begin{equation}
Y_{ij} = a_{i}+b_{i}\textstyle{LargeEns}_{ij}+\epsilon_{ij}
(\#eq:level1modelcnegem)
\end{equation}
- Level Two:
\begin{align*}
a_{i} & = \alpha_{0}+\alpha_{1}\textstyle{Orch}_{i}+\alpha_{2}\textstyle{MPQnem}_{i}+u_{i} \\
b_{i} & = \beta_{0}+\beta_{1}\textstyle{Orch}_{i}+\beta_{2}\textstyle{MPQnem}_{i}+v_{i},
\end{align*}
or as a composite model:
\begin{align*}
Y_{ij} & = [\alpha_{0}+\alpha_{1}\textstyle{Orch}_{i}+\alpha_{2}\textstyle{MPQnem}_{i}+\beta_{0}\textstyle{LargeEns}_{ij} \\
& \textrm{} + \beta_{1}\textstyle{Orch}_{i}\textstyle{LargeEns}_{ij}+\beta_{2}\textstyle{MPQnem}_{i}\textstyle{LargeEns}_{ij}] \\
& \textrm{} + [u_{i}+v_{i}\textstyle{LargeEns}_{ij}+\epsilon_{ij}]
\end{align*}
where error terms are defined as in Model C.
From the R output below, we see that, as our exploratory analyses suggested, subjects with higher baseline levels of stress reaction, alienation, and aggression (as measured by the MPQ negative emotionality scale) had significantly higher levels of performance anxiety before solos and small ensembles (t=3.893). They also had somewhat greater differences between large ensembles and other performance types, controlling for instrument (t=-0.575), although this interaction was not statistically significant.
```{r verb7,echo=FALSE, comment=NA}
# Model D (Add negative emotionality as second Level 2 covariate)
model.d <- lmer(na ~ orch + mpqnem + large + orch:large +
mpqnem:large + (large|id), REML=T, data=music)
# print(summary(model.d), correlation=FALSE, show.resids=FALSE)
tmp <- capture.output(summary(model.d))
cat(tmp[c(2:3, 11:26)], sep='\n')
```
### Interpretation of parameter estimates {#interp:modeld}
Compared to Model C, the directions of the effects of instrument and performance type are consistent, but the effect sizes and levels of significance are reduced because of the relative importance of the negative emotionality term. Interpretations will also change slightly to acknowledge that we have controlled for a covariate. In addition, interpretations of fixed effects involving negative emotionality must acknowledge that this covariate is a continuous measure and not binary like instrument and performance type:
- $\hat{\alpha}_{0} = 11.57$. The estimated mean performance anxiety for solos and small ensembles (`large=0`) is 11.57 for keyboard players and vocalists (`orch=0`) with negative emotionality of 0 at baseline (`mpqnem=0`). Since the minimum negative emotionality score in this study was 11, this interpretation, while technically correct, is not practically meaningful.
- $\hat{\alpha}_{1} = 1.00$. Orchestral instrument players have an estimated mean anxiety level before solos and small ensembles which is 1.00 point higher than keyboardists and vocalists, controlling for the effects of baseline negative emotionality.
- $\hat{\alpha}_{2} = 0.15$. A one point increase in baseline negative emotionality is associated with an estimated 0.15 mean increase in anxiety levels before solos and small ensembles, after controlling for instrument.
- $\hat{\beta}_{0} = -0.28$. Keyboard players and vocalists (`orch=0`) with baseline negative emotionality levels of 0 (`mpqnem=0`) have an estimated mean decrease in anxiety level of 0.28 points before large ensemble performances compared to other performance types.
- $\hat{\beta}_{1} = -0.95$. Orchestral instrument players have an estimated mean difference in anxiety levels between large ensembles and other performance types that is 0.95 points lower than the difference for keyboard players and vocalists, controlling for the effects of baseline negative emotionality. For example, orchestral instrumentalists with `mpqnem=0` have a larger estimated mean decrease in anxiety level of 1.23 points before large ensemble performances, rather than the mean decrease of 0.28 in those who don't play orchestral instruments.
- $\hat{\beta}_{2} = -0.03$. A one point increase in baseline negative emotionality is associated with an estimated .03 mean decrease in the difference in anxiety levels for large ensembles and other performance types, after controlling for instrument.
Some of the detail in these parameter interpretations can be tricky---describing interaction terms, deciding if a covariate must be fixed at 0 or merely held constant, etc. Often it helps to write out models for special cases to isolate the effects of specific fixed effects. We will consider a few parameter estimates from above and see why the interpretations are written as they are.
- $\hat{\alpha}_{1}$. For solos and small ensembles (`LargeEns=0`), the following equations describe the fixed effects portion of the composite model for negative affect score for vocalists and keyboardsists (`Orch=0`) and orchestral instrumentalists (`Orch=1`):
\begin{align*}
\textstyle{Orch}=0: & \\
Y_{ij} & = \alpha_{0}+\alpha_{2}\textstyle{MPQnem}_{i} \\
\textstyle{Orch}=1: & \\
Y_{ij} & = (\alpha_{0}+\alpha_{1})+\alpha_{2}\textstyle{MPQnem}_{i}
\end{align*}
Regardless of the subjects' baseline negative emotionality (`MPQnem`), $\hat{\alpha}_{1}$ represents the estimated difference in performance anxiety between those playing orchestral instruments and others. This interpretation, however, only holds for solos and small ensembles. For large ensembles, the difference between those playing orchestral instruments and others is actually given by $\hat{\alpha}_{1}+\hat{\beta}_{1}$, holding `MPQnem` constant (Show!).
- $\hat{\beta}_{0}$. Because `LargeEns` interacts with both `Orch` and `MPQnem` in Model C, $\hat{\beta}_{0}$ only describes the estimated difference between large ensembles and other performance types when both `Orch=0` and `MPQnem=0`, thus removing the effects of the interaction terms. If, for instance, `Orch=1` and `MPQnem=20`, then the difference between large ensembles and other performance types is given by $\hat{\beta}_{0}+\hat{\beta}_{1}+20\hat{\beta}_{2}$.
- $\hat{\beta}_{1}$. As with $\hat{\alpha}_{1}$, we consider equations describing the fixed effects portion of the composite model for negative affect score for vocalists and keyboardsists (`Orch=0`) and orchestral instrumentalists (`Orch=1`), except here we leave _LargeEns_ as an unknown rather than restricting the model to solos and small ensembles:
\begin{align*}
\textstyle{Orch}=0: & & \\
Y_{ij} & = \alpha_{0}+\alpha_{2}\textstyle{MPQnem}_{i}+\beta_{0}\textstyle{LargeEns}_{ij} \\
& \textrm{} +\beta_{2}\textstyle{MPQnem}_{i}\textstyle{LargeEns}_{ij} \\
\textstyle{Orch}=1: & \\
Y_{ij} & = (\alpha_{0}+\alpha_{1})+\alpha_{2}\textstyle{MPQnem}_{i}+(\beta_{0}+\beta_{1})\textstyle{LargeEns}_{ij} \\