-
Notifications
You must be signed in to change notification settings - Fork 0
/
Copy path9-hypothesis-testing.html
2117 lines (2067 loc) · 240 KB
/
9-hypothesis-testing.html
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622
623
624
625
626
627
628
629
630
631
632
633
634
635
636
637
638
639
640
641
642
643
644
645
646
647
648
649
650
651
652
653
654
655
656
657
658
659
660
661
662
663
664
665
666
667
668
669
670
671
672
673
674
675
676
677
678
679
680
681
682
683
684
685
686
687
688
689
690
691
692
693
694
695
696
697
698
699
700
701
702
703
704
705
706
707
708
709
710
711
712
713
714
715
716
717
718
719
720
721
722
723
724
725
726
727
728
729
730
731
732
733
734
735
736
737
738
739
740
741
742
743
744
745
746
747
748
749
750
751
752
753
754
755
756
757
758
759
760
761
762
763
764
765
766
767
768
769
770
771
772
773
774
775
776
777
778
779
780
781
782
783
784
785
786
787
788
789
790
791
792
793
794
795
796
797
798
799
800
801
802
803
804
805
806
807
808
809
810
811
812
813
814
815
816
817
818
819
820
821
822
823
824
825
826
827
828
829
830
831
832
833
834
835
836
837
838
839
840
841
842
843
844
845
846
847
848
849
850
851
852
853
854
855
856
857
858
859
860
861
862
863
864
865
866
867
868
869
870
871
872
873
874
875
876
877
878
879
880
881
882
883
884
885
886
887
888
889
890
891
892
893
894
895
896
897
898
899
900
901
902
903
904
905
906
907
908
909
910
911
912
913
914
915
916
917
918
919
920
921
922
923
924
925
926
927
928
929
930
931
932
933
934
935
936
937
938
939
940
941
942
943
944
945
946
947
948
949
950
951
952
953
954
955
956
957
958
959
960
961
962
963
964
965
966
967
968
969
970
971
972
973
974
975
976
977
978
979
980
981
982
983
984
985
986
987
988
989
990
991
992
993
994
995
996
997
998
999
1000
<!DOCTYPE html>
<html lang="" xml:lang="">
<head>
<meta charset="utf-8" />
<meta http-equiv="X-UA-Compatible" content="IE=edge" />
<title>Chapter 9 Hypothesis Testing | Statistical Inference via Data Science</title>
<meta name="description" content="An open-source and fully-reproducible electronic textbook for teaching statistical inference using tidyverse data science tools." />
<meta name="generator" content="bookdown 0.22 and GitBook 2.6.7" />
<meta property="og:title" content="Chapter 9 Hypothesis Testing | Statistical Inference via Data Science" />
<meta property="og:type" content="book" />
<meta property="og:url" content="https://moderndive.com/" />
<meta property="og:image" content="https://moderndive.com//images/logos/book_cover.png" />
<meta property="og:description" content="An open-source and fully-reproducible electronic textbook for teaching statistical inference using tidyverse data science tools." />
<meta name="github-repo" content="moderndive/ModernDive_book" />
<meta name="twitter:card" content="summary" />
<meta name="twitter:title" content="Chapter 9 Hypothesis Testing | Statistical Inference via Data Science" />
<meta name="twitter:site" content="@ModernDive" />
<meta name="twitter:description" content="An open-source and fully-reproducible electronic textbook for teaching statistical inference using tidyverse data science tools." />
<meta name="twitter:image" content="https://moderndive.com//images/logos/book_cover.png" />
<meta name="author" content="Chester Ismay and Albert Y. Kim Foreword by Kelly S. McConville Adapted by William R. Morgan" />
<meta name="date" content="2021-07-28" />
<meta name="viewport" content="width=device-width, initial-scale=1" />
<meta name="apple-mobile-web-app-capable" content="yes" />
<meta name="apple-mobile-web-app-status-bar-style" content="black" />
<link rel="apple-touch-icon-precomposed" sizes="152x152" href="images/logos/favicons/apple-touch-icon.png" />
<link rel="shortcut icon" href="images/logos/favicons/favicon.ico" type="image/x-icon" />
<link rel="prev" href="8-confidence-intervals.html"/>
<link rel="next" href="10-inference-for-regression.html"/>
<script src="libs/header-attrs-2.9/header-attrs.js"></script>
<script src="libs/jquery-2.2.3/jquery.min.js"></script>
<link href="libs/gitbook-2.6.7/css/style.css" rel="stylesheet" />
<link href="libs/gitbook-2.6.7/css/plugin-table.css" rel="stylesheet" />
<link href="libs/gitbook-2.6.7/css/plugin-bookdown.css" rel="stylesheet" />
<link href="libs/gitbook-2.6.7/css/plugin-highlight.css" rel="stylesheet" />
<link href="libs/gitbook-2.6.7/css/plugin-search.css" rel="stylesheet" />
<link href="libs/gitbook-2.6.7/css/plugin-fontsettings.css" rel="stylesheet" />
<link href="libs/gitbook-2.6.7/css/plugin-clipboard.css" rel="stylesheet" />
<link href="libs/anchor-sections-1.0.1/anchor-sections.css" rel="stylesheet" />
<script src="libs/anchor-sections-1.0.1/anchor-sections.js"></script>
<script src="libs/kePrint-0.0.1/kePrint.js"></script>
<link href="libs/lightable-0.0.1/lightable.css" rel="stylesheet" />
<script src="libs/htmlwidgets-1.5.3/htmlwidgets.js"></script>
<link href="libs/dygraphs-1.1.1/dygraph.css" rel="stylesheet" />
<script src="libs/dygraphs-1.1.1/dygraph-combined.js"></script>
<script src="libs/dygraphs-1.1.1/shapes.js"></script>
<script src="libs/moment-2.8.4/moment.js"></script>
<script src="libs/moment-timezone-0.2.5/moment-timezone-with-data.js"></script>
<script src="libs/moment-fquarter-1.0.0/moment-fquarter.min.js"></script>
<script src="libs/dygraphs-binding-1.1.1.6/dygraphs.js"></script>
<script>
(function(i,s,o,g,r,a,m){i['GoogleAnalyticsObject']=r;i[r]=i[r]||function(){
(i[r].q=i[r].q||[]).push(arguments)},i[r].l=1*new Date();a=s.createElement(o),
m=s.getElementsByTagName(o)[0];a.async=1;a.src=g;m.parentNode.insertBefore(a,m)
})(window,document,'script','https://www.google-analytics.com/analytics.js','ga');
ga('create', 'UA-89938436-1', 'auto');
ga('send', 'pageview');
</script>
<style type="text/css">
pre > code.sourceCode { white-space: pre; position: relative; }
pre > code.sourceCode > span { display: inline-block; line-height: 1.25; }
pre > code.sourceCode > span:empty { height: 1.2em; }
.sourceCode { overflow: visible; }
code.sourceCode > span { color: inherit; text-decoration: inherit; }
pre.sourceCode { margin: 0; }
@media screen {
div.sourceCode { overflow: auto; }
}
@media print {
pre > code.sourceCode { white-space: pre-wrap; }
pre > code.sourceCode > span { text-indent: -5em; padding-left: 5em; }
}
pre.numberSource code
{ counter-reset: source-line 0; }
pre.numberSource code > span
{ position: relative; left: -4em; counter-increment: source-line; }
pre.numberSource code > span > a:first-child::before
{ content: counter(source-line);
position: relative; left: -1em; text-align: right; vertical-align: baseline;
border: none; display: inline-block;
-webkit-touch-callout: none; -webkit-user-select: none;
-khtml-user-select: none; -moz-user-select: none;
-ms-user-select: none; user-select: none;
padding: 0 4px; width: 4em;
color: #aaaaaa;
}
pre.numberSource { margin-left: 3em; border-left: 1px solid #aaaaaa; padding-left: 4px; }
div.sourceCode
{ }
@media screen {
pre > code.sourceCode > span > a:first-child::before { text-decoration: underline; }
}
code span.al { color: #ff0000; font-weight: bold; } /* Alert */
code span.an { color: #60a0b0; font-weight: bold; font-style: italic; } /* Annotation */
code span.at { color: #7d9029; } /* Attribute */
code span.bn { color: #40a070; } /* BaseN */
code span.bu { } /* BuiltIn */
code span.cf { color: #007020; font-weight: bold; } /* ControlFlow */
code span.ch { color: #4070a0; } /* Char */
code span.cn { color: #880000; } /* Constant */
code span.co { color: #60a0b0; font-style: italic; } /* Comment */
code span.cv { color: #60a0b0; font-weight: bold; font-style: italic; } /* CommentVar */
code span.do { color: #ba2121; font-style: italic; } /* Documentation */
code span.dt { color: #902000; } /* DataType */
code span.dv { color: #40a070; } /* DecVal */
code span.er { color: #ff0000; font-weight: bold; } /* Error */
code span.ex { } /* Extension */
code span.fl { color: #40a070; } /* Float */
code span.fu { color: #06287e; } /* Function */
code span.im { } /* Import */
code span.in { color: #60a0b0; font-weight: bold; font-style: italic; } /* Information */
code span.kw { color: #007020; font-weight: bold; } /* Keyword */
code span.op { color: #666666; } /* Operator */
code span.ot { color: #007020; } /* Other */
code span.pp { color: #bc7a00; } /* Preprocessor */
code span.sc { color: #4070a0; } /* SpecialChar */
code span.ss { color: #bb6688; } /* SpecialString */
code span.st { color: #4070a0; } /* String */
code span.va { color: #19177c; } /* Variable */
code span.vs { color: #4070a0; } /* VerbatimString */
code span.wa { color: #60a0b0; font-weight: bold; font-style: italic; } /* Warning */
</style>
<style type="text/css">
/* Used with Pandoc 2.11+ new --citeproc when CSL is used */
div.csl-bib-body { }
div.csl-entry {
clear: both;
}
.hanging div.csl-entry {
margin-left:2em;
text-indent:-2em;
}
div.csl-left-margin {
min-width:2em;
float:left;
}
div.csl-right-inline {
margin-left:2em;
padding-left:1em;
}
div.csl-indent {
margin-left: 2em;
}
</style>
<link rel="stylesheet" href="style.css" type="text/css" />
</head>
<body>
<div class="book without-animation with-summary font-size-2 font-family-1" data-basepath=".">
<div class="book-summary">
<nav role="navigation">
<ul class="summary">
<li class="chapter" data-level="" data-path="index.html"><a href="index.html"><i class="fa fa-check"></i>Welcome to ModernDive</a></li>
<li class="chapter" data-level="" data-path="foreword.html"><a href="foreword.html"><i class="fa fa-check"></i>Foreword</a></li>
<li class="chapter" data-level="" data-path="preface.html"><a href="preface.html"><i class="fa fa-check"></i>Preface</a>
<ul>
<li class="chapter" data-level="" data-path="preface.html"><a href="preface.html#introduction-for-students"><i class="fa fa-check"></i>Introduction for students</a>
<ul>
<li class="chapter" data-level="" data-path="preface.html"><a href="preface.html#what-we-hope-you-will-learn-from-this-book"><i class="fa fa-check"></i>What we hope you will learn from this book</a></li>
<li class="chapter" data-level="" data-path="preface.html"><a href="preface.html#datascience-pipeline"><i class="fa fa-check"></i>Data/science pipeline</a></li>
<li class="chapter" data-level="" data-path="preface.html"><a href="preface.html#reproducible-research"><i class="fa fa-check"></i>Reproducible research</a></li>
<li class="chapter" data-level="" data-path="preface.html"><a href="preface.html#final-note-for-students"><i class="fa fa-check"></i>Final note for students</a></li>
</ul></li>
<li class="chapter" data-level="" data-path="preface.html"><a href="preface.html#introduction-for-instructors"><i class="fa fa-check"></i>Introduction for instructors</a>
<ul>
<li class="chapter" data-level="" data-path="preface.html"><a href="preface.html#resources"><i class="fa fa-check"></i>Resources</a></li>
<li class="chapter" data-level="" data-path="preface.html"><a href="preface.html#why-did-we-write-this-book"><i class="fa fa-check"></i>Why did we write this book?</a></li>
<li class="chapter" data-level="" data-path="preface.html"><a href="preface.html#who-is-this-book-for"><i class="fa fa-check"></i>Who is this book for?</a></li>
</ul></li>
<li class="chapter" data-level="" data-path="preface.html"><a href="preface.html#connect-and-contribute"><i class="fa fa-check"></i>Connect and contribute</a></li>
<li class="chapter" data-level="" data-path="preface.html"><a href="preface.html#acknowledgements"><i class="fa fa-check"></i>Acknowledgements</a></li>
<li class="chapter" data-level="" data-path="preface.html"><a href="preface.html#about-this-book"><i class="fa fa-check"></i>About this book</a></li>
</ul></li>
<li class="chapter" data-level="" data-path="about-the-authors.html"><a href="about-the-authors.html"><i class="fa fa-check"></i>About the authors</a></li>
<li class="chapter" data-level="1" data-path="1-getting-started.html"><a href="1-getting-started.html"><i class="fa fa-check"></i><b>1</b> Getting Started with Data in R</a>
<ul>
<li class="chapter" data-level="1.1" data-path="1-getting-started.html"><a href="1-getting-started.html#r-rstudio"><i class="fa fa-check"></i><b>1.1</b> What are R and RStudio?</a>
<ul>
<li class="chapter" data-level="1.1.1" data-path="1-getting-started.html"><a href="1-getting-started.html#installing"><i class="fa fa-check"></i><b>1.1.1</b> Installing R and RStudio</a></li>
<li class="chapter" data-level="1.1.2" data-path="1-getting-started.html"><a href="1-getting-started.html#using-r-via-rstudio"><i class="fa fa-check"></i><b>1.1.2</b> Using R via RStudio</a></li>
</ul></li>
<li class="chapter" data-level="1.2" data-path="1-getting-started.html"><a href="1-getting-started.html#code"><i class="fa fa-check"></i><b>1.2</b> How do I code in R?</a>
<ul>
<li class="chapter" data-level="1.2.1" data-path="1-getting-started.html"><a href="1-getting-started.html#programming-concepts"><i class="fa fa-check"></i><b>1.2.1</b> Basic programming concepts and terminology</a></li>
<li class="chapter" data-level="1.2.2" data-path="1-getting-started.html"><a href="1-getting-started.html#messages"><i class="fa fa-check"></i><b>1.2.2</b> Errors, warnings, and messages</a></li>
<li class="chapter" data-level="1.2.3" data-path="1-getting-started.html"><a href="1-getting-started.html#tips-code"><i class="fa fa-check"></i><b>1.2.3</b> Tips on learning to code</a></li>
</ul></li>
<li class="chapter" data-level="1.3" data-path="1-getting-started.html"><a href="1-getting-started.html#packages"><i class="fa fa-check"></i><b>1.3</b> What are R packages?</a>
<ul>
<li class="chapter" data-level="1.3.1" data-path="1-getting-started.html"><a href="1-getting-started.html#package-installation"><i class="fa fa-check"></i><b>1.3.1</b> Package installation</a></li>
<li class="chapter" data-level="1.3.2" data-path="1-getting-started.html"><a href="1-getting-started.html#package-loading"><i class="fa fa-check"></i><b>1.3.2</b> Package loading</a></li>
<li class="chapter" data-level="1.3.3" data-path="1-getting-started.html"><a href="1-getting-started.html#package-use"><i class="fa fa-check"></i><b>1.3.3</b> Package use</a></li>
</ul></li>
<li class="chapter" data-level="1.4" data-path="1-getting-started.html"><a href="1-getting-started.html#rfishbase"><i class="fa fa-check"></i><b>1.4</b> Explore your first datasets</a>
<ul>
<li class="chapter" data-level="1.4.1" data-path="1-getting-started.html"><a href="1-getting-started.html#rfishpackage"><i class="fa fa-check"></i><b>1.4.1</b> <code>rfishbase</code> package</a></li>
<li class="chapter" data-level="1.4.2" data-path="1-getting-started.html"><a href="1-getting-started.html#fishbasedataframe"><i class="fa fa-check"></i><b>1.4.2</b> <code>fishbase</code> data frame</a></li>
<li class="chapter" data-level="1.4.3" data-path="1-getting-started.html"><a href="1-getting-started.html#exploredataframes"><i class="fa fa-check"></i><b>1.4.3</b> Exploring data frames</a></li>
<li class="chapter" data-level="1.4.4" data-path="1-getting-started.html"><a href="1-getting-started.html#identification-vs-measurement-variables"><i class="fa fa-check"></i><b>1.4.4</b> Identification and measurement variables</a></li>
<li class="chapter" data-level="1.4.5" data-path="1-getting-started.html"><a href="1-getting-started.html#help-files"><i class="fa fa-check"></i><b>1.4.5</b> Help files</a></li>
</ul></li>
<li class="chapter" data-level="1.5" data-path="1-getting-started.html"><a href="1-getting-started.html#conclusion"><i class="fa fa-check"></i><b>1.5</b> Conclusion</a>
<ul>
<li class="chapter" data-level="1.5.1" data-path="1-getting-started.html"><a href="1-getting-started.html#additional-resources"><i class="fa fa-check"></i><b>1.5.1</b> Additional resources</a></li>
<li class="chapter" data-level="1.5.2" data-path="1-getting-started.html"><a href="1-getting-started.html#whats-to-come"><i class="fa fa-check"></i><b>1.5.2</b> What’s to come?</a></li>
</ul></li>
</ul></li>
<li class="part"><span><b>I Data Science with tidyverse</b></span></li>
<li class="chapter" data-level="2" data-path="2-viz.html"><a href="2-viz.html"><i class="fa fa-check"></i><b>2</b> Data Visualization</a>
<ul>
<li class="chapter" data-level="" data-path="2-viz.html"><a href="2-viz.html#needed-packages"><i class="fa fa-check"></i>Needed packages</a></li>
<li class="chapter" data-level="2.1" data-path="2-viz.html"><a href="2-viz.html#grammarofgraphics"><i class="fa fa-check"></i><b>2.1</b> The grammar of graphics</a>
<ul>
<li class="chapter" data-level="2.1.1" data-path="2-viz.html"><a href="2-viz.html#components-of-the-grammar"><i class="fa fa-check"></i><b>2.1.1</b> Components of the grammar</a></li>
<li class="chapter" data-level="2.1.2" data-path="2-viz.html"><a href="2-viz.html#gapminder"><i class="fa fa-check"></i><b>2.1.2</b> Gapminder data</a></li>
<li class="chapter" data-level="2.1.3" data-path="2-viz.html"><a href="2-viz.html#other-components"><i class="fa fa-check"></i><b>2.1.3</b> Other components</a></li>
<li class="chapter" data-level="2.1.4" data-path="2-viz.html"><a href="2-viz.html#ggplot2-package"><i class="fa fa-check"></i><b>2.1.4</b> ggplot2 package</a></li>
</ul></li>
<li class="chapter" data-level="2.2" data-path="2-viz.html"><a href="2-viz.html#FiveNG"><i class="fa fa-check"></i><b>2.2</b> Five named graphs - the 5NG</a></li>
<li class="chapter" data-level="2.3" data-path="2-viz.html"><a href="2-viz.html#scatterplots"><i class="fa fa-check"></i><b>2.3</b> 5NG#1: Scatterplots</a>
<ul>
<li class="chapter" data-level="2.3.1" data-path="2-viz.html"><a href="2-viz.html#geompoint"><i class="fa fa-check"></i><b>2.3.1</b> Scatterplots via <code>geom_point</code></a></li>
<li class="chapter" data-level="2.3.2" data-path="2-viz.html"><a href="2-viz.html#overplotting"><i class="fa fa-check"></i><b>2.3.2</b> Overplotting</a></li>
<li class="chapter" data-level="2.3.3" data-path="2-viz.html"><a href="2-viz.html#summary"><i class="fa fa-check"></i><b>2.3.3</b> Summary</a></li>
</ul></li>
<li class="chapter" data-level="2.4" data-path="2-viz.html"><a href="2-viz.html#linegraphs"><i class="fa fa-check"></i><b>2.4</b> 5NG#2: Linegraphs</a>
<ul>
<li class="chapter" data-level="2.4.1" data-path="2-viz.html"><a href="2-viz.html#geomline"><i class="fa fa-check"></i><b>2.4.1</b> Linegraphs via <code>geom_line</code></a></li>
<li class="chapter" data-level="2.4.2" data-path="2-viz.html"><a href="2-viz.html#summary-1"><i class="fa fa-check"></i><b>2.4.2</b> Summary</a></li>
</ul></li>
<li class="chapter" data-level="2.5" data-path="2-viz.html"><a href="2-viz.html#facets"><i class="fa fa-check"></i><b>2.5</b> Facets</a></li>
<li class="chapter" data-level="2.6" data-path="2-viz.html"><a href="2-viz.html#histograms"><i class="fa fa-check"></i><b>2.6</b> 5NG#3: Histograms</a>
<ul>
<li class="chapter" data-level="2.6.1" data-path="2-viz.html"><a href="2-viz.html#geomhistogram"><i class="fa fa-check"></i><b>2.6.1</b> Histograms via <code>geom_histogram</code></a></li>
<li class="chapter" data-level="2.6.2" data-path="2-viz.html"><a href="2-viz.html#adjustbins"><i class="fa fa-check"></i><b>2.6.2</b> Adjusting the bins</a></li>
<li class="chapter" data-level="2.6.3" data-path="2-viz.html"><a href="2-viz.html#summary-2"><i class="fa fa-check"></i><b>2.6.3</b> Summary</a></li>
</ul></li>
<li class="chapter" data-level="2.7" data-path="2-viz.html"><a href="2-viz.html#boxplots"><i class="fa fa-check"></i><b>2.7</b> 5NG#4: Boxplots</a>
<ul>
<li class="chapter" data-level="2.7.1" data-path="2-viz.html"><a href="2-viz.html#geomboxplot"><i class="fa fa-check"></i><b>2.7.1</b> Boxplots via <code>geom_boxplot</code></a></li>
<li class="chapter" data-level="2.7.2" data-path="2-viz.html"><a href="2-viz.html#summary-3"><i class="fa fa-check"></i><b>2.7.2</b> Summary</a></li>
</ul></li>
<li class="chapter" data-level="2.8" data-path="2-viz.html"><a href="2-viz.html#geombar"><i class="fa fa-check"></i><b>2.8</b> 5NG#5: Barplots</a>
<ul>
<li class="chapter" data-level="2.8.1" data-path="2-viz.html"><a href="2-viz.html#barplots-via-geom_bar-or-geom_col"><i class="fa fa-check"></i><b>2.8.1</b> Barplots via <code>geom_bar</code> or <code>geom_col</code></a></li>
<li class="chapter" data-level="2.8.2" data-path="2-viz.html"><a href="2-viz.html#must-avoid-pie-charts"><i class="fa fa-check"></i><b>2.8.2</b> Must avoid pie charts!</a></li>
<li class="chapter" data-level="2.8.3" data-path="2-viz.html"><a href="2-viz.html#two-categ-barplot"><i class="fa fa-check"></i><b>2.8.3</b> Two categorical variables</a></li>
<li class="chapter" data-level="2.8.4" data-path="2-viz.html"><a href="2-viz.html#summary-4"><i class="fa fa-check"></i><b>2.8.4</b> Summary</a></li>
</ul></li>
<li class="chapter" data-level="2.9" data-path="2-viz.html"><a href="2-viz.html#data-vis-conclusion"><i class="fa fa-check"></i><b>2.9</b> Conclusion</a>
<ul>
<li class="chapter" data-level="2.9.1" data-path="2-viz.html"><a href="2-viz.html#summary-table"><i class="fa fa-check"></i><b>2.9.1</b> Summary table</a></li>
<li class="chapter" data-level="2.9.2" data-path="2-viz.html"><a href="2-viz.html#function-argument-specification"><i class="fa fa-check"></i><b>2.9.2</b> Function argument specification</a></li>
<li class="chapter" data-level="2.9.3" data-path="2-viz.html"><a href="2-viz.html#additional-resources-1"><i class="fa fa-check"></i><b>2.9.3</b> Additional resources</a></li>
<li class="chapter" data-level="2.9.4" data-path="2-viz.html"><a href="2-viz.html#whats-to-come-3"><i class="fa fa-check"></i><b>2.9.4</b> What’s to come</a></li>
</ul></li>
</ul></li>
<li class="chapter" data-level="3" data-path="3-wrangling.html"><a href="3-wrangling.html"><i class="fa fa-check"></i><b>3</b> Data Wrangling</a>
<ul>
<li class="chapter" data-level="" data-path="3-wrangling.html"><a href="3-wrangling.html#wrangling-packages"><i class="fa fa-check"></i>Needed packages</a></li>
<li class="chapter" data-level="3.1" data-path="3-wrangling.html"><a href="3-wrangling.html#piping"><i class="fa fa-check"></i><b>3.1</b> The pipe operator: <code>%>%</code></a></li>
<li class="chapter" data-level="3.2" data-path="3-wrangling.html"><a href="3-wrangling.html#filter"><i class="fa fa-check"></i><b>3.2</b> <code>filter</code> rows</a></li>
<li class="chapter" data-level="3.3" data-path="3-wrangling.html"><a href="3-wrangling.html#slice-rows"><i class="fa fa-check"></i><b>3.3</b> <code>slice</code> rows</a></li>
<li class="chapter" data-level="3.4" data-path="3-wrangling.html"><a href="3-wrangling.html#select"><i class="fa fa-check"></i><b>3.4</b> <code>select</code> variables</a>
<ul>
<li class="chapter" data-level="3.4.1" data-path="3-wrangling.html"><a href="3-wrangling.html#rename"><i class="fa fa-check"></i><b>3.4.1</b> <code>rename</code> variables</a></li>
</ul></li>
<li class="chapter" data-level="3.5" data-path="3-wrangling.html"><a href="3-wrangling.html#summarize"><i class="fa fa-check"></i><b>3.5</b> <code>summarize</code> variables</a></li>
<li class="chapter" data-level="3.6" data-path="3-wrangling.html"><a href="3-wrangling.html#groupby"><i class="fa fa-check"></i><b>3.6</b> <code>group_by</code> rows</a>
<ul>
<li class="chapter" data-level="3.6.1" data-path="3-wrangling.html"><a href="3-wrangling.html#grouping-by-more-than-one-variable"><i class="fa fa-check"></i><b>3.6.1</b> Grouping by more than one variable</a></li>
</ul></li>
<li class="chapter" data-level="3.7" data-path="3-wrangling.html"><a href="3-wrangling.html#mutate"><i class="fa fa-check"></i><b>3.7</b> <code>mutate</code> existing variables</a></li>
<li class="chapter" data-level="3.8" data-path="3-wrangling.html"><a href="3-wrangling.html#arrange"><i class="fa fa-check"></i><b>3.8</b> <code>arrange</code> and sort rows</a></li>
<li class="chapter" data-level="3.9" data-path="3-wrangling.html"><a href="3-wrangling.html#joins"><i class="fa fa-check"></i><b>3.9</b> <code>join</code> data frames</a></li>
<li class="chapter" data-level="3.10" data-path="3-wrangling.html"><a href="3-wrangling.html#wrangling-conclusion"><i class="fa fa-check"></i><b>3.10</b> Conclusion</a>
<ul>
<li class="chapter" data-level="3.10.1" data-path="3-wrangling.html"><a href="3-wrangling.html#summary-table-1"><i class="fa fa-check"></i><b>3.10.1</b> Summary table</a></li>
<li class="chapter" data-level="3.10.2" data-path="3-wrangling.html"><a href="3-wrangling.html#additional-resources-2"><i class="fa fa-check"></i><b>3.10.2</b> Additional resources</a></li>
<li class="chapter" data-level="3.10.3" data-path="3-wrangling.html"><a href="3-wrangling.html#whats-to-come-1"><i class="fa fa-check"></i><b>3.10.3</b> What’s to come?</a></li>
</ul></li>
</ul></li>
<li class="chapter" data-level="4" data-path="4-tidy.html"><a href="4-tidy.html"><i class="fa fa-check"></i><b>4</b> Data Importing and “Tidy” Data</a>
<ul>
<li class="chapter" data-level="" data-path="4-tidy.html"><a href="4-tidy.html#tidy-packages"><i class="fa fa-check"></i>Needed packages</a></li>
<li class="chapter" data-level="4.1" data-path="4-tidy.html"><a href="4-tidy.html#csv"><i class="fa fa-check"></i><b>4.1</b> Importing data</a>
<ul>
<li class="chapter" data-level="4.1.1" data-path="4-tidy.html"><a href="4-tidy.html#using-the-console"><i class="fa fa-check"></i><b>4.1.1</b> Using the console</a></li>
<li class="chapter" data-level="4.1.2" data-path="4-tidy.html"><a href="4-tidy.html#using-rstudios-interface"><i class="fa fa-check"></i><b>4.1.2</b> Using RStudio’s interface</a></li>
</ul></li>
<li class="chapter" data-level="4.2" data-path="4-tidy.html"><a href="4-tidy.html#tidy-data-ex"><i class="fa fa-check"></i><b>4.2</b> “Tidy” data</a>
<ul>
<li class="chapter" data-level="4.2.1" data-path="4-tidy.html"><a href="4-tidy.html#tidy-definition"><i class="fa fa-check"></i><b>4.2.1</b> Definition of “tidy” data</a></li>
<li class="chapter" data-level="4.2.2" data-path="4-tidy.html"><a href="4-tidy.html#converting-to-tidy-data"><i class="fa fa-check"></i><b>4.2.2</b> Converting to “tidy” data</a></li>
</ul></li>
<li class="chapter" data-level="4.3" data-path="4-tidy.html"><a href="4-tidy.html#case-study-tidy"><i class="fa fa-check"></i><b>4.3</b> Case study: Weight loss data</a></li>
<li class="chapter" data-level="4.4" data-path="4-tidy.html"><a href="4-tidy.html#tidyverse-package"><i class="fa fa-check"></i><b>4.4</b> <code>tidyverse</code> package</a></li>
<li class="chapter" data-level="4.5" data-path="4-tidy.html"><a href="4-tidy.html#tidy-data-conclusion"><i class="fa fa-check"></i><b>4.5</b> Conclusion</a>
<ul>
<li class="chapter" data-level="4.5.1" data-path="4-tidy.html"><a href="4-tidy.html#additional-resources-3"><i class="fa fa-check"></i><b>4.5.1</b> Additional resources</a></li>
<li class="chapter" data-level="4.5.2" data-path="4-tidy.html"><a href="4-tidy.html#whats-to-come-2"><i class="fa fa-check"></i><b>4.5.2</b> What’s to come?</a></li>
</ul></li>
</ul></li>
<li class="part"><span><b>II Data Modeling with moderndive</b></span></li>
<li class="chapter" data-level="5" data-path="5-regression.html"><a href="5-regression.html"><i class="fa fa-check"></i><b>5</b> Basic Regression</a>
<ul>
<li class="chapter" data-level="" data-path="5-regression.html"><a href="5-regression.html#reg-packages"><i class="fa fa-check"></i>Needed packages</a></li>
<li class="chapter" data-level="5.1" data-path="5-regression.html"><a href="5-regression.html#model1"><i class="fa fa-check"></i><b>5.1</b> One numerical explanatory variable</a>
<ul>
<li class="chapter" data-level="5.1.1" data-path="5-regression.html"><a href="5-regression.html#model1EDA"><i class="fa fa-check"></i><b>5.1.1</b> Exploratory data analysis</a></li>
<li class="chapter" data-level="5.1.2" data-path="5-regression.html"><a href="5-regression.html#model1table"><i class="fa fa-check"></i><b>5.1.2</b> Simple linear regression</a></li>
<li class="chapter" data-level="5.1.3" data-path="5-regression.html"><a href="5-regression.html#model1points"><i class="fa fa-check"></i><b>5.1.3</b> Observed/fitted values and residuals</a></li>
</ul></li>
<li class="chapter" data-level="5.2" data-path="5-regression.html"><a href="5-regression.html#model2"><i class="fa fa-check"></i><b>5.2</b> One categorical explanatory variable</a>
<ul>
<li class="chapter" data-level="5.2.1" data-path="5-regression.html"><a href="5-regression.html#model2EDA"><i class="fa fa-check"></i><b>5.2.1</b> Exploratory data analysis</a></li>
<li class="chapter" data-level="5.2.2" data-path="5-regression.html"><a href="5-regression.html#model2table"><i class="fa fa-check"></i><b>5.2.2</b> Linear regression</a></li>
<li class="chapter" data-level="5.2.3" data-path="5-regression.html"><a href="5-regression.html#model2points"><i class="fa fa-check"></i><b>5.2.3</b> Observed/fitted values and residuals</a></li>
</ul></li>
<li class="chapter" data-level="5.3" data-path="5-regression.html"><a href="5-regression.html#reg-related-topics"><i class="fa fa-check"></i><b>5.3</b> Related topics</a>
<ul>
<li class="chapter" data-level="5.3.1" data-path="5-regression.html"><a href="5-regression.html#correlation-is-not-causation"><i class="fa fa-check"></i><b>5.3.1</b> Correlation is not necessarily causation</a></li>
<li class="chapter" data-level="5.3.2" data-path="5-regression.html"><a href="5-regression.html#leastsquares"><i class="fa fa-check"></i><b>5.3.2</b> Best-fitting line</a></li>
<li class="chapter" data-level="5.3.3" data-path="5-regression.html"><a href="5-regression.html#underthehood"><i class="fa fa-check"></i><b>5.3.3</b> <code>get_regression_x()</code> functions</a></li>
</ul></li>
<li class="chapter" data-level="5.4" data-path="5-regression.html"><a href="5-regression.html#reg-conclusion"><i class="fa fa-check"></i><b>5.4</b> Conclusion</a>
<ul>
<li class="chapter" data-level="5.4.1" data-path="5-regression.html"><a href="5-regression.html#additional-resources-basic-regression"><i class="fa fa-check"></i><b>5.4.1</b> Additional resources</a></li>
<li class="chapter" data-level="5.4.2" data-path="5-regression.html"><a href="5-regression.html#whats-to-come-4"><i class="fa fa-check"></i><b>5.4.2</b> What’s to come?</a></li>
</ul></li>
</ul></li>
<li class="chapter" data-level="6" data-path="6-multiple-regression.html"><a href="6-multiple-regression.html"><i class="fa fa-check"></i><b>6</b> Multiple Regression</a>
<ul>
<li class="chapter" data-level="" data-path="6-multiple-regression.html"><a href="6-multiple-regression.html#mult-reg-packages"><i class="fa fa-check"></i>Needed packages</a></li>
<li class="chapter" data-level="6.1" data-path="6-multiple-regression.html"><a href="6-multiple-regression.html#model4"><i class="fa fa-check"></i><b>6.1</b> One numerical and one categorical explanatory variable</a>
<ul>
<li class="chapter" data-level="6.1.1" data-path="6-multiple-regression.html"><a href="6-multiple-regression.html#model4EDA"><i class="fa fa-check"></i><b>6.1.1</b> Exploratory data analysis</a></li>
<li class="chapter" data-level="6.1.2" data-path="6-multiple-regression.html"><a href="6-multiple-regression.html#model4interactiontable"><i class="fa fa-check"></i><b>6.1.2</b> Interaction model</a></li>
<li class="chapter" data-level="6.1.3" data-path="6-multiple-regression.html"><a href="6-multiple-regression.html#model4table"><i class="fa fa-check"></i><b>6.1.3</b> Parallel slopes model</a></li>
<li class="chapter" data-level="6.1.4" data-path="6-multiple-regression.html"><a href="6-multiple-regression.html#model4points"><i class="fa fa-check"></i><b>6.1.4</b> Observed/fitted values and residuals</a></li>
</ul></li>
<li class="chapter" data-level="6.2" data-path="6-multiple-regression.html"><a href="6-multiple-regression.html#model3"><i class="fa fa-check"></i><b>6.2</b> Two categorical explanatory variables</a>
<ul>
<li class="chapter" data-level="6.2.1" data-path="6-multiple-regression.html"><a href="6-multiple-regression.html#model3EDA"><i class="fa fa-check"></i><b>6.2.1</b> Exploratory data analysis</a></li>
<li class="chapter" data-level="6.2.2" data-path="6-multiple-regression.html"><a href="6-multiple-regression.html#model3table"><i class="fa fa-check"></i><b>6.2.2</b> Regression lines</a></li>
<li class="chapter" data-level="6.2.3" data-path="6-multiple-regression.html"><a href="6-multiple-regression.html#model3points"><i class="fa fa-check"></i><b>6.2.3</b> Observed/fitted values and residuals</a></li>
</ul></li>
<li class="chapter" data-level="6.3" data-path="6-multiple-regression.html"><a href="6-multiple-regression.html#mult-reg-related-topics"><i class="fa fa-check"></i><b>6.3</b> Related topics</a>
<ul>
<li class="chapter" data-level="6.3.1" data-path="6-multiple-regression.html"><a href="6-multiple-regression.html#model-selection"><i class="fa fa-check"></i><b>6.3.1</b> Model selection using visualizations</a></li>
<li class="chapter" data-level="6.3.2" data-path="6-multiple-regression.html"><a href="6-multiple-regression.html#rsquared"><i class="fa fa-check"></i><b>6.3.2</b> Model selection using R-squared</a></li>
</ul></li>
<li class="chapter" data-level="6.4" data-path="6-multiple-regression.html"><a href="6-multiple-regression.html#mult-reg-conclusion"><i class="fa fa-check"></i><b>6.4</b> Conclusion</a>
<ul>
<li class="chapter" data-level="6.4.1" data-path="6-multiple-regression.html"><a href="6-multiple-regression.html#additional-resources-4"><i class="fa fa-check"></i><b>6.4.1</b> Additional resources</a></li>
<li class="chapter" data-level="6.4.2" data-path="6-multiple-regression.html"><a href="6-multiple-regression.html#whats-to-come-5"><i class="fa fa-check"></i><b>6.4.2</b> What’s to come?</a></li>
</ul></li>
</ul></li>
<li class="part"><span><b>III Statistical Inference with infer</b></span></li>
<li class="chapter" data-level="7" data-path="7-sampling.html"><a href="7-sampling.html"><i class="fa fa-check"></i><b>7</b> Sampling</a>
<ul>
<li class="chapter" data-level="" data-path="7-sampling.html"><a href="7-sampling.html#sampling-packages"><i class="fa fa-check"></i>Needed packages</a></li>
<li class="chapter" data-level="7.1" data-path="7-sampling.html"><a href="7-sampling.html#sampling-activity"><i class="fa fa-check"></i><b>7.1</b> Sampling bowl activity</a>
<ul>
<li class="chapter" data-level="7.1.1" data-path="7-sampling.html"><a href="7-sampling.html#what-proportion-of-this-bowls-balls-are-red"><i class="fa fa-check"></i><b>7.1.1</b> What proportion of this bowl’s balls are red?</a></li>
<li class="chapter" data-level="7.1.2" data-path="7-sampling.html"><a href="7-sampling.html#using-the-shovel-once"><i class="fa fa-check"></i><b>7.1.2</b> Using the shovel once</a></li>
<li class="chapter" data-level="7.1.3" data-path="7-sampling.html"><a href="7-sampling.html#student-shovels"><i class="fa fa-check"></i><b>7.1.3</b> Using the shovel 33 times</a></li>
<li class="chapter" data-level="7.1.4" data-path="7-sampling.html"><a href="7-sampling.html#sampling-what-did-we-just-do"><i class="fa fa-check"></i><b>7.1.4</b> What did we just do?</a></li>
</ul></li>
<li class="chapter" data-level="7.2" data-path="7-sampling.html"><a href="7-sampling.html#sampling-simulation"><i class="fa fa-check"></i><b>7.2</b> Virtual sampling</a>
<ul>
<li class="chapter" data-level="7.2.1" data-path="7-sampling.html"><a href="7-sampling.html#using-the-virtual-shovel-once"><i class="fa fa-check"></i><b>7.2.1</b> Using the virtual shovel once</a></li>
</ul></li>
<li class="chapter" data-level="7.3" data-path="7-sampling.html"><a href="7-sampling.html#sampling-framework"><i class="fa fa-check"></i><b>7.3</b> Sampling framework</a>
<ul>
<li class="chapter" data-level="7.3.1" data-path="7-sampling.html"><a href="7-sampling.html#terminology-and-notation"><i class="fa fa-check"></i><b>7.3.1</b> Terminology and notation</a></li>
<li class="chapter" data-level="7.3.2" data-path="7-sampling.html"><a href="7-sampling.html#sampling-definitions"><i class="fa fa-check"></i><b>7.3.2</b> Statistical definitions</a></li>
<li class="chapter" data-level="7.3.3" data-path="7-sampling.html"><a href="7-sampling.html#moral-of-the-story"><i class="fa fa-check"></i><b>7.3.3</b> The moral of the story</a></li>
</ul></li>
<li class="chapter" data-level="7.4" data-path="7-sampling.html"><a href="7-sampling.html#sampling-case-study"><i class="fa fa-check"></i><b>7.4</b> Case study: Polls</a></li>
<li class="chapter" data-level="7.5" data-path="7-sampling.html"><a href="7-sampling.html#sampling-conclusion-central-limit-theorem"><i class="fa fa-check"></i><b>7.5</b> Central Limit Theorem</a></li>
<li class="chapter" data-level="7.6" data-path="7-sampling.html"><a href="7-sampling.html#sampling-conclusion"><i class="fa fa-check"></i><b>7.6</b> Conclusion</a>
<ul>
<li class="chapter" data-level="7.6.1" data-path="7-sampling.html"><a href="7-sampling.html#sampling-conclusion-table"><i class="fa fa-check"></i><b>7.6.1</b> Sampling scenarios</a></li>
<li class="chapter" data-level="7.6.2" data-path="7-sampling.html"><a href="7-sampling.html#additional-resources-5"><i class="fa fa-check"></i><b>7.6.2</b> Additional resources</a></li>
<li class="chapter" data-level="7.6.3" data-path="7-sampling.html"><a href="7-sampling.html#whats-to-come-6"><i class="fa fa-check"></i><b>7.6.3</b> What’s to come?</a></li>
</ul></li>
</ul></li>
<li class="chapter" data-level="8" data-path="8-confidence-intervals.html"><a href="8-confidence-intervals.html"><i class="fa fa-check"></i><b>8</b> Bootstrapping and Confidence Intervals</a>
<ul>
<li class="chapter" data-level="" data-path="8-confidence-intervals.html"><a href="8-confidence-intervals.html#CI-packages"><i class="fa fa-check"></i>Needed packages</a></li>
<li class="chapter" data-level="8.1" data-path="8-confidence-intervals.html"><a href="8-confidence-intervals.html#resampling-tactile"><i class="fa fa-check"></i><b>8.1</b> Pennies activity</a>
<ul>
<li class="chapter" data-level="8.1.1" data-path="8-confidence-intervals.html"><a href="8-confidence-intervals.html#what-is-the-average-year-on-us-pennies-in-2019"><i class="fa fa-check"></i><b>8.1.1</b> What is the average year on US pennies in 2019?</a></li>
<li class="chapter" data-level="8.1.2" data-path="8-confidence-intervals.html"><a href="8-confidence-intervals.html#resampling-once"><i class="fa fa-check"></i><b>8.1.2</b> Resampling once</a></li>
<li class="chapter" data-level="8.1.3" data-path="8-confidence-intervals.html"><a href="8-confidence-intervals.html#student-resamples"><i class="fa fa-check"></i><b>8.1.3</b> Resampling 35 times</a></li>
<li class="chapter" data-level="8.1.4" data-path="8-confidence-intervals.html"><a href="8-confidence-intervals.html#ci-what-did-we-just-do"><i class="fa fa-check"></i><b>8.1.4</b> What did we just do?</a></li>
</ul></li>
<li class="chapter" data-level="8.2" data-path="8-confidence-intervals.html"><a href="8-confidence-intervals.html#resampling-simulation"><i class="fa fa-check"></i><b>8.2</b> Computer simulation of resampling</a>
<ul>
<li class="chapter" data-level="8.2.1" data-path="8-confidence-intervals.html"><a href="8-confidence-intervals.html#virtually-resampling-once"><i class="fa fa-check"></i><b>8.2.1</b> Virtually resampling once</a></li>
<li class="chapter" data-level="8.2.2" data-path="8-confidence-intervals.html"><a href="8-confidence-intervals.html#bootstrap-35-replicates"><i class="fa fa-check"></i><b>8.2.2</b> Virtually resampling 35 times</a></li>
<li class="chapter" data-level="8.2.3" data-path="8-confidence-intervals.html"><a href="8-confidence-intervals.html#bootstrap-1000-replicates"><i class="fa fa-check"></i><b>8.2.3</b> Virtually resampling 1000 times</a></li>
</ul></li>
<li class="chapter" data-level="8.3" data-path="8-confidence-intervals.html"><a href="8-confidence-intervals.html#ci-build-up"><i class="fa fa-check"></i><b>8.3</b> Understanding confidence intervals</a>
<ul>
<li class="chapter" data-level="8.3.1" data-path="8-confidence-intervals.html"><a href="8-confidence-intervals.html#percentile-method"><i class="fa fa-check"></i><b>8.3.1</b> Percentile method</a></li>
<li class="chapter" data-level="8.3.2" data-path="8-confidence-intervals.html"><a href="8-confidence-intervals.html#se-method"><i class="fa fa-check"></i><b>8.3.2</b> Standard error method</a></li>
</ul></li>
<li class="chapter" data-level="8.4" data-path="8-confidence-intervals.html"><a href="8-confidence-intervals.html#bootstrap-process"><i class="fa fa-check"></i><b>8.4</b> Constructing confidence intervals</a>
<ul>
<li class="chapter" data-level="8.4.1" data-path="8-confidence-intervals.html"><a href="8-confidence-intervals.html#original-workflow"><i class="fa fa-check"></i><b>8.4.1</b> Original workflow</a></li>
<li class="chapter" data-level="8.4.2" data-path="8-confidence-intervals.html"><a href="8-confidence-intervals.html#infer-workflow"><i class="fa fa-check"></i><b>8.4.2</b> <code>infer</code> package workflow</a></li>
<li class="chapter" data-level="8.4.3" data-path="8-confidence-intervals.html"><a href="8-confidence-intervals.html#percentile-method-infer"><i class="fa fa-check"></i><b>8.4.3</b> Percentile method with <code>infer</code></a></li>
<li class="chapter" data-level="8.4.4" data-path="8-confidence-intervals.html"><a href="8-confidence-intervals.html#infer-se"><i class="fa fa-check"></i><b>8.4.4</b> Standard error method with <code>infer</code></a></li>
</ul></li>
<li class="chapter" data-level="8.5" data-path="8-confidence-intervals.html"><a href="8-confidence-intervals.html#one-prop-ci"><i class="fa fa-check"></i><b>8.5</b> Interpreting confidence intervals</a>
<ul>
<li class="chapter" data-level="8.5.1" data-path="8-confidence-intervals.html"><a href="8-confidence-intervals.html#ilyas-yohan"><i class="fa fa-check"></i><b>8.5.1</b> Did the net capture the fish?</a></li>
<li class="chapter" data-level="8.5.2" data-path="8-confidence-intervals.html"><a href="8-confidence-intervals.html#shorthand"><i class="fa fa-check"></i><b>8.5.2</b> Precise and shorthand interpretation</a></li>
<li class="chapter" data-level="8.5.3" data-path="8-confidence-intervals.html"><a href="8-confidence-intervals.html#ci-width"><i class="fa fa-check"></i><b>8.5.3</b> Width of confidence intervals</a></li>
</ul></li>
<li class="chapter" data-level="8.6" data-path="8-confidence-intervals.html"><a href="8-confidence-intervals.html#case-study-two-prop-ci"><i class="fa fa-check"></i><b>8.6</b> Case study: Is yawning contagious?</a>
<ul>
<li class="chapter" data-level="8.6.1" data-path="8-confidence-intervals.html"><a href="8-confidence-intervals.html#mythbusters-study-data"><i class="fa fa-check"></i><b>8.6.1</b> <em>Mythbusters</em> study data</a></li>
<li class="chapter" data-level="8.6.2" data-path="8-confidence-intervals.html"><a href="8-confidence-intervals.html#sampling-scenario"><i class="fa fa-check"></i><b>8.6.2</b> Sampling scenario</a></li>
<li class="chapter" data-level="8.6.3" data-path="8-confidence-intervals.html"><a href="8-confidence-intervals.html#ci-build"><i class="fa fa-check"></i><b>8.6.3</b> Constructing the confidence interval</a></li>
<li class="chapter" data-level="8.6.4" data-path="8-confidence-intervals.html"><a href="8-confidence-intervals.html#interpreting-the-confidence-interval"><i class="fa fa-check"></i><b>8.6.4</b> Interpreting the confidence interval</a></li>
</ul></li>
<li class="chapter" data-level="8.7" data-path="8-confidence-intervals.html"><a href="8-confidence-intervals.html#ci-conclusion"><i class="fa fa-check"></i><b>8.7</b> Conclusion</a>
<ul>
<li class="chapter" data-level="8.7.1" data-path="8-confidence-intervals.html"><a href="8-confidence-intervals.html#bootstrap-vs-sampling"><i class="fa fa-check"></i><b>8.7.1</b> Comparing bootstrap and sampling distributions</a></li>
<li class="chapter" data-level="8.7.2" data-path="8-confidence-intervals.html"><a href="8-confidence-intervals.html#theory-ci"><i class="fa fa-check"></i><b>8.7.2</b> Theory-based confidence intervals</a></li>
<li class="chapter" data-level="8.7.3" data-path="8-confidence-intervals.html"><a href="8-confidence-intervals.html#additional-resources-6"><i class="fa fa-check"></i><b>8.7.3</b> Additional resources</a></li>
<li class="chapter" data-level="8.7.4" data-path="8-confidence-intervals.html"><a href="8-confidence-intervals.html#whats-to-come-7"><i class="fa fa-check"></i><b>8.7.4</b> What’s to come?</a></li>
</ul></li>
</ul></li>
<li class="chapter" data-level="9" data-path="9-hypothesis-testing.html"><a href="9-hypothesis-testing.html"><i class="fa fa-check"></i><b>9</b> Hypothesis Testing</a>
<ul>
<li class="chapter" data-level="" data-path="9-hypothesis-testing.html"><a href="9-hypothesis-testing.html#nhst-packages"><i class="fa fa-check"></i>Needed packages</a></li>
<li class="chapter" data-level="9.1" data-path="9-hypothesis-testing.html"><a href="9-hypothesis-testing.html#ht-activity"><i class="fa fa-check"></i><b>9.1</b> Promotions activity</a>
<ul>
<li class="chapter" data-level="9.1.1" data-path="9-hypothesis-testing.html"><a href="9-hypothesis-testing.html#does-gender-affect-promotions-at-a-bank"><i class="fa fa-check"></i><b>9.1.1</b> Does gender affect promotions at a bank?</a></li>
<li class="chapter" data-level="9.1.2" data-path="9-hypothesis-testing.html"><a href="9-hypothesis-testing.html#shuffling-once"><i class="fa fa-check"></i><b>9.1.2</b> Shuffling once</a></li>
<li class="chapter" data-level="9.1.3" data-path="9-hypothesis-testing.html"><a href="9-hypothesis-testing.html#shuffling-16-times"><i class="fa fa-check"></i><b>9.1.3</b> Shuffling 16 times</a></li>
<li class="chapter" data-level="9.1.4" data-path="9-hypothesis-testing.html"><a href="9-hypothesis-testing.html#ht-what-did-we-just-do"><i class="fa fa-check"></i><b>9.1.4</b> What did we just do?</a></li>
</ul></li>
<li class="chapter" data-level="9.2" data-path="9-hypothesis-testing.html"><a href="9-hypothesis-testing.html#understanding-ht"><i class="fa fa-check"></i><b>9.2</b> Understanding hypothesis tests</a></li>
<li class="chapter" data-level="9.3" data-path="9-hypothesis-testing.html"><a href="9-hypothesis-testing.html#ht-infer"><i class="fa fa-check"></i><b>9.3</b> Conducting hypothesis tests</a>
<ul>
<li class="chapter" data-level="9.3.1" data-path="9-hypothesis-testing.html"><a href="9-hypothesis-testing.html#infer-workflow-ht"><i class="fa fa-check"></i><b>9.3.1</b> <code>infer</code> package workflow</a></li>
<li class="chapter" data-level="9.3.2" data-path="9-hypothesis-testing.html"><a href="9-hypothesis-testing.html#comparing-infer-workflows"><i class="fa fa-check"></i><b>9.3.2</b> Comparison with confidence intervals</a></li>
<li class="chapter" data-level="9.3.3" data-path="9-hypothesis-testing.html"><a href="9-hypothesis-testing.html#only-one-test"><i class="fa fa-check"></i><b>9.3.3</b> “There is only one test”</a></li>
</ul></li>
<li class="chapter" data-level="9.4" data-path="9-hypothesis-testing.html"><a href="9-hypothesis-testing.html#ht-interpretation"><i class="fa fa-check"></i><b>9.4</b> Interpreting hypothesis tests</a>
<ul>
<li class="chapter" data-level="9.4.1" data-path="9-hypothesis-testing.html"><a href="9-hypothesis-testing.html#trial"><i class="fa fa-check"></i><b>9.4.1</b> Two possible outcomes</a></li>
<li class="chapter" data-level="9.4.2" data-path="9-hypothesis-testing.html"><a href="9-hypothesis-testing.html#types-of-errors"><i class="fa fa-check"></i><b>9.4.2</b> Types of errors</a></li>
<li class="chapter" data-level="9.4.3" data-path="9-hypothesis-testing.html"><a href="9-hypothesis-testing.html#choosing-alpha"><i class="fa fa-check"></i><b>9.4.3</b> How do we choose alpha?</a></li>
</ul></li>
<li class="chapter" data-level="9.5" data-path="9-hypothesis-testing.html"><a href="9-hypothesis-testing.html#ht-case-study"><i class="fa fa-check"></i><b>9.5</b> Case study: Are action or romance movies rated higher?</a>
<ul>
<li class="chapter" data-level="9.5.1" data-path="9-hypothesis-testing.html"><a href="9-hypothesis-testing.html#imdb-data"><i class="fa fa-check"></i><b>9.5.1</b> IMDb ratings data</a></li>
<li class="chapter" data-level="9.5.2" data-path="9-hypothesis-testing.html"><a href="9-hypothesis-testing.html#sampling-scenario-1"><i class="fa fa-check"></i><b>9.5.2</b> Sampling scenario</a></li>
<li class="chapter" data-level="9.5.3" data-path="9-hypothesis-testing.html"><a href="9-hypothesis-testing.html#conducting-the-hypothesis-test"><i class="fa fa-check"></i><b>9.5.3</b> Conducting the hypothesis test</a></li>
</ul></li>
<li class="chapter" data-level="9.6" data-path="9-hypothesis-testing.html"><a href="9-hypothesis-testing.html#nhst-conclusion"><i class="fa fa-check"></i><b>9.6</b> Conclusion</a>
<ul>
<li class="chapter" data-level="9.6.1" data-path="9-hypothesis-testing.html"><a href="9-hypothesis-testing.html#theory-hypo"><i class="fa fa-check"></i><b>9.6.1</b> Theory-based hypothesis tests</a></li>
<li class="chapter" data-level="9.6.2" data-path="9-hypothesis-testing.html"><a href="9-hypothesis-testing.html#when-inference-is-not-needed"><i class="fa fa-check"></i><b>9.6.2</b> When inference is not needed</a></li>
<li class="chapter" data-level="9.6.3" data-path="9-hypothesis-testing.html"><a href="9-hypothesis-testing.html#problems-with-p-values"><i class="fa fa-check"></i><b>9.6.3</b> Problems with p-values</a></li>
<li class="chapter" data-level="9.6.4" data-path="9-hypothesis-testing.html"><a href="9-hypothesis-testing.html#additional-resources-7"><i class="fa fa-check"></i><b>9.6.4</b> Additional resources</a></li>
<li class="chapter" data-level="9.6.5" data-path="9-hypothesis-testing.html"><a href="9-hypothesis-testing.html#whats-to-come-8"><i class="fa fa-check"></i><b>9.6.5</b> What’s to come</a></li>
</ul></li>
</ul></li>
<li class="chapter" data-level="10" data-path="10-inference-for-regression.html"><a href="10-inference-for-regression.html"><i class="fa fa-check"></i><b>10</b> Inference for Regression</a>
<ul>
<li class="chapter" data-level="" data-path="10-inference-for-regression.html"><a href="10-inference-for-regression.html#inf-packages"><i class="fa fa-check"></i>Needed packages</a></li>
<li class="chapter" data-level="10.1" data-path="10-inference-for-regression.html"><a href="10-inference-for-regression.html#regression-refresher"><i class="fa fa-check"></i><b>10.1</b> Regression refresher</a>
<ul>
<li class="chapter" data-level="10.1.1" data-path="10-inference-for-regression.html"><a href="10-inference-for-regression.html#teaching-evaluations-analysis"><i class="fa fa-check"></i><b>10.1.1</b> Teaching evaluations analysis</a></li>
<li class="chapter" data-level="10.1.2" data-path="10-inference-for-regression.html"><a href="10-inference-for-regression.html#sampling-scenario-2"><i class="fa fa-check"></i><b>10.1.2</b> Sampling scenario</a></li>
</ul></li>
<li class="chapter" data-level="10.2" data-path="10-inference-for-regression.html"><a href="10-inference-for-regression.html#regression-interp"><i class="fa fa-check"></i><b>10.2</b> Interpreting regression tables</a>
<ul>
<li class="chapter" data-level="10.2.1" data-path="10-inference-for-regression.html"><a href="10-inference-for-regression.html#regression-se"><i class="fa fa-check"></i><b>10.2.1</b> Standard error</a></li>
<li class="chapter" data-level="10.2.2" data-path="10-inference-for-regression.html"><a href="10-inference-for-regression.html#regression-test-statistic"><i class="fa fa-check"></i><b>10.2.2</b> Test statistic</a></li>
<li class="chapter" data-level="10.2.3" data-path="10-inference-for-regression.html"><a href="10-inference-for-regression.html#p-value"><i class="fa fa-check"></i><b>10.2.3</b> p-value</a></li>
<li class="chapter" data-level="10.2.4" data-path="10-inference-for-regression.html"><a href="10-inference-for-regression.html#confidence-interval"><i class="fa fa-check"></i><b>10.2.4</b> Confidence interval</a></li>
<li class="chapter" data-level="10.2.5" data-path="10-inference-for-regression.html"><a href="10-inference-for-regression.html#regression-table-computation"><i class="fa fa-check"></i><b>10.2.5</b> How does R compute the table?</a></li>
</ul></li>
<li class="chapter" data-level="10.3" data-path="10-inference-for-regression.html"><a href="10-inference-for-regression.html#regression-conditions"><i class="fa fa-check"></i><b>10.3</b> Conditions for inference for regression</a>
<ul>
<li class="chapter" data-level="10.3.1" data-path="10-inference-for-regression.html"><a href="10-inference-for-regression.html#residuals-refresher"><i class="fa fa-check"></i><b>10.3.1</b> Residuals refresher</a></li>
<li class="chapter" data-level="10.3.2" data-path="10-inference-for-regression.html"><a href="10-inference-for-regression.html#linearity-of-relationship"><i class="fa fa-check"></i><b>10.3.2</b> Linearity of relationship</a></li>
<li class="chapter" data-level="10.3.3" data-path="10-inference-for-regression.html"><a href="10-inference-for-regression.html#independence-of-residuals"><i class="fa fa-check"></i><b>10.3.3</b> Independence of residuals</a></li>
<li class="chapter" data-level="10.3.4" data-path="10-inference-for-regression.html"><a href="10-inference-for-regression.html#normality-of-residuals"><i class="fa fa-check"></i><b>10.3.4</b> Normality of residuals</a></li>
<li class="chapter" data-level="10.3.5" data-path="10-inference-for-regression.html"><a href="10-inference-for-regression.html#equality-of-variance"><i class="fa fa-check"></i><b>10.3.5</b> Equality of variance</a></li>
<li class="chapter" data-level="10.3.6" data-path="10-inference-for-regression.html"><a href="10-inference-for-regression.html#what-is-the-conclusion"><i class="fa fa-check"></i><b>10.3.6</b> What’s the conclusion?</a></li>
</ul></li>
<li class="chapter" data-level="10.4" data-path="10-inference-for-regression.html"><a href="10-inference-for-regression.html#infer-regression"><i class="fa fa-check"></i><b>10.4</b> Simulation-based inference for regression</a>
<ul>
<li class="chapter" data-level="10.4.1" data-path="10-inference-for-regression.html"><a href="10-inference-for-regression.html#confidence-interval-for-slope"><i class="fa fa-check"></i><b>10.4.1</b> Confidence interval for slope</a></li>
<li class="chapter" data-level="10.4.2" data-path="10-inference-for-regression.html"><a href="10-inference-for-regression.html#hypothesis-test-for-slope"><i class="fa fa-check"></i><b>10.4.2</b> Hypothesis test for slope</a></li>
</ul></li>
<li class="chapter" data-level="10.5" data-path="10-inference-for-regression.html"><a href="10-inference-for-regression.html#inference-conclusion"><i class="fa fa-check"></i><b>10.5</b> Conclusion</a>
<ul>
<li class="chapter" data-level="10.5.1" data-path="10-inference-for-regression.html"><a href="10-inference-for-regression.html#theory-regression"><i class="fa fa-check"></i><b>10.5.1</b> Theory-based inference for regression</a></li>
<li class="chapter" data-level="10.5.2" data-path="10-inference-for-regression.html"><a href="10-inference-for-regression.html#summary-of-statistical-inference"><i class="fa fa-check"></i><b>10.5.2</b> Summary of statistical inference</a></li>
<li class="chapter" data-level="10.5.3" data-path="10-inference-for-regression.html"><a href="10-inference-for-regression.html#additional-resources-8"><i class="fa fa-check"></i><b>10.5.3</b> Additional resources</a></li>
<li class="chapter" data-level="10.5.4" data-path="10-inference-for-regression.html"><a href="10-inference-for-regression.html#whats-to-come-9"><i class="fa fa-check"></i><b>10.5.4</b> What’s to come</a></li>
</ul></li>
</ul></li>
<li class="part"><span><b>IV Conclusion</b></span></li>
<li class="chapter" data-level="11" data-path="11-thinking-with-data.html"><a href="11-thinking-with-data.html"><i class="fa fa-check"></i><b>11</b> Tell Your Story with Data</a>
<ul>
<li class="chapter" data-level="11.1" data-path="11-thinking-with-data.html"><a href="11-thinking-with-data.html#review"><i class="fa fa-check"></i><b>11.1</b> Review</a>
<ul>
<li class="chapter" data-level="" data-path="11-thinking-with-data.html"><a href="11-thinking-with-data.html#story-packages"><i class="fa fa-check"></i>Needed packages</a></li>
</ul></li>
<li class="chapter" data-level="11.2" data-path="11-thinking-with-data.html"><a href="11-thinking-with-data.html#seattle-house-prices"><i class="fa fa-check"></i><b>11.2</b> Case study: Seattle house prices</a>
<ul>
<li class="chapter" data-level="11.2.1" data-path="11-thinking-with-data.html"><a href="11-thinking-with-data.html#house-prices-EDA-I"><i class="fa fa-check"></i><b>11.2.1</b> Exploratory data analysis: Part I</a></li>
<li class="chapter" data-level="11.2.2" data-path="11-thinking-with-data.html"><a href="11-thinking-with-data.html#house-prices-EDA-II"><i class="fa fa-check"></i><b>11.2.2</b> Exploratory data analysis: Part II</a></li>
<li class="chapter" data-level="11.2.3" data-path="11-thinking-with-data.html"><a href="11-thinking-with-data.html#house-prices-regression"><i class="fa fa-check"></i><b>11.2.3</b> Regression modeling</a></li>
<li class="chapter" data-level="11.2.4" data-path="11-thinking-with-data.html"><a href="11-thinking-with-data.html#house-prices-making-predictions"><i class="fa fa-check"></i><b>11.2.4</b> Making predictions</a></li>
</ul></li>
<li class="chapter" data-level="11.3" data-path="11-thinking-with-data.html"><a href="11-thinking-with-data.html#data-journalism"><i class="fa fa-check"></i><b>11.3</b> Case study: Effective data storytelling</a>
<ul>
<li class="chapter" data-level="11.3.1" data-path="11-thinking-with-data.html"><a href="11-thinking-with-data.html#bechdel-test-for-hollywood-gender-representation"><i class="fa fa-check"></i><b>11.3.1</b> Bechdel test for Hollywood gender representation</a></li>
<li class="chapter" data-level="11.3.2" data-path="11-thinking-with-data.html"><a href="11-thinking-with-data.html#us-births-in-1999"><i class="fa fa-check"></i><b>11.3.2</b> US Births in 1999</a></li>
<li class="chapter" data-level="11.3.3" data-path="11-thinking-with-data.html"><a href="11-thinking-with-data.html#scripts-of-r-code"><i class="fa fa-check"></i><b>11.3.3</b> Scripts of R code</a></li>
</ul></li>
<li class="chapter" data-level="" data-path="11-thinking-with-data.html"><a href="11-thinking-with-data.html#concluding-remarks"><i class="fa fa-check"></i>Concluding remarks</a></li>
</ul></li>
<li class="appendix"><span><b>Appendix</b></span></li>
<li class="chapter" data-level="A" data-path="A-appendixA.html"><a href="A-appendixA.html"><i class="fa fa-check"></i><b>A</b> Statistical Background</a>
<ul>
<li class="chapter" data-level="A.1" data-path="A-appendixA.html"><a href="A-appendixA.html#appendix-stat-terms"><i class="fa fa-check"></i><b>A.1</b> Basic statistical terms</a>
<ul>
<li class="chapter" data-level="A.1.1" data-path="A-appendixA.html"><a href="A-appendixA.html#mean"><i class="fa fa-check"></i><b>A.1.1</b> Mean</a></li>
<li class="chapter" data-level="A.1.2" data-path="A-appendixA.html"><a href="A-appendixA.html#median"><i class="fa fa-check"></i><b>A.1.2</b> Median</a></li>
<li class="chapter" data-level="A.1.3" data-path="A-appendixA.html"><a href="A-appendixA.html#appendix-sd-variance"><i class="fa fa-check"></i><b>A.1.3</b> Standard deviation and variance</a></li>
<li class="chapter" data-level="A.1.4" data-path="A-appendixA.html"><a href="A-appendixA.html#five-number-summary"><i class="fa fa-check"></i><b>A.1.4</b> Five-number summary</a></li>
<li class="chapter" data-level="A.1.5" data-path="A-appendixA.html"><a href="A-appendixA.html#distribution"><i class="fa fa-check"></i><b>A.1.5</b> Distribution</a></li>
<li class="chapter" data-level="A.1.6" data-path="A-appendixA.html"><a href="A-appendixA.html#outliers"><i class="fa fa-check"></i><b>A.1.6</b> Outliers</a></li>
</ul></li>
<li class="chapter" data-level="A.2" data-path="A-appendixA.html"><a href="A-appendixA.html#appendix-normal-curve"><i class="fa fa-check"></i><b>A.2</b> Normal distribution</a></li>
<li class="chapter" data-level="A.3" data-path="A-appendixA.html"><a href="A-appendixA.html#appendix-log10-transformations"><i class="fa fa-check"></i><b>A.3</b> log10 transformations</a></li>
</ul></li>
<li class="chapter" data-level="B" data-path="B-appendixB.html"><a href="B-appendixB.html"><i class="fa fa-check"></i><b>B</b> Inference Examples</a>
<ul>
<li class="chapter" data-level="" data-path="B-appendixB.html"><a href="B-appendixB.html#needed-packages-1"><i class="fa fa-check"></i>Needed packages</a></li>
<li class="chapter" data-level="B.1" data-path="B-appendixB.html"><a href="B-appendixB.html#inference-mind-map"><i class="fa fa-check"></i><b>B.1</b> Inference mind map</a></li>
<li class="chapter" data-level="B.2" data-path="B-appendixB.html"><a href="B-appendixB.html#one-mean"><i class="fa fa-check"></i><b>B.2</b> One mean</a>
<ul>
<li class="chapter" data-level="B.2.1" data-path="B-appendixB.html"><a href="B-appendixB.html#problem-statement"><i class="fa fa-check"></i><b>B.2.1</b> Problem statement</a></li>
<li class="chapter" data-level="B.2.2" data-path="B-appendixB.html"><a href="B-appendixB.html#competing-hypotheses"><i class="fa fa-check"></i><b>B.2.2</b> Competing hypotheses</a></li>
<li class="chapter" data-level="B.2.3" data-path="B-appendixB.html"><a href="B-appendixB.html#exploring-the-sample-data"><i class="fa fa-check"></i><b>B.2.3</b> Exploring the sample data</a></li>
<li class="chapter" data-level="B.2.4" data-path="B-appendixB.html"><a href="B-appendixB.html#non-traditional-methods"><i class="fa fa-check"></i><b>B.2.4</b> Non-traditional methods</a></li>
<li class="chapter" data-level="B.2.5" data-path="B-appendixB.html"><a href="B-appendixB.html#traditional-methods"><i class="fa fa-check"></i><b>B.2.5</b> Traditional methods</a></li>
<li class="chapter" data-level="B.2.6" data-path="B-appendixB.html"><a href="B-appendixB.html#comparing-results"><i class="fa fa-check"></i><b>B.2.6</b> Comparing results</a></li>
</ul></li>
<li class="chapter" data-level="B.3" data-path="B-appendixB.html"><a href="B-appendixB.html#one-proportion"><i class="fa fa-check"></i><b>B.3</b> One proportion</a>
<ul>
<li class="chapter" data-level="B.3.1" data-path="B-appendixB.html"><a href="B-appendixB.html#problem-statement-1"><i class="fa fa-check"></i><b>B.3.1</b> Problem statement</a></li>
<li class="chapter" data-level="B.3.2" data-path="B-appendixB.html"><a href="B-appendixB.html#competing-hypotheses-1"><i class="fa fa-check"></i><b>B.3.2</b> Competing hypotheses</a></li>
<li class="chapter" data-level="B.3.3" data-path="B-appendixB.html"><a href="B-appendixB.html#exploring-the-sample-data-1"><i class="fa fa-check"></i><b>B.3.3</b> Exploring the sample data</a></li>
<li class="chapter" data-level="B.3.4" data-path="B-appendixB.html"><a href="B-appendixB.html#non-traditional-methods-1"><i class="fa fa-check"></i><b>B.3.4</b> Non-traditional methods</a></li>
<li class="chapter" data-level="B.3.5" data-path="B-appendixB.html"><a href="B-appendixB.html#traditional-methods-1"><i class="fa fa-check"></i><b>B.3.5</b> Traditional methods</a></li>
<li class="chapter" data-level="B.3.6" data-path="B-appendixB.html"><a href="B-appendixB.html#comparing-results-1"><i class="fa fa-check"></i><b>B.3.6</b> Comparing results</a></li>
</ul></li>
<li class="chapter" data-level="B.4" data-path="B-appendixB.html"><a href="B-appendixB.html#two-proportions"><i class="fa fa-check"></i><b>B.4</b> Two proportions</a>
<ul>
<li class="chapter" data-level="B.4.1" data-path="B-appendixB.html"><a href="B-appendixB.html#problem-statement-2"><i class="fa fa-check"></i><b>B.4.1</b> Problem statement</a></li>
<li class="chapter" data-level="B.4.2" data-path="B-appendixB.html"><a href="B-appendixB.html#competing-hypotheses-2"><i class="fa fa-check"></i><b>B.4.2</b> Competing hypotheses</a></li>
<li class="chapter" data-level="B.4.3" data-path="B-appendixB.html"><a href="B-appendixB.html#exploring-the-sample-data-2"><i class="fa fa-check"></i><b>B.4.3</b> Exploring the sample data</a></li>
<li class="chapter" data-level="B.4.4" data-path="B-appendixB.html"><a href="B-appendixB.html#non-traditional-methods-2"><i class="fa fa-check"></i><b>B.4.4</b> Non-traditional methods</a></li>
<li class="chapter" data-level="B.4.5" data-path="B-appendixB.html"><a href="B-appendixB.html#traditional-methods-2"><i class="fa fa-check"></i><b>B.4.5</b> Traditional methods</a></li>
<li class="chapter" data-level="B.4.6" data-path="B-appendixB.html"><a href="B-appendixB.html#test-statistic-2"><i class="fa fa-check"></i><b>B.4.6</b> Test statistic</a></li>
<li class="chapter" data-level="B.4.7" data-path="B-appendixB.html"><a href="B-appendixB.html#state-conclusion-2"><i class="fa fa-check"></i><b>B.4.7</b> State conclusion</a></li>
<li class="chapter" data-level="B.4.8" data-path="B-appendixB.html"><a href="B-appendixB.html#comparing-results-2"><i class="fa fa-check"></i><b>B.4.8</b> Comparing results</a></li>
</ul></li>
<li class="chapter" data-level="B.5" data-path="B-appendixB.html"><a href="B-appendixB.html#two-means-independent-samples"><i class="fa fa-check"></i><b>B.5</b> Two means (independent samples)</a>
<ul>
<li class="chapter" data-level="B.5.1" data-path="B-appendixB.html"><a href="B-appendixB.html#problem-statement-3"><i class="fa fa-check"></i><b>B.5.1</b> Problem statement</a></li>
<li class="chapter" data-level="B.5.2" data-path="B-appendixB.html"><a href="B-appendixB.html#competing-hypotheses-3"><i class="fa fa-check"></i><b>B.5.2</b> Competing hypotheses</a></li>
<li class="chapter" data-level="B.5.3" data-path="B-appendixB.html"><a href="B-appendixB.html#exploring-the-sample-data-3"><i class="fa fa-check"></i><b>B.5.3</b> Exploring the sample data</a></li>
<li class="chapter" data-level="B.5.4" data-path="B-appendixB.html"><a href="B-appendixB.html#non-traditional-methods-3"><i class="fa fa-check"></i><b>B.5.4</b> Non-traditional methods</a></li>
<li class="chapter" data-level="B.5.5" data-path="B-appendixB.html"><a href="B-appendixB.html#traditional-methods-3"><i class="fa fa-check"></i><b>B.5.5</b> Traditional methods</a></li>
<li class="chapter" data-level="B.5.6" data-path="B-appendixB.html"><a href="B-appendixB.html#test-statistic-3"><i class="fa fa-check"></i><b>B.5.6</b> Test statistic</a></li>
<li class="chapter" data-level="B.5.7" data-path="B-appendixB.html"><a href="B-appendixB.html#compute-p-value-1"><i class="fa fa-check"></i><b>B.5.7</b> Compute <span class="math inline">\(p\)</span>-value</a></li>
<li class="chapter" data-level="B.5.8" data-path="B-appendixB.html"><a href="B-appendixB.html#state-conclusion-3"><i class="fa fa-check"></i><b>B.5.8</b> State conclusion</a></li>
<li class="chapter" data-level="B.5.9" data-path="B-appendixB.html"><a href="B-appendixB.html#comparing-results-3"><i class="fa fa-check"></i><b>B.5.9</b> Comparing results</a></li>
</ul></li>
<li class="chapter" data-level="B.6" data-path="B-appendixB.html"><a href="B-appendixB.html#two-means-paired-samples"><i class="fa fa-check"></i><b>B.6</b> Two means (paired samples)</a>
<ul>
<li class="chapter" data-level="" data-path="B-appendixB.html"><a href="B-appendixB.html#problem-statement-4"><i class="fa fa-check"></i>Problem statement</a></li>
<li class="chapter" data-level="B.6.1" data-path="B-appendixB.html"><a href="B-appendixB.html#competing-hypotheses-4"><i class="fa fa-check"></i><b>B.6.1</b> Competing hypotheses</a></li>
<li class="chapter" data-level="B.6.2" data-path="B-appendixB.html"><a href="B-appendixB.html#exploring-the-sample-data-4"><i class="fa fa-check"></i><b>B.6.2</b> Exploring the sample data</a></li>
<li class="chapter" data-level="B.6.3" data-path="B-appendixB.html"><a href="B-appendixB.html#non-traditional-methods-4"><i class="fa fa-check"></i><b>B.6.3</b> Non-traditional methods</a></li>
<li class="chapter" data-level="B.6.4" data-path="B-appendixB.html"><a href="B-appendixB.html#traditional-methods-4"><i class="fa fa-check"></i><b>B.6.4</b> Traditional methods</a></li>
<li class="chapter" data-level="B.6.5" data-path="B-appendixB.html"><a href="B-appendixB.html#comparing-results-4"><i class="fa fa-check"></i><b>B.6.5</b> Comparing results</a></li>
</ul></li>
</ul></li>
<li class="chapter" data-level="C" data-path="C-appendixC.html"><a href="C-appendixC.html"><i class="fa fa-check"></i><b>C</b> Tips and Tricks</a>
<ul>
<li class="chapter" data-level="" data-path="C-appendixC.html"><a href="C-appendixC.html#needed-packages-2"><i class="fa fa-check"></i>Needed packages</a></li>
<li class="chapter" data-level="C.1" data-path="C-appendixC.html"><a href="C-appendixC.html#data-wrangling"><i class="fa fa-check"></i><b>C.1</b> Data wrangling</a>
<ul>
<li class="chapter" data-level="C.1.1" data-path="C-appendixC.html"><a href="C-appendixC.html#appendix-missing-values"><i class="fa fa-check"></i><b>C.1.1</b> Dealing with missing values</a></li>
<li class="chapter" data-level="C.1.2" data-path="C-appendixC.html"><a href="C-appendixC.html#appendix-reordering-bars"><i class="fa fa-check"></i><b>C.1.2</b> Reordering bars in a barplot</a></li>
<li class="chapter" data-level="C.1.3" data-path="C-appendixC.html"><a href="C-appendixC.html#appendix-money-on-axis"><i class="fa fa-check"></i><b>C.1.3</b> Showing money on an axis</a></li>
<li class="chapter" data-level="C.1.4" data-path="C-appendixC.html"><a href="C-appendixC.html#appendix-changing-values"><i class="fa fa-check"></i><b>C.1.4</b> Changing values inside cells</a></li>
<li class="chapter" data-level="C.1.5" data-path="C-appendixC.html"><a href="C-appendixC.html#appendix-convert-numerical-categorical"><i class="fa fa-check"></i><b>C.1.5</b> Converting a numerical variable to a categorical one</a></li>
<li class="chapter" data-level="C.1.6" data-path="C-appendixC.html"><a href="C-appendixC.html#appendix-prop"><i class="fa fa-check"></i><b>C.1.6</b> Computing proportions</a></li>
<li class="chapter" data-level="C.1.7" data-path="C-appendixC.html"><a href="C-appendixC.html#appendix-commas"><i class="fa fa-check"></i><b>C.1.7</b> Dealing with %, commas, and $</a></li>
</ul></li>
<li class="chapter" data-level="C.2" data-path="C-appendixC.html"><a href="C-appendixC.html#interactive-graphics"><i class="fa fa-check"></i><b>C.2</b> Interactive graphics</a>
<ul>
<li class="chapter" data-level="C.2.1" data-path="C-appendixC.html"><a href="C-appendixC.html#interactive-linegraphs"><i class="fa fa-check"></i><b>C.2.1</b> Interactive linegraphs</a></li>
</ul></li>
</ul></li>
<li class="chapter" data-level="D" data-path="D-appendixD.html"><a href="D-appendixD.html"><i class="fa fa-check"></i><b>D</b> Learning Check Solutions</a>
<ul>
<li class="chapter" data-level="D.1" data-path="D-appendixD.html"><a href="D-appendixD.html#chapter-1-solutions"><i class="fa fa-check"></i><b>D.1</b> Chapter 1 Solutions</a></li>
</ul></li>
<li class="chapter" data-level="E" data-path="E-appendixE.html"><a href="E-appendixE.html"><i class="fa fa-check"></i><b>E</b> Versions of R Packages Used</a></li>
<li class="chapter" data-level="" data-path="references.html"><a href="references.html"><i class="fa fa-check"></i>References</a></li>
</ul>
</nav>
</div>
<div class="book-body">
<div class="body-inner">
<div class="book-header" role="navigation">
<h1>
<i class="fa fa-circle-o-notch fa-spin"></i><a href="./">Statistical Inference via Data Science</a>
</h1>
</div>
<div class="page-wrapper" tabindex="-1" role="main">
<div class="page-inner">
<section class="normal" id="section-">
<html>
<img src='https://moderndive.com/wide_format.png' alt="ModernDive">
</html>
<div id="hypothesis-testing" class="section level1" number="9">
<h1><span class="header-section-number">Chapter 9</span> Hypothesis Testing</h1>
<p>Now that we’ve studied confidence intervals in Chapter <a href="8-confidence-intervals.html#confidence-intervals">8</a>, let’s study another commonly used method for statistical inference: hypothesis testing. Hypothesis tests allow us to take a sample of data from a population and infer about the plausibility of competing hypotheses. For example, in the upcoming “promotions” activity in Section <a href="9-hypothesis-testing.html#ht-activity">9.1</a>, you’ll study the data collected from a psychology study in the 1970s to investigate whether gender-based discrimination in promotion rates existed in the banking industry at the time of the study.</p>
<p>The good news is we’ve already covered many of the necessary concepts to understand hypothesis testing in Chapters <a href="7-sampling.html#sampling">7</a> and <a href="8-confidence-intervals.html#confidence-intervals">8</a>. We will expand further on these ideas here and also provide a general framework for understanding hypothesis tests. By understanding this general framework, you’ll be able to adapt it to many different scenarios.</p>
<p>The same can be said for confidence intervals. There was one general framework that applies to <em>all</em> confidence intervals and the <code>infer</code> package was designed around this framework. While the specifics may change slightly for different types of confidence intervals, the general framework stays the same.</p>
<p>We believe that this approach is much better for long-term learning than focusing on specific details for specific confidence intervals using theory-based approaches. As you’ll now see, we prefer this general framework for hypothesis tests as well.</p>
<p>If you’d like more practice or you’re curious to see how this framework applies to different scenarios, you can find fully-worked out examples for many common hypothesis tests and their corresponding confidence intervals in Appendix B. We recommend that you carefully review these examples as they also cover how the general frameworks apply to traditional theory-based methods like the <span class="math inline">\(t\)</span>-test and normal-theory confidence intervals. You’ll see there that these traditional methods are just approximations for the computer-based methods we’ve been focusing on. However, they also require conditions to be met for their results to be valid. Computer-based methods using randomization, simulation, and bootstrapping have much fewer restrictions. Furthermore, they help develop your computational thinking, which is one big reason they are emphasized throughout this book.</p>
<div id="nhst-packages" class="section level3 unnumbered">
<h3>Needed packages</h3>
<p>Let’s load all the packages needed for this chapter (this assumes you’ve already installed them). Recall from our discussion in Section <a href="4-tidy.html#tidyverse-package">4.4</a> that loading the <code>tidyverse</code> package by running <code>library(tidyverse)</code> loads the following commonly used data science packages all at once:</p>
<ul>
<li><code>ggplot2</code> for data visualization</li>
<li><code>dplyr</code> for data wrangling</li>
<li><code>tidyr</code> for converting data to “tidy” format</li>
<li><code>readr</code> for importing spreadsheet data into R</li>
<li>As well as the more advanced <code>purrr</code>, <code>tibble</code>, <code>stringr</code>, and <code>forcats</code> packages</li>
</ul>
<p>If needed, read Section <a href="1-getting-started.html#packages">1.3</a> for information on how to install and load R packages.</p>
<div class="sourceCode" id="cb342"><pre class="sourceCode r"><code class="sourceCode r"><span id="cb342-1"><a href="9-hypothesis-testing.html#cb342-1" aria-hidden="true" tabindex="-1"></a><span class="fu">library</span>(tidyverse)</span>
<span id="cb342-2"><a href="9-hypothesis-testing.html#cb342-2" aria-hidden="true" tabindex="-1"></a><span class="fu">library</span>(infer)</span>
<span id="cb342-3"><a href="9-hypothesis-testing.html#cb342-3" aria-hidden="true" tabindex="-1"></a><span class="fu">library</span>(moderndive)</span>
<span id="cb342-4"><a href="9-hypothesis-testing.html#cb342-4" aria-hidden="true" tabindex="-1"></a><span class="fu">library</span>(nycflights13)</span>
<span id="cb342-5"><a href="9-hypothesis-testing.html#cb342-5" aria-hidden="true" tabindex="-1"></a><span class="fu">library</span>(ggplot2movies)</span></code></pre></div>
</div>
<div id="ht-activity" class="section level2" number="9.1">
<h2><span class="header-section-number">9.1</span> Promotions activity</h2>
<p>Let’s start with an activity studying the effect of gender on promotions at a bank.</p>
<div id="does-gender-affect-promotions-at-a-bank" class="section level3" number="9.1.1">
<h3><span class="header-section-number">9.1.1</span> Does gender affect promotions at a bank?</h3>
<p>Say you are working at a bank in the 1970s and you are submitting your résumé to apply for a promotion. Will your gender affect your chances of getting promoted? To answer this question, we’ll focus on data from a study published in the <em>Journal of Applied Psychology</em> in 1974. This data is also used in the <a href="https://www.openintro.org/"><em>OpenIntro</em></a> series of statistics textbooks.</p>
<p>To begin the study, 48 bank supervisors were asked to assume the role of a hypothetical director of a bank with multiple branches. Every one of the bank supervisors was given a résumé and asked whether or not the candidate on the résumé was fit to be promoted to a new position in one of their branches.</p>
<p>However, each of these 48 résumés were identical in all respects except one: the name of the applicant at the top of the résumé. Of the supervisors, 24 were randomly given résumés with stereotypically “male” names, while 24 of the supervisors were randomly given résumés with stereotypically “female” names. Since only (binary) gender varied from résumé to résumé, researchers could isolate the effect of this variable in promotion rates.</p>
<p>While many people today (including us, the authors) disagree with such binary views of gender, it is important to remember that this study was conducted at a time where more nuanced views of gender were not as prevalent. Despite this imperfection, we decided to still use this example as we feel it presents ideas still relevant today about how we could study discrimination in the workplace.</p>
<p>The <code>moderndive</code> package contains the data on the 48 applicants in the <code>promotions</code> data frame. Let’s explore this data by looking at six randomly selected rows:</p>
<div class="sourceCode" id="cb343"><pre class="sourceCode r"><code class="sourceCode r"><span id="cb343-1"><a href="9-hypothesis-testing.html#cb343-1" aria-hidden="true" tabindex="-1"></a>promotions <span class="sc">%>%</span> </span>
<span id="cb343-2"><a href="9-hypothesis-testing.html#cb343-2" aria-hidden="true" tabindex="-1"></a> <span class="fu">sample_n</span>(<span class="at">size =</span> <span class="dv">6</span>) <span class="sc">%>%</span> </span>
<span id="cb343-3"><a href="9-hypothesis-testing.html#cb343-3" aria-hidden="true" tabindex="-1"></a> <span class="fu">arrange</span>(id)</span></code></pre></div>
<pre><code># A tibble: 6 x 3
id decision gender
<int> <fct> <fct>
1 11 promoted male
2 26 promoted female
3 28 promoted female
4 36 not male
5 37 not male
6 46 not female</code></pre>
<p>The variable <code>id</code> acts as an identification variable for all 48 rows, the <code>decision</code> variable indicates whether the applicant was selected for promotion or not, while the <code>gender</code> variable indicates the gender of the name used on the résumé. Recall that this data does not pertain to 24 actual men and 24 actual women, but rather 48 identical résumés of which 24 were assigned stereotypically “male” names and 24 were assigned stereotypically “female” names.</p>
<p>Let’s perform an exploratory data analysis of the relationship between the two categorical variables <code>decision</code> and <code>gender</code>. Recall that we saw in Subsection <a href="2-viz.html#two-categ-barplot">2.8.3</a> that one way we can visualize such a relationship is by using a stacked barplot.</p>
<div class="sourceCode" id="cb345"><pre class="sourceCode r"><code class="sourceCode r"><span id="cb345-1"><a href="9-hypothesis-testing.html#cb345-1" aria-hidden="true" tabindex="-1"></a><span class="fu">ggplot</span>(promotions, <span class="fu">aes</span>(<span class="at">x =</span> gender, <span class="at">fill =</span> decision)) <span class="sc">+</span></span>
<span id="cb345-2"><a href="9-hypothesis-testing.html#cb345-2" aria-hidden="true" tabindex="-1"></a> <span class="fu">geom_bar</span>() <span class="sc">+</span></span>
<span id="cb345-3"><a href="9-hypothesis-testing.html#cb345-3" aria-hidden="true" tabindex="-1"></a> <span class="fu">labs</span>(<span class="at">x =</span> <span class="st">"Gender of name on résumé"</span>)</span></code></pre></div>
<div class="figure" style="text-align: center"><span id="fig:promotions-barplot"></span>
<img src="ModernDive_files/figure-html/promotions-barplot-1.png" alt="Barplot relating gender to promotion decision." width="\textwidth" />
<p class="caption">
FIGURE 9.1: Barplot relating gender to promotion decision.
</p>
</div>
<p>Observe in Figure <a href="9-hypothesis-testing.html#fig:promotions-barplot">9.1</a> that it appears that résumés with female names were much less likely to be accepted for promotion. Let’s quantify these promotion rates by computing the proportion of résumés accepted for promotion for each group using the <code>dplyr</code> package for data wrangling. Note the use of the <code>tally()</code> function here which is a shortcut for <code>summarize(n = n())</code> to get counts.</p>
<div class="sourceCode" id="cb346"><pre class="sourceCode r"><code class="sourceCode r"><span id="cb346-1"><a href="9-hypothesis-testing.html#cb346-1" aria-hidden="true" tabindex="-1"></a>promotions <span class="sc">%>%</span> </span>
<span id="cb346-2"><a href="9-hypothesis-testing.html#cb346-2" aria-hidden="true" tabindex="-1"></a> <span class="fu">group_by</span>(gender, decision) <span class="sc">%>%</span> </span>
<span id="cb346-3"><a href="9-hypothesis-testing.html#cb346-3" aria-hidden="true" tabindex="-1"></a> <span class="fu">tally</span>()</span></code></pre></div>
<pre><code># A tibble: 4 x 3
# Groups: gender [2]
gender decision n
<fct> <fct> <int>
1 male not 3
2 male promoted 21
3 female not 10
4 female promoted 14</code></pre>
<p>So of the 24 résumés with male names, 21 were selected for promotion, for a proportion of 21/24 = 0.875 = 87.5%. On the other hand, of the 24 résumés with female names, 14 were selected for promotion, for a proportion of 14/24 = 0.583 = 58.3%. Comparing these two rates of promotion, it appears that résumés with male names were selected for promotion at a rate 0.875 - 0.583 = 0.292 = 29.2% higher than résumés with female names. This is suggestive of an advantage for résumés with a male name on it.</p>
<p>The question is, however, does this provide <em>conclusive</em> evidence that there is gender discrimination in promotions at banks? Could a difference in promotion rates of 29.2% still occur by chance, even in a hypothetical world where no gender-based discrimination existed? In other words, what is the role of <em>sampling variation</em> in this hypothesized world? To answer this question, we’ll again rely on a computer to run <em>simulations</em>.</p>
</div>
<div id="shuffling-once" class="section level3" number="9.1.2">
<h3><span class="header-section-number">9.1.2</span> Shuffling once</h3>
<p>First, try to imagine a hypothetical universe where no gender discrimination in promotions existed. In such a hypothetical universe, the gender of an applicant would have no bearing on their chances of promotion. Bringing things back to our <code>promotions</code> data frame, the <code>gender</code> variable would thus be an irrelevant label. If these <code>gender</code> labels were irrelevant, then we could randomly reassign them by “shuffling” them to no consequence!</p>
<p>To illustrate this idea, let’s narrow our focus to 6 arbitrarily chosen résumés of the 48 in Table <a href="9-hypothesis-testing.html#tab:compare-six">9.1</a>. The <code>decision</code> column shows that 3 résumés resulted in promotion while 3 didn’t. The <code>gender</code> column shows what the original gender of the résumé name was.</p>
<p>However, in our hypothesized universe of no gender discrimination, gender is irrelevant and thus it is of no consequence to randomly “shuffle” the values of <code>gender</code>. The <code>shuffled_gender</code> column shows one such possible random shuffling. Observe in the fourth column how the number of male and female names remains the same at 3 each, but they are now listed in a different order.</p>
<table class="table" style="font-size: 16px; margin-left: auto; margin-right: auto;">
<caption style="font-size: initial !important;">
<span id="tab:compare-six">TABLE 9.1: </span>One example of shuffling gender variable
</caption>
<thead>
<tr>
<th style="text-align:right;">
résumé number
</th>
<th style="text-align:left;">
decision
</th>
<th style="text-align:left;">
gender
</th>
<th style="text-align:left;">
shuffled gender
</th>
</tr>
</thead>
<tbody>
<tr>
<td style="text-align:right;">
1
</td>
<td style="text-align:left;">
not
</td>
<td style="text-align:left;">
male
</td>
<td style="text-align:left;">
male
</td>
</tr>
<tr>
<td style="text-align:right;">
2
</td>
<td style="text-align:left;">
not
</td>
<td style="text-align:left;">
female
</td>
<td style="text-align:left;">
male
</td>
</tr>
<tr>
<td style="text-align:right;">
3
</td>
<td style="text-align:left;">
not
</td>
<td style="text-align:left;">
female
</td>
<td style="text-align:left;">
female
</td>
</tr>
<tr>
<td style="text-align:right;">
4
</td>
<td style="text-align:left;">
promoted
</td>
<td style="text-align:left;">
male
</td>
<td style="text-align:left;">
female
</td>
</tr>
<tr>
<td style="text-align:right;">
5
</td>
<td style="text-align:left;">
promoted
</td>
<td style="text-align:left;">
male
</td>
<td style="text-align:left;">
female
</td>
</tr>
<tr>
<td style="text-align:right;">
6
</td>
<td style="text-align:left;">
promoted
</td>
<td style="text-align:left;">
female
</td>
<td style="text-align:left;">
male
</td>
</tr>
</tbody>
</table>
<p>Again, such random shuffling of the gender label only makes sense in our hypothesized universe of no gender discrimination. How could we extend this shuffling of the gender variable to all 48 résumés by hand? One way would be by using standard deck of 52 playing cards, which we display in Figure <a href="9-hypothesis-testing.html#fig:deck-of-cards">9.2</a>.</p>
<div class="figure" style="text-align: center"><span id="fig:deck-of-cards"></span>
<img src="images/shutterstock/shutterstock_670789453.jpg" alt="Standard deck of 52 playing cards." width="100%" />
<p class="caption">
FIGURE 9.2: Standard deck of 52 playing cards.
</p>
</div>
<p>Since half the cards are red (diamonds and hearts) and the other half are black (spades and clubs), by removing two red cards and two black cards, we would end up with 24 red cards and 24 black cards. After shuffling these 48 cards as seen in Figure <a href="9-hypothesis-testing.html#fig:shuffling">9.3</a>, we can flip the cards over one-by-one, assigning “male” for each red card and “female” for each black card.</p>
<div class="figure" style="text-align: center"><span id="fig:shuffling"></span>
<img src="images/shutterstock/shutterstock_128283971.jpg" alt="Shuffling a deck of cards." width="100%" height="100%" />
<p class="caption">
FIGURE 9.3: Shuffling a deck of cards.
</p>
</div>
<p>We’ve saved one such shuffling in the <code>promotions_shuffled</code> data frame of the <code>moderndive</code> package. If you compare the original <code>promotions</code> and the shuffled <code>promotions_shuffled</code> data frames, you’ll see that while the <code>decision</code> variable is identical, the <code>gender</code> variable has changed.</p>
<p>Let’s repeat the same exploratory data analysis we did for the original <code>promotions</code> data on our <code>promotions_shuffled</code> data frame. Let’s create a barplot visualizing the relationship between <code>decision</code> and the new shuffled <code>gender</code> variable and compare this to the original unshuffled version in Figure <a href="9-hypothesis-testing.html#fig:promotions-barplot-permuted">9.4</a>.</p>
<div class="sourceCode" id="cb348"><pre class="sourceCode r"><code class="sourceCode r"><span id="cb348-1"><a href="9-hypothesis-testing.html#cb348-1" aria-hidden="true" tabindex="-1"></a><span class="fu">ggplot</span>(promotions_shuffled, </span>
<span id="cb348-2"><a href="9-hypothesis-testing.html#cb348-2" aria-hidden="true" tabindex="-1"></a> <span class="fu">aes</span>(<span class="at">x =</span> gender, <span class="at">fill =</span> decision)) <span class="sc">+</span></span>
<span id="cb348-3"><a href="9-hypothesis-testing.html#cb348-3" aria-hidden="true" tabindex="-1"></a> <span class="fu">geom_bar</span>() <span class="sc">+</span> </span>
<span id="cb348-4"><a href="9-hypothesis-testing.html#cb348-4" aria-hidden="true" tabindex="-1"></a> <span class="fu">labs</span>(<span class="at">x =</span> <span class="st">"Gender of résumé name"</span>)</span></code></pre></div>
<div class="figure" style="text-align: center"><span id="fig:promotions-barplot-permuted"></span>
<img src="ModernDive_files/figure-html/promotions-barplot-permuted-1.png" alt="Barplots of relationship of promotion with gender (left) and shuffled gender (right)." width="\textwidth" />
<p class="caption">
FIGURE 9.4: Barplots of relationship of promotion with gender (left) and shuffled gender (right).
</p>
</div>
<p>It appears the difference in “male names” versus “female names” promotion rates is now different. Compared to the original data in the left barplot, the new “shuffled” data in the right barplot has promotion rates that are much more similar.</p>
<p>Let’s also compute the proportion of résumés accepted for promotion for each group:</p>
<div class="sourceCode" id="cb349"><pre class="sourceCode r"><code class="sourceCode r"><span id="cb349-1"><a href="9-hypothesis-testing.html#cb349-1" aria-hidden="true" tabindex="-1"></a>promotions_shuffled <span class="sc">%>%</span> </span>
<span id="cb349-2"><a href="9-hypothesis-testing.html#cb349-2" aria-hidden="true" tabindex="-1"></a> <span class="fu">group_by</span>(gender, decision) <span class="sc">%>%</span> </span>
<span id="cb349-3"><a href="9-hypothesis-testing.html#cb349-3" aria-hidden="true" tabindex="-1"></a> <span class="fu">tally</span>() <span class="co"># Same as summarize(n = n())</span></span></code></pre></div>
<pre><code># A tibble: 4 x 3
# Groups: gender [2]
gender decision n
<fct> <fct> <int>
1 male not 6
2 male promoted 18
3 female not 7
4 female promoted 17</code></pre>
<p>So in this hypothetical universe of no discrimination, <span class="math inline">\(18/24 = 0.75 = 75\%\)</span> of “male” résumés were selected for promotion. On the other hand, <span class="math inline">\(17/24 = 0.708 = 70.8\%\)</span> of “female” résumés were selected for promotion.</p>
<p>Let’s next compare these two values. It appears that résumés with stereotypically male names were selected for promotion at a rate that was <span class="math inline">\(0.75 - 0.708 = 0.042 = 4.2\%\)</span> different than résumés with stereotypically female names.</p>
<p>Observe how this difference in rates is not the same as the difference in rates of 0.292 = 29.2% we originally observed. This is once again due to <em>sampling variation</em>. How can we better understand the effect of this sampling variation? By repeating this shuffling several times!</p>
</div>
<div id="shuffling-16-times" class="section level3" number="9.1.3">
<h3><span class="header-section-number">9.1.3</span> Shuffling 16 times</h3>
<p>We recruited 16 groups of our friends to repeat this shuffling exercise. They recorded these values in a <a href="https://docs.google.com/spreadsheets/d/1Q-ENy3o5IrpJshJ7gn3hJ5A0TOWV2AZrKNHMsshQtiE/">shared spreadsheet</a>; we display a snapshot of the first 10 rows and 5 columns in Figure <a href="9-hypothesis-testing.html#fig:tactile-shuffling">9.5</a>.</p>
<div class="figure" style="text-align: center"><span id="fig:tactile-shuffling"></span>
<img src="images/sampling/promotions/shared_spreadsheet.png" alt="Snapshot of shared spreadsheet of shuffling results (m for male, f for female)." width="100%" />
<p class="caption">
FIGURE 9.5: Snapshot of shared spreadsheet of shuffling results (m for male, f for female).
</p>
</div>
<p>For each of these 16 columns of <em>shuffles</em>, we computed the difference in promotion rates, and in Figure <a href="9-hypothesis-testing.html#fig:null-distribution-1">9.6</a> we display their distribution in a histogram. We also mark the observed difference in promotion rate that occurred in real life of 0.292 = 29.2% with a dark line.</p>
<div class="figure" style="text-align: center"><span id="fig:null-distribution-1"></span>
<img src="ModernDive_files/figure-html/null-distribution-1-1.png" alt="Distribution of shuffled differences in promotions." width="\textwidth" />
<p class="caption">
FIGURE 9.6: Distribution of shuffled differences in promotions.
</p>
</div>
<p>Before we discuss the distribution of the histogram, we emphasize the key thing to remember: this histogram represents differences in promotion rates that one would observe in our <em>hypothesized universe</em> of no gender discrimination.</p>
<p>Observe first that the histogram is roughly centered at 0. Saying that the difference in promotion rates is 0 is equivalent to saying that both genders had the same promotion rate. In other words, the center of these 16 values is consistent with what we would expect in our hypothesized universe of no gender discrimination.</p>
<p>However, while the values are centered at 0, there is variation about 0. This is because even in a hypothesized universe of no gender discrimination, you will still likely observe small differences in promotion rates because of chance <em>sampling variation</em>. Looking at the histogram in Figure <a href="9-hypothesis-testing.html#fig:null-distribution-1">9.6</a>, such differences could even be as extreme as -0.292 or 0.208.</p>
<p>Turning our attention to what we observed in real life: the difference of 0.292 = 29.2% is marked with a vertical dark line. Ask yourself: in a hypothesized world of no gender discrimination, how likely would it be that we observe this difference? While opinions here may differ, in our opinion not often! Now ask yourself: what do these results say about our hypothesized universe of no gender discrimination?</p>
<!--
v2 TODO: Consider adding;
Now each of our 33 friends does the following:
1. Takes the two decks of cards.
2. Shuffles the cards corresponding to gender.
3. Assigns the shuffled cards to the original deck of supervisors' decisions.
4. Count how many cards fall into each of the four categories:
- Promoted males
- Non-promoted males
- Promoted females
- Non-promoted females
5. Determines the proportion of promoted males out of 24.
6. Determines the proportion of promoted females out of 24.
7. Subtracts those two differences to get a new value of the test statistic, assuming the null hypothesis is true.
Let's see what this leads to for our friends in terms of results and label where the observed test statistic falls in relation to our friends' statistics:
```r
obs_diff_prop <- promotions %>%
specify(decision ~ gender, success = "promoted") %>%
calculate(stat = "diff in props", order = c("male", "female"))
obs_diff_prop
```
We see that of the 33 samples we selected only one is close to as extreme as what we observed. Thus, we might guess that we are starting to see some data suggesting that gender discrimination might be at play. Many of the statistics calculated appear close to 0 with the vast remainder appearing around values of a difference of -0.1 and 0.1. So what further evidence would we need to make this suggestion a little clearer? More simulations! As we've done before in Chapters \@ref(sampling) and \@ref(confidence-intervals), we'll use the computer to simulate these permutations and calculations many times. Let's do just that with the `infer` package in the next section.
-->
</div>
<div id="ht-what-did-we-just-do" class="section level3" number="9.1.4">
<h3><span class="header-section-number">9.1.4</span> What did we just do?</h3>
<p>What we just demonstrated in this activity is the statistical procedure known as <em>hypothesis testing</em> using a <em>permutation test</em>. The term “permutation” is the mathematical term for “shuffling”: taking a series of values and reordering them randomly, as you did with the playing cards.</p>
<p>In fact, permutations are another form of <em>resampling</em>, like the bootstrap method you performed in Chapter <a href="8-confidence-intervals.html#confidence-intervals">8</a>. While the bootstrap method involves resampling <em>with</em> replacement, permutation methods involve resampling <em>without</em> replacement.</p>
<p>Think of our exercise involving the slips of paper representing pennies and the hat in Section <a href="8-confidence-intervals.html#resampling-tactile">8.1</a>: after sampling a penny, you put it back in the hat. Now think of our deck of cards. After drawing a card, you laid it out in front of you, recorded the color, and then you <em>did not</em> put it back in the deck.</p>
<p>In our previous example, we tested the validity of the hypothesized universe of no gender discrimination. The evidence contained in our observed sample of 48 résumés was somewhat inconsistent with our hypothesized universe. Thus, we would be inclined to <em>reject</em> this hypothesized universe and declare that the evidence suggests there is gender discrimination.</p>
<p>Recall our case study on whether yawning is contagious from Section <a href="8-confidence-intervals.html#case-study-two-prop-ci">8.6</a>. The previous example involves inference about an unknown difference of population proportions as well. This time, it will be <span class="math inline">\(p_{m} - p_{f}\)</span>, where <span class="math inline">\(p_{m}\)</span> is the population proportion of résumés with male names being recommended for promotion and <span class="math inline">\(p_{f}\)</span> is the equivalent for résumés with female names. Recall that this is one of the scenarios for inference we’ve seen so far in Table <a href="9-hypothesis-testing.html#tab:table-diff-prop">9.2</a>.</p>
<table class="table" style="font-size: 16px; margin-left: auto; margin-right: auto;">
<caption style="font-size: initial !important;">
<span id="tab:table-diff-prop">TABLE 9.2: </span>Scenarios of sampling for inference
</caption>
<thead>
<tr>
<th style="text-align:right;">
Scenario
</th>
<th style="text-align:left;">
Population parameter
</th>
<th style="text-align:left;">
Notation
</th>
<th style="text-align:left;">
Point estimate
</th>
<th style="text-align:left;">
Symbol(s)
</th>
</tr>
</thead>
<tbody>
<tr>
<td style="text-align:right;width: 0.5in; ">
1
</td>
<td style="text-align:left;width: 0.7in; ">
Population proportion
</td>
<td style="text-align:left;width: 1in; ">
<span class="math inline">\(p\)</span>
</td>
<td style="text-align:left;width: 1.1in; ">
Sample proportion
</td>
<td style="text-align:left;width: 1in; ">
<span class="math inline">\(\widehat{p}\)</span>