forked from avehtari/BDA_course_Aalto
-
Notifications
You must be signed in to change notification settings - Fork 1
/
Copy pathBDA_notes_ch7.tex
177 lines (152 loc) · 6.1 KB
/
BDA_notes_ch7.tex
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
\documentclass[a4paper,11pt,english]{article}
\usepackage{babel}
\usepackage[latin1]{inputenc}
\usepackage[T1]{fontenc}
\usepackage{times}
\usepackage{amsmath}
\usepackage{microtype}
\usepackage{url}
\urlstyle{same}
\usepackage[pdftex,colorlinks=true,citecolor=black,
pagecolor=black,linkcolor=black,menucolor=black,
urlcolor=black]{hyperref}
\hypersetup{%
bookmarksopen=true,
bookmarksnumbered=true,
pdftitle={Bayesian data analysis},
pdfsubject={Reading instructions},
pdfauthor={Aki Vehtari},
pdfkeywords={Bayesian probability theory, Bayesian inference, Bayesian data analysis},
pdfstartview={FitH -32768}
}
% if not draft, smaller printable area makes the paper more readable
\topmargin -4mm
\oddsidemargin 0mm
\textheight 225mm
\textwidth 160mm
%\parskip=\baselineskip
\def\eff{\mathrm{rep}}
\DeclareMathOperator{\E}{E}
\DeclareMathOperator{\Var}{Var}
\DeclareMathOperator{\var}{var}
\DeclareMathOperator{\Sd}{Sd}
\DeclareMathOperator{\sd}{sd}
\DeclareMathOperator{\Bin}{Bin}
\DeclareMathOperator{\Beta}{Beta}
\DeclareMathOperator{\Invchi2}{Inv-\chi^2}
\DeclareMathOperator{\NInvchi2}{N-Inv-\chi^2}
\DeclareMathOperator{\logit}{logit}
\DeclareMathOperator{\N}{N}
\DeclareMathOperator{\U}{U}
\DeclareMathOperator{\tr}{tr}
%\DeclareMathOperator{\Pr}{Pr}
\DeclareMathOperator{\trace}{trace}
\DeclareMathOperator{\rep}{\mathrm{rep}}
\pagestyle{empty}
\begin{document}
\thispagestyle{empty}
\section*{Bayesian data analysis -- reading instructions 7}
\smallskip
{\bf Aki Vehtari}
\smallskip
\subsection*{Chapter 7}
Outline of the chapter 7
\begin{list}{$\bullet$}{\parsep=0pt\itemsep=2pt}
\item 7.1 Measures of predictive accuracy
\item 7.2 Information criteria and cross-validation (read instead the article mentioned below)
\item 7.3 Model comparison based on predictive performance (read instead the article mentioned below)
\item 7.4 Model comparison using Bayes factors
\item 7.5 Continuous model expansion / sensitivity analysis
\item 7.5 Example (may be skipped)
\end{list}
\noindent
Instead of Sections 7.2 and 7.3 it's better to read
\begin{list}{$\bullet$}{\parsep=0pt\itemsep=2pt}
\item Aki Vehtari, Andrew Gelman and Jonah Gabry (2017). Practical
Bayesian model evaluation using leave-one-out cross-validation and
WAIC. In Statistics and Computing, 27(5):1413-1432,
doi:10.1007/s11222-016-9696-4. \href{http://arxiv.org/abs/1507.04544}{arXiv preprint arXiv:1507.04544}.
\end{list}
In Sections 7.2 and 7.3 of BDA, for historical reasons there is a multiplier $-2$ used. After the book was published, weh have concluded it causing too much confusion and recommed not to multiply by $-2$. The above paper is not using $-2$ anymore.
\noindent
See also extra material at \url{https://avehtari.github.io/modelselection/}
\begin{list}{$\bullet$}{\parsep=0pt\itemsep=2pt}
\item Videos, slides, notebooks, references
\item The most relevant for the course is the first part of the
talk ``Model assessment, comparison and selection at Master
class in Bayesian statistics, CIRM, Marseille''
\end{list}
% Matlab demos
% \begin{list}{$\bullet$}{\parsep=0pt\itemsep=2pt}
% \item esim6\_1.m: Posterior predictive checking - light speed
% \item esim6\_2.m: Posterior predictive checking - sequential dependence
% \item esim6\_3.m: Posterior predictive checking - poor test statistic
% \item esim6\_4.m: Posterior predictive checking - marginal predictive p-value
% \end{list}
Find all the terms and symbols listed below. When reading the chapter
and the above mentioned article, write down questions related to
things unclear for you or things you think might be unclear for
others.
\begin{list}{$\bullet$}{\parsep=0pt\itemsep=2pt}
\item predictive accuracy/fit/error
\item external validation
\item cross-validation
\item information criteria
\item overfitting
\item measures of predictive accuracy
\item point prediction
\item scoring function
\item mean squared error
\item probabilistic prediction
\item scoring rule
\item logarithmic score
\item log-predictive density
\item out-of-sample predictive fit
\item elpd, elppd, lppd
\item deviance
\item within-sample predictive accuracy
\item adjusted within-sample predictive accuracy
\item AIC, DIC, WAIC (less important)
\item effective number of parameters
\item singular model
\item BIC (less important)
\item leave-one-out cross-validation
\item evaluating predictive error comparisons
\item bias induced by model selection
\item Bayes factors
\item continuous model expansion
\item sensitivity analysis
\end{list}
%fourth edition batching p. 287
\subsection*{Additional reading}
More theoretical details can be found in
\begin{itemize}
\item Aki Vehtari and Janne Ojanen (2012). A survey of Bayesian
predictive methods for model assessment, selection and
comparison. In Statistics Surveys,
6:142-228. \url{http://dx.doi.org/10.1214/12-SS102}
\end{itemize}
\noindent
See more experimental comparisons in
\begin{itemize}
\item Juho Piironen and Aki Vehtari (2017). Comparison of Bayesian predictive methods for model selection. Statistics and Computing, 27(3):711-735. doi:10.1007/s11222-016-9649-y. \url{http://link.springer.com/article/10.1007/s11222-016-9649-y}
\end{itemize}
\subsection*{Posterior probability of the model vs. predictive performance}
Gelman: ``To take a historical example, I don't find it useful, from a
statistical perspective, to say that in 1850, say, our posterior
probability that Newton's laws were true was 99\%, then in 1900 it was
50\%, then by 1920, it was 0.01\% or whatever. I'd rather say that
Newton's laws were a good fit to the available data and prior
information back in 1850, but then as more data and a clearer
understanding became available, people focused on areas of lack of fit
in order to improve the model.''
Newton's laws are still sufficient for prediction in specific contexts
(non-relative speeds and differences in gravity, non-significant
effects of air resistance or other friction). See more in the course
video 1.1 Introduction to uncertainty and modelling
\url{https://aalto.cloud.panopto.eu/Panopto/Pages/Viewer.aspx?id=d841f429-9c3d-4d24-8228-a9f400efda7b}.
\end{document}
%%% Local Variables:
%%% TeX-PDF-mode: t
%%% TeX-master: t
%%% End: