-
Notifications
You must be signed in to change notification settings - Fork 558
/
Copy pathindex.Rmd
164 lines (124 loc) · 11 KB
/
index.Rmd
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
---
title: "Bayesian Data Analysis course"
date: "Page updated: `r format.Date(file.mtime('index.Rmd'),'%Y-%m-%d')`"
---
This is the web page for the Bayesian Data Analysis course at Aalto (CS-E5710) by [Aki Vehtari](https://users.aalto.fi/~ave/).
Aalto students should check [the 2024 specific course web page](Aalto2024.html) and [MyCourses](https://mycourses.aalto.fi/course/view.php?id=44799). **In 2024** Aalto course can be taken online except for the final project presentation. The lectures will be given on campus, but recorded and the recording will be made available online after the course. The registration for the course lectures will be used to estimate the need for the resources. If you are unable to register for the course at the moment in the Sisu, there is no need to email the lecturer. You can start taking the course and register before the end of the course. Sisu shows rooms on campus for the computer exercises, but you can join the TA sessions also online via Zulip and Zoom. You can choose which TA session to join each week separately, without a need to register for those sessions.
- [Aalto University Code of Academic Integrity and Handling Violations Thereof](https://into.aalto.fi/display/ensaannot/Aalto+University+Code+of+Academic+Integrity+and+Handling+Violations+Thereof)
All the course material is available in a [git repo](https://github.com/avehtari/BDA_course_Aalto) (and these pages are for easier navigation). All the material can be used in other courses. Text (except the BDA3 book) and videos licensed under CC-BY-NC 4.0. Code licensed under BSD-3.
<div style= "float:right;position: relative;">
![](bda_cover.png)
</div>
[The electronic version of the course book
Bayesian Data Analysis, 3rd ed, by Andrew Gelman, John Carlin, Hal
Stern, David Dunson, Aki Vehtari, and Donald Rubin](https://users.aalto.fi/~ave/BDA3.pdf) is available for non-commercial purposes. Hard copies are available from [the publisher](https://www.crcpress.com/Bayesian-Data-Analysis/Gelman-Carlin-Stern-Dunson-Vehtari-Rubin/p/book/9781439840955) and many book stores.
See also [home page for the
book](http://www.stat.columbia.edu/~gelman/book/), [errata for the
book](http://www.stat.columbia.edu/~gelman/book/errata_bda3.txt), and [chapter notes](BDA3_notes.html).
## Prerequisites
- Basic terms of probability theory
- probability, probability density, distribution
- sum, product rule, and Bayes' rule
- expectation, mean, variance, median
- in Finnish, see e.g. [Stokastiikka ja tilastollinen ajattelu](http://math.aalto.fi/~lleskela/LectureNotes003.html)
- in English, see e.g. Wikipedia and [Introduction to probability and statistics](https://ocw.mit.edu/courses/mathematics/18-05-introduction-to-probability-and-statistics-spring-2014/readings/)
- Some algebra and calculus
- Basic visualisation techniques (R or Python)
- histogram, density plot, scatter plot
- see e.g. [BDA R demos](demos.html#BDA_R_demos)
- see e.g. [BDA Python demos](demos.html#BDA_Python_demos)
This course has been designed so that there is strong emphasis in
computational aspects of Bayesian data analysis and using the latest
computational tools.
If you find BDA3 too difficult to start with, I recommend
- For regression models, their connection to statistical testing and causal analysis see [Gelman, Hill and Vehtari, "Regression and Other Stories"](https://avehtari.github.io/ROS-Examples/).
- Richard McElreath's [Statistical Rethinking, 2nd ed](https://xcelab.net/rm/statistical-rethinking/) book is easier than BDA3 and the 2nd ed is excellent. Statistical Rethinking doesn't go as deep in some details, math, algorithms and programming as BDA course. Richard's lecture videos of [Statistical Rethinking: A Bayesian Course Using R and Stan](https://github.com/rmcelreath/statrethinking_winter2019) are highly recommended even if you are following BDA3.
- For background prerequisites some students have found chapters 2, 4 and 5 in [Kruschke, "Doing Bayesian Data Analysis"](https://sites.google.com/site/doingbayesiandataanalysis/) useful.
## Course contents following BDA3
Bayesian Data Analysis, 3rd ed, by Andrew Gelman, John Carlin, Hal
Stern, David Dunson, Aki Vehtari, and Donald Rubin. [Home page for the
book](http://www.stat.columbia.edu/~gelman/book/). [Errata for the
book](http://www.stat.columbia.edu/~gelman/book/errata_bda3.txt). [Electronic edition for non-commercial purposes only](https://users.aalto.fi/~ave/BDA3.pdf).
- Background (Ch 1, Lecture 1)
- Single-parameter models (Ch 2, Lecture 2)
- Multiparameter models (Ch 3, Lecture 3)
- Computational methods (Ch 10 , Lecture 4)
- Markov chain Monte Carlo (Chs 11-12, Lectures 5-6)
- Extra material for Stan and probabilistic programming (see below, Lecture 6)
- Hierarchical models (Ch 5, Lecture 7)
- Model checking (Ch 6, Lectures 8-9)
+ \+ [Visualization in Bayesian workflow](https://doi.org/10.1111/rssa.12378)
- Evaluating and comparing models (Ch 7)
+ \+ [Practical Bayesian model evaluation using leave-one-out cross-validation and WAIC](https://arxiv.org/abs/1507.04544) ([Journal link](https://doi.org/10.1007/s11222-016-9696-4))
+ \+ [Case studies](https://users.aalto.fi/~ave/casestudies.html)
+ \+ [Videos](https://users.aalto.fi/~ave/videos.html)
+ \+ [Cross-validation FAQ](https://users.aalto.fi/~ave/CV-FAQ.html)
- Decision analysis (Ch 9, Lecture 10)
- Large sample properties and Laplace approximation (Ch 4, Lecture 11-12)
- In addition you learn workflow for Bayesian data analysis
## How to study
Recommended way to go through the material is
- Read the reading instructions for a chapter in [chapter notes](BDA3_notes.html).
- Read the chapter in BDA3 and check that you find the terms listed in the reading instructions.
- Watch the corresponding [lecture video](https://aalto.cloud.panopto.eu/Panopto/Pages/Sessions/List.aspx#folderID=%224a7f385e-fdb1-4382-bfd0-af0700b7fc46%22) to get explanations for most important parts.
- Read corresponding additional information in the [chapter notes](BDA3_notes.html).
- Run the corresponding demos in [R demos](https://github.com/avehtari/BDA_R_demos) or
[Python demos](https://github.com/avehtari/BDA_py_demos).
- Read the exercise instructions and make the corresponding [assignments](https://github.com/avehtari/BDA_course_Aalto/tree/master/assignments). Demo codes in [R demos](https://github.com/avehtari/BDA_R_demos) and
[Python demos](https://github.com/avehtari/BDA_py_demos) have a lot of useful examples for handling data and plotting figures. If you have problems, visit TA sessions or ask in course slack channel.
- If you want to learn more, make also self study exercises listed below
## Slides and chapter notes
- [Slides](https://github.com/avehtari/BDA_course_Aalto/tree/master/slides)
- including code for reproducing some of the figures
- [Chapter notes](BDA3_notes.html)
- including reading instructions highlighting most important parts and terms
## Videos
The following video motivates why computational probabilistic methods and probabilistic programming are important part of modern Bayesian data analysis.
- [Computational probabilistic modeling in 15mins](https://www.youtube.com/watch?v=ukE5aqdoLZI)
Short video clips on selected introductory topics are available in [a Panopto folder](https://aalto.cloud.panopto.eu/Panopto/Pages/Sessions/List.aspx#folderID/=%22f0ec3a25-9e23-4935-873b-a9f401646812%22).
The 2022 lecture videos are in [a Panopto folder](https://aalto.cloud.panopto.eu/Panopto/Pages/Sessions/List.aspx#folderID=%224a7f385e-fdb1-4382-bfd0-af0700b7fc46%22). The 2023 lecture videos will be uploaded in another folder after each lecture.
## R and Python
We strongly recommend using R in the course as there are more packages for Stan and statistical analysis in R. If you are already fluent in Python, but not in R, then using Python may be easier, but it can still be more useful to learn also R. Unless you are already experienced and have figured out your preferred way to work with R, we recommend installing [RStudio Desktop](https://posit.co/download/rstudio-desktop/) or using [Aalto teaching JupyterHub](https://jupyter.cs.aalto.fi/). See [FAQ](FAQ.html) for frequently asked questions about R problems in this course. The [demo codes](demos.html) provide useful starting points for all the assignments.
- For learning R programming basics we recommend
- [Garrett Grolemund, Hands-On Programming with R](https://rstudio-education.github.io/hopr/)
- For learning basic and advanced plotting using R we recommend
- [Kieran Healy, Data Visualization - A practical introduction](https://socviz.co/)
- [Antony Unwin, Graphical Data Analysis with R](http://www.gradaanwr.net/)
## Demos
These demos include a lot of useful code for making the assignments.
- [R demos](demos.html#BDA_R_demos)
- [Python demos](demos.html#BDA_Python_demos)
## Self study exercises
Great self study BDA3 exercises for this course are listed below. Most of these have also [model solutions available](http://www.stat.columbia.edu/~gelman/book/solutions3.pdf).
- 1.1-1.4, 1.6-1.8 (model solutions for 1.1-1.6)
- 2.1-2.5, 2.8, 2.9, 2.14, 2.17, 2.22 (model solutions for 2.1-2.5, 2.7-2.13, 2.16, 2.17, 2.20, and 2.14 is in slides)
- 3.2, 3.3, 3.9 (model solutions for 3.1-3.3, 3.5, 3.9, 3.10)
- 4.2, 4.4, 4.6 (model solutions for 3.2-3.4, 3.6, 3.7, 3.9, 3.10)
- 5.1, 5.2 (model solutions for 5.3-5.5, 5.7-5.12)
- 6.1 (model solutions for 6.1, 6.5-6.7)
- 9.1
- 10.1, 10.2 (model solution for 10.4)
- 11.1 (model solution for 11.1)
## Stan
- [Stan home page](http://mc-stan.org/)
- [Introductory article in Journal of Statistical Software](http://www.stat.columbia.edu/~gelman/research/published/Stan-paper-aug-2015.pdf)
- [Documentation](https://mc-stan.org/users/documentation/)
- [RStan installation](https://github.com/stan-dev/rstan/wiki/RStan-Getting-Started)
- [PyStan installation](https://pystan.readthedocs.io/en/latest/getting_started.html)
- Basics of Bayesian inference and Stan, Jonah Gabry & Lauren Kennedy [Part 1](https://www.youtube.com/watch?v=ZRpo41l02KQ&t=8s&list=PLuwyh42iHquU4hUBQs20hkBsKSMrp6H0J&index=6) and [Part 2](https://www.youtube.com/watch?v=6cc4N1vT8pk&t=0s&list=PLuwyh42iHquU4hUBQs20hkBsKSMrp6H0J&index=7)
## Extra reading
- [Dicing with the unknown](https://doi.org/10.1111/j.1740-9713.2004.00050.x)
- [Origin of word Bayesian](http://jeff560.tripod.com/b.html)
- [Model selection](https://avehtari.github.io/modelselection/)
- [Cross-validation FAQ](https://avehtari.github.io/modelselection/CV-FAQ.html)
## Acknowledgements
The course material has been greatly improved by the previous and
current course assistants (in alphabetical order): Michael Riis
Andersen, Paul Bürkner, Akash Daka, Alejandro Catalina, Kunal Ghosh,
Meenal Jhajharia, Andrew Johnson, Noa Kallioinen, Joona Karjalainen,
David Kohns, Juho Kokkala, Leevi Lindgren, Yann McLatchie, Måns
Magnusson, Anton Mallasto, Janne Ojanen, Topi Paananen, Markus
Paasiniemi, Juho Piironen, Anna Riha, Jaakko Riihimäki, Niko Siccha,
Eero Siivola, Tuomas Sivula, Teemu Säilynoja, Jarno Vanhatalo.
The web page has been made with rmarkdown’s site generator.
[](https://bayes.club/@avehtari){rel="me"}