-
Notifications
You must be signed in to change notification settings - Fork 0
/
Copy pathagenda.qmd
319 lines (178 loc) · 7.33 KB
/
agenda.qmd
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
---
title: "Agenda"
bibliography: packages.bib
embed-resources: true
---
## Intro
The course will alternate between face-to-face presentations and exercises (15-20 minute each) on the covered topics. Next, the instructor will provide the "official" answer to the challenge. That not only offers a comparative learning experience but also sets the stage for the next topic. Each new chapter begins in a collaborative environment, with the official solution to the previous challenge implemented, ensuring a smooth transition and cohesive learning path.
At the end of each day, participants will have the option to tackle an longer hands-on exercise covering the day's topics; the solution will be discussed and presented in the first half-hour of the next class.
At the end of the course, we will suggest a comprehensive exercise to be done independently at home. We will also send a questionnaire to participants to assess the retention of the concepts presented in the course.
The whole course will be conducted using the R programming languages [@R-base], and it will introduce the main **Tidyverse** [@tidyverse2019] R-package ecosystem and philosophy for a uniform, composable, functional, and human-first design utilities to do data science. In particular, the following packages will be introduced and used during the course:
- the `{tidyverse}` [@R-tidyverse] to activate the ecosystem on the R session
- the RStudio projects, and the `{renv}` [@R-renv], and `{here}` [@R-here] packages for project management, reproducibility, and portability. On their side, there will be introduced the native *pipe*, with only mentioning the {magrittr} \[\@R-magrittr\], to pipe instructions in an easy-to-write/read/understand way.
- `{ggplot2}` [@R-ggplot2] for graphics production
- the `{rio}` [@R-rio] for data reading/import and writing/export
- `{janitor}` [@R-janitor], and `{tidyr}` [@R-tidyr] for data cleaning at import level, and the `{dplyr}` [@R-dplyr], `{forcats}` [@R-forcats], `{lubridate}` [@R-lubridate], and (optionally at teh end of teh course) `{stringr}` [@R-stringr], and `{glue}` [@R-glue] for post-import data sets and type-specific data manipulation and management.
- `{gtsummary}` [@R-gtsummary] to create summary tables for data and models and to include table data in a report's narrative text sections directly, i.e., without copy-pasting them by hand.[^1]
[^1]: The `{flextable}` [@R-flextable] package will be not introduced explicitly because `{gtsummary}` can now manage MS Word tables as well. On the other side, it will be explained how to convert a `gtsummary` table on other formats, `flextable` included.
<!-- -->
- R Markdown [@R-rmarkdown], `{knitr}` [@R-knitr], and Quarto [@R-quarto] to create reproducible, dynamic documents such as reports, articles, slides, and much more.
```{r}
#| include: false
knitr::write_bib(c(.packages(), "tidyverse", "ggplot2", "readr", "readxl", "writexl", "haven", "rio", "tidyr", "janitor", "dplyr", "forcats", "glue", "magrittr", "lubridate", "stringr", "renv", "here", "gtsummary", "flextable", "rmarkdown", "knitr", "quarto"), "packages.bib")
```
## Agenda
[240' each day]{.aside}
::: panel-tabset
## Day 1
### \[60'\] Intros:
- Course's objectives and philosophy
- Ts/TAs presentation
- Course organization, teaching materials, and personalized assistance
- R/RStudio
- Posit RStudio Cloud IDE presentation and setup
### EXERCISE \[15'\]
::: {.callout-note appearance="minimal"}
\[10'\] `BREAK`{=html}
:::
### \[20'\] Basic R and RStudio:
- R basics
- Packages
- Tidyverse
### EXERCISE \[15'\]
### SOLUTION \[5'\]
### \[30'\] Infrastructures[^2]
- R projects
- `{here}`
- Files organization (dev/ folder)
### EXERCISE \[20'\]
### SOLUTION \[5'\]
::: {.callout-note appearance="minimal"}
\[10'\] `BREAK`{=html}
:::
### \[20'\] Import and cleaning
- `rio`[^3] [^4]
### EXERCISE \[15'\]
### SOLUTION \[5'\]
::: {.callout-note appearance="minimal"}
\[10'\] Recap & Assignments
:::
## Day 2
::: {.callout-note appearance="minimal"}
\[10'\] Recap & Solutions
:::
### \[40'\] local environments
- `{renv}`
### EXERCISE \[25'\]
### SOLUTION \[5'\]
::: {.callout-note appearance="minimal"}
\[10'\] `BREAK`{=html}
:::
### \[30'\] R Data Structures
- Base data structures
- Subsetting and Extractions
### EXERCISE \[25'\]
### SOLUTION \[5'\]
### \[30'\] Pipe
- Pipe
### EXERCISE \[20'\]
### SOLUTION \[5'\]
::: {.callout-note appearance="minimal"}
\[10'\] `BREAK`{=html}
:::
### \[30'\] (cleaning (Optional if time allows))
- Headers, variables' names, and missing data
### EXERCISE \[25'\]
### SOLUTION \[5'\]
::: {.callout-note appearance="minimal"}
\[10'\] Recap & Assignments
:::
## Day 3
::: {.callout-note appearance="minimal"}
\[10'\] Recap & Solutions
:::
### \[30'\] Transform shape of dataset [^5] [^6]
- Tidy format
- `select` (all_of...)
- `filter`
### EXERCISE \[15'\]
### SOLUTION \[5'\]
::: {.callout-note appearance="minimal"}
\[10'\] BREAK
:::
### \[30'\] Transform -- manage dataset[^7]
- mutate (across) data contents row-by-row
- mutate and summarize data by groups
::: {.callout-note appearance="minimal"}
\[10'\] BREAK
:::
### EXERCISE \[25'\]
### SOLUTION \[5'\]
### \[50'\] Modeling (summary statistics tables)[^8]
- Summary tables
- descriptive statistics
### EXERCISE \[25'\]
### SOLUTION \[5'\]
```
- cross-tables
- saving tables
```
### EXERCISE \[20'\]
### SOLUTION \[5'\]
::: {.callout-note appearance="minimal"}
\[10'\] RECAP & assignments
:::
## Day 4
::: {.callout-note appearance="minimal"}
\[10'\] RECAP & solutions
:::
### \[85'\] Visualization [^9]
- Intro to `{ggplot2}`
- Tidy data, and the layered grammar of graphics.
- Base template (data, aesthetics, and geometries)
### EXERCISE \[20'\]
### SOLUTION \[5'\]
::: {.callout-note appearance="minimal"}
\[10'\] `BREAK`{=html}
:::
### \[10'\] Visualization [^10]
- Intro to `{ggplot2}`
- Scales, Facets, Labels, and Themes
- Saving plots
::: {.callout-note appearance="minimal"}
\[10'\] BREAK
:::
### \[30'\] Transform -- manage main types[^11]
- factors
- dates/datetimes
- (strings (optional if time allows))
<!-- ### 7. \[60'\] Communicate (tools: R markdown and Quarto)[^11] -->
<!-- - \[20'\] (Visual) Quarto document structure -->
<!-- - \[10'\] types & creating -->
<!-- - \[5'\] header -->
<!-- - \[5'\] content -->
<!-- - \[15'\] Base Markdown syntax -->
<!-- - \[20'\] chunks, inline code, and inline tables content -->
<!-- - \[5'\] chunk options -->
::: {.callout-note appearance="minimal"}
\[10'\] BREAK
:::
::: {.callout-note appearance="minimal"}
\[10'\] RECAP & final assignments
:::
::: {.callout-note appearance="minimal"}
\[25'\] finale w/ OVERALL RECAP, next-month assignment, support access instructions, and final survey
:::
:::
[^2]: pkg: {renv}, {here}
[^3]: pkgs: {readr}, {readxl}, {writexl}, {haven}
[^4]: pkg: {rio}
[^5]: pkgs: {janitor}, {tidyr}
[^6]: pkgs: {dplyr}
[^7]: pkgs: {dplyr}
[^8]: pkgs: {dplyr}, {gtsummary}
[^9]: pkg: {magrittr}, {ggplot2}
[^10]: pkg: {magrittr}, {ggplot2}
[^11]: pkgs: {dplyr}, {lubridate}, {stringr}, {glue}
<!-- [^11]: The *Communicate* section is flexible. Topics from this section can be omitted to allow for more in-depth discussion on earlier sections, if needed, based on the time available. -->
## Bibliography